Perspectives

Importing Flat Files with Newline Characters in Informatica

During a project that included loading flat file accounting data, I came across a very interesting issue. We were given multiple comma-delimited csv files with several fields utilizing a double quote text qualifier. Normally this would not be an issue; however inside the double quotes were newline characters. Below is an example of the issue […]

Learn More >>

How Do I Define Business Rules that Constantly Change?

Over the years I’ve seen companies large and small struggle with managing their business rules. It seems like teams are always asking the same questions: How many rules do we have? Which version is correct? Where did these calculations come from? Having ready answers for these questions is key when it comes to effectively managing […]

Learn More >>

Legacy Application Retirement – SP Execution Template Using PowerCenter

A major multinational communications company located in the Midwest, providing communication and data services to its residential and commercial customers, wanted to retire one of its legacy applications to free up system and employee resources. The application consisted of over 150 stored procedures (SPs) physically running on different servers at different schedules. Most of these […]

Learn More >>

What Is Apache Kafka?

Apache Kafka is a publish-subscribe messaging system that works with Apache Storm, Apache HBase and Apache Spark for real-time analysis and rendering of streaming data. Kafka is “a high-throughput distributed messaging system.” It was developed at LinkedIn as their streaming data backbone. Part of the Hortonworks Data Platform, it is fast, scalable, and fault tolerant. […]

Learn More >>

What Is Apache NiFi?

Apache NiFi is “an easy to use, powerful, and reliable system to process and distribute data.” In simpler terms, NiFi is a system for moving, filtering, and enhancing data with point source security and a nice UI wrapper. NiFi (formerly Niagarafiles) was developed at the National Security Agency (NSA) and donated to the Apache Software […]

Learn More >>

What Is Apache Storm?

Welcome! In this blog I will introduce you to Apache Storm (or just Storm) and share some of its capabilities, along with some potential uses for it. I hope this wets your appetite so you will want to join me and dig in even more on Storm and other topics! First of all, let’s address […]

Learn More >>

Data Warehousing and Analytics – Not an IT Function

Does your company’s IT department own your data warehouse or data lake? Data warehouses and data lakes are both storage repositories used to consolidate data, differing only in the format of the data stored. Warehouses are more structured while lakes use a flatter, flexible architecture. What about the analytics tool?It’s a common allocation, and why […]

Learn More >>

What Is Apache Cassandra?

Welcome! In this blog I will introduce you to Apache Cassandra (or just Cassandra) and share some of its capabilities, along with some potential uses for it. I hope this wets your appetite so you will want to join me and dig in even more on Cassandra and other topics!

Learn More >>

What Is Apache Spark?

Welcome! In this blog I will introduce you to Apache Spark (or just Spark) and share some of its capabilities, along with some potential uses for it. I hope this wets your appetite so you will want to join me and dig in even more on Spark and other topics!

Learn More >>

What Is the Network of Things?

Welcome! In this blog I will introduce you to the Network of Things (or NoT) and share some background, along with some potential uses for it. This blog has been inspired by a talk given by Dr. Voas on the topic. I hope this wets your appetite so you will want to join me and […]

Learn More >>

Shifting trends require shifting responsibilities

Developers and analysts must expand their skills if the business is to succeed in its efforts to improve agility. A challenge many IT leaders face today is an edict from the business to improve IT agility. Traditionally, analysts have worked with their business counterparts to define requirements. Analysts, in turn, work with developers to deliver […]

Learn More >>

Introducing a ‘define once, govern everywhere’ data management style

The sanity afforded by defining data standards only once and applying them anywhere will create time to investigate innovative uses for that data. Until recently, the information manager’s jurisdiction was the enterprise and the data generated by its applications. Information managers now fight a seemingly overwhelming influx of data created by mobile devices, applications in […]

Learn More >>

3 reasons you need to re-evaluate your information strategy

A flood of big data is bringing with it pressure for real-time insights from business users and from security and privacy regulations. All organizations now have—or should have—a strategy for managing metadata. But much has likely changed since you implemented your tools. The sheer volume of data is exploding: It is proliferating into the cloud […]

Learn More >>

Free your data: 5 steps to data independence

Understand your power as a CIO and use it to effect change by prioritizing innovation. IT—”information technology”— is at an inflection point. We are now defining leadership not by how well we manage the “technology” in IT but instead by how well we can harness the data and “information” to: Deliver the perfect product because […]

Learn More >>

How fresh is your data?

Simply getting data is not good enough. You must get it to the right people at the right time while it is still fresh enough to be useful. Data is critical to a company’s success—data about customers, data about products, data about revenue, data about every aspect of the business. You can use it to […]

Learn More >>

The role data architects can play in strengthening the IT-business relationship

Automation lets IT meet the ever-changing needs of the business by enhancing speed, reducing costs and risk, and growing revenue. In many organizations, the business-IT relationship is akin to a bad marriage, aggravated by poorly aligned priorities and inadequate communication. Automation can save the relationship, however. It opens the door to “managed” self-service for the […]

Learn More >>

Take the leap into big data

Use big data to shape the future of your company. Because if you do not, your successor will. CIOs commonly confuse what is urgent with what is most important to the business. Of course, your IT staff needs to ensure that the email server is up and e-commerce transactions are being processed. But your CEO […]

Learn More >>

Rise of the machines: The Internet of Things

Are devices that track our every move poised to unlock new potential in humankind or are they just downright invasive? The privacy and security implications of a network of devices that tracks our preferences, our financial transactions, our location, and our health may seem alarming. The possibilities of using that data to glean deeper insight […]

Learn More >>

3 ways to help business leaders become data integration champions

Data integration does not just concern IT. It’s an essential business process. Since the conception of middleware, business users have been more than happy to let IT handle the complex task of integrating enterprise data. Data has been the exclusive purview of IT, whereas business leaders have been more focused on outward-facing agendas. Big data […]

Learn More >>

Horton Works

Hortonworks is a leading innovator in the industry, creating, distributing and supporting enterprise-ready open data platforms and modern data applications. Our mission is to manage the world’s data. We have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. We along with our 1600+ partners provide the expertise, training and services that allow our customers to unlock transformational value for their organizations across any line of business. Our connected data platforms powers modern data applications that deliver actionable intelligence from all data: data-in-motion and data-at-rest. We are Powering the Future of Data™.