Videos, Blog, ETL/ELT

ETL vs ESB

Jared Hillam

Jared Hillam

March 9, 2016

 

Imagine you needed to deliver a package to the office building a block away.  But the only way to get there was to have the package scheduled to be delivered on a giant truck that gets dispatched out of another city. Or conversely, imagine needing to deliver thousands of boxes to a warehouse, but the only thing you have is a bicycle.

These strange analogies have a very similar parallel in the data world.  When we're trying to move large quantities of data, often the tool of choice is an ETL tool, which stands for extract, transform, and load.  However, when we are communicating between individual application processes we often use an Enterprise Service Bus or ESB.  So in this video, we’re going to briefly touch on the unique challenges of moving bulk data as well as real-time transaction data.  We’ll also discuss how vendors are beginning to blur the lines between these two requirements into unified integration platforms.

Moving bulk data requires a highly tuned platform for that very task. Additionally, it requires the ability to make transformations to the data at a large scale. For example, you may want to convert all the state names to their abbreviated form or have a highly parameterized transformation to put the data into separate buckets. And you may want this to occur to millions of records on a scheduled basis.  This would be akin to using a truck in our previous example. It's specifically designed for delivering in large quantities.

Application integration on the other hand requires a real-time Service to be constantly monitoring transactions.   This is because ESB tools are often talking between Applications. For example, imagine you have a closed opportunity in your CRM and you want that close opportunity to trigger an invoicing process for your Accounts Receivable System. Rather than hand entering all the information, you would set up a Service Bus which would carry the relevant information to your accounts receivable group, speeding up the invoice creation process.  This would be akin to using the bicycle in our previous example. It is not great for moving a lot of stuff but it's nimble and is purpose-built for getting a few packages from A to B.

In the past, The lines between these two different middleware functions were fairly stark. Organizations would simply have one tool for Application integration and another tool for bulk data loading.  This however can be frustrating because both ETL and ESB share so many of the same basic data movement requirements, namely the ability to deliver data through a structured process. And in the data world we’re not talking about trucks and bicycles with physical world limitations we’re talking about data and software. So in the back of everybody’s mind, we’ve always known that these two functions could reside in the same place.

And sure enough, the market is finally starting to deliver on true Multi-Latency and Multi-Volume with tools that are built from the ground up to cover both functions. These new capabilities are not something that organizations should rush out to replace their current environments with.  However, when defining a new project, consideration for latency flexibility and transformation flexibility, in the same platform, should be a priority moving into the future. We now live in a world where end-to-end processes can be established which can support application business processes all the way through to the analytics using shared transformation and routing objects.

Intricity can help you determine the requirements that your organization has in relation to best-of-breed multi latency, and Multi-Volume Data management tools. I recommend you reach out to Intricity and talk with one of our specialists.  We can help you identify a balanced method which future-proofs your data integration investment.

 

Related Post

Ness Digital Engineering Acquires Intricity

Ness Digital Engineering Acquires Intricity - a New York based company specializing in data strategy, governance, modernization, and monetization

Learn More

What is a Partition?

Understanding the concept of database partitioning can be significantly illuminated by the historical context of hard drive defragmentation.

Learn More

The Narrow Case for Data-to-Information Company Acquisitions

The rumors about Salesforce acquiring Informatica bring up some interesting observations from past acquisitions of this nature.

Learn More