If you’ve seen my video on ETL then you’re aware of the important role these kinds of integration tools play within an organization. However, in that video, I don’t spend a lot of time talking about ELT. So in this video, I’d like to draw a distinction between these two options.
ETL stands for Extract Transform and Load whereas ELT stands for Extract Load and Transform. The distinction is based on the order of events. In ETL we apply transformations to the data while it’s being moved. Whereas in an ELT scenario, we are transforming the data after it has been moved. Each has its own distinct advantages.
ETL has the advantage of landing data in its finished and transformed state. So it can handle near real-time scenarios if the transformation logic isn’t too crazy. This is a bit of a balancing act because as the complexity and volume of data increases, the ability of the ETL tool to load the data in a timely fashion is compromised. Thus leading us to ELT.
ELT has the advantage of leveraging a database to drive transformations. So rather than running the transformations while the data is in flight, we load the data without any transformations. Then after it is loaded we use the target database’s horsepower to make changes to the data. This allows us to massively increase the data footprint and transformation capacity.
Lately, ELT has been an exciting option because of what is happening with MPP database platforms in the cloud. These new platforms are now allowing for dynamic allocation of compute resources that don’t conflict with users' queries or other compute needs. Additionally, these platforms are enabling companies to only pay for the compute resources that they use, essentially allowing companies to be non-committal to any hardware footprint. Now that's a mouthful of detail, but what it essentially means is that if I use ELT as my primary data transformation method; the moment the data leaves my source system, it lands into a place with near-infinite storage and compute resources.
We at Intricity have experienced some amazing results leveraging ELT methods on these platforms. Complex data transformations which took 12 hours to run on in-memory on-premise hardware were brought down to 10 minutes when re-architected to use a dynamic cloud MPP architecture.
Intricity has a methodology for migrating existing Data Warehousing solutions over to this highly elastic cloud architecture. I’ve written a white paper titled: The “Do No Harm” DW Migration, that describes this methodology at a high level. If you would like to review our approach click here, and of course you can always reach out to Intricity to talk with a specialist about your existing data warehouse.