Case Study: Hadoop to Snowflake
Written by Jared Hillam
HBASE to Snowflake on Azure using Snowflake’s Stored Procedures
About CLIENT
CLIENT measures how consumers shop across all channels, sourcing data from both retailers and consumers to quantify sales, share, distribution, and velocity. The market research company collects point-of-sale data, tracking retailers, distributors, and foodservice operators, measuring what’s selling at 1,250 retailers, across 300,000 stores. CLIENT also interviews 12 million consumers annually and tracks millions of their receipts—following the same consumers over time—to understand shifting tastes and trends.
CLIENT helps retailers, manufacturers, financial analysts, and the public sector measure performance, predict future performance, improve marketing and product development, and identify business and consumer trends and market opportunities. CLIENT tracks spending and has dedicated advisers and analysts in more than 20 industries: apparel, appliances, automotive, beauty, books, consumer electronics, e-commerce, entertainment, fashion accessories, food consumption, foodservice, footwear, home, juvenile products, mobile, office supplies, retail, sports, technology, toys, travel retail, video games, and watches/jewelry.
About Intricity
Intricity is a team of specialized Data Management, Data Warehousing, and Business Intelligence experts. The team members at Intricity have been handpicked over the course of 20 years, and represent the top talent globally in Data oriented disciplines.
Challenge and Wins
Challenge
CLIENT adopted a Hadoop ecosystem to process their records, and while innovative at the time, it required a giant spend in labor and technical debt to maintain. Even fundamental things like partitioning were completely manual procedures requiring hours of hands on work.
Intricity conducted a review of CLIENT’s complex data assembly & merging for CPG data syndication. This data came from multiple data sets which included consumer purchasing events and products being purchased. Intricity assisted CLIENT in setting up the ingestion of data from HDFS PSV files straight into the Azure Blob Store and copying into Snowflake. Additionally Intricity setup the process to deal with updates and changes to the data which was being onboarded. The merging also required products to be appropriately classified so they could be measured and Intricity assisted CLIENT in bringing this logic into Snowflake. However to do this, Intricity needed a method of executing the strings of logic procedurally. Intricity requested early preview access to Snowflake’s Stored Procedures. With Snowflake’s Stored Procedures, Intricity was able to organize the data and distill it down in a form that is ready to syndicate to their customers. Additionally, once this merging and classification was done, PSV files were generated to initiate legacy processes to support their existing landscape.
Win 1: Ease Of Loading
Landing the PSV files out of HDFS into Blob Storage on Azure then copying into Snowflake made the loading of Snowflake a non-event, and an instant win. This greatly simplified the adoption of Snowflake as a platform.
Win 2: Native SQL support with a complex Big Data use case
CLIENT was accustomed to engaging an army of programmers to do anything with Hadoop. However, when they discovered that Snowflake treated SQL as a first class citizen, they began to see how much easier data manipulation would actually be. The complex merge statements could be done using SQL commands that were much easier to interpret and write.
Win 3: Integrated Processing of Stored Procedures
CLIENT’s data merging processes required a highly complex series of procedures. Intricity was given preview access to the Snowflake’s Stored Procedures and it was leveraged at CLIENT. The use of Stored Procedures greatly enhanced the speed at which the complex merges could occur, and greatly advanced preparation of their data for syndication.
Call to Action
To schedule time to discuss your landscape with an Intricity specialist, go to https://www.intricity.com/contact-us and register to talk with a specialist. Or call the office near you.
Who is Intricity?
Intricity is a specialized selection of over 100 Data Management Professionals, with offices located across the USA and Headquarters in New York City. Our team of experts has implemented in a variety of Industries including, Healthcare, Insurance, Manufacturing, Financial Services, Media, Pharmaceutical, Retail, and others. Intricity is uniquely positioned as a partner to the business that deeply understands what makes the data tick. This joint knowledge and acumen has positioned Intricity to beat out its Big 4 competitors time and time again. Intricity’s area of expertise spans the entirety of the information lifecycle. This means when you’re problem involves data; Intricity will be a trusted partner. Intricity's services cover a broad range of data-to-information engineering needs:
What Makes Intricity Different?
While Intricity conducts highly intricate and complex data management projects, Intricity is first a foremost a Business User Centric consulting company. Our internal slogan is to Simplify Complexity. This means that we take complex data management challenges and not only make them understandable to the business but also make them easier to operate. Intricity does this through using tools and techniques that are familiar to business people but adapted for IT content.
Thought Leadership
Intricity authors a highly sought after Data Management Video Series targeted towards Business Stakeholders at https://www.intricity.com/videos. These videos are used in universities across the world. Here is a small set of universities leveraging Intricity’s videos as a teaching tool:
Talk With a Specialist
If you would like to talk with an Intricity Specialist about your particular scenario, don’t hesitate to reach out to us. You can write us an email: specialist@intricity.com
(C) 2023 by Intricity, LLC
This content is the sole property of Intricity LLC. No reproduction can be made without Intricity's explicit consent.
Intricity, LLC. 244 Fifth Avenue Suite 2026 New York, NY 10001
Phone: 212.461.1100 • Fax: 212.461.1110 • Website: www.intricity.com