Our latest video describes how NoSQL provides unique benefits of scalability and flexibility. Jared describes it as the "Goldilocks Zone" of databases.
Text from the video:
"When you look at a blog or a news site your mind kicks into high gear and begins to parse all the pieces and parts of the site you are looking at, then it stores in memory the things that interest you. Our minds are so good at doing this that we can even take in information from a website that we’ve never seen before which might follow a different way of displaying the data. But for a moment I’m going to do a little thought experiment: Imagine that before you consumed anything from a webpage you had to understand all the relationships that the webpage had in advance. For example, you would require a perfect understanding of the relationship between the writer and the article, or perfectly understand the relationship of the article to the advertisement. And if you didn’t understand that relationship and all the other¬¬s your brain would literally say no to taking in the information… OK, I know this is a weird thought experiment, but this is literally what happens when we store data in a relational database. Because in a relational database we have to know upfront what the structure or schema of the data is before we write data into it. And this prerequisite is one of the primary reasons why NoSQL and Hadoop came about. Now you may have seen my videos that compare Hadoop to SQL, but in this video, I want to spend time discussing NoSQL.
NoSQL represents a broad category of databases that allow large quantities of unstructured and semi-structured data to be stored and managed. Additionally, they’re designed to handle high levels of reads and writes while scaling horizontally. To better explain this, let’s go back to our previous example, but this time I’m going to challenge an assumption I made earlier about our ability to consume data. If you take an honest assessment of your capacity to parse a webpage or data visualization, you’ll find that your mind actually needs quite a bit of context before it can make heads or tails of what you’re looking at. To test this, take a look at some of the data visualizations on d3js, a popular open-source data visualization template site. Pick out some of the less common data visualizations and you’ll come to the realization that your understanding of what is going on is highly reliant on your own ability to master a previously used context or experience. Compare that to browsing your favorite web page. You don’t fumble around trying to figure out how to use it, instead you go immediately into data processing and storage mode.
So there is a performance gain when we’re reading records if we can apply a structure to the information we’re storing. However, it takes a little more upfront time to define those categories. Relational databases take this to one extreme, and Hadoop takes it to another. NoSQL on the other hand sits in the goldilocks zone. They allow you to apply a structure to the data but they don’t require it upfront, meaning I can store data even though there isn’t a logical category for it yet. Let me give you an example using SQL as a comparison.
Imagine that I’m storing a product catalog in a relational database, and I get new SKUs on a regular basis from our suppliers. In a relational database, I would have my data model ready to consume data from our suppliers. As data would come in I would need to ensure that the data fits the structure that I’ve defined, and if it didn’t I would need to make a decision. Either I would transform the data to fit our model or change the model to fit the new data element. In most cases, organizations begin transforming the data, and it takes a serious amount of discipline and effort to do so because there are lots of SKUs regularly coming in.
Now imagine that I have a NoSQL database on the back end. I still would have data transformations that would apply the tagging to the data, but if data came in which I had not been previously categorized, it would be able to still land into the NoSQL database. I could then see the data and decide if it’s a common enough category to create a semi-structure tag for it, which will enable it to be processed quickly for future read/writes.
In my example, I’ve narrowed my NoSQL type to a document store. However, there are many NoSQL database types, with various structural advantages.
-Column Store: Stores data in columns and can expand to billions of columns allowing for data to be captured and stored, then later categorized and combined
-Graph Database: Stores natural data relationships between data elements to reveal networks like social networks
-Hybrid Cache Stores combine the power of a document store with sophisticated caching to deliver scalability and speed
The bottom line is, we no longer live in a world of a one size fits all database. There are tangible benefits of taking processes that previously were forced into a relational database and considering the use of NoSQL and Hadoop. Intricity can help you evaluate your backend landscape and conduct a modernization review. I recommend you reach out to Intricity and talk with a specialist, we can provide an unbiased evaluation which allows you to better support your internal business processes that don’t put a square peg in a round hole."
-Jared Hillam