Consider the following scenarios:
In which of these situations do you need a data warehouse and when can you continue using your workflow?
In this article, we're going to help you identify quantitative and qualitative reasons that'll let you answer these questions for your specific company. It will allow you to diagnose major pain points and justify the reasons for implementing a data warehouse.
And if you decide that you do want to go the data warehouse route, we've got you covered here to help pick which one.
We'll be using a variation of the Input-Process-Output model for analyzing team performance to help guide the decision.
Any data storage technology has an obvious input: data. What separates a data warehouse from a database is that a warehouse integrates data from many different sources. This, in the long run, saves your team time, especially if you have a high update frequency. This gives us two quantifiable input factors:
Having overall high input factors increases the need for a data warehouse. Of course, "high" depends on your company's budget/bandwidth.
Once your data is in and you've spent time loading it in, process factors help break down important dimensions to consider in your decision.
However, if you want to answer questions that require wrangling data from many sources, transforming them, and then creating new metrics from them, then you have high query complexity. A good heuristic is the processing time behind your queries. If it takes you hours, or even days to find business answers then you likely have high query complexity.
The reason why this is important is that data warehouses excel at allowing you to create/update analytical data without slowing down your transactional data backups. Analytical queries take up processing power and time. Time away from carefully maintaining your transactional data! So, the more you create analytical data, the more you need a data warehouse.
This is the factor to consider in your data warehouse decision. Who is the end user of your reporting? If the end users are just your small team, you can perhaps get away with using your current stack. On the other hand, if you're just out of Series B funding (congrats!) and have to report to investors then you most likely need a data warehouse to consistently create professional reports.
Besides the IPO model, there may be qualitative reasons to consider a data warehouse. Your team may want to go a step further and set up a data warehouse now so that you don't have to worry about it down the road. There may be intense pressure for reporting because of a recent funding round and you want to get a warehouse to alleviate that stress. Or, your engineering team just hates the manual process of updating files and it's now affecting performance. All these reasons are hard to measure but shouldn't be ignored to arrive at the right decision for your team.
We covered quantitative and qualitative reasons to help with the data warehouse decision. If you do conclude that you need a data warehouse, there are still more decisions to be made. A big one is choosing among Snowflake, Redshift, or BigQuery. You could also decide to use a turnkey data integration tool like Dropbase to help collect, clean, and centralize all your flat Excel and .csv files. The choice is yours and happy deciding!