Data pipelines are a sequence of steps that extract data from many sources, possibly apply any transformations and then load the data to a destination. Data pipelines are commonly abbreviated as ETL or ELT, depending on when the data is transformed. For example, a company could be interested in storing their customer data analytics from different device types (mobile app data, website traffic data, etc) after combining the data geographically. They would start by connecting those various data sources to Segment (extract), apply processing steps to group it by geography (transform) and finally send it to their Snowflake data warehouse (load).
The ability of creating an automated ETL pipeline that begins with offline flat files is what we call "at the edge ETL".
Once automated, data pipelines provide a strong infrastructure for your business intelligence team to create actionable insights. However, all ETL tools—like Stitch—only extract from online and SaaS sources. What happens if you're part of a company that spends too much time wrangling with flat files (like CSVs and Excel files)? You could be spending precious hours editing large CSVs manually and then load them to an online source ... only to realize that you made a mistake and have to start all over again! Rinse and repeat for every week. This is where Dropbase comes in.
In addition to providing all the usual ETL features of connecting to online data sources, Dropbase accepts a wide variety of flat file formats to which you can apply completely repeatable transformation steps and create a live, analytics-ready database. The ability of creating an automated ETL pipeline that begins with offline flat files is what we call "at the edge ETL".
In Dropbase, Pipelines can be used to solve the following types of problems:
Pipelines, in general, are very useful when you regularly process incoming data of the same original schema. You simply specify the processing steps (like steps in a recipe) on your worksheet and deploy it as a pipeline. Let's see how to deploy a pipeline in Dropbase!
Now, when you have a new file of the same original schema, you can simply drop the file into the Upload window after clicking your Pipeline. Check out our docs for more details and the below video for an example.