When Is It Time To Move To An Automated Data Pipeline?
Recently Toric has introduced and launched no-code data automations. But how do you know when it's time to use them in your organization? In this article, I'm going to explain what an automated data pipeline is and when to use triggers to take advantage of no-code data automation.
What Is An Automated Data Pipeline?
A data pipeline refers to a system or process in which you move data from its source to a destination, optimized and ready for data insights. An automated data pipeline is a pipeline that has been set up to run automatically using triggers. Sometimes this is accomplished with a data engineer's building code.
Toric automations you can automate any part of the ETL (extract, transform, load) process which makes up the data pipeline without code.
When do you want to move to an automated pipeline?
My bias is showing but I believe when anything can be automated it should be, especially if it takes very little setup time. But let's go over particular instances which have inspired the move to an automated data pipeline in our users.
1. When it's difficult to connect to your data sources.
The first step to any data analysis is extracting data. However this process can be difficult and time consuming. Sometimes this process requires the help of an IT team or software engineers.
You can avoid this with direct connections to data sources, even data warehouses or sometimes directly from the software. However, this is not enough. Data extraction is useful for initial insights but for automations it's set up so you can run the extraction time and time again at different time points as data is updated.
2. When your data constantly changes and you need to track it.
Automations are great when you want to tell the differences between data sets for projects over time. For example if yu are a construction company running projects in Procore. You may have native data to tell when is happening at that particular point in time, but not over time. This data is constantly changing, but the historical records are useful for understanding how to make better decisions for a future projects or how current projects are progressing and at what pace.
In this instance, you can have an automation which is a time based trigger to create a historical record for analysis.
3. When you want to instantly tell the differences between data sets.
Sometimes it's not enough to have data over time, instead or on top of the time based trigger you want to understand where your data has changed. Maybe you want to trigger an automation based on a change in data.
In this case it is possible to do this instantly using the Toric Diff Node in data flows and run a dataflow automation. In this instance it's easy to tell the differences between data because it is instantaneous and a record is also created for the changes over time.
The diff node allows you to instantly identify differences between different data sets.
4. When you are working with 3D model data.
The diff node and automation triggers have been particularly useful for users who are using Revit or have a warehouse with Revit files. For these projects to progress in a meaningful way you need to tell the difference between models and model versions. See our model comparison app as an example.
5. When you want standardized data cleanup.
Toric's data transformations are non-destructive so another great use case for data automation is for historical database data cleanup and transformation. If you understand how you want to standardize data you can build a customized dataflow from the data sources, run the automation through a Toric dataflow or project and then make the data available for use from Toric's data warehouse or another data warehouse which stores the now standardized data.
How do no-code data automations work?
Connect directly to your data source and build a trigger using Toric's no-code tools such as Toric's automation triggers.
Watch our Toric's 101 video series to learn how to build your own or the video below on how to create a data automation.
4 Types of Data Automation Triggers
Once I've connected to my data source I need to set what my trigger is for this automation.
1. Data Automation Triggers by Source Updated.
So by default the trigger is going to be source updated which essentially means my automation is going to run every time my source has a new version and then I can select the source type and the project for the specific source.
2. Time Based Data Automation Triggers.
The other options for the trigger are time based so I could select an hours and minutes and at that frequency my automation is going to run. This trigger is great for projects and to compare progress.
3. Manual Data Automations.
I could also do a manual automation which would only run when I wanted to. So there'll be a run automation button and once I click it my automation is going to run.
4. Web hook Based Automations.
Finally I have a web hook based automation. It allows you to send real-time data from one application to another whenever a given event occurs.
So this webhook automation is one that pulls data from the source every time a change event is triggered. This is different from a source updated automation which runs when the latest version of the source is available in Toric. This automations runs when a change in the data is made natively at the source.
Set Up Automation Actions
The next thing that I need to do is set up what the action for the automation is.
So this would be what would happen once the automation is triggered. By default it's going to be run data flow.
Dataflow Automation Action Trigger.
What run data flow means is if I select a project and then a corresponding data flow, every time my automation runs, the data flow is going to be executed.
Import Data Automation Action Trigger.
Similarly I'm going to have this import data action type.
The way this works is I can select an application which is going to be one of my integrations and then I'm going to set up the configuration for that corresponding application.
So if I select target broad, then I can select a destination project and then I can select the corresponding channel and then I can finally create my automation.
Enable the Automation
Remember once you create your automation and there's going to be a tab on the top right corner which would be disabled and only on enabling the automation would the automation actually be executed.
Automation triggers can be incredible useful for anyone looking to simplify their data analysis. Get in touch with our team to get started on your own data pipeline.