What Is Fivetran And Why You Should Use It
When your team needs to create dashboards and metrics using data from external sources like Salesforce and MySQL, they face a difficult problem.
How do they get all their data into one place in order to merge it all together?
Traditionally you would need to have a large team of data engineers to connect to all these various data sources, set up complex ETL jobs using either custom scripts or python libraries like Airflow and Luigi and then they would have to ensure that those pipelines were constantly running.
This method is both time consuming and expensive.
Thankfully, there are modern tools that can help your team simplify your data workflow and instead of 4, 5 maybe 10 data engineers. You can go down to just one.
One of those tools is Fivetran.
What Is Fivetran
Fivetran is a highly comprehensive ELT tool that is becoming more popular every day.
You’re probably wondering, what is an ELT?
ELT stands for extract, load, transform.
These are the steps required to take your data from a system like Saleforce or Quickbooks and insert it into your centralized data storage system (often referred to as a data warehouse).
Basically, Fivetran allows for efficient collection of business processes and customer data from related applications, websites, and servers. The data collected is then transferred to other tools for analytics, marketing, and data warehousing purposes.
Why Are Companies Turning To Fivetran?
As more third-party tools provide access to their data sources, companies are looking to build more and more data pipelines that get data into their data warehouses and data lakes.
Therein lies a major problem.
Data engineers, the ones responsible for building those pipelines are often the bottleneck.
Between having to build and deploy multiple new pipelines, building new datasets, and the constant one-off requests to add new columns. All of these requests start to quickly weigh down any data engineer team.
One option is to continually hire new data engineers. However, this is a very expensive endeavor. A data engineer can easily cost a company upwards of 100k per year. That’s a large extra expense to take on.
So that why Fivetran and other ELT tools have become popular.
Just to see how valuable Fivetran is. Here are some great Fivetran use cases.
Do You Need Fivetran And Data Infrastructure Consulting? Contact Our Team Of Data Infrastructure And Machine Learning Experts Today For A Free 30 Minute Consultation
Fivetran Case Studies
Fivetran recently released a great set of case studies on how they were able to improve companies like MVF, Starva, Square, Falcion.io, and several other companies’ modern data stacks. These companies were able to use Fivetran to reduce their costs as well as increase the speed they were able to get data into their data warehouses.
Let’s look at a few of those use cases.
MVF Increases Monthly Income By £400,000
MVF is a customer generation company that helps clients gain leads and delivers massive volumes of high-quality leads on a pay-per-lead basis. MVF’s main focus was it’s customer facing technology.
However, on their data side they had a brittle, error-prone solutions and hacks: Excel, Google Sheets, Access databases and one-off SQL queries. This one system was trying to manage 20-30 different data sources. Including data sources like Amazon S3, Bing Ads, Google Ads, Google Analytics, Iterable, LinkedIn Ad Analytics, MySQL and several others.
All with just two data engineers.
The team at MVF found their solution in Fivetran.
MVF’s Head of Analytics found that “Fivetrans connector coverage is vast, visibility on monitoring alerts is great, and the support has been brilliant — that is key.” MVF was able to take their small team of data engineers and create a large amount of impact. They were able to take two engineers powered by Fivetran and create an increases revenue of about £400,000 per month. All based on new ability to visualize leads.
But that’s not all.
MVF’s team was also able to reduce the maintenance cost of managing their complex pipelines, they were able to reassign engineering time towards bigger picture strategic tasks as well as increase data literacy by creating automated Looker reports.
Fivetran (and similar tools like Fivetran) is a key piece of the modern data stack.
Strava Saved $120,000 And Gained Insights Into Their Customers
Building customer 360 tools and digging into a customer’s journey is a necessity in a world with personally targeted ads and the expectation of user personalization.
Strava needed to better understand the customer journey and refine its marketing investment strategy.
There was one major problem. Their engineering team was too small to dedicate an engineer to extract and centralize all siloed data. So they needed a solution that let them easily pull data quickly without dedicating an entire FTE to just pull data from Apple Search Ads, Facebook Ad Insights, Google Ads, Google Play, Google Search Console, iTunes Connect, and Zendesk.
Traditionally this would take a lot of time requiring a custom connector or data pull from all these various sources.
Instead, Strava took advantage of Fivetran.
Using Fivetran, Strava was able to reduce the need to hire an FTE. as put by Michael Li, a data scientist at Starva. “Without Fivetran, we would need an additional full-time engineer to support the marketing team’s data needs, but Fivetran does more than free up time — it enhances our capabilities by enriching our marketing data set.” Again, a single data engineer can cost a company upwards of $100,000 a year(not including benefits).
At the end of the day, Fivetran allowed Strava to centralize their data from multiple external platforms and eliminate the need for an additional full-time engineer to support the marketing team’s data needs. All of this leads to them also being able to quickly and efficiently prioritizes product features and then build an attribution model to better understand user acquisition and map the entire customer journey.
Square Optimizing Limited Engineering Resources
Square suffered from a similar problem that I have seen multiple companies have. The need to do operational data work like adding columns, maintain pipelines, and make add-on tables can quickly divert engineering resources away from more impactful work such as integrating data sets and improving internal tooling.
This is why Square turned to Fivetran. In the words of Guli Zhu, Head of Marketing Analytics, this allowed Square to “spend less time on data plumbing, so we can focus on innovative initiatives like leveraging chatbots for lead qualification. And we’ve significantly elevated our infrastructure capabilities — a necessity now that we’re a $20 billion public company.”
Even large companies are finding Fivetran useful. Square was able too frees engineering resources to focus on innovation and product improvement (e.g., leveraging chatbots for lead qualification, building sales-enablement tools) and easily accommodates maintenance and internal tool improvement requests (e.g., a better SFTP loader) instead of deprioritizing them This all lead to them improving their analytical outcomes with continually updated data.
These were just three of the many case studies that Fivetran put together. If you would like to see some of the rest, then you can find them here.
But now let’s go over some of Fivetran’s features.
Fivetran Data Connectors
Fivetran does all this by offering a myriad of connectors to both source and destination connectors that allows for both pushing and pulling of data. The pull connectors will pull from data sources in a variety of methods including ODBC, JDBC, and multiple API methods. That’s a lot of technical abbreviations that translate to Fivetran being able to connect to nearly a hundred different data sources.
What Are Pushing And Pulling Data Connectors
Fivetran’s push connectors receive data that a source sends, or pushes, to them. In push connectors, such as Webhooks or Snowplow, source systems send data to Fivetran as events. This is very useful to make sure your data is as real-time as possible. From there, you can start implementing logic, and dashboards. This is where Fivetran data transformations come into play.
How Does Fivetran Transformations Work
This tool isn’t limited to only extracting and loading data.
After loading the data, Fivetran offers data teams the ability to easily set up custom data transformations. These transformations can run in your destination after you load your data in, so your raw data is always available alongside your transformed data. If a transformation fails, you can look into the logs to try to figure out the issue. If your analytical needs change, you can edit your transformations and run them again on your raw data.
There are two types of transformations you can use. Custom SQL and dbt transformations. Both can be set to run when you need and can help shape your data as need be.
What’s also great is if your team uses dbt transformations they can be version controlled to ensure your team can roll back to a previous data transformation.
For those unfamilar with dbt.
dbt is an open-source software that enables you to perform sophisticated data transformations in your destination using simple SQL statements. With dbt, you can, write and test SQL transformations, create and share documentation about your dbt transformations and view data lineage graphs.
Some other high level differences are listed below.
Once you have created your transformations. You will need to schedule them to run.
How Does Fivetran Manage Data Scheduling
One of the hacky things data engineers do to manage scheduling is put everything into cron or Windows Task Scheduler. Both of these tools allow you to schedule jobs at specific times and intervals.
The problem is, this can quickly become very messy.
Fivetran makes it easy. You can set your transformations to run either at specific intervals through the user interface or set it to run when new data is added to your source.
It’s also important to note that depending on how you set up your sources as well as which sources you are using can change how your data is updated.
Fivetran refers to the initial data load essentially as a sync. After the initial sync of your historical data, Fivetran moves to performing incremental updates of any new or modified data from your source database. During incremental syncs, Fivetran uses your database’s native change capture mechanism to request only the data that has changed since our last sync, including deletes. Each database uses a different change capture mechanism.
You can also set up re-syncs that will essentially wipe out all the previous data in the past.
The point here is that most of this data loading, is done without setting up complex API connections.
Should I hire a BI Developer/Data Engineer For Fivetran
Tools like Fivetran make developing ELTs very easy. The scheduling, connectors and transformations remove a lot of the heavy infrastructure lift that requires data and software engineers.
However, this doesn’t mean you don’t need a data expert.
Between needing to build dashboards off the data as well as creating SQL queries that are reliable and robust, having a strong data expert is important.
Our team can help you implement your Fivetran transformations as well as develop dashboards and metrics. Reach out to us today for a free consultation.
If you are interested in reading more about data science or data engineering, then read the articles below.
Ready to Graduate From Your ELT Tool? Here’s Your Next Move
Unleashing the full power of data insights using Hyperquery
Fivetran Custom Connectors: How They Work & How To Get One
4 SQL Tips For Data Scientists
What Are The Benefits Of Cloud Data Warehousing And Why You Should Migrate