Author: research@theseattledataguy.com

managed workflows for apache airflow

What Is Managed Workflows for Apache Airflow On AWS And Why Companies Should Migrate To It

Source Apache Airflow is a popular open-source tool that helps teams create, schedule, and monitor sequences of tasks, known as “workflows.” Managed Workflows for Apache Airflow, abbreviated as MWAA, is a managed service designed for orchestration in Apache Airflow. With MWAA, users can easily operate data pipelines at scale, setting up and managing them from…
Read more


September 16, 2021 0
segment data consulting

What Is Segment And Why You Should Use It For Your Customer Data Platform

Companies of all sizes are looking for faster and easier ways to take advantage of data through analytics and BI. In particular, knowing who your customers are, how they interact with your business, products and services is a modern necessity. In a customer-centric world where data can help improve the overall customer experience and relationships,…
Read more


August 8, 2021 0
data engineering consulting

What Is The Modern Data Stack And Why You Need to Migrate to the It

Photo by Myriam Jessier on Unsplash The modern data stack (MDS) is a new approach to data integration capable of saving your engineers time while allowing both engineers and analysts to focus on high-value pursuits. With a suite of tools to support data integration, the modern data stack will free your teams of monotony while empowering…
Read more


August 4, 2021 0
data engineering consulting

What Is Snowflake And Why You Should Use It For Your Cloud Data Warehouse

Source https://unsplash.com/photos/sb7RUrRMaC4 Snowflake is a cloud data platform. To be more specific it’s the first cloud built data platform. Its architecture allows data specialists to not only create data warehouses but also cloud data lake-houses because it can manage both structured and unstructured data easily. Being that Snowflake is based in the cloud, this means that it…
Read more


July 26, 2021 0
data engineering roadmap

Data Engineering Roadmap For 2021 – 12 Steps To Help You Go From 0 To Data Engineering

Maybe it’s the 6 figure salaries, the opportunity to work with cool technology or people are finally learning that data engineering is where everything starts in the data field. Whatever the reason, people are noticing. VCs are investing in data storage and ingestion platforms and companies are interviewing more data engineers compared to previous years. But how…
Read more


July 9, 2021 1
data engineering scaling hiring

Scaling a Data Analytics Team for a Billion Dollar Start-Up With Veronica Zhai Of Fivetran

Scaling data talent is hard. It’s hard to hire. Hard to grow and just overall hard in terms of managing processes and output. But there are brilliant leaders and managers doing it everyday. I recently interviewed Veronica Zhai, the principal product manager at Fivetran. She has led Fivetran’s analytical team for nearly the last year…
Read more


July 4, 2021 0
data strategy cost reduction

Is Data Engineering For You? Maybe you’re a recovering data scientist?

Photo by Christina @ wocintechchat.com on Unsplash Why does it feel like so many more articles are discussing the data engineering profession? Perhaps it’s because Dice’s 2020 tech jobs report cites data engineering as the fastest-growing field in 2020, increasing by a staggering 50%, while data science roles only increased by 10%? Or maybe it’s just because the Medium…
Read more


June 23, 2021 0

Rivery.io – What Is It and How It Can Help You Develop Your Data Pipelines

Photo by Martin Sanchez on Unsplash There are plenty of clichés about data and its likeness to oil. But it’s far from easy to get value from data. Companies are creating more data than ever before. At our current pace, 2.5 quintillion bytes of data are created every day. Companies went from pulling data solely…
Read more


June 22, 2021 0
spark consulting data

What is Apache Spark? And Why Use It?

Photo by Jakub Skafiriak on Unsplash Born out of frustration with the only open source distributed programming implementation of the time, Apache Spark was created in the UC Berkeley AMPLab in 2014 to replace it’s predecessor Hadoop MapReduce. MapReduce was robust but burdened by excessive boiler plating, serialization and deserialization. MapReduce was created to anticipate node…
Read more


May 29, 2021 0
data engineering consulting

3 Heads Of Data And Founders Perspective On Where Data Is Going

Photo by ThisisEngineering RAEng on Unsplash 2021 is almost halfway over and it seems like hundreds of millions of dollars has gone into investing in data, data start-ups and machine learning. In particular funding has also shifted heavily from just focusing on the data science and machine learning space to the data engineering and data management…
Read more


May 27, 2021 0