Author: research@theseattledataguy.com

cloud consulting

Airbnb’s Airflow Versus Spotify’s Luigi

Photo by Marcin Jozwiak on Unsplash We recently wrote about ETLs and why they’re important. We wanted to provide an outline for what ETL tools are. You could refer to these ETL tools as workflow tools that help manage moving data from point A to point B. Two of these popular workflow tools are Luigi by Spotify and…
Read more


November 25, 2019 0

Healthcare Fraud Detection With Python

This April a 1.5 billion dollar medicare scheme took advantage of hundreds of thousands of seniors in the US. In reality, this is just a small sliver of the billions of dollars healthcare fraud costs both consumers and insurance providers annually. Healthcare fraud can come from many different directions. Some people might think of the…
Read more


November 6, 2019 0

What Are ETLs and Why Are They Important?

Creating a world of self-service analytics Photo by chuttersnap on Unsplash The rise in self-service analytics is a significant selling point in the business intelligence world. Part of the point of creating self-service analytics is having easy access to the data from your organization. The question is how do you get your data from external application…
Read more


November 2, 2019 0

5 Use Cases for DynamoDB

Introduction Web-based applications face scaling due to the growth of users along with the increasing complexity of data traffic. Along with the modern complexity of business comes the need to process data faster and more robustly. Because of this, standard transactional databases aren’t always the best fit. Instead, databases such as DynamoDB have been designed…
Read more


October 30, 2019 0
predictive modeling

What Is Predictive Modeling?

Photo by Roman Mager on Unsplash In this modern world it is hard to imagine visiting a website that doesn’t automatically personalize what you see or predicting what product you will want to buy? It seems like the whole world wide web already knows who we are. Well, this is what predictive modeling enables us to…
Read more


October 15, 2019 0

Why Big Data Analytics is a Necessity in Digital Advertising

In the age of digital, the traditional way of advertising is slowly fading away — this means fewer brochures and more Facebook videos, fewer mailers and more personalized e-mails. An article on The Balance lists some of the cons of previous traditional advertising methods, which include hard to quantify results, expensive legwork, and unappealing hard-sell…
Read more


October 15, 2019 0

DynamoDB vs. Hadoop vs. MongoDB

Are All NoSQL Systems The Same? Photo by Campaign Creators on Unsplash Which database is best for your current business needs is usually dependent on the skill set of your dev team and the applications in place already. Understanding which database system will best fit your companies both current and future needs is an important step. Databases…
Read more


October 5, 2019 0
data science agile

Using Agile Methodologies in Data Science

Photo by Matteo Vistocco on Unsplash Agile is an umbrella term that refers to several methodologies that focus on being iterative and on getting tangible products and features out quickly at the end of what are often called sprints. This framework has been adapted for multiple domains, including programming and design. Similarly, data science has also…
Read more


October 4, 2019 0

Why Do You Need Data Engineers And What Do They Do?

Photo by Jan Kolar / VUI Designer on Unsplash Introduction:   As vital members of data analytics team in any firm, data engineers are responsible for overseeing, optimizing, monitoring and managing data storage, distribution, and retrieval throughout the company. They are responsible for developing new algorithms and finding the latest trends in data sets that can be used…
Read more


September 26, 2019 0
healthcare analytics

Using BigQuery And SaturnCloud To Analyze Medical Data

Today we wanted to use discuss using cloud tools that are available to everyone to analyze a medical data set. In particular will be using the Kaggle data set for medicare providers. This has information on diagnosis related groups average costs, hospital locations and interesting facts about providers and their service quality. These data sets…
Read more


September 17, 2019 0