Author: research@theseattledataguy.com

How to parse a pdf with SQL

How To Automate PDF Data Extraction – 3 Different Methods To Parse PDFs For Analytics

If you work in data, then at some point in your career, you’ll likely need to parse data from a PDF. You might need to parse thousands of PDFs in order to pull out invoice information. Or maybe you need to parse financial filing documents such as 10-Ks. This can seem challenging at first. Afterall,…
Read more


October 2, 2024 0
planning data strategy

How To Modernize Your Data Strategy And Infrastructure For 2025

We are still in the early days of data and the value it can add to companies. You’ll read plenty of statistics about how much value data can drive and how far behind companies that aren’t using data are. And as a data consultant, I have helped companies find that value in their data. It…
Read more


September 20, 2024 0

Real-time Analytics Vs Stream Processing – What Is The Difference?

One of the holy grails that many data teams seem to chase is real-time data analytics. After all, if you can have real-time analytics, you can make better decisions faster. However, there often is a conflation between real-time data analytics and stream processing.  These are two different concepts that are crucial to understanding how to…
Read more


September 3, 2024 0
how to learn ai for data engineering

Essential Skills for Data Engineers in the Age of AI

If you work in data, then AI is everywhere at this point.  But whether AI is hype or reality doesn’t change the fact that data engineers will play a major role in ensuring that the data sets that are utilized for the growing use cases are usable both by machines and humans. Whether that data…
Read more


August 8, 2024 0
leading a data analytics team

How To Run A Data Team As A New Head Of Data

What would you do if you became the head or director of data for a 1,000-person company? Yesterday, you were plugging along as an analyst, and now, suddenly, you have all these new responsibilities. Figuring out where to start is part of the job. You’d probably feel a strong temptation to freak out. Who wouldn’t?…
Read more


August 1, 2024 0
how to lead a data team

9 Habits Of Effective Data Managers – Running A Data Team

Running a successful data team is hard. Data teams are expected to juggle a combination of ad-hoc requests, big bet projects, migrations, etc. All while keeping up with the latest changes in technology. In the past few years I have gotten to work with dozens of teams and see how various directors and managers deal…
Read more


July 2, 2024 0
how to data model analytics

How To Data Model – Real Life Examples Of How Companies Model Their Data

How companies data model varies widely. They might say they use Kimball dimensional modeling. However, when you look in their data warehouse the only part you recognize is the word fact and dim. Over the past near decade, I have worked for and with different companies that have used various methods to capture this data.…
Read more


May 31, 2024 0
analytics consulting

Why Data Analysts And Engineers Make Great Consultants

Many data engineers and analysts don’t realize how valuable the knowledge they have is. They’ve spent hours upon hours learning SQL, Python, how to properly analyze data, build data warehouses, and understand the differences between eight different ETL solutions. Even what they might think is basic knowledge could be worth $10,000 to $100,000+ for a…
Read more


May 27, 2024 0
airbyte alternatives

4 ELT Alternatives To Airbyte – How To Ingest Your Data

Getting data out of source systems and into a data warehouse or data lake is one of the first steps in making it usable by analysts and data scientists. The question is how will your team do that? Will they write custom data connectors, pay for a data connector out of the box or perhaps…
Read more


May 8, 2024 0
change data capture time

Terms You Should Know If You’re Planning To Use Change Data Capture

If you’ve worked in data long enough, then you’ve likely come across the term change data capture. Often called CDC, change data capture involves tracking and recording changes in a database as they happen, and then transmitting these changes to designated targets. This can be crucial because some pipelines, in particular batch pipelines, don’t capture…
Read more


April 29, 2024 0