Tag: Data Science

parsing pdfs with python

Challenges You Will Face When Parsing PDFs With Python – How To Parse PDFs With Python

Scraping data from PDFs is a right of passage if you work in data. Someone somewhere always needs help getting invoices parsed, contracts read through, or dozens of other use cases. Most of us will turn to Python and our trusty list of Python libraries and start plugging away. Of course, there are many challenges…
Read more


November 19, 2024 0
metrics consulting

How Data Teams Drive Business Success by Understanding Core Metrics

A key responsibility for any data team is to understand the core metrics driving their business. Starting from the top, these metrics often include figures like gross revenue and expenses. However, these high-level metrics can feel too far removed and abstract from the actual business.  Many companies, therefore, break down these top-line metrics into more…
Read more


November 13, 2024 0
how to lead a data team

9 Must-Watch Videos for Aspiring Data Leaders: Bridging Tech and Business for Data Team Success

Leading data teams can be challenging. You’ve got management and non-technical teams constantly reaching out with ad-hoc data requests; you’re likely trying to figure out what tools will work best and not blow the bank. Not to mention, you’ve got to bridge the gap between business and technology. All while trying to grow your data…
Read more


November 6, 2024 0
leading a data analytics team

How To Run A Data Team As A New Head Of Data

What would you do if you became the head or director of data for a 1,000-person company? Yesterday, you were plugging along as an analyst, and now, suddenly, you have all these new responsibilities. Figuring out where to start is part of the job. You’d probably feel a strong temptation to freak out. Who wouldn’t?…
Read more


August 1, 2024 0
apache druid architecture

Apache Druid’s Architecture – How Druid Processes Data In Real Time At Scale

Recently, I wrote an article diving into what Druid is and which companies are using it. Now I wanted to do a deeper dive into Apache Druid’s architecture. Apache Druid has several unique features that allow it to be used as a real-time OLAP. Everything from its various nodes and processes that each have unique…
Read more


March 11, 2024 0
ssis migration project

Alternatives to SSIS(SQL Server Integration Services) – How To Migrate Away From SSIS

SQL Server Integration Services (SSIS) comes with a lot of functionality useful for extracting, transforming, and loading data. It can also play important roles in application development and other projects. But SSIS is far from the only platform that can provide these services. You might seek alternatives to SSIS because you want a more agile…
Read more


February 27, 2024 0
data quality

Why Your Team Needs To Implement Data Quality For Your AI Strategy

Companies that range from start-ups to enterprises are looking to implement AI and ML into their data strategy. With that it’s important not to forget about data quality. Regardless of how fancy or sophisticated a company’s AI model might be, poor data quality will break it. It will make the outputs of these models useless…
Read more


February 12, 2024 0
cut data stack costs

Cutting Your Data Stack Costs: How To Approach It And Common Issues

I once had an engineer tell me that they essentially didn’t want to consider cost as they were building a solution. I was baffled. Don’t get me wrong, yes, when you’re building, you iterate and aim to improve your solutions cost. But from my perspective, I don’t think completely ignoring costs from day one is…
Read more


January 5, 2024 0

Why Is Data Modeling So Challenging – How To Data Model For Analytics

Photo by Shubham Dhage on Unsplash Learning about how to data models from basic star schemas on the internet is like learning data science using the IRIS data set. It works great as a toy example. But it doesn’t match real life at all. Data modeling in real life requires you fully understand the data…
Read more


August 9, 2023 0
snowflake consulting

Data Warehouses Vs Operational Data Stores Vs Data Lakes – How To Store Your Data For Analytics

Photo by Leif Christoph Gottwald on Unsplash A few months ago, I uploaded a video where I discussed data warehouses, data lakes, and transactional databases. However, the world of data management is evolving rapidly, especially with the resurgence of AI and machine learning. There are numerous other methods that technical teams are utilizing to handle…
Read more


August 3, 2023 0