Blog

how to use pdfminer

What Is PDFMiner And Should You Use It – How To Extract Data From PDFs

PDF files are one of the most popular file formats today. Because they can preserve the visual layout of documents and are compatible with a wide range of devices and operating systems, PDFs are used for everything from business forms and educational material to creative designs. However, PDF files also present multiple challenges when it…
Read more


January 18, 2025 0
sftp how to

The Basics of SFTP: Authentication, Encryption, and File Management

If you’re looking to pass hundreds of GBs of data quickly, you’re likely not going to use a REST API. That’s why every day, companies share data sets of users, patient claims, financial transactions, and more via SFTP. If you’ve been in the industry for a while, you’ve probably come across automated SFTP jobs that…
Read more


December 23, 2024 0

Alternatives to Azure Document Intelligence Studio: Exploring Powerful Document Analysis Tools

Document Intelligence Studio is a data extraction tool that can pull unstructured data from diverse documents, including invoices, contracts, bank statements, pay stubs, and health insurance cards. The cloud-based tool from Microsoft Azure comes with several prebuilt models designed to extract data from popular document types. However, you can also use labeled datasets to train…
Read more


December 13, 2024 0
data strategy 2025

Preparing Your Data Infrastructure for 2025: Lessons from the Past, Strategies for the Future

When I broke into the data world, everyone wanted to hire data scientists that would let their companies become more data driven. There were statistics about the exabytes of data that we were creating and the value it would provide. However, a few years into my career, the data world started to make a pivot…
Read more


December 3, 2024 0
parsing pdfs with python

Challenges You Will Face When Parsing PDFs With Python – How To Parse PDFs With Python

Scraping data from PDFs is a right of passage if you work in data. Someone somewhere always needs help getting invoices parsed, contracts read through, or dozens of other use cases. Most of us will turn to Python and our trusty list of Python libraries and start plugging away. Of course, there are many challenges…
Read more


November 19, 2024 0
leading a data team

From IC to Data Leader: Key Strategies for Managing and Growing Data Teams

There are plenty of statistics about the speed at which we are creating data in today’s modern world. On the flip side of all that data creation is a need to manage all of that data and thats where data teams come in. But leading these data teams is challenging and yet many new data…
Read more


November 18, 2024 0
unstructured data analytics

What is Unstructured Data? A Guide to Storage, Processing, and Analysis

Much of the data we have used for analysis in traditional enterprises has been structured data. It’s easy for humans to break down, understand, and, in turn, find insights from it. However, much of the data that is being created and will be created comes in some form of unstructured format. However, the digital era…
Read more


November 13, 2024 0
metrics consulting

How Data Teams Drive Business Success by Understanding Core Metrics

A key responsibility for any data team is to understand the core metrics driving their business. Starting from the top, these metrics often include figures like gross revenue and expenses. However, these high-level metrics can feel too far removed and abstract from the actual business.  Many companies, therefore, break down these top-line metrics into more…
Read more


November 13, 2024 0

What Is AWS DMS And Why You Shouldn’t Use It As An ELT

Recently, I’ve encountered a few projects that used AWS DMS, which is almost like an ELT solution. Whether it was moving data from a local database instance to S3 or some other data storage layer. It was interesting to see AWS DMS used in this manner. But it’s not what DMS was built for. As…
Read more


November 8, 2024 0
how to lead a data team

9 Must-Watch Videos for Aspiring Data Leaders: Bridging Tech and Business for Data Team Success

Leading data teams can be challenging. You’ve got management and non-technical teams constantly reaching out with ad-hoc data requests; you’re likely trying to figure out what tools will work best and not blow the bank. Not to mention, you’ve got to bridge the gap between business and technology. All while trying to grow your data…
Read more


November 6, 2024 0