From IC to Data Leader: Key Strategies for Managing and Growing Data Teams
There are plenty of statistics about the speed at which we are creating data in today’s modern world. On the flip side of all that data creation is a need to manage all of that data and thats where data teams come in.
But leading these data teams is challenging and yet many new data leaders get very little training. Instead they are just thrown into the mix.
There is just the expectation that they will do a good job because we were a good individual contributor.
So I wanted to start to put together a list of what you need to know and do to be successful as a leader in the data space.
This ranges from taking even more time to understand the business to being intentional on how you hire and place talent.
So if you’re thinking of leading a data team or perhaps already are this is for you!
Understand the Business
If you’re doing these in any sort of priority order, this should be number one. – Tom Rampley Head Of Data At LastPass
Many people in the data world like to talk about the business side of organizations needing to become data fluent and not enough talk about how data professionals need to become business fluent.
How?
You must talk to and understand the business while doing your own research on your industry operates. Someone has to act as the liaison between the business and data, and it’s likely going to be whoever is in charge of the data team. Otherwise, you’ll have a hard time building what the business needs.
Yes, it’s important for the business to have a basic understanding of how to interpret charts and grasp fundamental data concepts.
However, it’s often easier for a data professional to learn about how a sales funnel operates, or how patient workflows in a hospital translate into data, than for non-data professionals to dive deep into analytics.
After all, data leaders are the ones analyzing the data, which is essentially a reflection of the business itself. They are well-positioned to ask the right questions and uncover insights.
On the topic of understanding the business, I really appreciated Veronika Durgin’s insight on making your data team indispensable.
Data teams become critical and irreplaceable when they are working on critical company initiatives – Veronika Durgin VP of Data at Saks
If you want to be part of company initiatives and not just be an ad-hoc help desk, you’re going to have to get into the weeds of what goes on in the business.
In the end, the business won’t be getting into the weeds of how your data infrastructure works.
The Business Doesn’t Care About How You Solve the Problem
“Never talk about data technology, infrastructure, or queries with people outside the data team — they just don’t care.” – Ethan Aaron CEO of Portable
Never is perhaps too strong. There is, as often in many topics, nuance.
Jeff Nemecek the Director of Engineering & Architecture at The Walt Disney Company made a great comment here covering some of that nuance, stating that:
Instead of using a vendor reference (Snowflake, Kafka, AWS, S3, Airflow…) speak of functions (data warehouse, streaming events, process orchestration, data pipeline) in ways a business person can connect with. The purpose is to help them understand the complexity of your work with clarity, not confusion. Why is it going to take a month to get the new product line integrated into the daily reports? They often need to understand the functional steps, but not the details of the underlying technologies.
I’ll add a few follow-ups. First, make sure everyone is on the same page in terms of what the words you use mean. I have seen teams struggle to move forward with projects because no one clearly defined what they meant when they were referencing their Postgres instance.
Was it a data warehouse, ODS, a database. Every term in the book got thrown around. Not everything fits into a perfect box, so sometimes you just need to define it, its general function and move on.
Second, just to make a point clear, if you ever have to open up an IDE or start explaining why a query doesn’t work with the C-suite on the same phone call, you’ve likely f*cked up.
When you speak to the business, it’s not about you and your technical problems. The business wants to know what you’re doing to move the project forward, to drive the outcomes they are looking for you to help them look good in front of shareholders or their boss, and everything you think is important isn’t (unless you’re blowing up your cloud bill). Then they are suddenly going to ask you about specific technologies.
Overall, you need to communicate with the business about what the business cares about, this means the functional components and business outcomes. You still want the executives to be able to ask questions confidently without needing to understand all the technical details.
Bad Data Quality Will Cost You
Your data must be accurate. You only have so many at-bats, as
says, so focus on building reliable systems.
Being $1 off today means you could be $100,000 off tomorrow.
Those small details matter. You’d be surprised what a CFO will notice when they are looking at a report only to see that their units sold or total expense for an account is off by $5. Those details matter, and the more you can ensure the data is accurate in the source as well as in whatever you decide to call your data analytics storage layer, the less likely you deal with these callouts.
Because once you lose trust, you’re going to have a very hard time gaining it back.
What’s worse, it further encourages shadow data teams and or decentralized processes that other departments might take on to build reports and numbers they want to see.
“A lack of data quality will cause other departments to silo their data when zero executives trust the enterprise data, and each come up with their own numbers.”- Eric Gonzalez VP, Business Intelligence Architecture at Eastern Bank
So keep that in mind when you think, hey we are only $5 off!
Take What Vendors Say With A Grain Of Salt
There are plenty of articles out there talking about the best way to set-up your data infrastructure.
But it’s important to know the source of said article. For example, I am sure there were plenty of articles talking about how schema-on-read was going to be THE method to help reduce the time to insights because you no longer had to spend as much time developing a data warehouse. Instead you could use a data lake. Forget ever needing a data warehouse!
Now I still see plenty of data lakes, but often they act more as a place to do initial processing with cheaper compute that will then get loaded into the data warehouse. But man, I remember back in 2015-2016, it seemed like everyone was talking about schema-on-read like it was a best practice.
But really it was a combination of vendors pushing Hadoop and a new paradigm that we were only beginning to figure out how to utilize(but there was a Google paper on it, so we better make it work). To be clear, Hadoop continues to play a massive role in the data world today. But there has been a shift away from heavily relying on schema-on-read.
Technology and best practices take time to actually settle and find their place. So if you believe a vendor is trying to push a methodology or approach that is new and untested just to sell their product, they probably are.
Be Intentional With Data And Your Data Roles
At the end of the day, the goals of data engineers and data architects are just different from that of data scientists and analysts. In turn, we approach the development of data pipelines and data sets with different intent.
That’s why data engineers and data architects should be in charge of the core data layer of a company, the data that represents the core aspects of the business. Some people may call these the core entities and relationships. But they are the building blocks, from a data perspective, that all other analytics and machine learning will be built on(obviously many companies just have a data analyst so this is likely less relevant to them).
Whether this is one team or a data engineer in each team, this layer of data should be treated like infrastructure with everything else being built on top of it.
Now I do want to be clear–the way I would envision an ideal state would be after the core layer of data is developed, and the following layers would be less restrictive. Perhaps an analytics engineer builds their own models, or as Bill Shube Sr. Manager, AMS Supply Chain Operations Technology and Analytics at the Lego Group referenced, the analysts and business users might build out the “analytics final mile.” I have no qualms with that and honestly see it as an ideal outcome.
It allows data engineers to really focus on building high-quality and reliable datasets and the analysts to build more ad-hoc reports, dashboards and one-off use cases.
Navigating the Role of New Data Leaders
This article was inspired by a multitude of things. As stated earlier, the data leaders calling most data teams failures really struck a chord. I don’t believe that most data teams are failing. However, I do believe that we’ve started to shift away from the discipline required to build reliable and trustworthy data systems.
I also believe that most new data leaders are often thrown into the role with little care for how they will adjust. So if you really enjoyed this topic, please let me know and I can start to dig even deeper!
With that, as always, thanks for reading!
And if you are a data leader who is looking for advice on how to better lead their team, feel free to set-up a consultation here
Also! Don’t forget to check the articles below.
Common Pitfalls of Data Analytics Projects
Azure Blob Storage file transfer using Mage Pro’s dynamic blocks
9 Habits Of Effective Data Managers – Running A Data Team
The Data Engineer’s Guide to ETL Alternatives
How to cut exact scoring moments from Euro 2024 videos with SQL
How To Modernize Your Data Strategy And Infrastructure For 2025