Scaling a Data Analytics Team for a Billion Dollar Start-Up With Veronica Zhai Of Fivetran
Scaling data talent is hard.
It’s hard to hire.
Hard to grow and just overall hard in terms of managing processes and output.
But there are brilliant leaders and managers doing it everyday.
I recently interviewed Veronica Zhai, the principal product manager at Fivetran. She has led Fivetran’s analytical team for nearly the last year and has managed to grow it, established a defined roadmap and deliver on goals.
In this community update, I wanted to condense the interview down to it’s key points such as hiring data talent, dealing with data flow pain points and discussing the modern data stack.
Let’s dive in!
Introduction to Veronica and Her Background
Veronica graduated from Columbia University in 2015 before becoming a derivatives trader at J.P. Morgan. There, she spearheaded the data transformation through data analytics and A.I. machine learning while scaling earnings by over 152% YoY.
After finding massive success in the financial markets, Veronica has now made her way to Fivetran, which uses data analytics to help businesses scale their operations.
Veronica is currently a principal product manager at the analytics team in Fivetran. She’s responsible for the overall roadmap and execution of their analytics platform, which handles millions of transactions per day. Veronica has helped increase staff from less than ten people to almost 20—and she knows that scaling will be one more thing under her belt as time goes on.
How do you get to manage a data product?
I started my career as an options trader. As a trader, I was very impressed with how quantitative the systems were. We were able to use a system that allowed us, traders, to generate over $100 million annually.
However, I noticed that there was not a major emphasis on collecting and optimizing the market data. So, I took on a project that would allow us to trade better and monetize the insights from the data.
This was the birth of my passion for understanding analytics and using market data to increase business revenues. From there, I created the first analytic stack for J.P. Morgan, where we centralized all the data in one place to extract the data more efficiently.
What were the major pain points you were running into while developing this data project?
What I have noticed is that different companies will have various pain points.
Let’s first look at larger enterprises. A company like J.P. Morgan has very mature and clean data. The pain for a large enterprise won’t be in the cleaning or organization of the data but instead, in the adoption of a modern data stack.
Many of these older enterprises are using older infrastructure that may be limited when attempting to scale with vast quantities of data.
A second pain point that I saw at J.P. Morgan was the time that it took for Data Engineers to collect the necessary data for me to complete my project.
In this case, it took six months to get the data engineers to complete their tasks and jobs.
When looking at smaller enterprises, they tend to be more agile and have higher adaptation to tech systems, but the issue is often in the operational systems.
When a company is smaller and rapidly scaling, system and operational issues are very common.
To summarize, one of the most common pain points for large enterprises is infrastructure, whereas smaller businesses are often more agile and more likely to face operational pain points.
What was the big “ah-ha” moment when you thought that Fivetran might have the right idea?
Being the individual responsible for delivering insights to the business, I was personally frustrated that I could not deliver things faster.
This is what motivated me to start researching the different technologies out there that can enable us to do data engineering and build data pipelines faster.
This was when I stumbled upon Fivetran.
Fivetran can move data from sources to destinations in a matter of days instead of weeks and months.
Moving data at rates that could save months of waiting was precisely the solution I was looking for. The more I learned about Fivetran, the more I fell in love with the company and what they stand for.
What does the modern data stack mean to you?
I think of the modern data stack as a suite of cloud-native data tools designed to automate, provide ease of use, and provide a lower-cost solution to help companies manage data better.
What do you think the state of most companies’ data stacks is currently?
The exciting thing about the current data stack industry is the ability to now integrate various new technologies.
For example, there are cloud data warehouses like Snowflake and Bigquery and B.I. tools such as Tableau and data transformation tools like dbt that can all integrate and work together.
Companies now have to be using these softwares that work together to fully understand the amount of data they have efficiently.
The CEO of Accenture mentioned that 90% of the data was produced in the last two years, and only 1 percent of this data has actually been utilized.
This shows the need for centralized systems and software that integrate well with one another.
Hiring data professionals is hard. Whether it be data scientists, analysts or data engineers. What are you looking for in general in a data professional?
I think of these as soft skills and technical skills.
In each of these skills, there are two key pillars.
Technical Skills: The first pillar in technical skills is, can you code? Can you do SQL really well? Can you use visualization and B.I. tools?
The second pillar being business domain knowledge.
To be successful as a data analyst, you have to have enough business intuition to solve a problem by looking at data. Having marketing analytics experience or other business-related experience goes a long way. Problem-solving is essential if you want to be successful in this field.
Soft Skills: On the soft skills side, I really value “the spark.” How excited do you get when you can take data and find solutions for businesses?
This, to me, is what separates a skilled data analyst from an exceptional one.
The second aspect being your ability to communicate. Engaging with stakeholders and end-users and adequately sharing the findings in the data research is an essential element of working in the data industry.
Are Their Data Specialist Archetypes?
I think the traditional way of approaching managing and scaling a team is thinking of different archetypes of a role.
However, I think about this more from a skills perspective.
What skills do we need within the team? And what skills can we gain by using different software available to us?
1. SQL skill
The number 1 skill that I am looking for in a data team is SQL.
SQL takes six months to a year to become very good at it, so this is something we highly value on our teams.
2. Analytics
The second most important skill is analytics.
Can a person critically think about a problem and solve the problem analytically?
3. Infrastructure
The third skill I would say is being able to build the actual infrastructure. This is more the data engineer kind of role.
However, using software, you can often replace the data engineer role.
4. PM skills
Because you are delivering products for either internal or external end-users, you need someone who can collect requirements, run prioritization, communicate with stakeholders and manage the project.
5. Prediction skills
Analysts are more about solving what has happened in the past, but you also need data scientists who can look at that same data and predict what will happen in the future to manage your business more practically.
What challenges have you faced scaling a data team?
The point of scaling a team is to develop a team that can keep up with the organization’s growth.
One of the most significant issues we have found was we started the organization design with a distributed model.
What we realized was a distributed model is not very efficient when wanting to scale an organization quickly. The reason for this is that having a non-centralized system will cause misalignment in terms of strategy and execution.
The whole point of building a data stack is to centralize all the data into one place, so if you do not have a centralized team as well, it defeats the purpose of creating a centralized data stack.
What’s your organizational design for developing a data roadmap for your team?
As the team grows, if we were to have thousands of employees, I can see us switching to a centralized, decentralized model like Facebook.
At some point, when an organization has enough employees, a new model will have to be looked at. However, having a key part of the organization as a centralized structure and team will always be very important.
Finally! Why should CEOs invest in data?
Data is an integral part of today’s world. If companies are not investing in data, they are facing business extinction.
Competitors are using these data models, and they are optimizing the way their businesses operate.
If all other businesses are doing this, but you are not, I do not think a company can survive in this data-forward world.
Closing
Veronica and I continued to talk a bit after this question as people asked questions on the livestream.
Overall, I want to say thank you to Veronica for her time! I learned a lot from talking to her and I hope you will too.
Thanks for reading! If you want to read more about data consulting, big data, and data science, then click below.
5 Tech Skills You Should Learn In 2021 – From Devops To System Design
How To Become A Data Engineer: From Analyst To Data Engineer
How To Improve Your Data-Driven Strategy
Greylock VC and 5 Data Analytics Companies It Invests In
What In The World Is Dremio And Why Is It Valued At 1 Billion Dollars?