KPI's for data teams: What works? What doesn't?

KPI’s for data science projects - what makes sense to measure?

Standard Project Management metrics are found lacking

These metrics measure the time, budget and scope of the project, and how they compare to the baseline decided at the outset. Due to the challenges around accurately measuring project size and deliverables, such metrics have limited use.

1. Agile Metrics

Agile metrics contribute to measuring the velocity of a data science team– the speed at which a team can produce value. Ex.

  • Rate of meaningful insights

  • Time from project request to kick-off

  • Time from kick-off to delivery of an MVP

2. Lean Metrics

Lean metrics answer the question of how much of the team's time is leading to actual added value–the percentage of time spent working on project tasks vs. percentage of time on documentation, meetings, and administrative work.

3. Artifact Creation

Over the course of a project, value is derived not just by fulfilling organizational goals, but also by the creation of components that can be reused to save time and effort in the future. Reusable artifacts such as:

  • Data scraping or collection tools

  • Frameworks

  • ML models

6. Competencies Gained

This measures the progress of individuals on the team, by tracking the valuable skill-sets that data scientists learn during the execution of a project.

7. Data Science Model Metrics

Measures the efficiency of the models created over the course of a project.

  • Is model performance significantly better than baseline performance?

  • Is there continuing improvement over successive models?

8. Financial Metrics and Impact to Organizational Goals

These are often the broadest goals for a data science team.

Financial metrics might include:

  • Incremental revenue earned

  • Incremental profits made

  • Incremental costs reduced

  • Net Present Value (NPV)

  • Return On Investment (ROI)

Organizational goals might be financial in nature, but may also measure impact in different ways. For an educational nonprofit, this might be a goal such as increase in literacy rate, or increase in children attending classes.

Tool Highlight

(Not sponsored 😃 )

  • Mintlify - A neat tool that converts your code to documentation.

  • DeepCode  - AI code quality tool that examine’s your code for vulnerabilities and bugs.

🎙️ A chat with Alan Roan of Cervus.ai

The Managing Director of Cervus.ai, Alan Roan, answered some questions on their recruitment process and the experience they’ve had with graduates as a startup. 

Do educational institutions (e.g. colleges, universities) provide enough specified training? Have you had an issue with skill set disconnect when hiring graduates? 

Some institutions are better than others. We have developed partnerships with key talent yielding institutions in an attempt to better understand their issues, but also to help them understand ours. 

What is your main source of funding? Where do you think the majority of Data Science / AI funding will come from in the next 10+ years? 

The real question should be; where is the best source of money to create innovation in Data Science? Government and major companies are shifting their large budgets in response to understanding the benefits, but these budgets tend to constrain creativity. Venture Funding will be a key source of funding to generate new Data Science/AI approaches, however, this may be affected by blow back from experiences such as ‘Metaverse’ dents in the market. 

Can students undertake the workload required to keep up with industry standards? 

I think today’s young people will change future industry standards. They think in a way which is essential to a digital economy – and industry should review the way it deals with workloads and standards.  

Watch out for part 2 of this interview in the next issue.

💡Tip of the week: Data Governance

Over-communication is a good thing when it comes to making sure your team follows data governance best practices

Start with communicating the benefit first - It’s about making sure the data you collect is useful for the team as a whole in the end. Then explain how to get there (policies, standards, defining data models)

📊 Future Focused

From McKinsey’s 'State of AI in 2022'​ report:

Many companies report continued challenges in hiring, risk management, and developing AI / ML skills in their tech and non-tech staff.

“Despite knowing for close to a decade about the growing need for roles like data scientists and data engineers, we still haven’t moved the needle enough on the supply side. Hiring from boot camps is picking up because experienced talent is just not available”

Helen Mayhew , Sydney based Partner at McKinsey

Key takeaways:

Nearly half of all respondents said they are making efforts to reskill employees as a route to get more AI talent. AI leaders have instituted specific capacity building programs to develop AI skills in their workforce, and are twice as likely to offer peer to peer learning and certification programs.

With the role of Machine Learning increasing in data science, we are seeing the emergence of advanced optimization practices like MLOps which is used to train, deploy and test hundreds or thousands of ML models on an industrial scale. Organizations are twice as likely to have hired a machine learning engineer or an AI Product Manager in the past year, according to the report.

Reply

or to participate.