Mapping Data Scientist Competencies

DTP #14

A competency map enables an organization to identify the specific skills, knowledge, abilities, and attributes required for a data scientist to operate effectively.

This is especially important, due to the lack of uniformity in the skills expected from candidates applying for a job. Conversely, in other instances, individuals lacking the necessary qualifications manage to secure roles as data scientists.

This may put companies at risk of wasting valuable time and losing out on opportunities, which can result in disappointing outcomes due to inadequate implementations.

Over the years, several models for mapping competencies have been put forth:

Your thoughts on the Data Talent Pulse?

Help us out by taking a few minutes to fill in this survey, and we’ll send you a Packt book of your choice

IBM’s Data Scientist competency model

In 2018, IBM put together the first Data Science Apprenticeship program in the United States, with the aim of developing a transparent blueprint to define the competencies needed for a data scientist. Namely, to be used:

  • As a recruitment guideline,

  • A skill development chart,

  • To set job expectations.

When the data scientist takes part in or oversees the AI enterprise workflow (as in the chart below), they are required to:

  • Grasp the business opportunity and its implications

  • Collaborate with data engineers and the IT department to identify suitable data sources

  • Process and structure the data while constructing machine learning and custom AI models

  • Contribute to the integration of these models into the organization's operational processes

  • Evaluate the success of the implemented models and effectively convey the outcomes to the business stakeholders.

Source: IBM Data Science Skills Competency Model

IBM’s model for Data Skill Competencies is organized into seven criteria:

Statistics and programming foundation.
In this field, the necessary skills revolve around understanding important statistical concepts and methods to identify patterns in data and make predictions.

Job applicants should be proficient in Python or other statistical programming languages. They must also possess the ability to visualize data, draw meaningful conclusions, and effectively communicate those insights in a straightforward and understandable way.

Data science foundation
A data scientist should be capable of:

  • Identifying and describing a business problem

  • Creating a hypothesis to solve the problem

  • Applying various methodologies in the analytics cycle

  • Planning and organizing the execution of the solution

Data preparation
To ensure that data scientists can create useful datasets, the essential competencies include:

  • Identifying and gathering the necessary data

  • Manipulating, transforming, and cleaning the data

Model building
This involves the utilization of various algorithms to train the data and ultimately selecting the best one. A data scientist should possess knowledge in the following areas:

  • Multiple modeling techniques

  • Techniques for model validation and selection

Model deployment
An ML model becomes valuable when it's smoothly integrated into the existing system and used for making business decisions. Being able to deploy a validated model and monitor its accuracy is an essential skill.

Big data foundation
A data scientist needs to show their comprehension of how big data is utilized, including the big data ecosystem and its main components. Moreover, they should exhibit proficiency in working with big data platforms like Hadoop and Spark.

Leadership and professional development
A data scientist should grasp the business opportunity before implementing solutions, work rigorously and thoroughly, and effectively communicate their findings. They must also understand how to analyze business risk, make process improvements, and be familiar with systems engineering concepts.

While IBM’s model is certainly holistic in the sense that it covers the skills involved in the day to day work that a data scientist undertakes, there is also the realm of interpersonal and workplace competencies, that is better defined in other models. For example:

BHEF model for new graduates

The Business Higher Education Forum defines Data Scientist competencies across four tiers (especially for newly minted data professionals):

Source: BHEF DSA Competency Map

Tier 1 – Personal Effectiveness Competencies - These represent personal attributes that may present some challenges to teach or assess.

This includes

  • Integrity,

  • Initiative,

  • Dependability and Reliability,

  • Adaptability,

  • Professionalism,

  • Teamwork,

  • Interpersonal Communication.

Tier 2 – Academic Competencies - Which include cognitive functions and thinking styles that are likely to apply to most industries and occupations. Offered by higher education institutions. Namely,

  • Deriving value from data,

  • Data literacy,

  • Data Governance and ethics,

  • Technology expertise,

  • Programming and data management,

  • Analytic planning.

Tier 3 – Workplace Competencies - Representing attributes, skills, and abilities, as well as interpersonal and self-management styles

  • Planning and organizing,

  • Problem solving,

  • Decision making,

  • Business fundamentals,

  • Customer focus.

Tier 4 – Industry-Wide Technical Competencies - The knowledge and skills that are common across sectors within a broader industry.

These being more specific than competencies in the previous tiers.

While these models presented by businesses and organization can serve as a stepping stone, a business looking to build its own competency map for its scientists must assess its current situation and needs.

See you next time,
Mukundan

Do you have a unique perspective on developing and managing data science and AI talent? We want to hear from you! Reach out to us by replying to this email.

Join the conversation

or to participate.