Data Science: Past, Present and Future

DTP #13

“We have gradually added layers of abstraction and grapple now with more social concerns than technical concerns”

An editorial piece from the Data Science Journal presents a compilation of research papers from data science scholars on the past, present and future of the field.

The initial observation of the piece highlights that the term "data science" is now applied to various scientific and technical endeavors due to the increasing prevalence of data and computing power.

However, the conclusion emphasizes that the term will continue to be subject to debate and lack precise definition, particularly in light of the expanding influence of generative AI and other data-intensive undertakings.

Key insights from the papers listed in the collection:

Your thoughts on the Data Talent Pulse?

Help us out by taking a few minutes to fill in this survey, and we’ll send you a Packt book of your choice

Can data science stakeholders use the lack of disciplinary clarity as a strength?

Conceptual advances are being made by non-data scientists in the understanding of data, leading to a rediscovery of ideas and periodic circularity of topics. This circularity stems from the ongoing nature of data challenges and the lack of knowledge about historical predecessors. Embracing porous boundaries and being open to new ideas and diverse participants could refresh and diversify the field of data science.

The Academic Data Science Alliance and the CARE principles are positive examples of inclusivity efforts in data science:

The debates on disciplinary boundaries have been intellectually generative within information science. Understanding data requires the use of multiple research methods, emphasizing the need for methodological versatility.

Can data science feed into an “empowering profession”?

Data literacy becomes a crucial aspect of this empowerment, enabling clients to understand the assumptions, limitations, and ethical considerations of data science projects.

Transparency and interpretability of data science outcomes are also important, especially for decision-making processes.

The move toward empowerment involves considering vulnerability, trust, autonomy, and agency in relation to data and information, supporting people in their cultural contexts and personal interests.

Data scientists are inherently political and ethical actors, and it remains an open question to what extent the data science profession embraces political, ethical, and empowerment-focused research agendas and norms. Empowering clients to understand their own data fosters trustworthiness and social responsibility, which can enhance the reputation and future development of the data science field.

The consequences of myopia (or surface level analysis) in data science

(Taking COVID data analysis as an example) The COVID pandemic posed significant challenges to data science. While the sharing of pandemic data and researchers' efforts to help were commendable, some data scientists lacking expertise in disease spread made simplistic extrapolations that could induce panic.

Data science has grown immensely in the past 20 years, but its academic categorization remains unclear. The author of the paper linked above observes that data science has evolved, now defining it as the science of data and research on methods and tools for learning from data. The ethical implications of data science and its impact on marginalised communities are crucial considerations.

Looking ahead, the author hopes AI will serve as a support tool for experts rather than replacing human decision-making. They also hope for data scientists to prioritize the social and environmental impact of their work and take a "first do no harm" approach.

The future of data science is unpredictable, but sharing research and experiences with good intentions can contribute to making the world a better place.

Preparing to embrace the future of Data Science

The future of data science presents various challenges and opportunities. Datafication, the process of quantitating data activities, is prevalent in the big data world but encounters societal conflicts.

Embracing open science principles like FAIR, CARE, and TRUST can harness the power of datafication responsibly.

I addition, data education is essential for data literacy empowerment, covering programming, algorithms, data culture, ethics, and critical thinking to prepare individuals for the changing world.

Additionally, future data science calls for a full data picture at the macro level, establishing a data ecosystem based on mutual trust.

Open science principles and collaboration between communities will help sustain and enhance the data ecosystem, promoting a data-sharing culture and supporting research across domains and regions. Overall, embracing these essentials will prepare us for the future of data science.

See you next time,

Do you have a unique perspective on developing and managing data science and AI talent? We want to hear from you! Reach out to us by replying to this email.

Join the conversation

or to participate.