👩‍💻 Communicating the Value of Data Governance

DTP #24: Q&A with Susan Walsh

Click the image to watch the video version!

We spoke to Susan Walsh, Founder of The Classification Guru, on the importance of data governance, how and when businesses must take data management into consideration and making the case to stakeholders. 

Here’s a quick summary of the QnA below:

  • Issues faced by data scientists and analysts when it comes to messy data.

  • When and why businesses typically start thinking about data governance.

  • Making the case for data governance to stakeholders.

  • The issues businesses face when it comes to cleaning data.

  • The first step to implementing data governance processes.

  • Where AI can and cannot help with data governance.

Quotations lightly edited for concision and readability.

🌐 From the Web

Data governance challenges include integrating data from multiple sources, ensuring data accuracy, addressing biases, maintaining data quality, and complying with data privacy laws. Data governance tools like Collibra, Informatica, Alation, Erwin, OneTrust, and SAP Master Data Governance help manage these challenges and ensure data consistency and trust. 

Analytics and IT leaders face pressure to establish a unified data strategy and better data management to support generative AI efforts, according to research by Salesforce. Line-of-business leaders are eager to adopt generative AI, with 77% of business leaders fearing they are missing out on its benefits.

I came across a social post recently where a data scientist complained that they weren't allowed to go back and clean messy data, leading to inaccurate analysis, which they then had to take the blame for. 

Yeah, sometimes people are told to leave it and not clean it and it's crazy, but it happens a lot, because what happens is once you start to clean it, you start to find more and more things that you don't want to know. I guess it would be frustrating if you're someone who had to work with that data, I think communication, making sure that business leaders get the point becomes all the more important, right? 

But it's communication, too, because a lot of data scientists, I think, struggle in silence. They just get on with it and do it. And they don't tell or show evidence of all the work they've had to do to get the finished product. 

What point do you find in your experience that businesses start thinking about cleaning their data and maintaining it? Or is it that they need to be reminded by something going wrong for them to actually do something about it? 

Oh yeah, I can tell you exactly when they realise they need our services. And that is when it has gone really wrong and they need to fix it quickly. Outside of that, it is very hard to get budget or funding to clean data without a reason because a lot of time people or the decision makers just don't see the value in it.They want something tangible. But of course you don't see the tangible benefits of that until after you've cleaned the data. So, it's a very hard sell. 

Have you had people convince decision makers to make that sell? How does one have to communicate with them that this needs to be prevented?  

I talk at events, I do webinars, I do podcasts. And I don't just stick to procurement [which] is my kind of base where most of my clients come from. But I talk on finance podcasts, I talk on data podcasts because this is an everything issue. This is not specifically for one area and the more I can get myself out there and spread that message, then hopefully it'll start to sink in. I mean, I meet so many people who would love to get me and my team in and help them out, but they know they don't have any money to do that. 

So, It is frustrating, but it's like, [when it goes wrong and] then they give me a call, then we can help. At that point, it's always more expensive because it's a bigger mess to solve. 

So money [constraints] are definitely a part of it. What are your thoughts on the time needed to be spent upfront?  

I don't think senior management realize how much time is being spent cleaning data or fixing data because it's not right or trying to join things together. It's an issue and people aren't measuring the impact of that on the business because it's not just delays on projects, it's costs. 

It's time. It's other things being sidelined and neglected. There's such a wider knock-on effect to that. Or you know, people having to work over weekends to get their jobs done. 

How would you guide a company or an organization that wants to set up processes (for data governance) when they come around to it? What are the first steps that they should take? 

Don't just look at the immediate problem or issue that you've got. You need to think bigger and longer term. So if you're cleaning your data now. What might you need in the future that you're not using right now? 

And that could be people's names, it could be contact information, it could be addresses, it could be part numbers, it could be weather information or Geo location data. You might think it's not important now, but it could be in a year's time. So have a think about what's coming up, what you need for the longer term and work on that instead. 

Do you find that different industries have different needs when it comes to data governance or can they use a similar basic framework and just go from there? 

I think if you're really top level, no. Ultimately it comes down to the people who are inputting the data and their motivation to keep it clean and organized. We work across all different industries and ultimately the issues are always the same. The data might look different, but how we fix it is the same. 

Do you have a plan in place before you get into conversations with prospective clients to understand what they need from you? 

The first conversation we have is to kind of start with the end in mind. [Such as,] what is your long term? Think about your long-term goals. 

It's kind of like [being] a therapist too. I'm like tell me your problems and then listen. [Then we] try and figure out how we can help them and what they really need. Because sometimes what they need is not what they think they need. 

Or we might just know because we work with all kinds of different data and different problems, we have such a broad view of different things to fix. You know, we might just come in with a different idea. 

Have you had any experience using AI with your work? 

We tested it out a while ago now and it just wasn't for the type of work that we do. It's not there. There are a lot of areas and industries and instances where AI will 100% improve processes and make things easier. But we're in a very niche area that is very subjective and contextual and it's really hard to get AI to be able to understand that. 

I think if we worked in one industry or for one company, then it would be easier to build something or work with something. But because every week we're working with different clients, different countries, different industries, it's too broad and vague for us to be able to do something. 

If AI was able to work for you, what part of the job would you want help with the most? Which part might be the most time consuming (for example)? 

I think when it comes to the work we do, it's with classification, it's things like we have to search for a lot, like medical products and chemicals and reagents and getting help with that would be good. And then on the normalization side, some of the translation of that could be done that would be helpful and there might be opportunities to try that out. 

You know we're not saying no, the problem really is not the data itself but the mess it's in and it's a bit too complicated for tech to understand right now. I like to think of the TCG team as the foundation, so we get the data ready to be used in AI because it's got to learn from clean data and someone's got to get that data clean. 

💻 Platform Highlight 

OvalEdge: Data governance platform and data catalog which allows SMEs to configure the solution to meet their business requirements. 

Collibra: Software providing automated data management and governance with cross team collaboration features. 

Ataccama: Platform-as-a-service (PaaS) tool that provides AI-powered services to simplify data management tasks.

💼 AI in Business

AI Enhancing Customer Experience

An article from CMSWire explores how AI is being harnessed by leading companies to enhance customer experience in various sectors such as retail, healthcare, and travel. AI integration optimizes resources and improves results, leading to personalized marketing and enhanced support. 

  1. Ulta Beauty (Automated Personalization): 

  • Consolidated customer data to enable personalized marketing. 

  • Utilized SAS Customer Intelligence 360 for marketing automation and granular customer recommendations. 

  • Achieved a 95% sales contribution from returning customers. 

  1. Liberty London (Customer Service): 

  • Used AI to classify and route customer support tickets efficiently. 

  • Reduced ticket resolution time by 11% and first reply time by 73%. 

  • Increased customer satisfaction by 9%. 

  1. TGH Urgent Care (Omnichannel Customer Service): 

  • Transitioned to an omnichannel experience using LivePerson's AI-based engagement options. 

  • Deflected calls to SMS conversations and used an FAQ AI chatbot. 

  • Achieved a 40% reduction in incoming calls and increased the call answer rate to 80%. 

  1. CAA Club Group (Predictive Analytics): 

  • Employed Pecan AI for automated forecasts on member assistance and staffing. 

  • Developed dozens of models predicting member call volume by interval, region, and service type. 

  • Reduced the time required to generate service forecasts by 30%. 

  1. IndiGo (High-Volume Customer Support): 

  • Implemented 24/7 automated customer support using Yellow.ai's AI chatbot. 

  • Automated over 35 use cases and 300 customer journeys. 

  • Achieved an average customer satisfaction score of 87% and 400,000 opt-ins for WhatsApp campaigns. 

💬 Social Highlight

Data Scientists on Reddit offer some suggestions for HR/Hiring Managers: Link  ‘

Thread: “Open-source models are not antithetical to guardrails and caution.”

🤖 Prompt of the week

[Describe your data team] 

[Describe your data maturity] 

[Add your timeline constraints] 

Can you suggest the best roll out plan for a data catalog project? 

See you next week,


How did you find the content in this week's newsletter issue?

Your feedback is valuable to us. We strive to deliver content that is both informative and relevant to your interests. Let us know how we did in this week's issue so we can continue to improve and provide you with the most valuable insights.

Login or Subscribe to participate in polls.

Join the conversation

or to participate.