Is Your Company’s Data Ready for Generative AI?
Originally published on HBR.org / March 26, 2024 / Reprint H082DP
By Thomas H. Davenport and Priyanka Tiwari
What/focus
A recent 2023 survey of data leaders across industries, backed up by some individual interviews, suggests that most companies have a long way to go to be data ready for generative AI. This is despite organisations mobilising to take advantage of GenAI in other ways, such as educational workshops, thinking about potential cases to develop and experimenting with the technology at the department level. However, data preparedness is lacking. This crucial work falls to chief data officers (CDOs), data engineers, and knowledge curators, but with a few exceptions, new data strategies have yet to be created, nor are people in these roles starting to manage data in the ways needed to make generative AI work for their companies.
How (details/methods)
The survey findings and interviews highlight the next steps for data leaders.
Historically, AI has worked with structured data, typically numbers. In contrast, generative AI uses unstructured data — text, images, even video. This data has to be made ready for generative AI in terms of accuracy, recency, and uniqueness, among other attributes. Poor quality data will lead to poor quality responses from gen AI models. However, despite many respondents identifying data quality as the biggest challenge, the majority of data leaders have not begun to make the needed changes in their data strategies.
The authors acknowledge there are other important data projects making calls on data officers’ time and resources, including making data available for traditional analytics and machine learning. The previous survey in the series (2022) identified that CDOs were expected to deliver value quickly. However, they were also under pressure to facilitate implementation of genAI. It is therefore strange that 71% of the CDOs in the succeeding survey agreed that while “generative AI is interesting […] we are more focused on other data initiatives to deliver more tangible value”.
So what?
CDOs and other data leaders are enthusiastic about the potential of genAI and it has brought a lot of attention to their roles. However, the survey results suggest they are pivoting a bit slowly from managing and improving structured data to unstructured content. The authors suggest it doesn’t make sense to wait to start preparing data as this could take years.
However, because curating, cleaning, and integrating all a company’s unstructured data requires a huge effort, companies should focus on the data domains where they expect to implement genAI in the near future. Customer operations, software engineering/code generation, marketing and sales activities, and R&D/product design and development are areas where data preparation work is happening in some companies.
Another mitigating factor for the seeming lack of action could be that generative AI is a contested space within companies, with CDOs competing with CIOs and CTOs for leadership of the hot new technology.