It’s never been harder to be a Chief Data Officer (CDO). On the one hand, demand for CDOs is higher than ever, with more than two-thirds of enterprises appointing a CDO, up from less than one in eight in 2012. On the other hand, job security is lacking, with the average CDO lasting just 2.5 years. That is shorter than all other C-level executives, and half the overall average C-suite tenure of 5 years.
Part of this is due to the rapid expansion of CDO duties over the past decade. First-generation CDOs focused on data management, in particular setting up data marts, and warehouses. They also added data governance to these data warehouses. They created and enforced processes to make sure data was used efficiently and safely, protected from cybertheft, privacy risk, degradation, and other tasks. In other words, they primarily played the role of bad cop, policing how employees used data.
A purely defensive role was demoralizing for both workers and the CDO, though. And CDOs know data better than anyone. They understood the power of applying analytics and creating data pipelines and data applications. So second-wave CDOs became internal innovation leaders, championing the transformation of how their companies view and use data, from a passive archive like books in a dusty library, to the lifeblood of a digital enterprise.
Today we are seeing the emergence of the third wave of CDOs.
Convinced of the value of analytics, businesses are looking to incorporate more data sources such as real-time event streams. They want to build operational dashboards and make data available regularly to their businesses. They’re seeking machine learning-generated predictive analytics for better decision making. And they’re clamoring for AI-based workflows to automate processes for better efficiency, agility and cost savings. Wave three CDOs are not just being asked to ideate, plan and champion. They are being asked to execute these transformations. And to do so, they are being handed dedicated data operations teams, oversight over data technologies and domains, and responsibility for overall data reliability and data delivery.
If CDOs are so important, then why are they getting fired so quickly, so often?
One reason is that CDOs naive about business risk try to modernize their data infrastructure while cutting the wrong costs. Today’s third-wave CDO has partial or total responsibility for a diverse, multi-cloud setup that includes ERP systems, Salesforce instances, traditional databases, data lakes and other big data deployments, and cloud-native data warehouses.
To deliver more value from their data pipelines and data repositories, they are constantly tinkering and upgrading their infrastructure. Bringing data into the cloud is the most common upgrade. Due to the ease of switching and scaling in the cloud, such migrations look deceptively easy.
But let’s not trivialize the complexity and work involved to make these migrations successful. Moving data from an on-premises data warehouse to a cloud instance, whether it is a simple lift-and-shift or a total refactoring, will require close monitoring and certification that the data was migrated with all of the datasets, schemas, and dependencies intact. Validating and reconciling data pre- and post-migration is labor-intensive work that a time-pressured CDO and his team may feel they don’t have the bandwidth for.
We see too many CDOs settling for letting their operational teams quickly eyeball the migrated data for any data errors or compromised data reliability. Doing that is a big risk for any company, but a massive one for companies using data in ways that support sales, business operations, or anything else mission critical. In such scenarios, data errors and broken data pipelines inevitably emerge. The worst part is that without strong oversight during the actual migration, these problems will continue to crop up for a long time, and at the worst times. Failed data migrations are a huge reason why CDOs lose their jobs.
Reason number two: the inability to support the business’s ravenous appetite for new data workflows. Even if they are not actively migrating data from clusters to cloud, most CDOs are still constantly adding new data sources. There are real-time customer clickstreams, Change Data Capture (CDC) synchronizations from internal repositories and third-party data marts, IoT sensor data that is ingested first by your ERP systems before being shared for wider analytics, and more.
This data does not conform to a single structure. Moreover, the modern way to treat data is no longer schema on write but mostly schema on read. This is a more flexible strategy that makes it easier to store a diversity of unstructured and semi-structured data types in large data lakes. But when it comes time for machine learning and analytics — especially the subtle, hard-to-detect anomalies and the bold sweeping trends that data scientists live for — petabytes of data coming from different sources must be harmonized before they can be processed and queried.
All of this is heavy, complicated work that data scientists outright cannot handle. And it presents plenty of potential drudgery for CDOs and well-trained data engineers, if they lack the tools. But because of the high priority given to many ML/AI projects today, CDOs that cannot quickly build these data pipelines are at risk of looking like blockers to the business.
In addition to their duties on offense, the third-wave CDO is still responsible for playing defense with the data — data governance, security and access control. The big shift that has happened, though, is that all these areas are now operational in nature and have to be conducted in real time. Data governance is no longer a one-time annual audit, but must be performed constantly and accomplished to perfection.
The 24/7, mission-critical nature of data mandates that today’s CDO has access to a best-of-breed, purpose-built data observability solution which allows them to monitor their data landscape. This observability solution should provide a common framework to all stakeholders in an organization — not just the CDO and their direct reports — so everyone can understand and interpret the state of the supply chain of data, and get a better handle on the state of the operations. Operational intelligence, in other words.
The data observability platform also must be multi-layered, and it must be able to examine every layer of your data infrastructure from multiple vantage points. It must be able to track data end-to-end throughout its journey through data pipelines and data platforms, monitoring its velocity, consumption, and its reliability as it is consumed by various business or data applications.
If you are a third wave CDO, you must think of your internal data landscape in these terms — data pipelines/platforms, and data applications. The integration of these layers in one operational view is key for data engineers and other CDO teams to succeed and deliver outsized returns. Observability is the glue that provides this neutral, single pane of glass into all your data, all of the time, regardless of how far-flung your data repositories and pipelines are.
The first two waves of CDOs didn’t have access to data observability. That’s why their data journeys have become so complicated and, judging by their short job tenures, so treacherous. A data observability solution like the Acceldata Data Observability Platform can serve as a strategic map for CDOs to successfully navigate the new data landscape and thrive for years to come.