When it comes to data, many businesses still suffer a disconnect. Consciousness of the potential value of data has never been higher, yet enterprises still struggle to make the most of it. Companies have more tools and capabilities at their disposal than ever, but can’t necessarily translate information into commercial intelligence. Data is seen as a strategic resource, but many organizations have yet to define a data strategy.
One recent Harvard Business Review study found barely a quarter of companies feel they’re able to effectively measure and report on the business value of their data and analytics investments - despite 80% agreeing it’s important to do so. Research into the issues facing CEOs by PwC also points to a significant gap between the data business leaders know they need to make critical decisions, and the adequacy of the data they actually get.
The data disconnect
i. Data lakes everywhere (and not a drop to drink)
The failure to achieve this state of trust is rooted in data’s historical trajectory.
“We have to challenge this very fundamental assumption that for any company or business unit to engage in data-driven experimentation they must have access to centralized data to get any meaning out of it,” Dehghani says. “That paradigm has become a blocker to scale in any meaningful way. which impacts how we build organizations and teams, and leads to how technology has been built bottom-up.”
The bias towards centralization, in the form of data warehouses and later on data lakes, means teams that are not intimately familiar with the data, its origin or its usage are centrally created and made responsible for it. This leads to blockages and some of the truthfulness of the data being ‘lost in translation.’
A typical data lake/organizational structure
ii. Shifting towards a mesh
The business case is clear: instead of creating lakes, or silos, organizations should pursue a nimbler approach to data, bringing it closer to parts of the business where it’s directly relevant.
This can be achieved by applying two core principles - domain-oriented data and data as a product. Domain-oriented ownership and distribution breaks data architecture down around individual functions while maintaining overarching connectedness and integrity. Utilizing data as a product, and not just a resource, becomes something that’s a pleasure to consume and use. These practices are the basis of a data architecture designed for a resilient, and fast-acting, digital business: Data Mesh.
When developing a data platform, enterprises won’t always be starting from scratch. Dehghani notes that existing cloud technologies can act as a “utility layer,” providing the storage and streaming capabilities and standards upon which more mature layers of the platform are built to support interactions with distributed architecture and decentralized teams.
At most organizations, “the utility later is there, but it’s been built to assume data is going to be centralized, and there’s a layer of technology that’s absent around the orchestration of the distribution of data,” she explains. “If you decide to put ownership of the data in the hands of different domains, not a central group of hyper-specialized data engineers, you need to raise the abstraction of the platform to a level that a generalist developer can also get the analytical data they need to build a microservice or application. That shift of power, from the specialist to the generalist being able to generate meaningful, useful data, requires engineering commitment.”
Data platform model
v. Embedding security and governance
Allowing all this freedom may seem problematic, given business leaders remain highly, and correctly, concerned about any potential weaknesses in data security and governance. Decentralization can be seen as risky, because it removes a single gateway or point of control.
Top data security concerns of CIOs/IT leaders
vi. Dealing with the data talent shortage
In implementing security and other policies around data, businesses often fret about a shortage of expertise – and indeed studies show demand for data skills continues to outstrip supply.
However as Gorcenski points out, companies are often “sitting on data talent that they don’t realize they have” - people who may have a strong interest in data but that have been prevented from interacting with systems or working with developers because these tasks don’t fall under their formal role.
Biggest technology skills shortages for businesses
According to Pendse, while the focus is often on software and services many of the more exciting recent developments have been on the hardware side. “The whole fabric of computing is changing, with fit for purpose chip design,” he says. “Then there are developments like non-wallet IO memory, which basically means if you shut off your computer, your RAM doesn’t go away – persistent memory in other words. What happens to the idea of a database if an application is persistent even when the server shuts down?”
Gorcenski, meanwhile, sees massive potential in the vast amounts of data left untapped in the Internet of Things (IoT) space – and in enterprises striving to do genuinely new things with data, rather than emulating the approaches of luminaries like Google or Facebook.
“We need to look at how to use data to disrupt our own industries, not to do what Google is doing, but to do what nobody’s done before,” she says. “We need to stop thinking of other businesses as living in different worlds and start to see them as potential partners, finding ways to augment each other with data. Collaboration creates a better business ecosystem than competition in many cases. Recognizing those benefits requires bold thinkers who are willing to do challenging and complicated things and make that investment. It’s not going to happen in a quarter or a year, but it certainly is possible. There are more unsolved data problems than there are solved ones.”