What Is Data Mesh, And Why Does It Solve Problems?

The Data Mesh gives an option in contrast to the incorporated authoritative and structural model of the information lake with a conveyed and decentralized engineering intended to assist undertakings with having endeavor talent and versatility. Notwithstanding the innumerable advantages it offers, we frequently feel that innovation, rather than expanding usefulness, is making our work more complex and more intricate. This happens when it is ineffectively carried out, with too unbending principles or a working rationale that doesn’t follow the client’s requirements or, once more, when it appears to meddle with next to no real excuse (attempt to move a picture in Word by a millimeter to see a misrepresented response that changes the whole report).

This doesn’t imply that we lamented when we needed to drive with a paper map in our lap, before the appearance of satellite pilots, or when we needed to peruse the business catalog looking for an inn, an eatery, or even only the phone number of an organization. To put it plainly, we want innovations that genuinely help us and with which it is lovely to work. The Data Mesh takes care of issues adequately, rapidly, and without laces. We should see what this implies, practically speaking.

Data Mesh And Real Problems

The challenge in a modern company today is to find reliable data. It’s not just about knowing where the data is, but also whether we can trust them. In the increasingly competitive business world, avoiding extra expenses and not giving up an opportunity is an absolute must. And you can’t afford these missteps “just” because you can’t find or trust your data. Today it is doubtful that a company does not have the data for a report, a different KPI, a new Business Intelligence or Business Analytics initiative, an analysis to validate a unique business proposition, etc.

On the contrary, we seem overwhelmed by data and struggle to manage it, with constant questions about their quality. Are they reliable? Who created them? Did they come from within the company, or were they purchased? Are they from a different source data set? Are they up to date? Do I find all the information I need in one place? Am I in a format compatible with my needs? What is the complexity (and therefore the cost) of extracting the information I need from that data? Is there anyone else in the company who has already leveraged that dataset, and if so, what has their experience been like?

There may therefore be issues that need to be resolved before a particular data asset can be used. No wonder this happens: every new business analysis and BI initiative has its own set of obstacles. However, the real challenge lies in estimating these efforts in advance so as not to run into budget overruns or other costly delays. Every time you try to analyze the situation, you can never get a definitive answer from the data engineers regarding the quality of the data or even the time it takes to determine if the data is adequate. There is no way to set a budget (both in terms of time and resources) for solving problems when you can’t know in advance what problems your data might have and how expensive it would be to find out.

It is impossible to estimate a Time-To-Market if you don’t know the challenges to face. So how can you determine if a product or service will be relevant in the market when there is no way of knowing how long it will take to launch? Is it a risk to run or instead block the entire project? This is the kind of problem that poor data observability and reusability leads to, combined with ineffective data governance policies. Companies are beginning to understand that the problem of data integration must be approached as an organizational issue rather than a technical one. Business Units are (or should be) responsible for the data assets, taking ownership in technical and functional ways.

In the last decade, on the contrary, the Data Warehouse and Data Lake architectures, in all their forms, have freed data owners from the technical burden and at the same time have kept the knowledge and competence of these data in the hands of the Business Unit. Who created them?

Unfortunately, a direct consequence has been that the central IT (or data engineering team), once the first ingest process is in place, has “taken” ownership of that data, thereby forcing centralized ownership. Here the integration stops: potential consumers who could create value from the data must now go through the data engineering team, which has no actual business knowledge of the data they provide as ETL results. This leads to potential consumers not trusting or not exploiting data resources and therefore not producing value in the chain.

The Four Principles Of Data Mesh

The above implies that now is the time not only for another architecture or technology but rather for a completely new paradigm in data to solve all these integration (and organization) problems.

This is where Data Mesh comes in. It is, first of all, a new organizational and architectural model based on the principle of domain-driven design. This concept has proved very successful in microservices, which are now applied to data assets to manage them with strategies and domains oriented to the business and not only to the application/technology.

In other words, it means making the data work for our needs rather than intervening to solve their technical complexities. Data Mesh is both revolutionary for its results and evolutionary, as it exploits existing technologies and is not tied to an underlying specification. Now let’s try to understand how it works from the point of view of problem-solving.

The Data Mesh is now based on four principles:

  • Decentralized, domain-oriented data ownership and architecture
  • Data as a product
  • Self-service data infrastructure as a platform
  • Federated computational governance.

As already mentioned, one of the most frequent reasons for failures in data strategies is the centralized ownership model due to its inherent “bottleneck shape” and inability to scale up.

The adoption of the Data Mesh breaks down this model, transforming it into a decentralized (domain-driven) one. Domains must be the owners of the data they know and provide to the business. This ownership must be both from a functional point of view to the company and from a technical/technological point of view, to allow the domains to move at their speed, with the technology with which they are most comfortable, providing at the same time valuable and accessible to any potential consumer of data.