Data is an asset that has a distinguishing characteristic: the more it is shared, the more its value and potential impact increase. It is important to identify the main stakeholders and elements which are involved in a data marketplace.
Who are the participants and stakeholders of a Data Marketplace within an organization?
Ideally, a Data Marketplace should be used by most data stakeholders in a data-driven organization.
In this context, a Data Marketplace is an ecosystem that must consider a variety of participants and stakeholders since its purpose is to serve as a one-stop-shop for data governance and a central hub for data sharing within an organization. In this sense, a Data Marketplace cannot be defined only from one point of view; one must try to put oneself in the shoes of each stakeholder who will be part of this new ecosystem.
Mainly, the participants and stakeholders of a Data Marketplace can be classified into 3 groups, each with their roles, functions, responsibilities, interests, and corresponding benefits:
Historically, the roles of producers/suppliers and consumers of information have been performed by IT staff with knowledge and skills in data development and exploitation (software/data/IT/BI developers/engineers/architects, operations/data technicians, …) but as data ‘s visibility increases in an organization and is identified as a strategic asset, this paradigm changes, producing the following effects:
- New roles with functions and responsibilities over data assets within specific data domains related to business areas (data stewards, data owners, …), transferring the responsibility for them from IT profiles to business profiles.
- Business roles are beginning to have more training and greater capabilities for the exploitation and analysis of available data, moving from being mere consumers of prefabricated reports to become data analysts. In addition, these roles have access to additional technologies that improve the availability, analysis, and consumption of information in different ways, so their “appetite for data” is growing exponentially.
- New roles combine hybrid knowledge and capabilities using technology, data, and business such as data scientists, who are usually more attached to business areas than IT. Likewise, this type of data specialist role often has a wide range of technologies available that bring data much closer to the business and provide them with a set of previously unknown capabilities.
From this, it is evident that the focus is not on technological solutions for technical IT roles. It is becoming an obligation for technological solutions related to data to be designed to facilitate their adoption by business roles, offering them a set of capabilities to which they did not previously have access, basically, because they did not need them.
Some roles include both technical (IT architects, DBAs, systems technicians, CISO, CIO, CTO, …) and business (data architects, compliance, legal, audit, …) skills, while new roles have appeared that did not exist before (CDO, DGO, Data Office, …). All of them also need to have their needs met from a data governance point of view and must be considered as stakeholders in the Data Marketplace ecosystem.
What are the elements of a Data Marketplace?
The first and most important aspect of a Data Marketplace is to abstract data governance from the underlying technologies used for data capture, integration, storage, processing, and exploitation.
These technologies are usually designed from a technical point of view, where the most important variables considered are usually performance, integration capabilities, processing capabilities, data volumes, fault tolerance, high availability, etc. However, many of them do not consider data governance, metadata management or interoperability capabilities from a more functional point of view.
This, together with the needs of an organization to have an enterprise vision of data, much closer to the language of the business, facilitating its access and exploitation by potential consumers, means that these technologies must be complemented by a higher (semantic) layer that allows the implementation and operationalization of an effective and efficient data governance initiative with a technology-agnostic vision.
Each section of the architecture can be identified according to its purpose:
Data Platforms Layer (technologies where data live and move):
- Technologies for data movement and ingestion (ETL/ELT, FTP, events, streaming, …)
- Technologies for data storage (relational databases, NoSQL, DWH, Data Lake, …)
- Technologies for data processing (preparation, transformation, modeling, quality, …)
- Technologies for data consumption and exploitation (BI, Reporting, Analytics, Self-Service, …)
- Technologies for data integration (ETL/ELT, virtualization, APIs, DaaS, …)
Security Layer (technologies that control access to data at the physical level):
- Identity Management Systems (AD, LDAP, IAM, …)
- Data security and permissions management systems (ACL/grants, RBAC/PBAC/ABAC, anonymization/masking/filtering, …)
Data Governance and Metadata Management Layer:
- Centralized metadata repository (Data Dictionary, Data Catalog, and Business Glossary).
- Reference Metadata Management (semantics/ontology, taxonomies, classifications, tags, …)
- Complete data asset lifecycle management with versioning support
- Governance model is based on data domains, roles, and permissions
- Operational model based on automatable workflows for policy and procedures implementation
- Data portal with Google-like search engine and advanced filters
- One-stop-shop integrated with demand management
- Data access management through data sharing agreements and data contracts
- Automation of common technical processes over data marketplace technologies and data platforms
- Complete data quality lifecycle management
- Collaborative, interactive, and intuitive environment for non-technical users with messaging, notifications, and alerts
- Global, hybrid, and extended view of data lineage and traceability along with knowledge graphs of the data ecosystem
- Support for both internal and external auditing
- Continuous monitoring and improvement of both data marketplace processes and data governance implementation
The first two layers are usually quite well covered by a multitude of technologies in those organizations that want to be data-driven, but the last layer is the one that is usually left behind. This third layer is the focus of the Data Marketplace paradigm, especially with a multi-platform vision and a meta data-centric approach.
A Data Marketplace involves a lot of different stakeholders among diverse roles who will interact and collaborate with one other in a complex ecosystem by using several mechanisms and tools organized in 3 layers for different purposes.