Evolution is a constant, and for data and data management, evolution is essential
Evolution of Data
Change is the only constant in life, and technology, civilizations, and culture have evolved over history. What has not changed are facts.
With the passage of time and the evolution of technologies, civilizations, and culture, the methodologies used to capture, store, process, and use facts have evolved. Similarly, data (a representation of facts) and data management have had their own evolution cycles and they continue to evolve.
Until the advent of computers, limited facts were documented, given the expense and scarcity of resources and effort to store and maintain them. In ancient times, it was not uncommon for knowledge to be transferred from one generation to another by the process of oral learning. The oral tradition is a contrast to the current digital age, which has elaborate document and content management systems that store knowledge in the form of documents and records (Mahanti 2021b).
With the advent of computers and subsequent innovations in computing and industrial automation, a marked shift in data processing has resulted in the electronic recording and processing of data to support business operations. While electronic storage and processing of data started at the end of the 19th century, owing to the cost and limitations of storage, the amount of data that could be stored was relatively small, and data management as a discipline was less complex. Technology was seen to support reducing manual overhead to generate correct reports, and data was seen as a by-product.
However, the advancement in technology, decreasing cost of disk hardware, and availability of cloud storage has facilitated the storage of large volumes of data at much lower cost. With the growing number of data-generating devices including smart phones, gadgets with sensors, internet of things and cloud computing it has culminated into an explosion of data.
As stated by Eric Schmidt “there were 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days.” From 2010 to 2020, the amount of data created, captured, copied, and consumed in the world increased from 1.2 trillion gigabytes to 59 trillion gigabytes, an almost 5,000% growth (Press 2020).
Data Deluge Challenges
The digital age is characterized by an over-abundance of information. It can be considered a pandemic in own right, hence the name infodemic. While earlier, people were starved for knowledge and insights because of limited or no data, the increasing volumes of data in the present age brings with it some challenges of their own. Unfortunately, people are still starved for knowledge as well as insights. The quote by John Naisbitt (2016), “we are drowning in information but starved for knowledge” very aptly summarizes the current information and knowledge situation in the digital age.
Some of the challenges with the explosion of data are:
Very few businesses have the time, resources, or expertise to make use of all the data that they capture and store. According to Forbes, “On average, companies only use a fraction of the data they collect and store (Marr 2016).” With too much data, it is a challenge to locate the right data. A survey conducted by the Compliance, Governance and Oversight Council (CGOC) showed that only 1% of information being retained was subject to legal hold requirements (that is, required to be preserved because it is related to the subject matter of actual or reasonably anticipated litigation or regulatory proceedings) (Baker and Sjoberg, 2018). With large amounts of data stored and replicated in multiple repositories across the organization, locating this 1% of data is like trying to find a needle in a haystack (Mahanti, 2021a).
Information versus Misinformation
Misinformation is false information that is spread, regardless of intent to mislead. The abundance of information and advancement in communication technologies such as internet, social media and telephone has amplified this problem of spreading misinformation at exceptional speed (Mahanti, 2022). Research indicates that in the first three months of 2020, roughly 6000 people around the globe were hospitalized because of coronavirus misinformation. During this period, researchers state that at least 800 people may have expired due to misinformation related to COVID-19 (WHO, 2021). With the huge amounts of data, it is hard to distinguish between information and misinformation. Another problem is the speed with which misinformation travels, when compared to information (Mahanti, 2022). Misinformation travels much faster and reaches more people than information. The quote “Information walks. Misinformation flies” very aptly summarized the problem.
Security and Quality
With the huge amounts of data that the organization captures and stores, data security is a challenge. Security, privacy, and compliance needs to be taken into consideration and measures and controls need to be implemented for the same. Also, there are bound to be data quality issues. Decisions based on bad quality data are bad decisions that an organization does not know about until later.
Evolution of Strategy, Data Management, and Governance
With enterprises capturing and storing exponential volumes of data, there needs to be adequate strategy, data management, and data governance to derive the best value and drive competitive advantage. Data management is no longer a simple discipline that existed in the early days of computing. Currently, data management is a multifaceted discipline with several closely interacting sub-disciplines or functions including data quality, data security, data architecture, metadata management, and master data management; data governance is a core function connecting all the other data management functions (Mahanti, 2021b).
Without an adequate strategy, all these data management functions will be implemented though different data initiatives and the solutions will be addressed piecemeal or as department silos without assessing the enterprise level implications. The lack of an enterprise view can allow risks to be magnified and alignment with organization’s strategic objectives be neglected, resulting in implementation of sub-optimized solutions, and introduce inconsistencies in data and information, or development of systems that cannot be easily integrated (Mahanti, 2019).
Evolution of data management has many steps. A data strategy containing a proposed balance of defense (control) versus offense (flexibility/growth) coupled with planned execution through data initiatives should be accompanied by periodic review and revision, and measurement of impact and progress. An evolutionary data strategy can deliver a real and tangible impact, drive change, improve operational efficiencies, and provide substantial opportunities for new revenue and competitive advantage.
This article draws significantly from the research presented in the books- Data Governance and Compliance: Evolving to Our Current High Stakes Environment, Data Governance and Data Management: Contextualizing Data Governance Drivers, Technologies, and Tools published by Springer in 2021, Data Quality: Dimensions, Measurement, Strategy, Management and Governance published by ASQ Quality Press in 2019, and How Data can Manage Global Pandemics: Analysing and Understanding COVID-19 by Routledge Press in 2022. Future research will focus on strategies and management to derive maximum value from data.