There are many characteristics of information quality. Some are intrinsic to the nature of the data, including objectivity, believability and reputation.
Many discussions about the quality of data revolve around the issue of accuracy. There have been many books and articles that have focused on that information characteristic. However, there are other characteristics that the researchers at MIT and other universities associated with the Total Data Quality Management program at MIT have identified as contributing to the overall quality of the information used within organizations.
Fifteen characteristics have been found to affect information quality. Some other characteristics can be included in this list, which are intrinsic to the information itself.
The first of these is the objectivity of the data. This characteristic usually is a problem for information that contains codes representing an event or condition. If the data consists of the number of items that have been shipped to a customer, it is a count that can be checked easily to determine how well it represents an accurate depiction of the actual event that has occurred. If the information also contains a code or piece of interpreted data, the process used to produce the information has some subjective aspect which will influence the quality of the data.
The second characteristic is the believability of the information. Information that is believable will be considered to have higher overall quality. There are many situations in which data that was highly believable was inaccurate. This condition made it extremely difficult for the consumers of the information to overcome concerns that arose when the accuracy of the data was improved and the changes caused discomfort. The believability of information represents the feeling of the information consumer concerning its ability to match their view of the world and whether it supports the actions taken or decisions made.
When the information supports their beliefs and understanding, the information is thought to be believable and have good data quality. It takes a long time for an information consumer to develop a sense that the information they are using is believable. It only takes one incident for that believability to be shaken and for the information consumer to lose confidence in the information provided. If this occurs, the process of rebuilding believability again takes time and effort.
The third characteristic is reputation. If the information consumer thinks that the data is accurate, that it objectively represents the events or conditions in which they are interested, and that it supports their view of that external environment, he will consider that the data can be used confidently for its intended purpose. As with believability, it takes a long time and great effort to build and maintain a good reputation for the information being used. Similarly, this reputation can be quickly impacted and information consumers can lose confidence in utilizing it.
Examples of Characteristics’ Effect on Information Quality
Mismatches among different sources of the “same” data are a common cause of intrinsic information quality concerns. Initially information consumers are not able to identify the source to which quality problems should be attributed. They do know that data can be in conflict. These concerns initially appear as believability problems. Over time, information about causes of mismatches accumulates from evaluations of the accuracy of the different sources, which leads to a poor reputation for less accurate sources. As a reputation for poor quality data becomes common knowledge, those data sources are viewed as adding little value for the organization resulting in reduced use of the data.
Organizations can have a history of mismatches between their inventory system data and physical warehouse counts. In these instances, the warehouse counts serve as a standard against which to measure the accuracy of the system data, if the metadata of the warehouse data is more accurate and complete than the metadata for the inventory system.
The system data source is thought to be inaccurate and not believable and is adjusted periodically to match actual warehouse counts. The system data then again gradually develops mismatches which results in a gradually worsening reputation until the data is not used for decision making. If this occurs, how does one defend the production and maintenance of such data if it has such little value for the organization? How does the organization support data quality without data governance?
Judgment or subjectivity in the information production process is another common cause for quality concerns. Initially only those with knowledge of the information production processes are aware of the potential problems which usually occur as concerns about data objectivity. Over time, information about the subjective nature of the production process accumulates which results in questionable believability and reputation of the data. The overall result is again reduced use of the suspect data.
Although the fifteen tangible characteristics of information quality are important, it is crucial that organizations pay attention to the intrinsic characteristics of objectivity, believability, and reputability to ensure full information and data quality confidence.