Affiliated with:

The Value of Context to Textual Data Analysis

admin ajax

Context is essential to the meaning of any text. The value of context in textual data analysis cannot be overlooked.

We take context for granted. Context is free and natural and just comes with data. Right?

When we look into this proposition we find out that it is anything but true. Context is necessary but it does not come free and is not naturally associated with data. The association must be explicit and definite to be useful. The association between the data and its metadata must be clear and the context must be usable.

To illustrate the importance of context take the very common circumstance.

The teacher asks the question – “What is the answer”?

The young man – a student – answers the question – “The answer is 7.”

Now is this an answer. Does the young man mean seven seas? Seven dwarfs? Seven continents? Seven dollars? Seven days of the week?

Unless the young man adds context to the number, the number is meaningless. Which leads us to the following set of statements –

Number – meaningless

Number + Context = information

Numbers by themselves are naked. Any number stripped of context is meaningless. When you understand information this way, you start to understand the value of context. Moreover, it becomes obvious that numbers have to have context to be meaningful.

So where did the notion that numbers naturally contain context originate from? The notion that numbers have their own “natural” context originated from the early days of database management systems. In the early days of database management systems, everyone was taught to define records. And in these records there are fields.

Typical fields for a customer’s record might include –

  •  Name
  •  Address
  •  Telephone number
  •  And so forth.

On the other hand, typical fields for a transaction might include –

  •  Item purchased
  • Date of purchase
  • Transaction amount
  •  Location of purchase
  • And so forth.

The database record is designed based on these requirements, and the database records contain fields. Then, as transactions are made, the data from the transactions is entered into the appropriate field. In a database record there indeed is a direct correlation between numbers and context. That context is termed “metadata.”

This process of designing database records and building systems around those designs is a well-established practice. In addition, in the case described, it is easy to see how there is the assumption that numbers do indeed come with context. In the case of systems built around database records and systems that process business activities, numbers DO come with context in a very natural manner.

So it is easy to see where the assumption that numbers and context go hand in hand comes from. Because in standard database management systems numbers and context really do go hand in hand. This is one of the foundational components to enterprise data management. However, database records are hardly the only thing that there is.

In the case of the young man answering his teacher, it is likely that both the teacher and the student understand the context of the question. When the student answers – “7,” both the teacher and the student understand the context of the answer, since they know the question that was asked and the topic under discussion. So there is implied context which gives meaning to the answer. However, where

there is no implied context to a conversation, it is necessary to state context explicitly to make the answer meaningful.

Outside of the world of database management systems, the world is governed by the basic understanding: Number + Context = useful information.


Bill Inmon

Bill Inmon is best-known as the “Father of Data Warehousing” and textual data integration. He has become the most prolific and well-known author worldwide in the data warehousing and business intelligence arena, and has opened the field of textual data integration. In addition to authoring more than 50 books and 650 articles, Bill lectures on data warehousing, textual data integration and related topics. Bill consults with a large number of Fortune 1000 clients, and supports IT executives on data warehousing, business intelligence, and database management issues around the world.

© Since 1997 to the present – Enterprise Warehousing Solutions, Inc. (EWSolutions). All Rights Reserved

Subscribe To DMU

Be the first to hear about articles, tips, and opportunities for improving your data management career.