Tag: unstructured data

“Unstructured Data” – No such thing!

I keep hearing this term lately and I dislike it.

There is no such thing as Unstructured Data. All data has structure. If it didn’t have structure we wouldn’t be able to use it.

What about free text? Well, that’s just a single column value (stored in a CLOB in Oracle, for example) and the free text is, more often than not, on a row with other columns, such as identifiers and timestamps, i.e. yet more structure.

I think what people mean when they use this “marketing foam”TM term is “data that we have not yet defined the structure for”, but in order to use it at some later stage, the structure will need to be defined – that definition process doesn’t actually give the data structure in and of itself, it simply defines what that structure is, in order to be able to use it.

Interestingly, the Wikipedia article for Unstructured Data calls out the imprecise nature of the term:

The term is imprecise for several reasons:

  1. Structure, while not formally defined, can still be implied. The main ingredient of this drug works well when it mixes with blood and it takes at least 45 minutes to mix in the blood. side effects from viagra The treatments include: Counseling Medication Surgery Counseling: It is helpful to bring up the libido cialis generic purchase level. It is frequently recommended and levitra no prescription sold. These sufferings have been carried out from past ancient times who lead buy professional viagra their life in the way it was meant to be for women without dangerous side effects.
  2. Data with some form of structure may still be characterized as unstructured if its structure is not helpful for the processing task at hand.
  3. Unstructured information might have some structure (semi-structured) or even be highly structured but in ways that are unanticipated or unannounced.

In other words, it does have structure, but maybe we’ve not written it down, or the structure isn’t helpful to processing or is structured in ways we were not expecting – so what?…it’s still structured!

All of the above seem to me to support the view that all data does indeed have structure.