The always insightful Pete Warden recently penned a blog post on “What the Sumerians can teach us about data.” There is much to praise and react to in his analysis. But I’m struck in particular by a semantic matter: does Pete really mean “data” or “information”? I usually hate this genre of challenge; it’s the most tedious in our business. But this time it deserves to be raised.
The reason is that the idea of quantification is really a phenomenon of the Middle Ages in Europe (laying to rest the old canard that they were “dark ages” devoid of progress). On the other hand, the period of antiquity is typified by man describing his world as one of qualities. (Remember Socrates’s “forms?” And Aristotle’s taxonomy on just about everything?)
To be sure, in the area of money we can talk about quantification and thus data as we think about it today. But in many of Pete’s terrific examples of how the Sumerians recorded their world — in the “fixed media” of clay tablets and the like — I am unsure if the term data fits.
Ought “writing” be considered data? If so, how about caveman paintings? Surely the Egyptian hieroglyphs imparted information — but should we call it “data” per se? The only way to answer that question is to define data.
The word data is the plural of datum, neuter past participle of the Latin dare, “to give”, hence “something given,” instructs Wikipedia. “1. Facts and statistics collected together for reference or analysis. 2. The quantities, characters, or symbols on which operations are performed by a computer, being stored and transmitted in the form of…” reports a Google definition.
Building on the idea that data may be something different than just recording information, at what point does something go from being simply info to data?
I have a few ideas on how to answer this — I am scribbling away on a large work that looks at this topic among others. But I’m not quite ready to share it with the world, since the thoughts are still fermenting. In the meantime, Pete’s post is a wonderful look at how an early society recorded and used information. Among my favorite points:
* “Written records remove the problem of fallible memories, but replaces it with a second-degree question of provenance. How do you know the data accurately reflects what happened?”
* “We still have a disturbing tendency to trust anything that’s recorded, without understanding the subjective process that went into creating the record.”
* “The main way Sumerians protected the integrity of their data was through curses. This may seem laughable to a modern audience, but I don’t think we’re so different. Do you expect the FBI to actually raid your house if you copy that VHS tape?”
* “In the absence of real answers, we’ll take bogus ones painted with a veneer of data, just like the Sumerians.
* “If there’s any way you can, please think about how to open up data you control, it’s the best way to pass it on to posterity.”
Having pointed out what I enjoyed most, let me close on a final quibble. Pete writes:
“The Sumerians recorded everything on stone or clay tablets … This data exhaust gives a rich view into trade, worship, life, death, medicine and almost every other aspect of the Sumerian’s world.”
It is absolutely not “data exhaust” in the way that the term has come to be known (and how I helped popularize it in a report a few years ago). The idea was information provided as a byproduct of interacting with information that itself could be collected and analyzed. The simplest example is tracking readers activities to reveal to website visitors the most-read articles, as a simple heuristic to indicate what might interest them.
What Pete describes, and what the Sumerians recorded, was information (or perhaps data) pure and simple. No “exhaust” about it — other than that the tablets had been thrown away by the Sumerians before modern archeologists dug them up.
But all this ranting is only meant to add momentum to my appreciation for Pete’s splendid work in this post and others!