Freight data makes me feel like a failure

Freight, originally uploaded by Thomas Hawk.

Yesterday I got one of my least favourite types of reference questions: somebody wanted freight data. Not just that, but freight data about how much it costs to ship dry goods in different types of container ships.

Why do I hate freight data? It’s opaque. It’s inaccessible. It’s expensive. There’s no sign of it changing and I feel impotent to do anything about it.

Every so often I get these questions. “How much does it cost to ship goods from there to here?” Well… there are some places you could look at the Baltic Dry Index or maybe the Coal Transportation Rates and Trends, but those tend to be too broad or general for what most users want.

The frustrating thing is that you know the data does exist. Freight companies collect all sorts of data about everything they do. It helps them stay efficient and profitable, but they also know they can resell this data for a lot of money, and that’s sort of where the problem comes in. Pricing. I asked somebody about this at TRB last year and they summed it up like this: The more they charge for the data, the more valuable it is because less people will use it… they think. It’s not rocket science, it’s speculation, but that’s sort of what these private companies have to do. It would be interesting to see how things would change if they made this sort of data more accessible to researchers without the embargoes and prices that very few researchers from universities or public agencies can afford. The optimist in me thinks every one would benefit, but alas… the rest of the freight world (or most of the transportation engineering/economics world) is there yet.

In some ways these questions are a good, though perhaps harsh, lesson to the realities of commoditized data. Unfortunately sometimes you have to work with what you can get, not what you want. It’s sort of a buzz kill.

As long as there are people who can shell out thousands of dollars for a single data set, and that market and model continue to thrive, we either need to cough of the money or shrug and do something else. Sort of depressing, really.

Data.Gov — transparency overload, PR move, tool.

The SLA Gov Info division posted this morning a link to a Harvard Business School case study, Data.Gov: Matching Government Data with Rapid Innovation. (There’s a free copy of the study for government employees linked on the DGI blog.)

It’s a good read about Vivek Kundra and the whole project, but I think it doesn’t really address some of the major usability issues of Data.Gov. Not everything is in there, organization could be better, most usable analysis tools, etc. It’s a fine line because on one hand, it’s commendable for the Obama administration to take on such a task and pay service to the idea of Open Data, but then on the other it’s frustrating because not all data is equal. Some agencies are barely represented because their data is such a mess, though it is available elsewhere. It makes my job difficult because I can’t yet sell Data.Gov to my users as the one-stop source for their data needs.
It would be great if the Obama Administration could make better data collection and sharing a top priority for all of the different departments, but I also know it’s pretty low on the list of things to do (even before the economy collapsed). When it has enough transportation data for the needs of my community, I will be happy, but I know we’re not alone.