You know why metadata matters.


Beauty queens with time capsule rock, Dana Point, 1966 flickr photo by Orange County Archives shared under a Creative Commons (BY) license

I often forget that information workers (librarians, archivists, what have you) have a different way of viewing the world than others. How things are described and presented so that people might be able to find something at a later date is something I think about a lot. It’s metadata. It’s preservation. It’s a time capsule. It’s a should be deliberate part of anything (online), but it’s often overlooked and assumed that search engines will take care of it or something.

This weekend Rachel Swan, a journalist I know, discovered that their bylines from a previous publication were erased:

It’s frustrating. It’s erasure. It’s also (as Swan seems to understand) likely just an oversight through likely a number migrations of different content management systems (CMS), without thinking about how to preserve all of the metadata from migration to migration because it’s messy and hard.* So much time is spent moving over the content, stuff like authors who no longer in the system as creators might get lost in the cracks or get turned into “staff”. Whether that’s nefarious or not. *cough*Deadspin*cough*

But I see this a lot from other sources. Like white papers or tech reports that don’t have any context – Who wrote it? When was it written? Who sponsored it? Where is it being published? The internet has made it so easy to publish stuff, and I guess people have a lot of faith in major search engines and webarchives to take care of their stuff. Making sure your work is minimally described makes it so much easier for people to find it at a later date (unless you don’t want people to see it). Even then, we can get into where your stuff is indexed to be found, and ensure that the link won’t be broken with next CMS upgrade. Someday AI and machine learning will take care of it all I guess, but even then it will only be successful if the building blocks are deliberately in place.

So my call to action for you is if you’re publishing something – a blog, an email blast, a paper, a report, an article, a mixtape – include attribution and a date at the very least.

* Also important to note that the SF Weekly is an alt-weekly who probably doesn’t have the resources or staff to think about these issues because it’s often all overhead. It’s hard not to think about that fresh on the heels of news that the OC Weekly is shutting down. Or remember when Gothamist and DNAinfo were shut down with no proviso for an archive or anything.

Leave a Reply