Reflections on #transpo: Licensing public data so that everybody wins?

Idle Transit Data is the Devil’s Playground, originally uploaded by williac.

So yesterday I posted my first thoughts about Transportation Camp. Today I’m following up. Let’s talk about data licensing.

In the afternoon I attended a couple of sessions focused on transit data. One, lead by Ian White of Urban Mapping, discussed data licenses for transit data, though it could be extended to any type of public/government data. The other session, lead by Jim Donovan was sort of a wishlist/future of transit data feeds prognosis. Both sessions featured the close relationship between the developers and the transit agencies, and the clear benefits for both groups of such a partnership. Few agencies have the time, money, and muscle to really do much with their data, so opening it up to others to make apps and add value isn’t a bad thing at all. It wouldn’t really happen any other way.

When it comes to licenses for transit data, there’s a pretty good list in the Google Transit Data Feed wiki. The sort of gold standards, in terms of use are those of TriMet and BART. They aren’t too restrictive, they cover their asses, and they encourage developers to do something with the data. BART and SFMTA were interested in hearing what developers would like from their datafeeds in the future, so that the developers can continue to do things that make riders happy. (Or happier than being in the dark.) It’s a win-win. Right?

Well… I’m torn on this issue. I love that agencies are starting to really share their data. I want that data out there. I want researchers to do stuff with it and make better, more practical models and proposals for the future. I like seeing developers and other people in the tech community do interesting and novel things with the data. I think there’s a lot of creativity going on right now, and everybody benefits. I can’t help but have this nagging worry in the back of my mind though about the increasing privatization of public data. NextBus is sort of an example, but now I also worry about Streetline and all of the data they’re collecting for cities about parking. Cities, counties, states, MPOs, they’re contracting out the collection of data, and then buying it back . It’s frustrating because researchers, particularly those from universities, knows it exists but can’t obtain it. This is different than buying freight data from operators or traffic data from INRIX. This is publicly funded data housed privately. It’s outsourcing and unless the agencies start writing access and licensing of the data for research and experimentation into the contracts, it’s going to be a very expensive and difficult road ahead. This isn’t exactly new, as lots of public agencies have paid for traffic counts and other studies requiring data collection from consulting firms only to have to buy back the data for later use, but I really fear that it’s going to be more common as agencies are forced to do more with far less.

That fear doesn’t really keep me up at night. More than anything, I’m really excited about all the sharing that is going on, and would rather work on ways to promote such openness and make it easier for all agencies. That’s my dream, but it sort of goes back to what librarians like to do. We like to make stuff accessible and findable.






Leave a Reply