tecznotes

Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.

Jan 8, 2013 6:55pm

work in progress: green means go

Since State Of The Map in Portland, I’ve been applying simple raster methods to OpenStreetMap data to draw a picture of the current state of U.S. Government TIGER/Line data in the project. TIGER data is a street-level dataset of widely varying quality covering the whole of the United States, and much of OSM in this country is built on TIGER. OSM US Board member Martijn Van Exel explains, in his TIGER deserts post:

Back in 2007, we imported TIGER/Line data from the U.S. Census into OpenStreetMap. TIGER/Line was and is pretty crappy geodata, never meant to make pretty maps with, let alone do frivolous things like routing. But we did it anyway, because it gave us more or less complete base data for the U.S. to work with. …there’s lots of places where we haven’t been taking as good care of the data. Vast expanses of U.S. territory where the majority of the data in OSM is still TIGER as it was imported all those years ago. The TIGER deserts.

TIGER data has been a fantastic leg up for the U.S. map, but elsewhere in the world data imports are frowned upon. The german community in particular feels that imports are antithetical to local community mapping. The U.S. is very different from Europe in terms of population density and driving distances. As Toby Murray said in this message last year, the imbalance between mapper population and surface area between Kansas and Germany is potentially insurmountable:

It is a 9 hour drive from Topeka to Denver and I think you go past a total of 3 cities with a population of over 10,000. In fact, out of the 54 counties west of Wichita, only 7 have a population for the whole county of over 10,000. So while we might be able to start OSM communities in some of the larger cities, vast stretches of the country would remain completely empty.

In many rural parts of the country, the prospective local OSM mapping population and the creators of government data are exactly the same people. I talked to an Esri employee at SOTM this year who told me that at every year's User Conference, she gets a regular stream of these folks approaching her with data in hand, asking how they can get it into OSM. They are the local community we want, and it’s not always clear how we can help them help us.

Based on the full history dump, I’ve been working on a map that I’m calling “Green Means Go,” a visualization of the state of TIGER/Line data in OpenStreetMap. The map shows a grid of 1km×1km squares covering the continental United States. Green squares show places where data imports are unlikely to interfere with community mapping, based on a count of unique participating mappers who don’t appear to be part of any of the three big TIGER imports.

Large, densely-populated urban areas show a similar pattern, with a dark center where many individual mappers have contributed, surrounded by a green rural fringe where no OSM community members have participated in the cleanup and checking of TIGER data.

This pattern shows a lot of local variety. For example, the area around Portland and Salem in Oregon, where we held last year’s SOTM-US conference, shows a broad swath of edited area. Portland in particular has shown a strong local uptake of OSM, basing its official TriMet trip planner on OpenStreetMap.

Other parts of the country, especially in the Great Plains, show the pattern of relative non-participation described by Toby Murray:

Good data does exist in these places, and in fact can be found in the more recent TIGER data sets which rely much more heavily on data generated directly by local county officials. In an area like the one above, the Green Means Go map should help a GIS data owner see that his or her own data and local knowledge would interfere minimally (if at all) with local community mappers.

In some cases, we see patterns that are worth exploring further. Entire counties in Pennsylvania show up as edited, but it’s not obvious to me that there is a county-wide local community here. Have these areas already been replaced by county-level importers who’ve improved the data, or is there some portion of the 2007 TIGER import that I’m missing?

In this other image, the relative lack of any kind of data (OSM or TIGER) is visible on the grounds of Eglin Air Force Base, south of Interstate 10 and east of Pensacola in Florida:

This work is heavily in progress. I’d also like to write about the process of making it, using the National Landcover Dataset and Hadoop to generate this imagery. Some possible next steps include:

  • Collaborating with Ian Dees, Alex Barth, Ruben Mendoza and others from the US OSM community to develop better ways of seeing TIGER data.
  • Creating static, per-County and Census Place views.
  • Developing a plan to regenerate these map tiles for future data updates.

Comments (1)

  1. I now know how to edit OSM (thanks to some tips from Ian Dees). I had to learn because when I downloaded extracts from OSM planet (thanks in part to your website, but I get it from BBBike Extract because your last one was in September), there was old stuff in there that no one had edited (like a bridge no longer under construction). Anyway, is there a webpage I can look at to see what in my area (Chicagoland) has been flagged as needing improvement/editing?

    Posted by Steven Vance on Tuesday, January 15 2013 2:04am PST

Sorry, no new comments on old posts.

November 2014
Su M Tu W Th F Sa
      
      

Recent Entries

  1. making the right job for the tool
  2. the hard part
  3. end the age of gotham-everywhere
  4. on this day
  5. write code
  6. managers are awesome / managers are cool when they’re part of your team
  7. bike seven: french parts
  8. being a client
  9. bike seven: building a cargo bike
  10. blog all video timecodes: how buildings learn, part 3
  11. talk notes, urban airship speaker series
  12. john mcphee on structure
  13. blog all oft-played tracks V
  14. tiled vectors update, with math
  15. disposable development boxes: linux containers on virtualbox
  16. week 1,851: week one
  17. tilestache 0.7% better
  18. south end of lake merritt construction
  19. network time machine backups
  20. week 1,846: ladders

Archives