tecznotes

Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.

Nov 19, 2014 10:07pm

open address machine

The OpenAddresses project is super-interesting right now:

OpenAddresses is a global repository for open address data. In good open source fashion, OpenAddresses provides a space to collaborate. Today, OpenAddresses is a downloadable archive of address files, it is an API to ingest those address files into your application and, more than anything, it is a place to gather more addresses and create a movement: add your government’s address file and if there isn’t one online yet, petition for it. —Launching OpenAddresses.

OA is the free and open global address collection, but it’s just getting off the ground. Ian Dees of longtime OpenStreetMap involvement kicked off the project early this year when OSM balked at bulk address imports. It’s more sensible as a separate project anyway.

I’ve been working on data.openaddresses.io to make the project more legible and responsive.

I’m about six months late to the party, but there’s a ton to do right now. Thinking back on my own involvement in OSM, I remembered that around 2006 the street map tiles were being updated infrequently, and my own willingness to add data was gated by the turnaround time of seeing my input on the real, live map. I’d add some stuff, then twiddle my thumbs for days (or weeks) while the render refreshed. My satisfaction from adding data improved with every advance in OSM’s rendering stack re-render time. Seeing your effect on the data set is an important motivational factor.

OA has a similar issue for me. It’s implemented as a giant bag of JSON files stored in Github, so it’s not immediately obvious where the data lives, how up-to-date it is, or (if you’re submitting new files) whether a data source even works. The processing code works, but it’s not immediately obvious how to make all the pieces fit together.

I have been working on machine, a harness for running the whole process on a more regular cycle. There’s a bunch of interesting moving pieces.

I’ve taken Andy Allan’s chef advice to heart and created a chef recipe collection for preparing OA to run on a bare Ubuntu 14.04 machine. Chef is a no-brainer for me now, and I use it for everything that stands any chance of being important. Andy says:

Configuration management really kicks in to its own when you have dozens of servers, but how few are too few to be worth the hassle? It’s a tough one. Nowadays I’d say if you have only one server it’s still worth it – just – since one server really means three, right? The one you’re running, the VM on your laptop that you’re messing around with for the next big software upgrade, and the next one you haven’t installed yet.

If you want to add a skeletal chef script to any existing repository, start here:

git pull https://github.com/migurski/chefbase.git master

The whole OA codebase is now possible to run on a scratch machine, which means that once each week I can start an EC2-XXXL server and have it set up with complete OA code in minutes. It takes a few hours to run everything. We can keep data.openaddresses.io up-to-date with the status of the data, including a fresh map of data from US states and counties (even though OA is international), a complete listing of cached and processed status for all data, and small data samples to provide hints for correctly mapping (“conforming”) source data to OA’s needs.

There remains a lengthy ticket backlog, but I am hoping that OA provides a way to better expose and unify the world’s municipal government spatial data. Today, addresses. Tomorrow, parcels.

Apr 24, 2014 9:07am

making the right job for the tool

Near the second half of most nerd debates, your likelihood of hearing the phrase “pick the right tool for the job” approaches 100% (cf. frameworks, rails, more rails, node, drupal, jquery, rails again). “Right tool for the job” is a conversation killer, because no shit. You shouldn’t be using the wrong tool. And yet, working in code is working in language (naming things is the second hard problem) so it’s equally in-bounds to debate the choice of job for the tool. “Right tool” assumes that the Job is a constant and the Tool is a variable, but this is an arbitrary choice and notably contradicted by our own research into the motivations of idealistic geeks. Volunteers know the tools they know, and are looking for ways to use their existing powers for good. They are selecting a job to fit the tool. Martin Seay’s brilliant essay on pop music, Ke$ha’s TiK ToK, pro wrestling and conservatism critiques the type of realist resignation that assumes the environment (the job) is immutable:

It is a sterling example of what a number of commentators—I’ll refer you to k-punk—have characterized as the fantasy of realism: an expedient and comfortable confusion of what is politically difficult with what is physically impossible. … This kind of “realism” offers something even more desirable than a clear-eyed assessment of your current circumstances, namely the feeling that you’ve made such an assessment, and that you’ve come away with the conclusion that this is as good as it gets. … This is professional wrestling again: the comforting notion that you know what you need to know, that everything is clear.

At some level, our tools come preselected. At Day Job, we have tool guy Eric Ries on our board, and by design stick to the universes of web scripting languages and user needs research. Going in to a government partnership, we know that the set of jobs for which we are suited is bounded by time and scale. Instead, we look for opportunities where governments are creating the wrong jobs based on the tools they have available. One example is Digital Front Door, an emerging project on publishing and content management where we’re looking at the intertwined evils of CMS software and omnibus vendor contracts. Given a late 90s consensus on content publishing, it seems inevitable that every website project must result in a massive single-source RFP, design, and migration effort. So much risk to pile onto a single spot. How would a city government change the scope of a job if it knew it had other tools available? Would the presence of static site generators and workflows based on git-style sharing models influence the redefinition of the job to be smaller, lower-risk, more agile? I think yes.

“Pick the right tool” is common-sense advice that elides a more interesting set of possibilities. When you can redefine the job, the best tool may be the one you already have.

Apr 12, 2014 10:03pm

the hard part

The hard part of coming to State of the Map is that I’m only a little bit connected to the OpenStreetMap project right now, and not spending most of my time on geospatial open source like I used to. I’ll come back to it, but today I’ve had a number of conversations about projects of mine, their status, and whether I have abandoned them. Metropolitan Extracts have not yet been run during 2014, TileStache is stable but has a few outstanding pull requests, and it’s high time I merged Walking Papers with Stamen’s more-stable Field Papers offshoot. Thankfully Vector Tiles remain happily running on the US OSM server.

I wish I could say I had easy answers for these projects; they seem genuinely useful to people but not something I can maintain at the moment and not something I can exactly delagate at CfA.

Apr 5, 2014 11:36am

end the age of gotham-everywhere

If you’ve attended a movie or generally looked at things in the past five years, you’ll know that we’re in the age of ubiquitous gotham.

The MPAA has switched to Gotham for their ratings screens. If you pay attention to the previews before a movie, it’s now the go-to font for all movie titles. Obama’s 2008 campaign standardized on Gotham and Sentinel (another H&FJ face) for their celebrated visual identity. In 2012, Obama switched things up and standardized on Sentinel and Gotham. Code for America uses these two excellent fonts on our website via the H&FJ cloud service.

Gotham is the inception horn of typefaces.

It’s a major, inescapable part of the visual landscape, and I think it needs a boycott.

You might be interested to know that the celebrated type foundry Hoefler & Frere-Jones who created the typeface is going through an acrimonious divorce right now. Do yourself a quick favor and read Tobias Frere-Jones’s opposition to Hoefler’s motion to dismiss. The short version is that Frere-Jones joined Hoefler’s foundry over ten years ago, and brought with him a set of “dowry” fonts that he had developed for his previous employer. In return, Hoefler allegedly promised 50% of the company, changed its name to Hoefler & Frere-Jones, and spent a complete decade referring to Frere-Jones as “partner” while quietly stalling on making the status official. Frere-Jones finally got fed up waiting and forced the issue, things went pear-shaped, and the studio is now called Hoefler Company.

I’ve had my issues with H&FJ in the past, but this current situation is an object lesson in the perils of half-assing legal relationships. Hoefler is a celebrated designer himself, but in this story plays the role of jerkface suit. Frere-Jones is probably too naïve for his own good, but here he is serving as a critical example for why you should always get it in writing, even just a one-sentence napkin scrawl of intent. I left my own former company Stamen Design in 2012 after nine years, and my experience working with Eric and Shawn and eventually departing was a cakewalk thanks to Eric’s above-board handling of my 25-year-old self in 2003.

So, to draw attention to the need for businesses to treat designers with respect, and the need for designers to insist that business processes happen by the book, I think it’s time we put an end to the age of Gotham everywhere.

Apr 1, 2014 10:14pm

on this day

Today, we wrapped up a bunch of fake cities from the goofy tail end of our Code for America site redesign process into a 2015 cities April Fools joke.

Today, everyone on the tech team dressed up as me, right down to the yellow DURR.

Today, Frances and I merged streams and made progress on a proof-of-concept we’ve been thinking about for the Digital Front Door project. Lane wrote that linked post.

Today, I participated on a panel hosted by Zipfian Academy on Data Science For Social Good, with folks who work in education, power, petitions, and health. These are my notes:

Code for America works at the source of data: cities, governments, and the primary source data they produce.
Governments are famously behind on technology, while Silicon Valley is famously out front. So, the biggest technical challenge we face is the bridging the trough of disillusionment rollercoaster ride of the Gartner Hype Cycle.
We try to build that bridge by working on things that matter to cities, like public records, public services, and communications.
At the same time, we are building and supporting an international community of civic hackers, through projects like the Brigade and a new API for collecting and hosting information about civic tech projects.
Right now, one of the emergent data science issues we see is ETL or Extract/Transform/Load. At our weekly Open Oakland Brigade hack night, people like former fellow Dave Guarino are helping city staff publish ethics commission data.
Come to Open Oakland meeting every Tuesday, 6:30pm at city hall to participate. Search Google “ETL for America” to learn more.
If you live in SF, come to SF Brigade meeting every Wednesday, 6:30pm at Code for America 9th & Natoma.

All in all a good day.

November 2014
Su M Tu W Th F Sa
      
      

Recent Entries

  1. open address machine
  2. making the right job for the tool
  3. the hard part
  4. end the age of gotham-everywhere
  5. on this day
  6. write code
  7. managers are awesome / managers are cool when they’re part of your team
  8. bike seven: french parts
  9. being a client
  10. bike seven: building a cargo bike
  11. blog all video timecodes: how buildings learn, part 3
  12. talk notes, urban airship speaker series
  13. john mcphee on structure
  14. blog all oft-played tracks V
  15. tiled vectors update, with math
  16. disposable development boxes: linux containers on virtualbox
  17. week 1,851: week one
  18. tilestache 0.7% better
  19. south end of lake merritt construction
  20. network time machine backups

Archives