Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.

May 1, 2016 8:40pm

notes on debian packaging for ubuntu

I’ve devoted some time over the past month to learning how to create distribution packages for Ubuntu, using the Debian packaging system. This has been a longstanding interest for me since Dane Springmeyer and Robert Coup created a Personal Package Archive (PPA) to easily and quickly distribute reliable versions of Mapnik. It’s become more important lately as programming languages have developed their own mutually-incompatible packaging systems like npm, Ruby Gems, or PyPi, while developer interest has veered toward container-style technology such as Vagrant or Docker. My immediate interest comes from an OpenAddresses question from Waldo Jaquith: what would a packaged OA geocoder look like? I’ve watched numerous software projects create Vangrantfile or Dockerfile scripts before, only to let those fall out of date and eventually become a source of unanswerable help requests. OS-level packages offer a stable, fully-enclosed download with no external dependencies on 3rd party hosting services.

The source for the package I’ve generated can be found under openaddresses/pelias-api-ubuntu-xenial on Github. The resulting package is published under ~openaddresses on Launchpad.

What does it take to prepare a package?

The staff at Launchpad, particularly Colin Watson, have been helpful in answering my questions as I moved from getting a tiny “hello world” package onto a PPA to wrapping up Mapzen’s worldwide geocoder software, Pelias.

I started with a relatively-simple tutorial from Ask Ubuntu, where a community member steps you through each part of wrapping a complete Debian package from a single shell script source. This mostly worked, but there are a few tricks along the way. The various required files are often shown inside a directory called “DEBIAN”, but I’ve found that it needs to be lower-case “debian” in order to work with the various preparation scripts. The control file, debian/control, is the most important one, and has a set of required fields arranged in stanzas that must conform to a particular pattern. My first Launchpad question addressed a series of mistakes I was making.

The file debian/changelog was the second challenge. It needs to conform to an exact syntax, and it’s easiest to have the utility dch (available in the devscripts package) do it for you. You need to provide a meaningful version number, release target, notes, and signature for this file to work. The release target is usually a Debian or Ubuntu codename, in this case “xenial” for Ubuntu’s 16.04 Xenial Xerus release. The version number is also tricky; it’s generally assumed that a package maintainer is downstream from the original developer, so the version number will be a combination of upstream version and downstream release target, in my case Pelias 2.2.0 + “ubuntu” + “xenial”.

A debian/rules file is also required, but it seems to be sufficient to use a short default file that calls out to the Debian helper script dh.

I have not been able to determine how to test my debian director package locally, but I have found that the emails sent from Launchpad after using dput to post versions of my packages can be helpful when debugging. I tested with a simple package called “hellodeb”; here is a complete listing of each attempt I made to publish this package in my “hello” PPA as I learned the process.

My second Launchpad question concerned the contents of the package: why wasn’t anything being installed? The various Debian helper scripts try to do a lot of work for you, and as a newcomer it’s sometimes hard to guess where it’s being helpful, and where it’s subtly chastising you for doing the wrong thing. For example, after I determined that including an “install” make target in the default project Makefile that wrote files to $DESTDIR was the way to create output, it turned out that my attempt to install under /usr/local was being thwarted by dh_usrlocal, a script which enforces the Linux filesystem standard convention that only users should write files to /usr/local, never OS-level packages. In the end, while it’s possible to simply list everything in debian/install, it seems better to do that work in a more central and easy-to-find Makefile.

Finally, I learned through trial-and-error that the Launchpad build system prevents network access. Since Pelias is written in Node, it is necessary to send the complete code along with all dependencies under node_modules to the build system. This ensures that builds are more predictable and reliable, and circumvents many of the SNAFU situations that can result from dynamic build systems.

Rolling a new release is a four step process:

  1. Create a new entry in the debian/changelog file using dch, which will determine the version number.
  2. From inside the project directory, run debuild -k'8CBDE645' -S (“8CBDE645” is my GPG key ID, used by Launchpad to be sure that I’m me) to create a set of files with names like pelias-api_2.2.0-ubuntu1~xenial5_source.*.
  3. From outside the project directory, run dput ppa:migurski/hello "pelias-api_2.2.0-ubuntu1~xenial5_source.changes” to push the new package version to Launchpad.
  4. Wait.

Now, we’re at a point where a possible Dockerfile is much simpler.

Apr 27, 2016 9:17pm

guyana trip report

Guyana is about to celebrate its 50th anniversary of independence from Great Britain. It’s not a big country, but it’s two-thirds rain forest and contains the oldest exposed rocks on earth, aged about two billion years. In the inland south of the country, the Wapichan and Macushi Amerindian tribes are asserting their place in the modern world using a combination of political, cultural, and technological means to map their territory. I just spent seven days there observing the material support work of Digital Democracy (Dd), where I’m a board member.

Saddle Mountain is culturally and spiritually significant to the tribe

Dd’s work in Guyana is arranged with the district council of elected leaders, and focuses on a group of environmental monitors covering this enormous 3,000 square mile savannah and the 7,000 square mile forest to the east. The monitors do two kinds of work: they map static sites using OpenStreetMap and Esri tools to compile a replacement for decades-old low-accuracy official maps, and they record evidence for economic/environmental abuses such as cross-border cattle rustling from Brazil to the west and destructive gold mining in the hilly forest rivers to the east. There are just a handful of monitors covering this area: Ezra, Tessa, Gavin, Timothy, Angelbert, and Phillip. They cover the entire Rupununi savannah, and their work typically centers on the northern village of Shulinab where we were guests of Nicholas and Faye Fredericks. Nick is the elected head of the village; he recently accepted the Equator Prize from the UN Development Program to indigenous organizations working on climate change issues.

The guest house where we stayed, newly-built to encourage visitors and possibly tourism

The monitoring work uses drones and phones to collect aerial imagery, and is definitely the more charismatic of the two efforts. Gold mining operations along the rivers in the hills alter the natural habitats of economically-important species, particularly through mercury dumping that can poison un-mined sections of downstream river. Dd’s Gregor MacLennan has been visiting Guyana for the past ten years, and has experimented with a variety of imagery techniques to create a temporal portrait of mining operations along the rivers in the hills. Due to persistent cloud cover, satellite imagery is only intermittently useful, while cheap, low-flying drones can be used in more kinds of weather. The current technology challenge centers on image-stitching; the best results we’ve achieved have come from the expensive Pix4D package, which demands an active internet connection for piracy control purposes. QZ recently wrote an excellent article about the monitoring work.

Dismantled Parrot quadcopter

Border monitoring relies on data collection with OpenDataKit (ODK) and smartphones. The smartphones work well in the savannah, especially now that the monitors have begun using motorcycle-to-USB charging kits to keep batteries powered. ODK is just okay; it’s a clumsy system, likes to have a central server for reporting and collating data, and seems to be designed for a institutional use-case that doesn’t match the need here.

Ron, Ezra, and the monitor team huddling around a smartphone demonstration of new ODK input forms

The static mapping effort is less exciting for outsiders, but more directly useful for the community. ODK has been in use here as well, but we’re working on some new concepts in peer-to-peer OSM editing using distributed logs with James Halliday a.k.a. Substack. James has previously been involved in Max Ogden’s Dat Project, he’s the author of a bunch of popular Node things including Browserify, and p2p architectures are his current jam. This part of the work, dubbed Peermaps, is super promising, and implements a number of concepts borrowed from Git and other distributed data tools. Gregor and James have submitted a talk proposal to State Of The Map U.S. The end goal here is a map of the territory, something the group has been collecting in Esri ArcMap format for the past few years. We saw a wall-sized laminated version of the map, and several area schools have requested print versions for use in classrooms. The data features accurate river lines, and numerous points of interest for cultural and natural features significant to the community. Mapping these features allows the community to support their territorial claims in discussions with the natural government more effectively than previous narrative descriptions.

Tessa, considered the most computer-savvy of the monitors, explained the territory map for me

Probably the most surprising thing about this trip for me was the sophistication of the political organization behind this effort. I had some “white savior complex” worries before I showed up because it’s often hard to tell from secondhand stories who’s really pushing the work, and whether partners are passive or active. In Guyana, the local community is organized into a district council of elected leaders (each called a Toshao) from the seventeen villages, strategy is determined through deliberation, and decisions are made and supported consistently. I spoke with people in Georgetown about the effort and word of Nick and Faye’s energy and creativity here has definitely gotten around. I was honored to be their guest. Dd provides material support, but credit for the recognition of the importance of data collection and the decision to support it through the monitoring program belongs to the community.

Angelbert, Ezra, Timothy, and the monitor team working on the data collection kit; James is on the far right

Ezra, one of the monitors, said at one point that “photos justify our land.” This phrase stuck with me.

Feb 15, 2016 1:27am

openaddresses population comparison

Lots going on with OpenAddresses since I last wrote about it in July. The continuous integration service is alive and well, we have downloadable collections of all addresses, the dot map is updating regularly, and the full collection has ticked over 220 million records.

Last year, Tom Lee (of Sunlight Foundation and Mapbox) made an offhand estimate of two people per address which has had me thinking about completeness and coverage estimation. At that number, we’re at approximately 3% of global addresses. Fortunately, there are a few gridded population datasets with worldwide coverage that make it possible to estimate coverage more precisely:

G-Econ is the smallest data set, and as a simple spreadsheet it’s the easiest to get started with. It has estimates through 2005 which is close enough for an experiment, and a number of interesting data columns beyond simple population counts about economic output, soil type, availability of water, and climate.

Using G-Econ, it’s possible to generate address/person comparisons for places where OpenAddresses has data, such as Europe:

From this image, it looks like Tom’s estimate of 0.5 addresses per person holds for much of France, Spain, and Denmark, but not at all for Poland where it’s lower at 0.2 (five people per address). The values for Estonia are strange, with almost two addresses per person in certain grid squares.

In the Western U.S., we can see the effect of overcounting Montana addresses via overlapping statewide and county sources:

I’m just getting started on this analysis, and hope to create fresh data on a regular basis as we generate scheduled downloads of OA data. I’d like to understand more about the relationships between population density and address availability, and potentially switch to the more complex and current GPWv4 dataset if this is interesting.

For now, check out these two things:

Jan 31, 2016 11:59pm

blog all oft-played tracks VII

This music:

  1. made its way to iTunes in 2015,
  2. and got listened to a lot.

I’ve made these for 2014, 2013, 2012, 2011, 2010, and 2009. Also: everything as an .m3u playlist.

1. Grimes: Flesh without Blood

2. Trust: Shoom

3. Klatsch!: God Save The Queer

4. Zed’s Dead: Lost You

5. The Communards: Disenchanted

6. Delia Gonzalez & Gavin Russom: Relevee (Carl Craig Remix)

7. The Smiths: Rubber Ring

8. Hot Chip: Need You Now

9. The All Seeing I: 1st Man In Space

10. Missy Elliott: WTF (Where They From) ft. Pharrell Williams

Dec 1, 2015 1:33pm

week 1,984: back to the map

After 2½ years as Code for America CTO, I’m moving on to the next thing. Starting December 14, I’ll be joining a crew of former Stamen colleagues & clients, CfA friends, OpenStreetMappers, and co-geobreakfasters at Mapzen, part of Samsung Accelerator. If Mapzen was a game show, it’d be This Is Your Life. I’ll be combining my background in open source mapping and my more recent experience working on CfA technology products to lead a team making writings, demos, tools, and entry points for Mapzen’s work on routing, search, transit, and the brainmelting beauty of Tangram. We’re actively hiring (especially front-end developers), so please get in touch.

I will miss Code for America greatly, particularly the technology and product crew we built to deliver new communications and engagement approaches for digital government, the three years of fellowship classes we collaborated with, the whole staff of people making it work, and that one time my team dressed like me for April Fools.

This month is an especially hard time to go, with a major victory from Dan Hon on Child Welfare Services technology procurement—if you are a California design or dev shop, bid on this project to literally save children’s lives. It’s also an auspicious time to go, with a few key colleagues like Cyd Harrell and Frances Berriman heading out and a break between the 2015 and 2016 fellowship classes.


May 2016
Su M Tu W Th F Sa

Recent Entries

  1. notes on debian packaging for ubuntu
  2. guyana trip report
  3. openaddresses population comparison
  4. blog all oft-played tracks VII
  5. week 1,984: back to the map
  6. bike eleven: trek roadie
  7. code like you don’t have the time
  8. projecting elevation data
  9. the bike rack burrito n’ beer box
  10. a historical map for moving bodies, moving culture
  11. the other openstreetmap churches post
  12. platforminess, smartness, and meaningfulness
  13. writing a new continuous integration service for openaddresses
  14. state of the map 2015
  15. bike ten: schwinn touring, v2
  16. blog all oft-played tracks VI
  17. 2015 fellowship reader
  18. bike ten: schwinn touring
  19. more open address machine
  20. open address machine