tecznotes

Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.

Apr 13, 2008 6:24am

index supercuts

Andy has a collection of fanboy supercuts, a "genre of video meme, where some obsessive-compulsive superfan collects every phrase/action/cliche from an episode (or entire series) of their favorite show/film/game into a single massive video montage." His collection includes some of the excellent and bizarre Lovelines isolation studies by Chuck Jones.

I'm reminded of how these constitute a kind of search index, a concept first introduced to me 11 years ago via Brian Slesinsky's Webmonkey article, Roll Your Own Search Engine. That was the first of many demystifications of big, web-scale technology for me. The thread running through all these fan cuts is the inverted index, identical to the concept introduced in that ancient article. An inverted index maps elements such as words to their source locations in a data corpus. Each of the pieces Andy links to is a kind of inverted index, pointing to locations of obscenities, audible inhalations, wilhelm screams, and so on.

The other thing it reminded me of was Simon Winchester's excellent book, The Professor And The Madman, an account of W.C. Minor's assistance in constructing the first edition of the Oxford English Dictionary. Minor was a confined lunatic with an extensive personal library, and the OED required that every sense of a word in its definition be traceable to an original, printed quotation. These were crowd-sourced from literate Englishmen of the time, but Minor's contribution went above and beyond because he noted interesting words as he read, constructing an inverted index of his library for OED-worthy terms. When dictionary editor James Murray needed a quotation for a particular word, there was a good chance Minor had already encountered and indexed it.

The works pointed to by Andy's blog post (and additions in the comments) are a special form of indexing, made possible by cheap communication and digital media. Let's hope the RIAA/MPAA don't fuck everything for an emerging form of media consumption.

Comments (3)

  1. I've already seen a number of videos forced offline by their copyright owners, even though they're clearly not infringement. In particular, all the Arrested Development supercuts have been pulled, probably in a sweep of all videos mentioning the show's name. This is a huge loss, considering the time it takes to make each video.

    Posted by Andy Baio on Sunday, April 13 2008 4:14pm UTC

  2. People should counter-notice if they feel their use is fair... if they get sued, I'm sure we can find someone to represent them pro bono. This would be an interesting test case (check out the Kelly v. Arriba Soft case)

    Posted by joe on Sunday, April 13 2008 4:41pm UTC

  3. Is counter-noticing a simple matter of filling out a form as well? I assume that the AD takedowns Andy mentions were a wholesale affair, part of a largely automated search-and-notify process. What if Youtube or someone else made it equally easy to counter-notify? I'm picturing something like http://gethuman.com with target mailing addresses for the resulting forms.

    Posted by Michal Migurski on Sunday, April 13 2008 4:56pm UTC

Sorry, no new comments on old posts.

November 2018
Su M Tu W Th F Sa
    
 

Recent Entries

  1. How It’s Made: A PlanScore Predictive Model for Partisan Elections
  2. Micromobility Data Policies: A Survey of City Needs
  3. Open Precinct Data
  4. Scoring Pennsylvania
  5. Coming To A Street Near You: Help Remix Create a New Tool for Street Designers
  6. planscore: a project to score gerrymandered district plans
  7. blog all dog-eared pages: human transit
  8. the levity of serverlessness
  9. three open data projects: openstreetmap, openaddresses, and who’s on first
  10. building up redistricting data for North Carolina
  11. district plans by the hundredweight
  12. baby steps towards measuring the efficiency gap
  13. things I’ve recently learned about legislative redistricting
  14. oh no
  15. landsat satellite imagery is easy to use
  16. openstreetmap: robots, crisis, and craft mappers
  17. quoted in the news
  18. dockering address data
  19. blog all dog-eared pages: the best and the brightest
  20. five-minute geocoder for openaddresses

Archives