tecznotes

Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.

Apr 13, 2008 2:24am

index supercuts

Andy has a collection of fanboy supercuts, a "genre of video meme, where some obsessive-compulsive superfan collects every phrase/action/cliche from an episode (or entire series) of their favorite show/film/game into a single massive video montage." His collection includes some of the excellent and bizarre Lovelines isolation studies by Chuck Jones.

I'm reminded of how these constitute a kind of search index, a concept first introduced to me 11 years ago via Brian Slesinsky's Webmonkey article, Roll Your Own Search Engine. That was the first of many demystifications of big, web-scale technology for me. The thread running through all these fan cuts is the inverted index, identical to the concept introduced in that ancient article. An inverted index maps elements such as words to their source locations in a data corpus. Each of the pieces Andy links to is a kind of inverted index, pointing to locations of obscenities, audible inhalations, wilhelm screams, and so on.

The other thing it reminded me of was Simon Winchester's excellent book, The Professor And The Madman, an account of W.C. Minor's assistance in constructing the first edition of the Oxford English Dictionary. Minor was a confined lunatic with an extensive personal library, and the OED required that every sense of a word in its definition be traceable to an original, printed quotation. These were crowd-sourced from literate Englishmen of the time, but Minor's contribution went above and beyond because he noted interesting words as he read, constructing an inverted index of his library for OED-worthy terms. When dictionary editor James Murray needed a quotation for a particular word, there was a good chance Minor had already encountered and indexed it.

The works pointed to by Andy's blog post (and additions in the comments) are a special form of indexing, made possible by cheap communication and digital media. Let's hope the RIAA/MPAA don't fuck everything for an emerging form of media consumption.

Comments (3)

  1. I've already seen a number of videos forced offline by their copyright owners, even though they're clearly not infringement. In particular, all the Arrested Development supercuts have been pulled, probably in a sweep of all videos mentioning the show's name. This is a huge loss, considering the time it takes to make each video.

    Posted by Andy Baio on Sunday, April 13 2008 12:14pm EDT

  2. People should counter-notice if they feel their use is fair... if they get sued, I'm sure we can find someone to represent them pro bono. This would be an interesting test case (check out the Kelly v. Arriba Soft case)

    Posted by joe on Sunday, April 13 2008 12:41pm EDT

  3. Is counter-noticing a simple matter of filling out a form as well? I assume that the AD takedowns Andy mentions were a wholesale affair, part of a largely automated search-and-notify process. What if Youtube or someone else made it equally easy to counter-notify? I'm picturing something like http://gethuman.com with target mailing addresses for the resulting forms.

    Posted by Michal Migurski on Sunday, April 13 2008 12:56pm EDT

Sorry, no new comments on old posts.

May 2017
Su M Tu W Th F Sa
 
   

Recent Entries

  1. three open data projects: openstreetmap, openaddresses, and who’s on first
  2. building up redistricting data for North Carolina
  3. district plans by the hundredweight
  4. baby steps towards measuring the efficiency gap
  5. things I’ve recently learned about legislative redistricting
  6. oh no
  7. landsat satellite imagery is easy to use
  8. openstreetmap: robots, crisis, and craft mappers
  9. quoted in the news
  10. dockering address data
  11. blog all dog-eared pages: the best and the brightest
  12. five-minute geocoder for openaddresses
  13. notes on debian packaging for ubuntu
  14. guyana trip report
  15. openaddresses population comparison
  16. blog all oft-played tracks VII
  17. week 1,984: back to the map
  18. bike eleven: trek roadie
  19. code like you don’t have the time
  20. projecting elevation data

Archives