tecznotes

Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.

Aug 21, 2004 7:56pm

temporary troubles with In The News

I discovered today that Google had made some small tweaks to the HTML structure of their news page. It caused an interruption in my ability to scrape their site between the afternoon of August 20th and the morning of August 21st. Hopefully this won't be noticeable by tomorrow as my fix works its way through various levels of caching, but it does highlight the difficulties of working with capricious methods like page scraping. I'm currently looking at the feasibility of doing a similar visualization investigation with Open Secrets, and troubles with unexpected modifications to data formats underscore the usefulness of web services with stable API's. "Small pieces, loosely joined" —Flickr gets it, even Google gets it with their normal search service. Here's hoping that Google News moves out of beta soon.

Comments

Sorry, no new comments on old posts.

October 2008
Su M Tu W Th F Sa
   
 

Other places on the web I'm enjoying: Andrew Vande Moere's Information Aesthetics, Jan Chipchase's Future Perfect, Peacay's Bibliodyssey, Eyebeam's Reblog, The Sartorialist, Processing Blogs, Matthew Hurst's Data Mining, Wondermark, Photos tagged Wroclaw, and The Beautiful Poland Pool.

Friends (who have websites): Abe, Adam, another Adam, Andrew, Andy, Boris, Cassidy, Darren, Eric, Mike, Nikki, Otherworld, Peter, Ryan, Tomas, Tom, Thomas.

Recent Entries

  1. dunbar's dungeon
  2. design engaged 2008
  3. post-ONA conference
  4. map beautiful
  5. tracking hurricanes
  6. neocartography
  7. cascadenik: cascading sheets of style for mapnik
  8. uxweek 2008
  9. blog all dog-eared pages: understanding media
  10. making sense of mapnik

Archives