tecznotes
Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.
Aug 22, 2004 2:56am
temporary troubles with In The News
I discovered today that Google had made some small tweaks to the
HTML structure of their news page. It caused an interruption
in my ability to scrape their site between the afternoon of August 20th and
the morning of August 21st.
Hopefully this won't be noticeable by tomorrow as my fix works its way
through various levels of caching, but it does highlight the difficulties
of working with capricious methods like page scraping. I'm currently
looking at the feasibility of doing a similar visualization investigation
with Open Secrets, and troubles
with unexpected modifications to data formats underscore the usefulness
of web services with stable API's.
"Small pieces,
loosely joined"
—Flickr gets it, even
Google gets it with their normal
search service. Here's hoping that Google News moves out of beta soon.
Comments
Sorry, no new comments on old posts.