tecznotes

Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.

Feb 8, 2007 5:07pm

polite loops

A recent message to the Yahoo JSON user group got me thinking about social norms as programming techniques. The question was about iterations over long sets of data:

I am trying to iterate through a big JSON variable (about 1500 nodes). It works but FF pops up with the message saying the script is not responding (A script on this page may be busy... do you want to stop the script, debug, continue). If I select continue, it works fine. It is just that the iteration takes a bit of time to go through all the nodes. Is there a way to avoid the above? (Carl Harroch)

There's a handy technique I picked up from Twisted Python that I like to call polite loops. Polite, because sharing resources in an environment like a browser takes tact. When you're at a party with an open bar, you don't hog the bartender's attention, downing pint after pint while he's force to wait on you to have your fill. You're polite; you get your Newkie Brown and walk away so someone else can order.

I've found a similar approach especially useful in languages typically used in web browsers, like Actionscript and Javascript. A greedy loop in a typical web browser affects not only the page it's on, but any other page open at the same time. There are usually allowances for this, such as messages that offer you the ability to kill a script in progress. These only appear after a prolonged period of uselessness, so relying on them is a bad idea.

Twisted's built-in event loop, the Reactor, is single-threaded but built to handle asynchronous events such as network traffic or user input. It does this by encouraging finely-grained function calls that hand control back to the Reactor frequently. For example, this recursive function:

def perturbItems(items):
    if len(items):
        item.pop().perturb()
        perturbItems(items)

...might be rewritten like this:

def perturbItemsPolitely(items):
    if len(items):
        item.pop().perturb()
        Reactor.callLater(0, perturbItemsPolitely, items)

It continues to do exactly the same thing: perturbItems() takes the last item, perturbs it (perturbation might be a lengthy and expensive procedure), then passes the remainder of the list back recursively. Instead of calling itself directly, it schedules a call with the Reactor so that other pending or more urgent tasks can be handled in the meantime. The Javascript analogues could look like this:

function perturbItems(items)
{
    if(items.length) {
        items.pop().perturb();
        perturbItems(items);
    }
}

...and the polite version:

function perturbItemsPolitely(items)
{
    var perturbItem = function()
    {
        if(items.length) {
            items.pop().perturb();
            window.setTimeout(perturbItem, 100);
        }
    }
    perturbItem();
}

This function sets up a closure and uses the Javascript window.setTimeout() method to introduce a 100 msec delay between calls. The short delay greatly increases the amount of time the function needs to run, but is also enough of a breather to allow other scripts or user input to have an effect. It prevents the "this script is running slowly..." issue.

Another way to write the same function does it by chunks:

function perturbItemsPolitely(items)
{
    var perturbChunk = function()
    {
        var start = new Date();
        while(items.length) {
            items.pop().perturb();
            if(start.getTime() + 100 < new Date().getTime()) {
                window.setTimeout(perturbChunk, 100);
                break;
            }
        }
    }
    perturbChunk();
}

Same function, same closure. This time, the function perturbs as many items as it can (note the while loop) before reaching 100 msec and introducing a 100 msec rest. Best case, it's no slower than the original recursive loop. Worst case, it takes twice as long with plenty of rest stops. Going back to the original post above, I'd solve the problem by introducing two polite loops: one to request the 1,500 element data set in chunks of a hundred, and a second to iterate over those hundred-element chunks with occasional rest stops.

Comments

Sorry, no new comments on old posts.

December 2024
Su M Tu W Th F Sa
    

Recent Entries

  1. Mapping Remote Roads with OpenStreetMap, RapiD, and QGIS
  2. How It’s Made: A PlanScore Predictive Model for Partisan Elections
  3. Micromobility Data Policies: A Survey of City Needs
  4. Open Precinct Data
  5. Scoring Pennsylvania
  6. Coming To A Street Near You: Help Remix Create a New Tool for Street Designers
  7. planscore: a project to score gerrymandered district plans
  8. blog all dog-eared pages: human transit
  9. the levity of serverlessness
  10. three open data projects: openstreetmap, openaddresses, and who’s on first
  11. building up redistricting data for North Carolina
  12. district plans by the hundredweight
  13. baby steps towards measuring the efficiency gap
  14. things I’ve recently learned about legislative redistricting
  15. oh no
  16. landsat satellite imagery is easy to use
  17. openstreetmap: robots, crisis, and craft mappers
  18. quoted in the news
  19. dockering address data
  20. blog all dog-eared pages: the best and the brightest

Archives