tecznotes

Michal Migurski's notebook, listening post, and soapbox. Subscribe to this blog. Check out the rest of my site as well.

Feb 8, 2007 9:07am

polite loops

A recent message to the Yahoo JSON user group got me thinking about social norms as programming techniques. The question was about iterations over long sets of data:

I am trying to iterate through a big JSON variable (about 1500 nodes). It works but FF pops up with the message saying the script is not responding (A script on this page may be busy... do you want to stop the script, debug, continue). If I select continue, it works fine. It is just that the iteration takes a bit of time to go through all the nodes. Is there a way to avoid the above? (Carl Harroch)

There's a handy technique I picked up from Twisted Python that I like to call polite loops. Polite, because sharing resources in an environment like a browser takes tact. When you're at a party with an open bar, you don't hog the bartender's attention, downing pint after pint while he's force to wait on you to have your fill. You're polite; you get your Newkie Brown and walk away so someone else can order.

I've found a similar approach especially useful in languages typically used in web browsers, like Actionscript and Javascript. A greedy loop in a typical web browser affects not only the page it's on, but any other page open at the same time. There are usually allowances for this, such as messages that offer you the ability to kill a script in progress. These only appear after a prolonged period of uselessness, so relying on them is a bad idea.

Twisted's built-in event loop, the Reactor, is single-threaded but built to handle asynchronous events such as network traffic or user input. It does this by encouraging finely-grained function calls that hand control back to the Reactor frequently. For example, this recursive function:

def perturbItems(items):
    if len(items):
        item.pop().perturb()
        perturbItems(items)

...might be rewritten like this:

def perturbItemsPolitely(items):
    if len(items):
        item.pop().perturb()
        Reactor.callLater(0, perturbItemsPolitely, items)

It continues to do exactly the same thing: perturbItems() takes the last item, perturbs it (perturbation might be a lengthy and expensive procedure), then passes the remainder of the list back recursively. Instead of calling itself directly, it schedules a call with the Reactor so that other pending or more urgent tasks can be handled in the meantime. The Javascript analogues could look like this:

function perturbItems(items)
{
    if(items.length) {
        items.pop().perturb();
        perturbItems(items);
    }
}

...and the polite version:

function perturbItemsPolitely(items)
{
    var perturbItem = function()
    {
        if(items.length) {
            items.pop().perturb();
            window.setTimeout(perturbItem, 100);
        }
    }
    perturbItem();
}

This function sets up a closure and uses the Javascript window.setTimeout() method to introduce a 100 msec delay between calls. The short delay greatly increases the amount of time the function needs to run, but is also enough of a breather to allow other scripts or user input to have an effect. It prevents the "this script is running slowly..." issue.

Another way to write the same function does it by chunks:

function perturbItemsPolitely(items)
{
    var perturbChunk = function()
    {
        var start = new Date();
        while(items.length) {
            items.pop().perturb();
            if(start.getTime() + 100 < new Date().getTime()) {
                window.setTimeout(perturbChunk, 100);
                break;
            }
        }
    }
    perturbChunk();
}

Same function, same closure. This time, the function perturbs as many items as it can (note the while loop) before reaching 100 msec and introducing a 100 msec rest. Best case, it's no slower than the original recursive loop. Worst case, it takes twice as long with plenty of rest stops. Going back to the original post above, I'd solve the problem by introducing two polite loops: one to request the 1,500 element data set in chunks of a hundred, and a second to iterate over those hundred-element chunks with occasional rest stops.

Comments

Sorry, no new comments on old posts.

April 2014
Su M Tu W Th F Sa
  
   

Recent Entries

  1. the hard part
  2. end the age of gotham-everywhere
  3. on this day
  4. write code
  5. managers are awesome / managers are cool when they’re part of your team
  6. bike seven: french parts
  7. being a client
  8. bike seven: building a cargo bike
  9. blog all video timecodes: how buildings learn, part 3
  10. talk notes, urban airship speaker series
  11. john mcphee on structure
  12. blog all oft-played tracks V
  13. tiled vectors update, with math
  14. disposable development boxes: linux containers on virtualbox
  15. week 1,851: week one
  16. tilestache 0.7% better
  17. south end of lake merritt construction
  18. network time machine backups
  19. week 1,846: ladders
  20. documentation for tiled vectors in mapnik

Archives