Sunday, April 24, 2011

Should we avoid exceptions in JS ? (Or the problem with jsPerf micro-benchmarks)

In general my take on benchmarking is: it's too easy to screw up something. The biggest problem is that with normal programming/engineering it's easy to see if you're doing it right. With benchmarking you have some (plausibly looking or otherwise) numbers. I actually think that most of benchmarks we're seeing are flawed in some way. So it's very easy to jump to conclusions, only to be found guilty of screwing up some important aspect. So first of all I'd like to make the following statement:
I understand that benchmarks suck. Please be careful and don't blindly believe any benchmarking numbers. Yes. Even mine.
Now back to main story. Today my pull request for underscore was rejected. I've proposed nice call/cc like helper to do non-local return from JS functions. It should be especially helpful for breaking out of forEach/map loops. More ideally ES5 designers could have added some ability to terminate loops early. Like via returning some special value. But apparently they're too conservative. Maybe I'll post some rants about that some other day.

Anyway, my patch was rejected because underscore author does not want to use exceptions for control flow. His point is that exceptions are very slow.

But is that true? I'm not sure. If we want all performance we can get, then probably we shouldn't use forEach & friends in first place. And then ES3's break statement can actually break out of several loop nestings (via break to label). But note, that standard forbids breaking across function calls, otherwise it would be nice (and potentially faster) alternative to exceptions.

Part of the problem is that, apparentlyV8 forces heap allocation of frames of functions with try/catch block.

The fact is, any code that creates closures has potential to force heap allocation of stack frames (or parts of), which if done often enough will trigger (relatively) expensive GC pauses. That's because if closure is 'leaked' out of it's dynamic scope, it effectively out-lives function that created it. So any variables that are needed by closure cannot live on normal stack.

Lot's of tricks in modern JS interpreters allow them to avoid heap allocation in many important cases. But sometimes you still have to pay that price.

My point is: there is no other way to do non-local return in JS other than by throwing exception. And it doesn't seem to be too slow. I also expect JS-engine vendors to gradually optimize that case.

But even though I understand micro-benchmarking drawbacks. Here's my (potentially flawed) take on for-loop-with-break vs. forEach with throw problem.

First, here's jsPerf page on plain for vs. forEach performance. We see that on FF3.6 forEach is much slower than loop. But on V8 it's about same.

Second, here's jsPerf page on for with break vs. forEach with throw performance. We see that throwing out of loop is not too bad. It actually seems to be faster on Firefox and only 1.5-to-2 times slower on Chrome's V8.

Also note, that somehow manual inlining of add function produces much larger scores, which might indicate  major flow in this microbenchmark. In general of course we'd like to compile/optimize benchmark code once and then run it multiple times. But lack of automatic inlining in this very trivial case hints that we might be causing JIT to optimize it each time it's run. So feel free to play with that benchmarks more and find some flaw!

Note that there's some plausible evidence that throwing non-Errors is much faster. I think that's because when you throw Error instance, it causes runtime to collect backtrace and fill in 'stack' attribute, which seems to be much costlier then just throwing. In fact you can google for Java exception throwing performance and you'll discover that optimal throwing performance is reached by throwing 'prepared' exception instance. Exactly for same reason.

But returning to original question, I still feel that in some cases exceptions are perfectly good way to do some non-local returns. There are so many creative ways to use all language features. And blindly rejecting any of them, just because it seems to be slow is not smart, IMHO. Sure, exceptions have their cost, but sometimes avoiding them is costlier. And all trade-offs should be carefully analyzed and understood.