tag:blogger.com,1999:blog-21098996940468220242024-02-07T08:50:03.048+03:00Tales about Aliaksey's lifeAliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.comBlogger28125tag:blogger.com,1999:blog-2109899694046822024.post-29907141026204793772015-02-15T09:34:00.001+03:002015-02-15T09:34:30.454+03:00Visualizing perf profiles using pprof<div dir="ltr" style="text-align: left;" trbidi="on">
Perf is a pretty powerful profiling tool (in addition to it's other features). But it is not as good as pprof at visualizing profiles.<br />
Now you can capture profiles using perf (including profiling running processes), but deal with them using pprof. With help of <a href="https://github.com/alk/perf2pprof">perf2pprof tool.</a> It is also available via rubygems. Simply do gem install perf2pprof and you're ready to rock.<br />
<br /></div>
Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-73833542730922261522014-12-12T11:03:00.000+03:002014-12-12T11:03:11.642+03:00Better productivity writing golang code with help of supermegadoc<div dir="ltr" style="text-align: left;" trbidi="on">
At work I'm writing more and more golang code recently. It is nice little language. But as a language noob I have yet to internalize which features are available in standard library and how to use them. So I'm spending a lot of time staring at pages under golang.org/pkg looking for types or functions or methods I need.<br />
For <a href="https://github.com/couchbase/ns_server">my Erlang work</a> I'm quite regular user of <a href="https://github.com/alk/supermegadoc">supermegadoc</a> erlang integration (video demo with my nice/odd accent is <a href="https://www.youtube.com/watch?v=cfec6u13twE">here</a>). And I was seriously lacking something similar for go. Today I finally spent few hours and I have something that looks like what I need.<br />
<br />
Observe:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVdw1Mcy6nGP1NxCA6BFauXWz7sGXkjXeVpDZxnz4To1U5fLJWKrp_ld_qwTuSuj3C6e-4lpUPoSX34LYo9gISsV9Ng-pV0MaRsdmRc7Yov9myrhENgkbFH4pQKJZzZOVevyd3dEd5LRQ/s1600/SuperGo-screen.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVdw1Mcy6nGP1NxCA6BFauXWz7sGXkjXeVpDZxnz4To1U5fLJWKrp_ld_qwTuSuj3C6e-4lpUPoSX34LYo9gISsV9Ng-pV0MaRsdmRc7Yov9myrhENgkbFH4pQKJZzZOVevyd3dEd5LRQ/s1600/SuperGo-screen.png" height="189" width="320" /></a></div>
<br />
I can quickly find things that are visible to godoc. Just like with other supermegadoc integrations, I can see function signatures and constant and variable values. And I can see if type is struct, "typedef" or interface. From experience working with Erlang's supermegadoc, I know that this means that I often don't even need to open corresponding doc entry. It is often enough to see that it's there and (in case of functions or methods) what it's signature is.<br />
<br />
I expect my golang productivity to increase.<br />
<br />
Have a nice day folks!<br />
<br /></div>
Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-80309713068896050442013-12-09T23:59:00.000+03:002013-12-09T23:59:26.780+03:00Massive power of (liblzma based) XZ archiver<div dir="ltr" style="text-align: left;" trbidi="on">
I've recently restarted gathering of bitcoin market data. I'm grabbing samples of market depth every 3 seconds and I'm collecting trade events.<br />
<br />
Market depth samples can be quite large. Every mtgox sample appears to be about 90 kilobytes big. So 1 hour of samples is about 100 megs of data. And month is about 3 gigs. Which is a bit too much.<br />
<br />
gzip is able to compress that about 5x. But that's still a bit too large.<br />
<br />
I've found xz to really shine on that kind of data. More than 1 gig of data gets squeezed down to less than a meg! And what's extra cool is xz is very quick to decompress. For static data like btc market archive that's very useful.<br />
<br />
So quality compression does matter. And I just wanted to express my ultimate respect to authors of that extremely useful software.<br />
<br />
Have a nice day and happy hacking!<br />
<br /></div>
Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-768788870208649612013-09-22T03:39:00.000+03:002013-09-22T03:39:35.680+03:00Playing with Intel TSX<div dir="ltr" style="text-align: left;" trbidi="on">
I've recently got access to a box that has Intel Haswell CPU inside. And I was quite looking forward playing with one of it's most interesting features: hardware transactional memory. My particular interest is to see how cheap it is.<br />
<br />
My use case is per-processor data structures (e.g. malloc caches). And without explicit binding of threads to processors, there's only optimistic way of doing it. Which requires some synchronization to defend against pessimistic case of rescheduling of thread to different cpu. That would look like taking cpu id, locking it's corresponding lock which in most cases would be in cache and uncontended and thus reasonably quick, and then doing something with per-cpu data. So in this approach we always pay some performance price even if majority of actual runs will hit fast-path. Lack of really cheap optimistic locking makes that price significant which makes it less attractive.<br />
<br />
So lets return to Intel's implementation of transactional memory (aka TSX). <a href="http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions">Wikipedia article</a> describes that thing pretty well. My understanding is that it's expected to be most useful for somewhat coarse locks where multiple threads would normally contend for the lock yet they touch different memory locations. E.g. imagine different threads touching different buckets of hash table or different branches of binary search tree. It can also be used as compare-and-exchange operation that allows you to process multiple memory locations at once. There's already glibc support for it that optimises pthread mutex operations described in usually <a href="https://lwn.net/Articles/534758/">nice lwn.net article</a>.<br />
<br />
My hope was that this feature ends up being even faster than atomic operations in fastest path (everything is in L1 cache) given it's optimistic nature. And that it might be useful for quick optimistic locking I'd like to have.<br />
<br />
You can see my test case <a href="https://gist.github.com/alk/6655145">here</a>. It simulates fastpath of "lock" that guards a counter. There is no locking itself, just check that "lock" is free. Which is what glibc lock elision code is doing. And you can see how TSX allows to avoid actual locking. "On the other side of the ring" is code that changes counter via traditional compare-exchange atomic operation (no locking either, to give me purer numbers).<br />
<br />
On the box I have access to (with Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz processor) I'm getting about 67 cycles per loop iteration for TSX case. And about 27 cycles for atomic CAS code (and same for more traditional locking instructions "xchg mem, register"). Note that it's very likely that larger transactions will have bigger overhead. Also note that usual synchronised region is two atomic ops (unlocking atomic operation being potentially significantly cheaper than locking operation), so in this limited case TSX appears to be somewhat competitive with traditional locking, but not faster.<br />
<br />
So TSX is not faster than single atomic operation. Which breaks my hope of using it for quick optimistic lock It is somewhat sad that on today's hardware there's seemingly no way to have fast-path of locks to be lightning fast without playing crazy tricks (e.g. slow path stopping lock owner thread via signal or ptrace like jvm "biased locking" appears to be doing).<br />
<br />
Anyways, being slightly more than 2x slower than simple atomic operation is pretty good news IMHO for use cases for which TSX is designed. And it's "multi-word cas" application appears to be very interesting and useful too. So I'm looking forward using it somewhere.<br />
<br />
And finally I have to note that, especially at the beginning, debugging transactional memory code can be quite tricky and very weird. That's because transaction is fully isolated while it runs, so there's no way to printf something to see why it fails. Or set breakpoint inside it and inspect things. This hit me initially because my simplistic code wasn't at all prepared to handle transaction failures. I.e. my code is only supposed to test fast-path without any real-world synchronization contention. After few minutes of struggling with it I realized, that even otherwise conflict-less code will abort from time to time. For example, any interrupt (e.g. timer tick) will abort in-flight transaction, as well as in fact any user- to kernel-space transition will.<br />
<br />
So lesson number one is that debugging hardware transactional memory code should be done very carefully. Especially if code path is significantly different between successful and abort-ful cases. I.e. imagine some real transaction that might span several layers of code and consider that debugger/printf will never be able to see or expose "guts" of aborted transactions. And lesson number two is that aborts have to be handled always, even in toy code.<br />
<br />
Have a nice day and happy hacking.<br />
<br /></div>
Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-71706869285571705762012-04-08T23:29:00.000+03:002012-04-08T23:29:31.710+03:00gpicker 2.2 is out!<div dir="ltr" style="text-align: left;" trbidi="on">Hello there! I've just made long due release of gpicker 2.2. Some notable changes are:<br />
<br />
<br />
<ul style="text-align: left;"><li>new project type -- script, that I'm using to handle multi-repository project (i.e. couchbase)</li>
<li>implemented poor man's isearch on steroid's -- gpicker-isearch</li>
<li>big improvements for gpicker-imenu</li>
<li>more optimization</li>
</ul><br />
<div><a href="https://savannah.gnu.org/projects/gpicker">Savannah project page</a> has link download area with source .tar.{gz,bz2,xz} archives and binary .deb packages (built on lenny) for i386 and amd64. If you haven't heard of gpicker before also check out <a href="http://github.com/alk/supermegadoc">supermegadoc</a> which is very convenient gpicker-using tool.</div><div><br />
</div></div>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com3tag:blogger.com,1999:blog-2109899694046822024.post-88799887615717857902011-12-15T11:25:00.000+03:002011-12-15T11:25:01.418+03:00Me and Gnome3<div dir="ltr" style="text-align: left;" trbidi="on">Hi. Quite a bit of time passed since my last past. That was busy time with continued hard work on (still forthcoming) Couchbase Server 2.0 release and, most importantly, I've found beautiful girl and got married!<br />
<br />
Anyway, I just got remind that I should not forget about writing something from time to time. And today's "hot" topic is Gnome 3.<br />
<br />
About a month ago (or was it 2 ? Time flies so weirdly with so much happening around me now) Debian Sid got Gnome 3. Even earlier it got some components of Gnome 3. Most noticeable was upgrade of gnome-terminal to Gnome 3 version. And that was almost immediately reverted back to gnome-terminal 2 from last Debian stable. The reason is very simple. Default theme of gtk3 (which is, naturally, used by all gnome 3 apps) is ugly. Like very very ugly. And, surprisingly, there's only one non-default theme engine for gtk3. The one that's heavily using CSS3. I don't like it's look either, but the most worrisome aspect of it is quite noticeable slowness. There are ways to adjust look with CSS3 hackery after all. I've found that some porting work of old gtk engines was initiated. But quick and minimalistic Mist engine I'm used to is not yet ported.<br />
<br />
That's basically my whole Gnome 3 story. I cannot tolerate Gnome 3 not because of it's experimental UI, but because I need usable gtk3 theme first. I cannot even say what I'm thinking about gnome's UI, because I haven't even tried using it on daily basis.<br />
<br />
Whoever makes Mist work on gtk3 will become my hero. Meanwhile, I was forced to find refuge in XFCE land, that's missing few things I had on my gnome 2 desktop.<br />
<br />
</div>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com3tag:blogger.com,1999:blog-2109899694046822024.post-1367705579494930842011-05-16T23:32:00.000+03:002011-05-16T23:32:40.785+03:00Unbreaking LXC on latest Debian unstableWith recent <a href="http://wiki.debian.org/ReleaseGoals/RunDirectory">switch</a> to /run directory in Debian I was getting error from lxc when it was trying to mount /dev/shm in container and failed because /dev/shm is now symlink to inside /run. The simplest fix I found is replacing symlink with bind mount. Here's what I've added to /etc/rc.local<br />
<br />
<blockquote><blockquote>if [ -L /dev/shm ]</blockquote><blockquote>then</blockquote><blockquote> mv /dev/shm /dev/shm~</blockquote><blockquote> mkdir /dev/shm</blockquote><blockquote> mount --bind "`readlink -f /dev/shm~`" /dev/shm</blockquote><blockquote>fi</blockquote><div><br />
</div></blockquote>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-12415580247898167042011-05-03T10:50:00.000+03:002011-05-03T10:50:31.490+03:00Setting up Distel and erlang remote shell for membaseI decided to do myself a small present. I've just wrote some Emacs lisp code that via REST API grabs erlang otp node and cookie and connects my emacs with that node. I've made integration with <a href="https://github.com/massemanet/distel">Distel</a> and erlang remote shell. The later was most problematic because of quite weird TTY handling in Erlang shell. But in the end it seems to work great! Grab code here: <a href="https://gist.github.com/952958">https://gist.github.com/952958</a> And the usage is M-x alk-membase-shell and M-x alk-membase-setup-distel.<br />
<br />
<br />
<script src="https://gist.github.com/952958.js">
</script>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-9644620554009495392011-04-24T00:18:00.000+03:002011-04-24T00:18:27.597+03:00Should we avoid exceptions in JS ? (Or the problem with jsPerf micro-benchmarks)In general my take on benchmarking is: it's too easy to screw up something. The biggest problem is that with normal programming/engineering it's easy to see if you're doing it right. With benchmarking you have some (plausibly looking or otherwise) numbers. I actually think that most of benchmarks we're seeing are flawed in some way. So it's very easy to jump to conclusions, only to be found guilty of screwing up some important aspect. So first of all I'd like to make the following statement:<br />
<blockquote>I understand that benchmarks suck. Please be careful and don't blindly believe any benchmarking numbers. Yes. Even mine.</blockquote>Now back to main story. Today <a href="https://github.com/documentcloud/underscore/pull/179">my pull request for underscore</a> was rejected. I've proposed nice call/cc like helper to do non-local return from JS functions. It should be especially helpful for breaking out of forEach/map loops. More ideally ES5 designers could have added some ability to terminate loops early. Like via returning some special value. But apparently they're too conservative. Maybe I'll post some rants about that some other day.<br />
<br />
Anyway, my patch was rejected because underscore author does not want to use exceptions for control flow. His point is that exceptions are very slow.<br />
<br />
But is that true? I'm not sure. If we want all performance we can get, then probably we shouldn't use forEach & friends in first place. And then ES3's break statement can actually break out of several loop nestings (via break to label). But note, that standard forbids breaking across function calls, otherwise it would be nice (and potentially faster) alternative to exceptions.<br />
<br />
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">Part of the problem is that, <a href="http://jsperf.com/try-catch-performance-overhead">apparently</a>, <a href="http://code.google.com/p/v8/">V8</a> <a href="https://groups.google.com/forum/#!topic/nodejs-dev/E-Re9KDDo5w">forces</a> heap allocation of frames of functions with try/catch block.</div><div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"><br />
</div>The fact is, any code that creates closures has potential to force heap allocation of stack frames (or parts of), which if done often enough will trigger (relatively) expensive GC pauses. That's because if closure is 'leaked' out of it's dynamic scope, it effectively out-lives function that created it. So any variables that are needed by closure cannot live on normal stack.<br />
<br />
Lot's of tricks in modern JS interpreters allow them to avoid heap allocation in many important cases. But sometimes you still have to pay that price.<br />
<br />
My point is: there is no other way to do non-local return in JS other than by throwing exception. And it doesn't seem to be too slow. I also expect JS-engine vendors to gradually optimize that case.<br />
<br />
But even though I understand <a href="http://www.ibm.com/developerworks/java/library/j-jtp02225/index.html">micro-benchmarking drawbacks</a>. Here's my (potentially flawed) take on for-loop-with-break vs. forEach with throw problem.<br />
<br />
First, here's jsPerf page on <a href="http://www.browserscope.org/user/tests/table/agt1YS1wcm9maWxlcnINCxIEVGVzdBiDm9EFDA">plain for vs. forEach performance</a>. We see that on FF3.6 forEach is much slower than loop. But on V8 it's about same.<br />
<br />
Second, here's jsPerf page on <a href="http://www.browserscope.org/user/tests/table/agt1YS1wcm9maWxlcnINCxIEVGVzdBiYnMkFDA">for with break vs. forEach with throw</a> performance. We see that throwing out of loop is not too bad. It actually seems to be faster on Firefox and only 1.5-to-2 times slower on Chrome's V8.<br />
<br />
Also note, that somehow manual inlining of add function produces much larger scores, which might indicate major flow in this microbenchmark. In general of course we'd like to compile/optimize benchmark code once and then run it multiple times. But lack of automatic inlining in this very trivial case hints that we might be causing JIT to optimize it each time it's run. So feel free to play with that benchmarks more and find some flaw!<br />
<br />
Note that there's <a href="http://jsperf.com/exception-performance/3">some plausible evidence</a> that throwing non-Errors is much faster. I think that's because when you throw Error instance, it causes runtime to collect backtrace and fill in 'stack' attribute, which seems to be much costlier then just throwing. In fact you can google for Java exception throwing performance and you'll discover that optimal throwing performance is reached by throwing 'prepared' exception instance. Exactly for same reason.<br />
<br />
But returning to original question, I still feel that in some cases exceptions are perfectly good way to do some non-local returns. There are so many creative ways to use all language features. And blindly rejecting any of them, just because it seems to be slow is not smart, IMHO. Sure, exceptions have their cost, but sometimes avoiding them is costlier. And all trade-offs should be carefully analyzed and understood.Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-64126132022554256672011-02-21T05:12:00.000+02:002011-02-21T05:12:42.789+02:00Caching autotools outputs for fun and profit<a href="http://www.gnu.org/software/make/manual/html_node/Parallel.html#Parallel">GNU make</a> parallell building feature is cool. And <a href="http://ccache.samba.org/">ccache</a> which speeds up rebuilds of same files is cool too. But even with this great tools rebuilding some project is still not as fast as it can be. Why ? Because preparing GNU autotools files (./configure & friends) takes quite long time. And there is no way to parallelize it.<br />
<br />
But we can cache autotools products just as ccache caches compiler products. And now there is tool that's capable of doing that! You can grab it <a href="https://github.com/alk/caching_fabricate">here</a>. I've adapted <a href="http://code.google.com/p/fabricate/">fabricate.py</a> for this job.<br />
<br />
Just put fabricate.py into PATH and prepend it to your command. Like:<br />
<br />
fabricate.py ./bootstrap<br />
<br />
or<br />
<br />
fabricate.py sh -c './bootstrap && ./configure'<br />
<br />
And it'll copy command outputs from cache if all dependencies are same. Nearly instantaneous!<a href="http://www.blogger.com/"></a>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-22116691373214972542011-02-08T15:39:00.000+02:002011-02-08T15:39:04.690+02:00Welcome to 21th century, Aliaksey!I'm used to slow, expensive & unreliable Internet. Because of that I tended to keep as much as possible on my local hard drive. I had tons of specs, articles, e-books. And I even kept local mirror of entire Debian unstable repository! That's because at home and until recently at Altoros office I had quite limited and expensive Internet access. I tried (successfully) to reduce my dependency on Internet as mush as possible.<br />
<br />
Around 2 weeks ago I accidentally deleted most of my 'stuff'. And now I'm forced to use online resources. And quite surprisingly that's not bad experience at all! Internet here, at Bay Area, is fast enough. And googling for information and reading it online sometimes seems to be even faster than locating and opening it from local disk.<br />
<br />
This is unusual and new experience. When I, for example, need some information on how something is done in python I just ask Google. And within mere seconds I have precisely what I need! I've even started reading PDF articles online via Chrome's plugin without downloading them first.<br />
<br />
I'm slowly getting used to 21th century Internet. I'm still not quite 100% comfortable with depending on Internet availability. And some lookups are definitely much faster against local data. For example, I don't think anything can be faster than local <a href="http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=supermegadoc">supermegadoc</a> documentation lookup of some function. But I'm going to stop downloading interesting stuff and will instead bookmark it. Let's see how it'll go. I'm especially curious how it will work when I'll return home. Hopefully, we'll have speedy ADSL connection by then :)<br />
<div><br />
</div>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com1tag:blogger.com,1999:blog-2109899694046822024.post-86973155110187115432011-02-07T01:51:00.000+02:002011-02-07T01:51:44.419+02:00How to use two monitors in IntelliJ IDEAI'm working on some educational Scala project. And IDEA is natural choice for that. Normally I use Emacs for my work, but for anything Java related it's not as good as IDEA. And I cannot express how powerful and polished this nearly free software IDE is.<br />
<br />
However, I was missing one very useful feature of Emacs, which is ability to open several frames to effectively use two monitors. Until few minutes ago.<br />
<br />
When you want to open same "buffer" on two monitors you can first split it horizontally (C-x 2 if you use Emacs key bindings) and then you can simply drag one of the windows out of it's frame to second monitor! If you want different files you can just drag tabs out of main frame.Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com5tag:blogger.com,1999:blog-2109899694046822024.post-12016183369707839762011-01-29T23:00:00.001+02:002011-04-12T02:16:08.188+03:00How to testament... childs' deathOften Unix processes spawn childs. Contrary to the popular belief childs DO NOT automatically die when their parent exits.<br />
<br />
Unix job control has a concept of sessions and session leaders. Session leader is the process that 'owns' terminal, and when this guy dies, each process of session gets SIGHUP. This usually leads to death. Every time you open new tab in your terminal emulator, you're creating new session. And bash (or other shell) is leader of this session. So your childs will likely die when terminal emulator tab will be closed, but they will easily outlive your process. And they will live forever when your main process is spawned from cron or your favorite continuous integration tool.<br />
<br />
Of course the whole topic of Unix job control is slightly more complex. You can read more by reading <a href="http://www.gnu.org/software/libc/manual/">GNU libc manual</a> (or aptitude install glibc-doc && info libc). Posix man pages (aptitude install manpages-posix{,-dev}) contain lots of details, much more than usual Linux man pages.<br />
<br />
So one of the ways to ensure your childs die is by using sessions. This is usually implemented by creating pseudo terminal pair (PTY) and giving slave side of that pair to child. Session leader (i.e. child) gets SIGHUP when master side of PTY pair is closed. Master side is closed when your main process exits (if you don't pass it's fd to child by mistake). When child dies, this causes delivery of SIGHUP to whole session. {spawn,fork}pty-like functions are used for that.<br />
<br />
And there's another even less widely known way. There is a concept of orphaned process group that can be used to ensure SIGHUP delivery without relying on PTYs. I'll refer to <a href="http://pubs.opengroup.org/onlinepubs/009695399/functions/exit.html">posix man page</a> for details. But basically you fork and deliver SIGSTOP to child. When your process group becomes orphaned (e.g. when main process dies), each member will be sent SIGCONT & SIGHUP. Added benefit of this approach is that if parent of your main process dies, your process and it's child will die too. This relies on main process or it's parent being process group leader, but that usually holds.<br />
<br />
I'm using this to control a bunch of erlang VMs when running <a href="http://www.membase.org/">Membase</a> in development mode. You can grab python code for doing that <a href="http://review.membase.org/4380">here</a>. Ruby code for doing same is removed, but you can dig it's <a href="https://github.com/alk/ns_server/blob/f57a6b516e3e2f9c8c2a008c0acd469e8ec010f5/test/orphaner.rb">grave</a>.<br />
<br />
UPDATE: unfortunately this trick doesn't always work in OSX/Darwin due to different definition of orphaned process group in OSX: <a href="http://developer.apple.com/library/mac/#documentation/darwin/reference/manpages/man2/intro.2.html">http://developer.apple.com/library/mac/#documentation/darwin/reference/manpages/man2/intro.2.html</a>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-33206889848676474362010-12-26T12:44:00.000+02:002010-12-26T12:44:38.676+02:00Two envelopes sophism puzzleI'm currently on vacation. And it allows me to return to some small topics that I postponed earlier. One of them is <a href="http://en.wikipedia.org/wiki/Two_envelopes_problem">Two envelopes problems</a>. This is quite interesting and difficult <a href="http://en.wikipedia.org/wiki/Sophism">sophism</a> to solve and 'get'. This morning I'm not in 100% work-ready mood so I decided to return to my list of 'Later'. Surprisingly, this time it took quite small amount of time to solve this puzzle. Actually, I'm a bit puzzled why I postponed it earlier. Maybe I should take vacations more seriously ?<br />
<br />
I'm copying <a href="http://en.wikipedia.org/wiki/Two_envelopes_problem">definition from wikipedia</a>:<br />
<br />
<span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px;"><b>The basic setup</b></span><span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px;">: You are presented with two indistinguishable envelopes containing some money. You are further informed that one of the envelopes contains twice as much money as the other. You may select any one of the envelopes and you will receive the money in the selected envelope. When you have selected one of the envelopes at random but not yet opened it, you get the opportunity to take the other envelope instead. Should you switch to the other envelope?</span><br />
<br />
<div style="font-family: sans-serif; font-size: 13px; line-height: 1.5em; margin-bottom: 0.5em; margin-left: 0px; margin-right: 0px; margin-top: 0.4em;"><b>The switching argument</b>: One line of reasoning proceeds as follows:</div><ol style="font-family: sans-serif; font-size: 13px; line-height: 1.5em; list-style-image: none; margin-bottom: 0.5em; margin-left: 3.2em; margin-right: 0px; margin-top: 0.3em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"><li style="margin-bottom: 0.1em;">I denote by <i>A</i> the amount in my selected envelope.</li>
<li style="margin-bottom: 0.1em;">The probability that <i>A</i> is the smaller amount is 1/2, and that it is the larger amount is also 1/2.</li>
<li style="margin-bottom: 0.1em;">The other envelope may contain either 2<i>A</i> or <i>A</i>/2.</li>
<li style="margin-bottom: 0.1em;">If <i>A</i> is the smaller amount the other envelope contains 2<i>A</i>.</li>
<li style="margin-bottom: 0.1em;">If <i>A</i> is the larger amount the other envelope contains <i>A</i>/2.</li>
<li style="margin-bottom: 0.1em;">Thus the other envelope contains 2<i>A</i> with probability 1/2 and <i>A</i>/2 with probability 1/2.</li>
<li style="margin-bottom: 0.1em;">So the <a href="http://en.wikipedia.org/wiki/Expected_value" style="background-attachment: initial; background-clip: initial; background-color: initial; background-image: none; background-origin: initial; background-position: initial initial; background-repeat: initial initial; color: #0645ad; text-decoration: none;" title="Expected value">expected value</a> of the money in the other envelope is<br />
<br />
<img alt="{1 \over 2} 2A + {1 \over 2} {A \over 2} = {5 \over 4}A" class="tex" src="http://upload.wikimedia.org/math/6/8/d/68db9f468f376d8ea117dd56af9f7c5a.png" style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; vertical-align: middle;" /><br />
</li>
<li style="margin-bottom: 0.1em;">This is greater than <i>A</i>, so I gain on average by switching.</li>
<li style="margin-bottom: 0.1em;">After the switch, I can denote that content by <i>B</i> and reason in exactly the same manner as above.</li>
<li style="margin-bottom: 0.1em;">I will conclude that the most rational thing to do is to swap back again.</li>
<li style="margin-bottom: 0.1em;">To be rational, I will thus end up swapping envelopes indefinitely.</li>
<li style="margin-bottom: 0.1em;">As it seems more rational to open just any envelope than to swap indefinitely, we have a contradiction.</li>
</ol><div style="font-family: sans-serif; font-size: 13px; line-height: 1.5em; margin-bottom: 0.5em; margin-left: 0px; margin-right: 0px; margin-top: 0.4em;"><b>The puzzle</b>: The puzzle is to find the flaw in the very compelling line of reasoning above.</div><br />
IMO the most easily understandable flaw in that reasoning is that it adds values of A from two possible realities. And those values obviously differ. If you analyze this problem correctly then it's obvious, that switching envelopes does not make sense. So the lesson is: be careful with your variables.Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-51499571484678146632010-12-25T11:31:00.000+02:002010-12-25T11:31:00.397+02:00Reflections on Ruby vs. Python meetup<span class="Apple-style-span" style="color: #737373; font-family: Arial, sans-serif; font-size: 12px;"></span><br />
<div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;"><span class="Apple-style-span" style="color: #737373; font-family: Arial, sans-serif; font-size: 12px; line-height: normal;"></span><br />
<div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">This is cross-posted from <a href="http://blog.altoros.com/ruby-vs-python.html">Altoros' developers blog</a>. I want to thank Altoros PR department for editing this text.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">I never understood developers and customers that willingly choose Python over Ruby. Ruby does feel like a native language to me, while everything else doesn’t. No, this doesn’t mean that I don’t like working with other technologies. Over the last year and a half, I was really inspired by my positive experience of using Erlang and JavaScript, while working on the <a href="http://www.membase.org/">Membase</a> project (where Python is widely utilized). I really enjoyed doing my work, so it’s obviously possible to have fun with languages that are less well designed than Ruby.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">However, I always wanted to understand why this happens. Why do people choose other languages than Ruby so often? That is why I was excited to accept the invitation to the <a href="http://meetup.by/RPMM">“Ruby vs. Python” meetup</a> that was sponsored by our company. I chose the topic “Why Ruby Is a Brilliantly Designed Language” and wanted to summarize what makes Ruby different.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;"><strong>The Holy War</strong></div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">The event was very dynamic and intriguing. The speakers and attendees were split into two groups: one defended Python, while another fought for Ruby. Each group prepared presentations describing history, community, main concepts, development features, and perspectives of their language of choice.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">I was a part of the group that presented Ruby. It was my first public speech in many years. My previous presentation—the defense of the diploma—was a big failure. I’m glad that this time I’ve learned some lessons and wasn’t so bad. <img alt=":)" class="wp-smiley" src="http://blog.altoros.com/wp-includes/images/smilies/icon_smile.gif" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; cursor: move;" /></div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;"><strong>So, why is Ruby a brilliantly designed language?</strong></div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">With hindsight, it is obvious that I picked up a very hard topic. Squeezing my experience into a 15-minute speech was really tricky. While preparing for the session, I reinforced my belief that Ruby is the best designed programming language I’ve ever used. However, when it was just one day left before the meetup, I realized that I could bring in only the most essential arguments to prove that.</div><ol style="list-style-position: inside; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;">1. My first pro-Ruby argument was that, from the moment the language was created, there were no missing features to add and no design bugs to fix. With over 10 years of its history, Ruby remained—mostly—stable with only minor syntactic features added/tweaked.<div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;"></div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">2. Second, Ruby represents all powerful object-oriented programming (OOP) features, which it adopted from Smalltalk. It is a “pure” OOP language, where everything is an object. The pureness, open classes, method_missing, and other OOP features provide for a huge semantic power. Surprisingly, nothing game-changing was introduced to the programming language design over the last 30 years.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">3. After that, I talked about what powerful features Ruby adds on top of the traditional form of OOP. For instance, mix-ins, a powerful alternative to multiple inheritance is a great invention. During the presentation, I demonstrated how it makes functional programming even more natural than in LISP. Another remarkable Ruby feature worth mentioning was the ability to extend individual objects. It really makes defining “static” methods natural and can be useful in some other cases.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">4. I concluded with touching upon Ruby’s concise syntax. As a part of this argument, I demonstrated Ruby’s syntactical killer-feature – blocks.</div></ol><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;"><strong>Revealed pros and cons</strong></div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">There were some other nice discussions during and in between the speeches. For example, the Python group argued that its language is more widely used. However, I noted that one could resort to such argument only because of the lack of other worthy pro-language arguments.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">Another controversial point was that Python is good for education and is adopted by Massachusetts Institute of Technology as the primary programming language for students. I objected that Python is not the best language to start with. I think it is wiser to begin developing your programming skills with a pure OOP language, such as Ruby, because it can build a better solid ground for object-oriented design and way of thinking.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;"></div><div lang="en-US" style="margin-bottom: 0cm; margin-left: 0px; margin-right: 0px; margin-top: 0px;">The Ruby group was also presented by <a href="http://www.google.com/profiles/saksmlz">Aliaksandr Rahalevich</a>, Altoros alumni, who spoke about how Ruby allows him to design great APIs, and Hleb Stiblo, who presented an “Introduction to Ruby.” At the end, my university classmate <a href="http://twitter.com/#!/alovak">Pavel Gabriel</a> called upon pro-Python and pro-Ruby folks to summarize their arguments and language trivia.</div><br />
<div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">The discussion ended up with the confessions from both sides about what sucks in their language of choice. Python 2.6/3 incompatibility was the biggest issue that Pythonistas admitted. Ruby developers noted the lack of speed, the lack of support for real concurrency, and cumbersome multiple encodings support in Ruby 1.9. The first two negative points were well understood by Pythonistas, since they suffer from them, as well. Ruby’s incompatibility of 1.9 vs. 1.8 was also mentioned, although nobody said that the difference between these versions is far less than the difference between Python 2.6 and Python 3.</div><div style="line-height: 1.4; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 10px; padding-left: 0px; padding-right: 0px; padding-top: 10px;">I really enjoyed attending this event and making a presentation there. However, it seems like everyone—including me—forgot to mention that it is hard work of human beings and their attention to details which mostly contributes to an end-result. I think that as long as the technology doesn’t actively interfere with the programmer’s job, it affects the end-result very slightly. The language is merely the tool, while the engineer is the master.</div></div>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com1tag:blogger.com,1999:blog-2109899694046822024.post-33685204648054240822010-10-22T06:23:00.000+03:002010-10-22T06:23:05.863+03:00How to use time travel for testing javascript librariesThe project that I'm working on right now is <a href="http://membase.org/">Membase</a>. It is and was awesome experience in many ways. One cool part of that project is that I'm allowed to do quite advanced stuff for membase UI. Part of that is Cells library that is growing as part of Membase UI effort. But this post is not about cells. I promise to write about cells later. This post is about mocking JS clock for testing cells.<br />
<div><br />
</div><div>The code is <a href="http://github.com/membase/ns_server/tree/19211ffd8914370fc40b9d24f44ce79630ab8226/deps/menelaus/priv/js-unit-tests">here</a>. You're interested in files named: mockTheClock.js and wrapped-date.js. It is inspired by equivalent code in <a href="http://github.com/alk/jsunit">JSUnit</a>, but it's more full featured and more efficient. In particular I'm using binary heap to order pending timers.<br />
<br />
The code is Apache-licensed, so feel free to grab and re-use it for your own projects.</div>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-25031311598426311982010-09-02T23:58:00.000+03:002010-09-02T23:58:49.384+03:00News from the past — gpicker 2 is out!I've released gpicker 2.0 more than half year ago. And I planned to write about it since then. Lack of time and lazyness delayed that more and more. But some people outside of my immediate 'circle of influence' are now using it for some very creative things. And I cannot wait anymore. It's time to realize my plan at last! :)<br />
<br />
So gpicker 2.0 is out and it has some nice improvements. All new features are detailed in README file. But I'd like to describe the most important things here.<br />
<br />
When I <a href="http://alkandratsenka.blogspot.com/2008/10/gpicker-tale.html">announced gpicker</a>, the project didn't even had a home. It's official home now is here: <a href="http://savannah.nongnu.org/projects/gpicker">http://savannah.nongnu.org/projects/gpicker</a>.<br />
<br />
gpicker became even faster. Loading and filtration were moved into separate threads, so that UI is always ultra-responsive. With speed improvements and separated filtration (which nicely aborts/restarts when user changes filter) it's now very responsive even on my whole filesystem which has more than 1E6 files.<br />
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1rOsyZKt-4P2lL7J3eqm2pw4M_Vn0K284EFxYZqeDtXVjcDiSvR9Ha4vp7Yb5cTQ-cccDa_KNn1usePUESDr2uP2ED93Pg3OCJjATpBq4eEiuuZABSXOL2x-fnDCBsVP_q2tK29yff5A/s1600/Screenshot.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1rOsyZKt-4P2lL7J3eqm2pw4M_Vn0K284EFxYZqeDtXVjcDiSvR9Ha4vp7Yb5cTQ-cccDa_KNn1usePUESDr2uP2ED93Pg3OCJjATpBq4eEiuuZABSXOL2x-fnDCBsVP_q2tK29yff5A/s400/Screenshot.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">supermegadoc and erlang docs</td></tr>
</tbody></table>Perhaps the most significant improvement is ability to select from arbitrary list. gpicker is very good at selecting files, but you can feed it list of arbitrary strings via standard input. And I have quite interesting use of that feature – <a href="http://github.com/alk/supermegadoc">supermegadoc</a>. This thing indexes man pages, ri, devhelp & erlang documentation and allows you to quickly choose and open documentation you need. But it's also quite useful for exploration of documentation. You can try typing various words and see if there's anything like that in documentation. I found it especially helpful for getting up to speed with Erlang standard library. I also did some very basic imenu & tags completer for Emacs which is based on this feature too.<br />
<br />
Another area of improvement is gpicker integration. VIM integration (contributed by <a href="http://avsej.github.com/">Sergey Avseyev</a>) is included in gpicker distribution. And we also have plug-ins for <a href="http://github.com/avsej/gpicker-netbeans/downloads">netbeans</a> (by Sergey Avseyev) and for <a href="http://github.com/yltsrc/gedit-gpicker">gedit</a> (by <a href="http://github.com/yltsrc">Yury Tolstik</a>).<br />
<br />
We also have some support for building/running on <a href="http://www.apple.com/macosx/">OSX</a>. But unfortunately it's not very easy (with limited Internet & OSX access I have) to get proper installation of Gtk+ on OSX. The canonical way seems to be building from source via <a href="http://live.gnome.org/Jhbuild">jhbuild</a>. But gpicker-on-OSX work was done against ancient distribution of Gtk+ which is long withdrawn from web. So any success/failure reports with proper Gtk+ installation as well as contributions in this area are very much welcomed. I expect, that only minor configure.ac fixes should be required at most. Contributing pre-built osx binaries is very much appreciated too. This can be done by <a href="http://sourceforge.net/apps/trac/gtk-osx/wiki/Bundle">Bundler</a>.<br />
<br />
Yet another improvement is ability to build gpicker without Gtk+ and glib. This flavor of gpicker doesn't have any GUI, obviously. But it's able to filter standard input against filter specified via command line, which should be enough for some useful things. Some people, for example, would like to have gpicker-like functionality with text-mode Emacs. gpicker-simple can help here. In fact I did some very preliminary (and buggy) integration of gpicker-simple filtration and <a href="http://www.emacswiki.org/emacs/InteractivelyDoThings">IDO</a> user-interface. If anyone is interested in making this work, please contribute. You can find it in gpicker.el from gpicker distribution.<br />
<br />
The creative work by other people I mentioned before is a very interesting idea of switching windows by auto-completing their names (and other info) in gpicker. See <a href="http://www.doitian.com/2010/08/switch-window-using-fuzz-matching/">Switch Window Using Fuzz Matching :: DoIT</a> for more info.<br />
<br />
I'm very glad that I can say thank you to people that helped with code (Sergey & Yura) and/or bug reports/suggestions (folks from Ruby department at <a href="http://altoros.com/">Altoros</a>). I'm happy that I have some users outside of Altoros. In particular, <a href="http://www.fullofcaffeine.com/2010/01/01/gpicker-on-osx/">Marcelo De Moraes Serpa</a> has been very helpful with suggestions and spreading the word. Thank you very much guys!<br />
<br />
I thought that gpicker would take over the world, it's awesome after all, but my (not evil at all) plan utterly failed. This is a bit unfortunate :) But I'm very glad that gpicker is slowly getting some adoption. The future is bright!Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com1tag:blogger.com,1999:blog-2109899694046822024.post-41720023857289000312010-07-31T23:33:00.001+03:002010-07-31T23:33:49.378+03:00My life with Debian<h4>Tribute</h4>I'm a long time Debian user. And I'm a happy Debian user. My current Debian installation is very (and I mean very, very) old. I don't remember exactly how old it is, but it's more than 6 years old. And it's probably as much as 9 years old. Since early months it was unstable branch of Debian. It is updated regularly, as often as my (quite poor at times) network connectivity allows.<br />
<br />
When I'm getting new machine I simply copy my existing installation on new machine. And my particular OS has already been reincarnated from one machine to another for countless times.<br />
<br />
I'm not trying to set any records here. I'm just lazy enough to avoid re-installing my OS and setting it up. But I seriously doubt than any other OS or distro can handle it. It's unique combination of Debian's approach to distro development, package upgrade-ability policies and attention to software quality that makes it possible.<br />
<br />
So I owe a big thank you to Debian guys for their outstanding job during all this years. Thanks a lot!<br />
<br />
Of course sometimes I had issues during package upgrades. It is inevitable when you run unstable and when you do, probably, as much as thousands of package upgrades per year. But I don't remember having anything really major. I was able to resolve all issues that I had.<br />
<br />
<h4>Tips</h4>During this years of running and constantly upgrading my system I've learned a couple of tricks to keep it in best shape. And I'd like to share them, though it's nothing really new.<br />
<br />
Sometimes due to various reasons old versions of some libraries stay installed on your system. It's useful to run deborphan (http://www.debian-administration.org/articles/134) periodically. With new (well a number of years already) features of apt & aptitude it should happen less often, but some cruft can still accumulate.<br />
<br />
Another trick that I discovered quite recently is package database de-fragmentation. It's described here (http://ubuntuforums.org/showthread.php?t=1004376).<br />
<br />
Probably, the widest known tip is to do purge instead of just uninstall when removing packages. After normal uninstallation, Debian keeps config files. Purging package removes those too. Synaptic package management tool can be quite useful for purging any packages that where uninstalled in default mode.<br />
<br />
I found another cleanup opportunity with package database. It turned out that dpkg kept information about uninstalled packages in is /var/lib/dpkg/available file. And during all this years this file has grown quite substantially (15 megs). I wrote simple script that cleans this up. And I've used it a couple of month ago. So far I don't see any bad effects of that. So I can recommend it for anyone with long lived and often upgraded Debian-based distro. Here it is.<br />
<script src="http://gist.github.com/502538.js"></script><br />
<br />
<h4>amd64 versus i386</h4>Another piece of knowledge, that I can share is that Debian is probably the only distro that more or less supports running 64 bit kernel with 32 bit user-space. I think, that's best combination. You get all 4 gigs of address space for 32 bit apps, you can run 64 bit apps, if you want, and you don't waste memory on twice as large pointers.<br />
<br />
It would be great to have advantages of amd64. That's first of all larger register file and modern instruction set (i386 Debian still targets i386). But in my opinion, for typical desktop & developer machine larger memory consumption of 64 bit programs outweigh the benefits. In particular, most java programs really consume very close to twice as much of memory on 64 bit. And don't forget, that AFAIK there's still no lightweight 'client' JIT for amd64. Larger memory consumption causes less cache hits and more memory bandwitch, so amd64 is often slower. I also did some benchmarks with ruby & rails and found, that i386 version is faster.<br />
<br />
So I've decided to stick with 64-bit kernel and 32-bit user-space. (And avoid re-install yet again). I'm running this combo for around half of year. One thing that was a bit annoying after switch, is that uname reported amd64, which caused issues with most (all?) configure scripts. My simple trick, which relies on debian's default init is to put the following file in /etc/initscript<br />
<script src="http://gist.github.com/502563.js?file=initscript">
</script><br />
<br />
It makes sure that everything that's spawned by init has i386 personality. This fixed issues with configure scripts.<br />
<br />
For some rare programs, that require kernel component and don't support mixed user- & kernel-space (virtualbox), I have amd64 version installed in chroot. This also helps with development.<br />
<br />
Thats all. Keep your systems clean and efficient.Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-63709046514762472112010-07-01T11:21:00.001+03:002010-07-01T11:22:05.153+03:00How to cheaply turn single machine into a cluster<p>For development of <a href="http://membase.org">membase</a> which is a distributed storage system, I often need to run it on cluster of machines. Luckily I have two machines at home and at work so with trivial use of rsync & ssh running cluster of two nodes was easy. But I sometimes need to run more than two nodes so I decided to find something cheap that allows me to run multiple true nodes even when I have single machine. This is more important now, because I'll be on business trip for next month. So I'll have only single machine at my disposal.<br />
<p>One approach is to use virtualization. I tried it around a year ago for some project. Starting complete OS just to run single application is a bit too slow, but that can be alleviated by use of snapshots. In practice this is too painful. Even with snapshots it's slow. And free software virtualization products either have it wrong (virtualbox) or buggy (kvm & qemu). Bridge networking is relatively slow to come up. And I remember having some networking issues when restoring from snapshot.<br />
<p>Yesterday I finally tamed 'virtualization' that's cheap and works. My approach is to use <a href="http://lxc.sourceforge.net/">LXC</a> which is Linux's built-in containers implementation. It supports network virtualization as it's core feature and it doesn't require separate root for it's instances. So with my current solution I can create large number of 'servers' all having shared filesystem, but different hostnames and network stacks. It's reliable, starts quickly and it's easy to kill.<br />
<p>One of the problems was that I needed to create virtual host-only network that connects host and all containers. The problem is that current implementation of macvlan link type doesn't support networking with host. I worked that around by creating virtual ethernet pair and linking macvlans to one side of it, while using other side as host's end. So far it works beautifully!<br />
<p>The main script is at the following gist: <a href="http://gist.github.com/459693">http://gist.github.com/459693</a>. It takes care of allocation of ip address & hostname and simply runs provided command inside container. It also takes care to kill everything inside container when main command exits. This is useful for killing any daemons (e.g. erlang port mapper) that might still be running inside.<br />
<p>Here's what I added to project's makefile to launch multiple instances of membase:<br />
<script src="http://gist.github.com/459694.js"></script><br />
<p>'make lxc-run' starts one instance of membase inside container. 'make lxc-cluster' starts three instances in tabs of new terminal window.Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-92166597123044166252010-06-03T09:32:00.002+03:002010-06-03T09:45:26.708+03:00Multiple return values in JavascriptWhile writing some javascript few days ago I encountered one case where it was really nice to be able to return multiple values from function. Usually you pack your return values in array, return that array and unpack it at call site. While creating array with values is easy in javascript, unpacking those values back is neither elegant, nor short. Because javascript (portable subset at least) have no destructuring. E.g. code normally looks like this:<br />
<br />
<pre> function findRoutingFor(method, path) {
var foundRoute, routeArgs;
// some computation here
if (foundRoute)
return [foundRoute, routeArgs];
}
// and at call site
var rv = findRoutingFor(method, path);
if (!rv)
throw new Error(/* */);
var foundRoute = rv[0];
var routeArgs = rv[1];
</pre><br />
I believe, that I came up with more elegant way to do that. <a href="http://en.wikipedia.org/wiki/Continuation_passing_style">Continuation passing style</a> saves us here. Here's how:<br />
<br />
<pre> // notice extra parameter
function findRoutingFor(method, path, body) {
var foundRoute, routeArgs;
// same computation here
// and then we invoke continuation
return body(foundRoute, routeArgs);
}
// then at call site...
return findRoutingFor(method, path, function (foundRoute, routeArgs) {
if (!foundRoute)
throw new Error(/* */);
// rest of call site is here
});
</pre><br />
So <a href="http://en.wikipedia.org/wiki/Continuation_passing_style">continuation passing style</a> allows us to avoid packing & unpacking of multiple return values. There are some costs, like, for example, extra indentation of rest of call site, but, IMO, in many cases it's better than usual alternative.<br />
<div><br />
</div>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-19462606565616965922010-01-24T13:00:00.001+02:002010-01-24T13:08:26.218+02:00Emacs and broken syntax highlightingI've been struggling with syntax highlighting bugs in ruby-mode and javascript-mode for quite some time. Today I've tried <a href="http://www.nongnu.org/espresso/">espresso-mode</a> that claims to be fixed javascript mode, and it fails too. I'm not sure why but it seems to be quite typical for Emacs and not typical for other advanced editors (vim & textmate at least). Maybe something is wrong with syntax highlighting facility of Emacs, or maybe major mode authors just cannot use it correctly.<br />
<br />
One of the potential problems is regular expressions facility of Emacs Lisp. First of all it uses quite inconvenient dialect of regexps. Many constructs need to be escaped with '\' (alternative (|) and grouping, for example). Second, ELisp doesn't have special syntax for regexps. You have to use strings, where slash has to be escaped again. So even simple regexp a|b turns to "a\\|b".<br />
<br />
Two years ago I've written small regexp DSL for Emacs Lisp. Using this DSL you can express complex regular expressions readably.<br />
<br />
So today I found regexp literal expression in espresso.el -- "[=(,:]\\(?:\\s-\\|\n\\)*\\(/\\)\\(?:\\\\/\\|[^/*]\\)\\(?:\\\\/\\|[^/]\\)*\\(/\\)", translated it to my DSL and got:<br />
<br />
<pre>(redsl-to-regexp `(concat (char-set "=(,:")
(* (or (whitespace) "\n"))
(cap-group /)
(or "\\/" (neg-char-set "/*"))
(* (or "\\/" (neg-char-set "/")))
(cap-group /)))
</pre><br />
which is simply wrong. It obviously fails for simple regexp -- /\\/<br />
<br />
I tranformed it into:<br />
<br />
<pre>(redsl-to-regexp `(concat (char-set "=(,:")
(* (or (whitespace) "\n"))
(cap-group /)
(+ (or (neg-char-set "\\/")
(concat "\\" (anychar))))
(cap-group /)))
</pre><br />
<div>which is logical and works. And ура ура I finally have working syntax highlighting in javascript mode!<br />
<br />
I've uploaded regex-dsl to <a href="http://github.com/alk/elisp-regex-dsl">http://github.com/alk/elisp-regex-dsl</a> and will send espresso.el fix soon.<br />
</div><div><br />
</div>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-22930678532288489092010-01-11T20:01:00.004+02:002010-01-11T20:30:44.202+02:00Speeding up ruby interpreter in Debian & UbuntuIt's an ancient trick, but I decided to blog about it while I still remember to blog about it.<br />
<br />
Unix (or ELF, to be more precise) shared libraries suck. Well, there are many useful features that are not found on other platforms (like LD_PRELOAD), but the performance is worse than in other systems. And sometimes, dramatically.<br />
<br />
The problem is that shared libraries are usually compiled as <a href="http://en.wikipedia.org/wiki/Position-independent_code">position independent code</a> (PIC). This is quite slower (especially, on older ISAs, like i386, that don't have have PC-relative addressing) than normal code. Other reason for shared libraries slowness is indirection for almost all function calls. This indirection gives you extra flexibility. ELF shared libraries allow you to replace almost every function, due to this extra indirection. And sometimes this is very convenient, but this is slow.<br />
<br />
Now imagine modern C or C++ code, with small functions that frequently call each other. Extra indirection and PIC cost in function prologues becomes quite high. GCC's -fvisibity switch allows you to get rid of some of this flexibility for gains in speed, but few libraries use it, yet.<br />
<br />
You can run simple experiment (as I did few years ago). Write a small program that does malloc/free in a tight loop. Link it normally (i.e. with shared libc) and statically. Then compare their speed. I remember as high as 50% gain with statically sinked libc, that doesn't pay performance price for ELF shared libraries flexibility.<br />
<br />
There's 'sort-of-hack' that trades memory efficiency for speed. It is possible to link normal (i.e. non-PIC code) as shared library (at least on i386). This way you'll have minimal function call indirection and no variable access indirection at all. You won't pay PIC price too. The only downside of this method is that dynamic linker will have to patch TEXT section pages with relocated addresses, so it won't be possible to share this pages between different processes. The performance gain may well worth it though. NVIDIA folks, for example, build their libGL in this way. And I'm sure they know what they do.<br />
<br />
Why I'm mentioning ELF shared libraries? Because Debian (and Ubuntu too) build ruby interpreter as shared library, and according to my measurements, this gives around 10% performance penalty. To regain part of this loss, I simply edit ruby's 'configure' script and remove all mentions of -fPIC. I than 'dpkg-buildpackage' and install resultant .deb-s as usual. This brings performance of ruby interpreter back to it's normal level (i.e. when it's built statically). Another optimization that I use, is passing better GCC optimization flags. "-march=' flag is quite important on i386.<br />
<br />
P.S. Read excellent <a href="http://people.redhat.com/drepper/dsohowto.pdf">http://people.redhat.com/drepper/dsohowto.pdf</a> by GNU libc guru Ulrich Drepper, if you want to know more details about ELF shared libraries.Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-61110152125268072922010-01-04T21:39:00.000+02:002010-01-04T21:39:39.120+02:00Power of modern javascript interpretersToday my colleague (Vlad Zarakovsky), who is interested in porting gpicker scoring to JS or finding something that's close to that, showed me some filtration code and it's performance results. He demonstrated that some (quite non-trivial it seems, from quick glimpse of the source) filtration implementation takes 33 milliseconds on 22k of filenames, which is not bad. This result was obtained on Safari, which sports state-of-the-art javascript interpreter.<br />
<br />
Another thing that we noted is that sorting takes additional 14 milliseconds, which compared to much more sophisticated scoring is quite large. I thought that it should be due to cost of transition from native qsort code to JS code thats implements comparator function. And out of curiosity I've proposed trying pure JS sorting instead.<br />
<br />
The results were interesting. At first it was much slower than built-in qsort, but Vlad then discovered that it's actually faster on random data (I haven't asked by how much, but that's not very important). All this led me to the following conclusions:<br />
<br />
<ul><li>Naive qsort on real world non-random data may be 100 times slower than proper qsort implementation. Which is understandable, but this is the first time I've met such case personally.</li>
<li>I was right about pure JS qsort being able to beat C qsort, that has to use slow and barely optimize-able comparator.</li>
<li>In general, state of the art javascript interpreters are indeed approaching speeds of optimized C. If only we had Ruby interpreter of same quality :)</li>
</ul><br />
So when next time your JS code will depend on fast sorting, consider picking pure JS sort implementation, but pick wisely. Also don't forget that there are still some very slow JS interpreters around (yes, I'm talking about 'the-cursed-browser').<br />
<br />
BTW, I'm sure same applies to C standard qsort. In cases where comparator is quite simple, the implementation that is able to inline comparator should be much faster. For gpicker I've stolen source of glibc's qsort implementation and made sure that it's able to inline comparator.Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-11180823738643602122009-12-27T23:22:00.004+02:002009-12-30T23:47:55.659+02:00Speeding up firewatirMy recent work is pure ajax single-page application. And now I'm trying to cover it with some tests. Yes, I'm not TDD-infected, but that's a story for another post. And I've tried <a href="http://wiki.openqa.org/display/WTR/FireWatir/">Firewatir</a>. I quickly encountered it's known issue of being slow sometimes.<br />
<br />
After a bit of investigation it turned out to be lack of TCP_NODELAY on both sides of firefox control socket. It's surprisingly typical that programmers are not aware that <a href="http://en.wikipedia.org/wiki/Nagle's_algorithm">Naggle's</a> algorithm affects localhost-to-localhost sockets too. In this case short requests and responses were unnecessarily delayed in kernel because of that.<br />
<br />
The most interesting thing was fixing this issue. It seems that (re)building <a href="http://www.croczilla.com/bits_and_pieces/jssh/">JSSh</a> extension that is firefox side of Firewatir power is not very convenient and quick process. So I made a funny hack instead.<br />
<br />
In one of my earlier posts I described a piece of code that drives GDB from ruby to get backtrace from running ruby process. This time I decided to use same approach to set TCP_NODELAY on connected socket inside running firefox process. I extended old code and published it at <a href="http://github.com/alk/alk-ruby-gdb-hacks/blob/master/gdb.rb">http://github.com/alk/alk-ruby-gdb-hacks/blob/master/gdb.rb</a>. After creating Firewatir instance you simply need to call<br />
<code>GDB.force_socket_nodelay($jssh_socket)<br />
</code><br />
<span style="font-family: monospace;"><br />
</span>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0tag:blogger.com,1999:blog-2109899694046822024.post-70257628814309276682009-12-27T22:01:00.008+02:002009-12-30T23:49:42.369+02:00On biggest numbersThere's a quite long, but very interesting essay on big numbers. Here:<br />
<div><a href="http://www.scottaaronson.com/writings/bignumbers.html">http://www.scottaaronson.com/writings/bignumbers.html</a>.<br />
<div></div><div>Author describes math competition to write down a representation of as big as possible number in 30 seconds. It's quite interesting thing to read as it stretches your imagination.<br />
</div><div></div><div>The biggest joy was to read about 'Busy Beaver' numbers. Those are amazing. But there's one mistake IMO. While it's possible to prove that busy beaver numbers sequence grows faster than any other computable sequence, it doesn't let you win biggest number competition. It only says that for each 'opponent' sequence there exists some number N starting from which Busy Beaver sequence will be greater than that sequence. And you will still need to prove, that, say busy beaver sequence number 1111 is greater than Ackermann's number 1111. For example BB(4) is only 107, while Ackermann's 4th number is quite large - around 10E154.<br />
</div><div></div><div>It may be quite tricky to judge such competition. For estimating largest entries we can use floating point numbers for computation, but it seems that we can easily meet such big entrants that resulting floating point mantissa representation will exceed available computer memory! So we may need to use floating point number approximation for mantissas! It's not hard to imagine yet another level of floating approximation of mantissas.<br />
</div><div></div><div>Computation time can also be challenging and may require special methods. How about estimating A(1E1111) ? My math skills and available computation facilities do not allow me to get even estimation of that number in my entire lifetime. But it should be truly huge number.<br />
</div><div></div></div>Aliaksei Kandratsenkahttp://www.blogger.com/profile/03257803952496012458noreply@blogger.com0