Via daniel’s blog (the creator of curl) I arrived to this page: why the Java implementation (JGit) doesn’t run nearly as fast as the C implementation. The short version of it is: even after many tunings JGit is twice as slow as the C implementation. One of the problems which got my attention, was the different ways a SHA-1 sum got sliced and diced. So I’ve done a microbenchmark and here are my (not very scientific) results:
- The fastest way to compare two SHA-1 sums in Java (that I found) was to use its string representation. I’ve tried cramming the hash in Unicode characters (two bytes per character) and byte arrays. The first was only slightly slower, while the second was orders of magnitude slower (~15x slower)
- Compared to the naive C implementation (using strcmp over the string representation) the Java solution was 100x times (!) slower
What is the end-conclusion? Yes, Java is slower. This is an extreme case of course (amongst other problems, the test ran for very short period of times and possibly the JIT didn’t kick in) and in real life the performance loss is much smaller. In fact the email linked above talks about a 2x performance loss and 2x bigger memory consumption. What it doesn’t talk about however, is the number of bugs (of the “walk all over your memory and you are scratching your head” kind) in the C implementation versus the Java implementation. In my opinion:
- The speed of Java is “good enough”. In fact it is (orders of magnitude) better than many other high-level languages which are widely used (like PHP, Perl, Python, Ruby).
- Yes, you can implement things in C, but you will do it in 10x the time with 10x the bugs and probably go mad (unless your aim is job security rather than getting work done)
- There is an incredible amount of work going into improving the performance of the JVM. Check out this episode from the Java Posse (great podcast btw!) if you are interested in the subject
- Always profile before deciding that you need to optimize a certain part of your code. Humans are notoriously bad at guessing the bottlenecks
- “Good enough” means “good enough”. Ok, so the Java implementation was a 100 times slower. Still, it managed to compare over 10 million (that is 10^7) hashes in one second! I find it hard to believe that the main bottleneck in a source-code versioning system this is the comparing of hashes (or the CPU more generally). Even my crappy CVS saturates the disk I/O over a high latency VPN.
- Related to the above point: set (realistic) goals and don’t obsess about the fact that you could be “doing better”. For example: it needs to render the HTML page in less than 100 ms in 95% of the cases. Could you do it in less tha 50 ms? Maybe, but if 100 ms is good enough, it is good enough.
- Finally, after you profiled, you always have the option of reimplementing problematic parts in C if you think that it’s worth your time
Picture taken from Tahmid Munaz's photostream with permission.
Better than Python as in faster?
ReplyDeleteThat's completely the reverse of my own personal, non developer experience.
Python programs usually run great, stable (!!) and fast, Java apps are buggy, crash and are really slow. I suspect Java is ok because some folks like it (something works for you guys) but i never liked a Java app.
Sorry if i actually triple posted, i didn't get a confirmation on the first 2..
ReplyDeleteI'm anonymous, but i'm a regular reader :)
@Anonymous: you didn't triple post (or at least it doesn't seem that way from my comment moderation side). Sorry for moderating the comments, but it is just a spam prevention measure.
ReplyDeleteFrom a speed point of view: there are many aspects of it (for example: the average speed of a long running application, the startup speed of the application - which again can be subdivided in multiple cases - like the first time the splashscreen shows up, or until GUI becomes usable. Then we have the whole latency vs. throughput problem).
My end-point I wanted to make is that there are many ways to skin a cat (what a horrible metaphor btw!) and comparison like "X is N times faster than Y" are mostly academic. The right way (IMHO) is to set goals (ie. GUI delay less than 100ms) and then work towards that goal. Going beyond the goal has no benefit. If you think that the goal is not correctly set, modify it, but don't optimize for the sake of optimization.