There’s a new essay called “Optimizing Code for Speed” available on Shlomi Fish’s site. It covers the motivation, methods and terminology of optimizing code, and provides many links and examples to illustrate this important subject.
There’s a new essay called “Optimizing Code for Speed” available on Shlomi Fish’s site. It covers the motivation, methods and terminology of optimizing code, and provides many links and examples to illustrate this important subject.
Thanks Shlomi!
Seconded. Well worth a read and I’ve learned some new things, too.
Let me join here. It was an interesting article, worth reading, and worth thinking about it (as the following comments show). I never thought that I would read something about optimization for speed – today, when computers have more and more power and nobody really seems to care for any kind of optimization, when “bloatware” is everywhere, making the most modern computer react as it was already 10 years old. Optimization is considered to be “outdated” by many “modern” programmers, I sometimes think. But it’s basics. You should know about complexity classes, iterations (if you have to do them), memory consumption and COW. Thanks for putting such an article on OSAlert.
All of the techniques in the article are valid for the objective of decreasing execution time, and in fact I would expect anyone with a BS in Computer Science to understand and be familiar with the techniques. I believe there should be a greater emphasis on the order of optimization. Time and time again I have seen novices worrying about porting their C# or Java application to C/C++ only to find they have not done other optimization. When a novice was skeptical that anything could beat the “almighty” C, I told him to proceed with his port. His original Java program to about a minute, his new C program took about 35 seconds. He felt this has been a success until he was given my Java implementation which ran in 0.6 seconds.
My point is basics first. I tend to follow this order:
1. Have a well structured design.
2. Optimize IO (File or database)
3. Minimize in memory usage and copying
4. Replace your algorithms with ones with better runtime characteristics.
5. Replace the 10% most executed portion of your program with assembly (ie. java native method, C/C++ inline)
Note: the application should be profiled between every step.
Just my 2 cents.
P.S. The article didn’t mention that in some cases reusable modular functions are faster, despite the call overhead, if the result is that most functions used remain in cache during execution.
Wonderful tips — thank you.
Good article, but overall I don’t think they emphasized the point enough that you shouldn’t start optimizing if you can help it. While the author does touch on that, they don’t emphasize it enough. After all:
For example, quotes from the article such as this worry me:
If your program is built on glib, USE GLIB FUNCTIONS. If you’re using Qt, USE QT FUNCTIONS. If that’s not fast enough (and it is for 99% of cases) then investigate a more specialized library, but avoid at all costs writing your own data structures (unless you have a very unusual requirement and have gotten at least 2 other expert opinions on the matter).
I understand the lure of optimization. It’s tons of fun to profile something and save some cycles here and there, and I always scroll right to the optimization section of the KDE commit digest, but I also understand that it is often just that, “fun”. I’m not against making stuff faster, but you really have to get your facts straight first.
I think the take home message is “think before you optimize”. I recall a post to a message board complaining that painting an image in Qt was taking half a second. At least 3 different people immediately replied that the original poster should just use OpenGL and it would be faster, without even considering that there must be something weird going on for painting to be that slow. Well I asked for his code, and it turned out he was reading each pixel from the image and painting it to the screen (so there were hundreds of thousands of function calls for every paint event). Of course a change to paintImage() solved everything.
Edited 2009-02-13 19:39 UTC
the number one priority when optimizing is finding the bottleneck. optimizing for memory useage makes no sense when the program is slow because it waits for io, and vice versa.
anybody out there, benchmarking is the ONLY way to make your program faster. i’ve seen people “optimize” code which didn’t need optimizations and ended up with a piece of unmaintainable crap.
please, benchmark, benchmark and benchmark again. just because you think something is slow does not make it slow. i’ve tried that in the beginning and i’ve been proven wrong too many times.
bottleneck is the what and benchmarking is the how when optimizing.
and just a general tip, don’t be too smart. other software writers optimize for the most common case.
On this article, especially since it’s not easy to do: you have to measure CPU cycle, memory usage, cache usage, IO usage, etc.
Changing the algorithm or the implementation is the easiest part once you understand where and why it’s slow.
Before I took the Algorithmic course in my Computer Science BSc, I never had an idea why some algos where faster than other. Once you get to understand what is O(n log n) and why it is faster than O(n^A^3), it change a lot the way you code. Doing something right is not because it works ok, but because you’re efficient doing it. People may say that today we have plenty of Ghz to spare, but in this age of processing power efficiency, optimized code is important. The problems however is not about doing algos, there is course for this. Most algos are well documented and have been know for a while. Nowadays, most compiler have builtin optimizer that can recognize patterns. Optimization is now more of: how can I get my code to stay longer in the processor cache, how I can make better us of disk IO, how can I parallize my application?
Copy-on-write is not always a good optimization. In case of multithreaded programs the locking kills your performance.
Think about your optimizations, and don’t assume that a optimization will work in all cases. And second, time your optimization to see that it is really an improvement.
Even complexity optimizations don’t always work out the way you think. Maybe for N=1000 a N Log N sort is faster than a N^2 sort, but for N=5, maybe the N^2 sort if both smaller and faster.
Have a look at this article as well:
http://www.gotw.ca/publications/optimizations.htm
Edited 2009-02-16 09:57 UTC
This is true. I have used bubble sort, which is O(N^2), in a case where N was guaranteed to always be 5 (a hand of 5 cards), since it is a simple algorithm to write and understand and the speed was more than adequate.