C# performance tuning, part 7. Summary.

Previous parts of this series:
Part 1: Introduction
Part 2: Collisions explained
Part 3: Obvious optimizations
Part 4: Quad Tree
Part 5: Quad Tree improvements and small fixes
Part 6: Minimize loop repeats

I’ve went through whole optimization process. Getting better and better results at each step, finally reaching the limit when getting 362 times faster than first version of code. That’s a lot. Is that a limit? I doubt. We could do few things more at this point. Replacing property calls with private field should give a little bit more performance. Investigating algorithms more deeply may uncover some more optimizations.

Is that needed? At that point to me – no. I’ve been testing it under pretty stressful conditions – small area packed with 1000 triangles means a lot of collisions happening. Changing area size to 10 000 x 10 000 gives serial collision check time at 2ms level. Leave area size 1000 x 1000 and limit triangle number to 500 and time is as low as 2ms as well. And it’s still probably way more collisions going on per frame than in usual 2D game where we either don’t check for collisions against objects close to each other (player vs. floor) or objects are usually not colliding (player vs. wall or vs. player).

How does the data look?

Change Collisions serial Time serial Collisions parallel Time parallel
initial 8058 1814 8058 562
line segments initialized 8058 1061 8058 308
bounding boxes 8058 25 8058 7
quad tree basic 7054 30 7054 10
_allChildren added 7054 25 7054 7
two segments check 6685 19 6685 7
rectangle optimization 6685 19 6685 7
line precalculations 6685 18 6685 7
triangle contains linq replacement 6685 13 6685 4
QuadTree collison check 3163 5 3163 3

As you can see collisions detected over time decreased as I’ve made the code more precise at the same time improving performance.

What’s to learn from this exercise? First – performance often comes with the price. You can do some things faster, but you will need to pay the price in memory usage (when precalculatig some values), code readability (when replacing LINQ with code duplications, however one could argue if resulted code was harder to read in this particular case) or time take to rewrite the code (changing simple list and replacing it with more complex Quad Tree).

Also – but that’s pretty common knowledge – there is no point in doing performance improvements without profiler set up. You may get better results, you may not, but you will only be guessing without something that will measure your code’s performance.

And final thought – many times there are complains that C# and Java are slow. Well, I’m pretty sure those collisions written in C or C++ would be faster. But would it make that much of a difference? I don’t think so, at least not in my case. For some applications it can be critical to get code running as fast as possible, for others 2ms is fast enough. And tool you know will serve you better at that point than tool you are not that experienced with.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s