Archives For General Computing

General CS related posts

Sailing simulation demo

February 18, 2014 — Leave a comment

Few years ago (2010) I made a sailing simulation demo. This video shows some footage from this demo:


The sailboat model is motivated by the real physics of sailing, but it is not 100% correct. I made a dynamical system that roughly approximates the real thing. After reading several white papers, I realized that making a “real” simulator would be a huge endeavor and so I cut few corners :-). But even with my simplifications, the model can tack realistically, can get stuck “in irons”, the wind can blow the boat backwards, etc. Also, I implemented a simple collision system, AI that drives NPC boats (with red sails), line of sight, wave/ocean simulation, skeletal animation, light, crafted 3D models, etc. AI that controls NPC boats does not cheat and controls the same dynamical system as the player’s boat by doing global optimization over the control/goal function. The goal of the NPC boats is to catch the player’s boat with white sails. The demo uses DirectX 9.0 and uses a basic framework that comes with DX9 code samples.

You can use the demo source code if you wish (as is and with no warranty). The ZIP file with the C++ source code (VS 2010 project), shaders and the demo executable (runs on Windows with DX9 installed) is located in this file –



This is a really cool demo of what will be possible in computer graphics –

Those lights and reflections are just too crazy to believe in! I wonder how fast it may drain a mobile device battery…


Found a pretty interesting article about CS progress over the last 20 years –

The results are not that great, which is also clearly demonstrated in this video –


    OK, I decided to revive this blog from hibernation. I guess, it is more like a website and not a real blog that you can subscribe to. Anyway, recently, I was evaluating one pretty good search engine and hit a performance issue that might be interesting to some people. The issue was with the speed of data indexing for search – after some basic perf tuning, I reached certain speed (in “documents per second”), but it was still not sufficient for me. So I decided to do more parameter tweaking to see if it can be improved, but nothing helped. It looked like I hit its upper perf limit.  

    This graph shows how different resources were utilized during most of the indexing time (there were some minor variations, but this graph shows the most representative part):

Disk IO vs CPU During Data Indexing

    As you can see that the most used resource was the disk IO (not a surprise). Specifically, the engine was writing data most of the time. This makes sense, since it is creating the search index on the disk J. What’s interesting – even though the search indexer needs to do word breaking and some textual processing, which are processor intensive operations, CPU was not used most of the time. The hardware that I had was: Intel QuadCore  2.3 GHz, 8GB RAM and two 500GB SATA Hard Drives with one hard drive dedicated to indexing (OS ran on another drive). So my computer had plenty of CPU power, medium size memory and pretty slow commodity hard drives. When I ran the same indexing on a better SCSI drive it worked faster (as expected).

    I started thinking how to make it run faster on my existing SATA drive and tried different variations of parameters (increased memory caching, changed number of threads, changed number of documents per indexing batch, etc.), but it had almost zero effect on the speed of indexing. Then I stumbled on some parameters that were controlling compression of index chunks and some temporary text files used during indexing. The manual for this search engine said clearly that if I turn compression “off”, then indexing should run faster. This made sense since I remembered that in the “old” days compression was expensive, so I turned it off. To my big surprise, I found that the indexing process became much slower without compression. When I turned it back on, the indexing performance had improved.

    At this point I realized that in this setup, where the biggest bottleneck is the disk IO and where CPU power/memory is in abundance, compression can actually help improving disk intensive operations – data gets compressed/decompressed in memory (indexing chunks in my case were pretty small) and then gets written/read to/from the disk much faster. The time to compress/decompress small chunks of data in memory is negligible compared to time needed to write/read it from/to the disk, if you have plenty of CPU power and if your data compresses well. My intuition from the older days, when compression was costly and was all about saving hard disk space, was wrong. Modern systems are mostly bound by the disk IO and not by CPU/memory, so compression can improve performance of disk intensive applications/servers.

    Later, I found that other people knew this fact all along. For example, in this excellent book – Introduction to Information Retrieval, I found the following:

“The second more subtle advantage of compression is faster transfer data from disk to memory … We can reduce input/output (IO) time by loading a much smaller compressed posting list, even when you add on the cost of decompression. So, in most cases, the retrieval system runs faster on compressed postings lists than on uncompressed postings lists.”

And later also:

“Choosing the optimal encoding for an inverted index is an ever-changing game for the system builder, because it is strongly dependent on underlying computer technologies and their relative speeds and sizes. Traditionally, CPUs were slow, and so highly compressed techniques were not optimal. Now CPUs are fast and disk is slow, so reducing disk postings list size dominates. However, if you’re running a search engine with everything in memory, the equation changes again.” 

    So, if you develop data intensive applications, then compression might be your friend if the disk IO is a bottleneck. This may change again with the arrival of solid state disks…