Hello Infernauts!!
One of the many changes we made in the recently released Infer.NET 2.3 beta was allowing parallel inference on multiple cores. We are keen to make Infer.NET scale to ever larger datasets and supporting parallelism has always been an important part of that goal. Here I'll show how to use multi-core parallelism to speed up inference - of course, you'll only see the speedup on a multi-core machine.
The first thing you need to do is to install the Microsoft Parallel Extensions CTP. This is the library that Infer.NET uses to do multi-core parallelism. We have chosen to work with the CTP for the time being - the Parallel Extensions library is planned to be part of version 4.0 of the Microsoft .NET framework and we will move to using that version at a future date. You can read more about current and forthcoming Microsoft parallelism technologies at the MSDN Parallel Computing Developer Center.
OK, so you've installed the CTP. Now pick a model you want to parallelise. I'll use the click model which you can run from inside the Infer.NET examples browser, under 'Applications'. Make sure that inference on your model is working as you would expect. Now you can configure your inference engine to use parallel loops by adding the following line:
engine.Compiler.UseParallelForLoops = true;
|
The click model uses two engines, one for training and one for test so I added this line in both cases. You must configure the inference engine before any calls to
Infer() for this to have an effect. The result of this change is that certain for loops in the generated inference code will be replaced with
Parallel.For loops. To see how this affects the speed of inference, you can use the handy built-in ability to see timings for various stages of inference:
engine.ShowTimings = true;
|
Typical results looks on my 8-core machine are shown in the table below. For comparison, I have also shown results for running a Hidden Markov Model on a large data set.
| |
Time per iteration - Normal |
Time per iteration - Parallel For |
| Click Model Training |
218ms |
177ms |
| Click Model Test |
22ms |
30ms |
| Hidden Markov Model |
5341ms |
1352ms |
These results may be unexpected. The speed up for click model training is about 20% using parallel for loops but for click model testing, using parallel for loops actually slows things down. This is because parallel for loops introduce additional overhead compared to ordinary for loops, and so you will only get a benefit if the loop is large and a reasonable amount of work is being done inside the loop. In the test case, the iterations are quick and the time is dominated by the parallel overhead. This point is illustrated by the Hidden Markov Model results, where an almost 4x speedup is achieved using parallel for loops - quite a significant improvement for one line of code!!
So, in summary, parallel for loops can speed up inference, especially in larger models, and may speed it up considerably. The exact speedup you will get depends on your hardware, model and inference algorithm. The easiest way to determine it is just to try it out and see.
Whilst this post has discussed multi-core parallelism, it is also possible to distribute Infer.NET inference across a cluster, by dividing your model into chunks using shared variables. At the moment, you have to wire together together the chunks manually e.g. using MPI. We will be looking how to make this more automatic in a future version of Infer.NET and we'll be sure to blog about it when we do!
John W.
Posted
08-13-2009 2:55 PM
by
jwinn