Microsoft Research Community
Infer.NET 2.3 Beta 1 released

This new release is the first we have done since the January 2.2 Beta release; the version number change reflects the fact that an initial version of Gibbs sampling is now available. Just as importantly, there have been many practical improvements, particularly with respect to indexing and jagged arrays, which enable a larger class of models to be built. Also, some new factors are available including a logistic factor, and there are wrappers to make syntax a bit more natural for F# and for Iron Python. Finally, we have looked into what was preventing Infer.NET running on Linux under Mono (this was a combination of a bug in Mono, and some obfuscation issues), and we have worked around these; we have verified that the examples, at least, now run under Mono - let us know how you get along. There have been one or two minor API changes, but your models should mainly work without change, unless you have been using the InferAll method - more on this later.

The list of changes can be found in in the User Guide, and I will pick out and expand on a few of the changes below.

Gibbs Sampling

This represents the first release of Gibbs sampling within the Infer.NET framework. Gibbs sampling is a very different algorithm from Expectation Propagation and Variational Message Passing (which are approximate deterministic algorithms); it generates a chain of samples from the posterior distribution, and marginal distributions can estimated from the samples lists. Block Gibbs sampling is also supported as part of the implementation.  This allows more efficient sampling for highly correlated variables, and is also required for dealing with deterministic factors and constraints.

In Infer.NET, you specify a model in a way that is completely independent of the inference algorithm; it is only when you want to run inference that you specify the algorithm, and your model code is then compiled for that algorithm. The examples browser installed with Infer.NET allows you to choose which algorithm to run on a given model (however, note that not all algorithms support all factors so you will get an error message for certain combinations of model and algorithm). As an example, take a look at the 'Learning a Gaussian with Ranges' example in the tutorial browser; you can run the same model with all three algorithms and compare results.

Here is a very simple example of Gibbs sampling where we infer the mean and precision of a Gaussian. Note how we can recover both the marginal and the chain of samples.

// The model
Variable<double> mean = Variable.GaussianFromMeanAndVariance(0, 100);
Variable
<double> prec = Variable.GammaFromShapeAndScale(1, 1);
Range
i = new Range(2);
VariableArray
<double> data = Variable.Array<double>(i);
data[ i ] = Variable.GaussianFromMeanAndPrecision(mean, prec).ForEach(i);
// The observations
data.ObservedValue = new double[ ] { 5.0, 7.0 };
// The model

InferenceEngine
engine = new InferenceEngine(new GibbsSampling());
var
meanMarg = engine.Infer<Gaussian>(mean);
var
meanSamples = engine.Infer<SampleList<double>>(mean, QueryTypes.Samples);
var
precMarg = engine.Infer<Gamma>(prec);
var
precSamples = engine.Infer<SampleList<double>>(prec, QueryTypes.Samples);

Note that Gibbs sampling is a work in progress, and some factors are not yet supported. In particular, Gates are not yet handled, so you will not be able to use Variable.Switch statements or Variable.If statements. Also, factors which take an array argument such as 'Sum' are not yet supported. Also on our to-do list are some speed improvements in the generated code, especially with respect to array variables.

Improvements in jagged arrays and indexing

Most real world problems have complex and irregular dependencies between variables. Support for such problems is now greatly improved, the generated code is more efficient, and several bugs have been fixed.

As an example, consider a social network. We might have various observations about the individuals in the social network, and the goal is to infer something about the link between two individuals. Variable ranges, which were available in Infer.NET 2.2 can be used to encode the graph structure - so we can range over all users, and for each user range over that user's friends. Improvements now allow edge indexing to be encoded via a 2D observed index array, and allow switching based on that edge index so that you can encode latent variables on the edges of the social network.

A later blog will look at these types of model in more detail.

API changes

API changes have occurred in two areas. First within the Infer and InferAll methods, and secondly in the evidence operators. You won't need to worry about the latter unless you've been writing your own factors and operators, so I'll concentrate on the former.

The Infer function (a method on the InferenceEngine) was previously a bit tricky to use with variable arrays (especially complex arrays such as jagged arrays of 2D arrays). The return types in these case were (and still are) quite difficult to understand and work with; each return type is a form of DistributionArray which can be thought of as a distribution over an array domain. It is more natural and less confusing to work with .NET arrays of distributions over non-array domains, and this is now the recommended usage (though the old will still work).

For example, suppose you have a jagged variable array of type

VariableArray<VariableArray<double>, double[][]>

where the individual items are drawn from Gaussians. You can now get the inference results as a jagged array of Gaussians as follows:

Gaussian[][] posterior = engine.Infer<Gaussian[][]>(w);

As part of the overhaul of the inference methods, the InferAll method no longer returns anything. You can call it as before, and it will have the same behaviour as before, and run the inference for the specified variables, but you will not be able to assign the results. To retrieve the results, just call individual Infers for each variable you are interested in; if this variable was specified in a previous call to InferAll, the marginal will be returned immediately without further computation.

Language Wrappers

The extensive use of generic methods in Infer.NET can make calling it from F# a bit cumbersome and unintuitive. A set of wrapper types for distribution arrays (for example GaussianArrayOfArray) and variable arrays (for example VariableGaussianArrayOfArray) are provided for commonly-used distributions and domain types.  Also, the wrapper provides alternative operators for cases where Infer.NET operator overloads are not recognised; for example the "<" operator (though supported in Infer.NET for creating a boolean variable from two real variables) is not recognised when calling from F# - the wrapper provides "<<" as an alternative. Another part of the wrapper provides more F#-like syntax for working with Infer.NET if blocks and switch blocks. The F# wrapper is documented in the User Guide, and distributed as a DLL (Infer.FSharp.dll) and as a source file (FSharpWrapper.fs) - these can be found under the bin and source folders respectively in the installation folder.

There is also a wrapper for IronPython, which is distributed as IronPythonWrapper.py under the source folder under the installation folder. This wrapper is only useful/tested for IronPython 2.6 Beta 2 and above.

Well, that's a summary of some of the more visible changes. There have, of course, been many changes under the hood, fixing bugs, and making things more efficient. We plan to do some more focussed blogs on individual areas in the coming weeks, so stay tuned.

John G. and the Infer.NET team.


Posted 08-11-2009 7:03 PM by John Guiver
©2009 Microsoft Corporation. All rights reserved. Terms of Use | Trademarks | Privacy Statement | Feedback