Hi,
I've been trying to do some bayesian lineâr regression as a first trial but with much success:
double[,] data = new double[,] { {1,-3}, {1,-2.1}, {1,-1.3}, {1,0.5}, {1,1.2}, {1,3.3}, {1,4.4}, {1,5.5} }; Range rows= new Range(data.GetLength(0)); Range columns = new Range(data.GetLength(1)); Variable<Matrix> x = Variable.Constant<Matrix>(new Matrix(data)).Named("x"); Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(new Vector(new double[]{0,0}),PositiveDefiniteMatrix.Identity(2)).Named("w"); Variable<Vector> yVar = Variable.MatrixTimesVector(x, w).Named("y"); yVar.ObservedValue = new Vector(30, 45, 40, 80, 70, 100, 130, 110); InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing()); VectorGaussian postW = engine.Infer<VectorGaussian>(w);
I suspect that it doesn't work because Variable.MatrixTimesVector is not implemented yet. What would be the best way to solve this problem? Implement MatrixTimesVector, quit because it won't work at all because of..., or try to find a way arround?
Thanks,
Joao
I tried to do it as a series of innerproduct of vectors, but is also non supported. :)
Guess I will have no choice?
The problem here is not with Infer.NET, but with the Variational Message Passing algorithm and the particular model being used here. This is not really a standard linear regression model, since normally you would add Gaussian noise before observing the product. Here there is no noise, and that is the source of the problem. You are directly observing the product of two variables, which VMP cannot support. It is not a case of Infer.NET being incomplete. VMP simply does not handle the case when a derived variable is observed. You will run into this limitation no matter how you rewrite the model. So, you should either use EP or change the model to have some additional noise.
You can get some insight into why VMP breaks down here by reading my paper "Divergence measures and message passing" (http://research.microsoft.com/en-us/um/people/minka/papers/message-passing/). As shown there, VMP will not represent the posterior distribution but simply pick one possible solution and put all probability mass there. This happens due to the zero-forcing nature of the divergence being minimized. Rather than have VMP return degenerate solutions, we opted to have Infer.NET throw an exception in these cases.
So to build on Tom's answer, here is a solution using Vectors and InnerProduct which adds Gaussian noise:
Vector[] data = new Vector[] { new Vector( 1.0, -3 ), new Vector( 1.0, -2.1 ), new Vector( 1.0, -1.3 ), new Vector ( 1.0, 0.5 ), new Vector( 1.0, 1.2 ), new Vector( 1.0, 3.3 ), new Vector( 1.0, 4.4 ), new Vector( 1.0, 5.5 ) };Range rows= new Range(data.Length);VariableArray<Vector> x = Variable.Constant(data, rows).Named("x");Variable<Vector> w = Variable.VectorGaussianFromMeanAndPrecision(new Vector(new double[] { 0, 0 }), PositiveDefiniteMatrix.Identity(2)).Named("w");VariableArray<double> y = Variable.Array<double>(rows);y[rows] = Variable.GaussianFromMeanAndVariance(Variable.InnerProduct(x[rows], w),1.0);y.ObservedValue = new double[] { 30, 45, 40, 80, 70, 100, 130, 110 }; InferenceEngine engine = new InferenceEngine(new VariationalMessagePassing());VectorGaussian postW = engine.Infer<VectorGaussian>(w);Console.WriteLine("Posterior over the weights: "+Environment.NewLine+postW);
Best,John W.
Many thanks. That's what I meant to do in the first place but I misconceived the model.