ropencvimage-processingnon-linear-regressionstructure-from-motion

Simple Orthographic Structure from Motion using R -- Determining Metric Constraints


I would like to build a simple structure from motion program according to Tomasi and Kanade [1992]. The article can be found below:

https://people.eecs.berkeley.edu/~yang/courses/cs294-6/papers/TomasiC_Shape%20and%20motion%20from%20image%20streams%20under%20orthography.pdf

This method seems elegant and simple, however, I am having trouble calculating the metric constraints outlined in equation 16 of the above reference.

I am using R and have outlined my work thus far below:

Given a set of images

enter image description here

I want to track the corners of the three cabinet doors and the one picture (black points on images). First we read in the points as a matrix w where

w

Ultimately, we want to factorize w into a rotation matrix R and shape matrix S that describe the 3 dimensional points. I will spare as many details as I can but a complete description of the maths can be gleaned from the Tomasi and Kanade [1992] paper.

I supply w below:

w.vector=c(0.2076,0.1369,0.1918,0.1862,0.1741,0.1434,0.176,0.1723,0.2047,0.233,0.3593,0.3668,0.3744,0.3593,0.3876,0.3574,0.3639,0.3062,0.3295,0.3267,0.3128,0.2811,0.2979,0.2876,0.2782,0.2876,0.3838,0.3819,0.3819,0.3649,0.3913,0.3555,0.3593,0.2997,0.3202,0.3137,0.31,0.2718,0.2895,0.2867,0.825,0.7703,0.742,0.7251,0.7232,0.7138,0.7345,0.6911,0.1937,0.1248,0.1723,0.1741,0.1657,0.1313,0.162,0.1657,0.8834,0.8118,0.7552,0.727,0.7364,0.7232,0.7288,0.6892,0.4309,0.3798,0.4021,0.3965,0.3844,0.3546,0.3695,0.3583,0.314,0.3065,0.3989,0.3876,0.3857,0.3781,0.3989,0.3593,0.5184,0.4849,0.5147,0.5193,0.5109,0.4812,0.4979,0.4849,0.3536,0.3517,0.4121,0.3951,0.3951,0.3781,0.397,0.348,0.5175,0.484,0.5091,0.5147,0.5128,0.4784,0.4905,0.4821,0.7722,0.7326,0.7326,0.7232,0.7232,0.7119,0.7402,0.7006,0.4281,0.3779,0.3918,0.3863,0.3825,0.3472,0.3611,0.3537,0.8043,0.7628,0.7458,0.7288,0.727,0.7213,0.7364,0.6949,0.5789,0.5491,0.5761,0.5817,0.5733,0.5444,0.5537,0.5379,0.3649,0.3536,0.4177,0.3951,0.3857,0.3819,0.397,0.3461,0.697,0.671,0.6821,0.6821,0.6719,0.6412,0.6468,0.6235,0.3744,0.3649,0.4159,0.3819,0.3781,0.3612,0.3763,0.314,0.7008,0.6691,0.6794,0.6812,0.6747,0.6393,0.6412,0.6235,0.7571,0.7345,0.7439,0.7496,0.7402,0.742,0.7647,0.7213,0.5817,0.5463,0.5696,0.5779,0.5761,0.5398,0.551,0.5398,0.7665,0.7326,0.7439,0.7345,0.7288,0.727,0.7515,0.7062,0.8301,0.818,0.8571,0.8878,0.8766,0.8561,0.858,0.8394,0.4121,0.3876,0.4347,0.397,0.38,0.3631,0.3668,0.2971,0.912,0.8962,0.9185,0.939,0.9259,0.898,0.8887,0.8571,0.3989,0.3781,0.4215,0.3725,0.3612,0.3461,0.3423,0.2782,0.9092,0.8952,0.9176,0.9399,0.925,0.8971,0.8887,0.8571,0.4743,0.4536,0.4894,0.4517,0.446,0.4328,0.4385,0.3706,0.8273,0.8171,0.8571,0.8878,0.8766,0.8543,0.8561,0.8394,0.4743,0.4554,0.4969,0.4668,0.4536,0.4404,0.4536,0.3857)

w=matrix(w.vector,ncol=16,nrow=16,byrow=FALSE)

Then create registered measurement matrix wm according to equation 2 as

equation 2

by

wm = w - rowMeans(w)

We can decompose wm into a '2FxP' matrix o1 a diagonal 'PxP' matrix e and 'PxP' matrix o2 by using a singular value decomposition.

svdwm <- svd(wm)

o1 <- svdwm$u
e <-  diag(svdwm$d)
o2 <- t(svdwm$v) ## dont forget the transpose!

However, because of noise, we only pay attention to the first 3 columns of o1, first 3 values of e and the first 3 rows of o2 by:

o1p <- svdwm$u[,1:3]
ep <-  diag(svdwm$d[1:3])
o2p <- t(svdwm$v)[1:3,] ## dont forget the transpose!

Now we can solve for our rhat and shat in equation (14)

equation 14

by

rhat <- o1p%*%ep^(1/2)
shat <- ep^(1/2) %*% o2p

However, these results are not unique and we still need to solve for R and S by equation (15)

equation 15

by using the metric constraints of equation (16)

equation 16

Now I need to find Q. I believe there are two potential methods but am unclear how to employ either.

Method 1 involves solving for B where B=Q%*%solve(Q) then using Cholesky decomposition to find Q. Method 1 appears to be the common choice in literature, however, little detail is given as to how to actually solve the linear system. It is apparent that B is a '3x3' symmetric matrix of 6 unknowns. However, given the metric constraints (equations 16), I don't know how to solve for 6 unknowns given 3 equations. Am I forgetting a property of symmetric matrices?

Method II involves using non-linear methods to estimate Q and is less commonly used in structure from motion literature.

Can anyone offer some advice as to how to go about solving this problem? Thanks in advance and let me know if I need to be more clear in my question.


Solution

  • i^f can be written as {a_if b_if c_if}.

    j^f can be written as {a_jf b_jf c_jf}.

    Q can be written as {{q11,q12,q13}{q12,q22,q23}{q13,q23,q33}}.

    so our equations are:

    So the first equation can be written as:


    which is equivalent to

    To keep it short we define now:

    (I know the spacings are terrably small, but yes, this is a Vector...)

    So for all equations in all different Frames f, we can write one big equation:

    (sorry for the ugly formulas...) Now you just need to solve the QQ^T-Matrix using Cholesky decomposition or whatever...