[SOLVED] DICOM: how to resample multi modality data with different origins?

For any two images A and B deemed to represent the same object, registration is the act of identifying for each pixel / landmark in A the equivalent pixel / landmark in B.
Assuming each pixel in both A and B can be embedded in a coordinate system, registration usually entails transforming A such that after the transformation, the coordinates of each pixel in A coincide with those of the equivalent pixel in B (i.e. the objective is for the two objects overlap in that coordinate space)
An isometric transformation is one where the distance between any two pixels in A, and the distance between the equivalent two pixels in B does not change after the transformation has been applied. For instance, rotation in space, reflection (i.e. mirror image), and translation (i.e. shifting the object in a particular direction) are all isometric transformations. A registration algorithm applying only isometric transformations is said to be rigid.
An affine transformation is similar to an isometric one, except scaling may also be involved (i.e. the object can also grow or shrink in size).
In medical imaging If A and B were obtained at different times, it is highly unlikely that the transformation is a simple affine or isometric one. For instance, say during scan A the patient had their arms down by their side, and in scan B the patient had their arms over their head. There is no rigid registration of A that would result in perfect overlap with B, since distances between equivalent points have changed (e.g. the distance between head-to-hand, and hand-to-foot in each case). Therefore more elaborate non-rigid registration algorithms would need to be used.
The fact that in your case A and B were obtained during the same scanning session in the same machine means that it's a reasonable assumption that the transformation will be a simple affine one. I.e. you will probably only need to rotate and translate the object a bit; if the coordinate system of A is 'denser' than B, you might also need to grow / shrink it a bit. But that's it, no weird 'warping' will be necessary to compensate for 'movement' occurring between scans A and B being obtained, since they happened at the same time.
A 3D vector, denoting a 'magnitude and direction' in 3D space can be transformed to another 3D vector using a 3x3 transformation matrix T. For example, if you apply transformation to vector (using matrix multiplication), the resulting vector u is . In other words, the 'new' x-coordinate depends on the old x, y, and z coordinates in a manner specified by the transformation matrix, and similarly for the new y and new z coordinates.
If you apply a 3x3 transformation T to three vectors at the same time, you'll get three transformed vectors out. e.g. for v = [v1, v2, v3] where v1 = [1; 2; 3], v2 = [2; 3; 4], v3 = [3; 4; 5], then T*v will give you a 3x3 matrix u, where each column corresponds to a transformed vector of x,y,z coordinates.
Now, consider the transformation matrix T is unknown and we want to discover it. Say we have a known point and we know that after the transformation it becomes a known point . We have:

Consider the top row; even if you know p and p', it should be clear that you cannot determine a, b, and c from a single point. You have three unknowns and only one equation. Therefore to solve for a, b, and c, you need at least a system of three equations. The same applies for the other two rows. Therefore, to find the transformation matrix T you need three known points (before and after transformation).
In matlab, you can solve such a system of equations where T*v = u, by typing T = u/v. For a 3x3 transformation matrix T, u and v need to contain at least 3 vectors, but they can contain more (i.e. the system of equations is overrepresented). The more vectors you pass in, the more accurate the transformation matrix from a numerical point of view. But in theory you only need three.
If your transformation also involves a translation element, then you need to do the trick described in the image you posted. I.e. you represent a 3D vector [x,y,z] as a homogeneous-coordinates vector [x,y,z,1]. This enables you to add a 4th column in your transformation matrix, which results in a 'translation' for each point, i.e. adding an extra value in the new x', y' and z' coefficients, which is independent of the input vector. Since the translation coefficients are also unknown, you now have 12 instead of 9 unknowns, and therefore you need 4 points to solve this system. i.e.

To summarise:

To transform your image A to occupy the same space as B, interpret the coordinates of A as if they were in the same coordinate system as B, find four equivalent landmarks in both, and obtain a suitable transformation matrix as above by solving this system of equations using the / right matrix division operator. You can then use this transformation matrix T you found, to transform all the coordinates in A (expressed as homogeneous coordinates) to the new ones.