I am trying to figure out what would be a good architecture for neural network that takes projections (2D images) from different angles and creates volume consisting of 2D slices (CT-like).
So for example:
I have ground truth volumes.
I came up with the idea of using ResNet as Encoder. But I'm not really sure how to implement Decoder and what model would be a good choice for this kind of problem. I did think of U-net architecture, but output dimension is different, so I've abandoned this idea.
I am using PyTorch.
Specifying the whole network is out of scope of a single answer, but generally you want something like this:
ConvTranspose3d
layers to upsample latent tensor to desired output sizeYou can do a UNet-like setup where you have skip connections between encoder layers and decoder layers, you would just need a projection layer to map the encoder activations into a shape compatible with the decoder activations.