I am extracting features from pictures using a pretrained CNN. Would it make sense to use those features as inputs for a new CNN/NN? Has it been done before?
This is called finetuning. It is very commonly used. Usually, one deletes the last few layers of a VGG or a similar network, adds layers which suit to the task and trains the network on the new data.
See: