Previous workIn several works was mentioned that fixed convolutional weights perform not much worse than training the whole model, an example of these kind of works is Jarrett et al., 2009. Later, Saxe et al. investigated this phenomena from theoretical point of view and concluded, that fine tuning of the fully connected layers has the biggest effect on the training.
It is very tempting to use fixed weights for hyper parameter search since the training of the model with fixed weights should be easier and faster. I'm going to make several experiments to find advantages and drawbacks of this approach and analyze the behaviour of the training procedure in this setup.