Deep Representation Learning: Running time analysis

Measuring time

The main purpose of this project is to speedup the architecture selection process. In order to make my experiments representative, I run all the models on the same GPU with the same optimization method (RMSProp) with the same hyper parameters.

Results

The model I used for this task is quite small and the majority of parameters are the parameters of the fully connected layer. Nevertheless, fixing weights being random in first layers helps to speed up the training procedure.

I conducted experiments for 2-5 fixed layers first and then after about a week I run a reference model and a model with 1 fixed layer. Seems, that something had changed within this week and two later models run much faster. I concluded, that for this small models the running time depends on other processes. I put in the table number of parameters for each model and time per epoch.

Fixed layers	Trained parameters	One epoch time, s
reference	486,000	237
1	484,000	204
2	482,000	347
3	470,000	238
4	434,000	218
5	360,0000	183

Conclusions

Fixed random features help to speed up the learning
The impact is not so big for small models
Optimizing may be more difficult for models with fixed weights

Deep Representation Learning

Thursday, 9 April 2015

Running time analysis

Measuring time

Results

Conclusions

No comments:

Post a Comment