Názov:Merging of Neural Networks
Vedúci:Mgr. Vladimír Boža, PhD.
Kľúčové slová:merging neural networks, L0 regularization, pruning neural network, knowledge distillation
Abstrakt:We propose two simple strategies for merging two trained neural networks (teachers) into one (student) of the same size. We use relaxed L0 regularization to identify essential parts of the teachers. In the first strategy, we train the student to mimic them layer by layer. In the second, we create the student by concatenating teachers in the channel dimension and pruning the unimportant parts. We can use these strategies if we already have two trained models. But, we also show that training two teachers and then merging them leads to better performance than classical training in the same number of epochs. We compare our strategy (train two teachers and merge them) with the bo3 strategy (train three models, pick the best) and one model strategy (use the whole training budget on one model). In all performed experiments, our strategy has significantly better results.

Súbory diplomovej práce:

Diplomová práca Martin Pašen.pdf

Súbory prezentácie na obhajobe:

Martin Pašen.pdf