ディープラーニングの心理学的解釈 (心理学特講IIIA)

Shin Aasakawa, all rights reserved.
Date: 03/Jul/2020
Appache 2.0 license


第 09 回 自動翻訳, 文章要約, 転移学習, マルチモーダル学習, マルチタスク学習

マルチタスク学習,転移学習

  • 学習したことがらを応用することは賢さの尺度でしょう

たとえば,映画カラテキッド(1984)では,ミヤギ先生はダニエルさんに車のワックスがけや床掃除を教えました :-) ワックスがけや床磨きは空手の技術習得にとって必要な技能であったというオチです。

実習ファイル



Hard parameter sharing


左:マルチタスク学習, 右:転移学習, いずれも Sebastuan Ruder のブログより


Soft parameter sharing

In soft parameter sharing on the other hand, each task has its own model with its own parameters. The distance between the parameters of the model is then regularized in order to encourage the parameters to be similar. [8] for instance use the norm for regularization, while [9] use the trace norm.

  • [8]: Duong, L., Cohn, T., Bird, S., & Cook, P. (2015). Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Short Papers), 845–850.
  • [9]: Yang, Y., & Hospedales, T. M. (2017). Trace Norm Regularised Deep Multi-Task Learning. In Workshop track - ICLR 2017. Retrieved from http://arxiv.org/abs/1606.04038


Recent work on MTL for Deep Learning

Deep Relationship Networks

A Deep Relationship Network with shared convolutional and task-specific fully connected layers with matrix priors (Long and Wang, 2015).

  • Long, M., & Wang, J. (2015). Learning Multiple Tasks with Deep Relationship Networks. arXiv Preprint arXiv:1506.02117. Retrieved from http://arxiv.org/abs/1506.02117 ↩︎

Fully-Adaptive Feature Sharing


The widening procedure for fully-adaptive feature sharing (Lu et al., 2016).

Lu, Y., Kumar, A., Zhai, S., Cheng, Y., Javidi, T., & Feris, R. (2016). Fully-adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification. Retrieved from http://arxiv.org/abs/1611.05377


Cross-stitch Networks


Cross-stitch networks for two tasks (Misra et al., 2016).

Misra, I., Shrivastava, A., Gupta, A., & Hebert, M. (2016). Cross-stitch Networks for Multi-task Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2016.433


A Joint Many-Task Model


A Joint Many-Task Model (Hashimoto et al., 2016).


Weighting losses with uncertainty


Uncertainty-based loss function weighting for multi-task learning (Kendall et al., 2017).

Kendall, A., Gal, Y., & Cipolla, R. (2017). Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Retrieved from http://arxiv.org/abs/1705.07115


Sluice Networks


A sluice network for two tasks (Ruder et al., 2017).

Ruder, S., Bingel, J., Augenstein, I., & Søgaard, A. (2017). Sluice networks: Learning what to share between loosely related tasks. Retrieved from http://arxiv.org/abs/1705.08142