12- Deep Learning/ClipID:21061 previous clip next clip

Recording date 2020-10-11

Course

Deep Learning

Language

English

Organisational Unit

Friedrich-Alexander-Universität Erlangen-Nürnberg

Producer

Friedrich-Alexander-Universität Erlangen-Nürnberg

Deep Learning - Loss and Optimization Part 3

This video discusses details on optimization and different options in gradient descent procedure such as momentum and ADAM.

For reminders to watch the new video follow on Twitter or LinkedIn.

Video References:
Lex Fridman's Channel

References

[1] Christopher M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2006.
[2] Anna Choromanska, Mikael Henaff, Michael Mathieu, et al. “The Loss Surfaces of Multilayer Networks.” In: AISTATS. 2015.
[3] Yann N Dauphin, Razvan Pascanu, Caglar Gulcehre, et al. “Identifying and attacking the saddle point problem in high-dimensional non-convex optimization”. In: Advances in neural information processing systems. 2014, pp. 2933–2941.
[4] Yichuan Tang. “Deep learning using linear support vector machines”. In: arXiv preprint arXiv:1306.0239 (2013).
[5] Sashank J. Reddi, Satyen Kale, and Sanjiv Kumar. “On the Convergence of Adam and Beyond”. In: International Conference on Learning Representations. 2018.
[6] Katarzyna Janocha and Wojciech Marian Czarnecki. “On Loss Functions for Deep Neural Networks in Classification”. In: arXiv preprint arXiv:1702.05659 (2017).
[7] Jeffrey Dean, Greg Corrado, Rajat Monga, et al. “Large scale distributed deep networks”. In: Advances in neural information processing systems. 2012, pp. 1223–1231.
[8] Maren Mahsereci and Philipp Hennig. “Probabilistic line searches for stochastic optimization”. In: Advances In Neural Information Processing Systems. 2015, pp. 181–189.
[9] Jason Weston, Chris Watkins, et al. “Support vector machines for multi-class pattern recognition.” In: ESANN. Vol. 99. 1999, pp. 219–224.
[10] Chiyuan Zhang, Samy Bengio, Moritz Hardt, et al. “Understanding deep learning requires rethinking generalization”. In: arXiv preprint arXiv:1611.03530 (2016).

Further Reading:
A gentle Introduction to Deep Learning

Up next

Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-11
IdM-login
Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-11
IdM-login
Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-11
IdM-login
Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-12
IdM-login
Maier, Andreas
Prof. Dr. Andreas Maier
2020-10-12
IdM-login

More clips in this category "Friedrich-Alexander-Universität Erlangen-Nürnberg"

2015-09-02
Free
public  
2020-04-17
Studon
protected  
2020-07-06
Studon
protected  
2020-11-11
IdM-login / Studon
protected  
2020-04-26
IdM-login / Studon
protected