dictionary][ SGD ( Stochastic Gradient Descent )

원본1 : http://shuuki4.github.io/deep%20learning/2016/05/20/Gradient-Descent-Algorithm-Overview.html

Gradient Descent Optimization Algorithms 정리

Neural network의 weight을 조절하는 과정에는 보통 ‘Gradient Descent’ 라는 방법을 사용한다. 이는 네트워크의 parameter들을 $\theta$라고 했을 때, 네트워크에서 내놓는 결과값과 실제 결과값 사이의 차이를 정의하는 함수 Loss function $J(\theta)$의 값을 최소화하기 위해 기울기(gradient) $\nabla_{\theta} J(\theta)$를 이용하는 방법이다. Gradient Descent에

shuuki4.github.io

loss function을 계산할 때 전체 데이터(batch) ( 데이터셋( data set )) 대신

일부 조그마한 데이터의 모음(mini-batch)에 대해서만 loss function을 계산

단점 : 다소 부정확함.

장점 : 속도가 빠름.

저작자표시 비영리 변경금지

'Deep learning > 0x01-dictionary' 카테고리의 다른 글

dictionary][ model( 모델 정의 )-동영상 필기 자료 (0)	2020.01.07

장경칩연구소 : [ Black Falcon 대장 ]

dictionary][ SGD ( Stochastic Gradient Descent )

'Deep learning > 0x01-dictionary' 카테고리의 다른 글

티스토리툴바

dictionary][ SGD ( Stochastic Gradient Descent )

'Deep learning > 0x01-dictionary' 카테고리의 다른 글

관련글

티스토리툴바