Example of a linear model adapted by gradient descent

(we could use least squares directly, but this algorithm can be extended for nonlinear models and does not use inverse matrix)

 

Fundamental gradient descent rule:

 



Single run of adaptation on longer data (1000 samples) model approaches real value

μ=.1

error e=y-yn

 

model

 

model

 

Adaptation in Epochs

epochs=10

μ =.01

When the learning rate m is chosen too small, the adaptation needs more epochs (if it can work for the data at all).


epochs=20

μ =.01


epochs=3

μ =.1


When the learning rate m is chosen too big, the adaptation becomes unstable.

μ =1 !

epochs=10