**Example of a linear model adapted by gradient descent**

(we could use least squares directly, but this algorithm can be extended for nonlinear models and does not use inverse matrix)

Fundamental gradient descent rule:

_{}

**Single run of adaptation** on longer data
(1000 samples) – model approaches real value

μ=.1

error e=y-yn model model

**Adaptation in Epochs**

epochs=10

μ =.01

When the learning rate m is chosen too small, the adaptation needs more
epochs (if it can work for the data at all).

epochs=20

μ =.01

epochs=3

μ =.1

When the learning rate m is chosen too big, the adaptation becomes unstable.

μ =1 !

epochs=10