Gradient Methods

Linear system AX = B is solved by finding a minimum of the functional F(X) = 1/2 X^T A X - X^T B .

Assumptions on the matrix A

- A is symmetric: A = A^T
- A is positive definite, so for example the leading minors could be checked, or eigenvalues of A, or some other sufficient condition on A

Method of Gradient Descent (or Steepest Descent)

>> A = [ 4 1 1 0; 1 4 1 1; 1 1 4 1; 0 1 1 4] % the matrix
>> B = [ -2; 4; 10; 14]                      % the right hand side
>> X = [ 0; 0; 0; 0]                         % the first approximation - any vector

repeat the following 4 steps, until the norm of the residual is less than some chosen tolerance:

>> R = B - A * X                     % residual - the direction for searching the new approx. 
>> norm(R)                           % norm of the residual - the criterion of convergence
>> alfa = (R' * R) / (R' * (A * R))  % the coefficient (assuming norm(R) > 0)
>> X = X + alfa * R                  % the new approximation

After 10 iterations the result X = [ -1; 0; 2; 3] is obtained with precision of 4 decimal places, euclidean norm of the residual is less than 0.001.

More effective programming - saves one multiplication by the matrix when computing the residual:
(in this case we must choose the first approx. equal to zero vector, however)

>> X = [ 0; 0; 0; 0]                 
>> R = B                      % only holds for X = zero vector

repeat the following steps, until the norm of the residual is less than some chosen tolerance:

>> Q = A * R ;                
>> alfa = (R' * R) / (R' * Q) % the coefficient (assuming norm(R) > 0)
>> X = X + alfa * R           % the new approximation
>> R = R - alfa * Q           % residual - the direction for searching the new approx.  
>> norm(R)                    % norm of the residual - the criterion of convergence

The function using the Method of Gradient Descent (as a file nejv_spad.m):

function [X, res, i] = nejv_spad( A, B, tol, numit)
% [X, res, i] = nejv_spad( A, B, tol, numit)
% input: A - the system matrix (symmetric positive definite, it is not checked)
%        B - the right hand side 
%        tol - tolerance for norm of the residual
%        numit - maximal number of iterations
% output: X the solution of AX=B computed with given tolerance or after numit iterations
%         res - norm of the residual
%         i - number of iterations
  X = zeros(size(B));   % the first approx. (zero vector of size of B)                
  R = B;                % residual
  res = norm(R);        % norm of the residual
  i = 0;
  % iterations
  while ((i < numit) && (res > tol))
    i = i+1;
    Q = A * R;
    alfa = (R' * R) / (R' * Q);
    X = X + alfa * R;
    R = R - alfa * Q;
    res = norm(R);
  end  % while
end  % function

function can be called in different ways (assume A and B are defined already):

>> [X, r, i] = nejv_spad(A, B, 0.1 ,6);  % all the results saved in X, r, i  
>> X = nejv_spad(A, B, 0.001, 15);       % X only
>> nejv_spad(A, B, 0.01, 10)             % results will be displayed on the screen
>> help nejv_spad                        % the help - displays the leading comments

Conjugate Gradients Method

>> X = [ 0; 0; 0; 0]          % the first approx. - any vector          
>> R = B - A*X   
>> V = R

repeat the following steps, until the norm of the residual is less than some chosen tolerance:

>> Q = A * V ;                
>> alfa = (V' * R) / (V' * Q)        
>> X = X + alfa * V                  % the new approximation
>> R = R - alfa * Q                  % the residual 
>> norm(R)                           % norm of the residual - the criterion of convergence
>> beta = - (V' * A * R) / (V' * Q)
>> V = R + beta * V                  % the new direction for searching the minimum