The Botany of Matrices
Diagonal, Symmetric, Banded, Hessenberg, Circulant, Hankel,
Triangular, Sparse,
Orthogonal, semi-positive definite,
My notations for:
transpose, inner product,norm, matrix norms,
Close singularity is just as bad because the roundoff could take us there, we will have to measure this.
Single precision is usually enough for 30-50 equations.
Double precision makes it possible to solve for a few hundred.
This does NOT hold for pathological systems we will look into these and how to recognize them when we talk about conditionning.
Suppose the matrix of coefficients is a square matrix,
(of full rank also, as we saw, non-degenerate)
then the equation looks like this:
Formally you would say : invert it, in fact a computational mistake, except if you have multiple right hand sides.
Doing what one would first think of doing if you had to solve a system by hand.
The easiest system: triangular.
Bringing to triangular shape.
Pivotting. This is NOT to solve ill-conditionning, but unlucky diagonal elements that are very small make the algorithm unstable.
LU decomposition, with a permutation matrix
(prepivotting) is what is best to do.
Flops: Of an order
(the same as the cost of the multiplication
of two matrices, not so bad).
Try a small perturbation of
to
,
call the solution of the perturbed system:
,

We will study it several times, here is my first introduction: a little made-up example with matlab to start with:
>> u=[3 1 -1 2]'
u =
3
1
-1
2
>> v=(1:4)'
v =
1
2
3
4
>> X=u*v'
X =
3 6 9 12
1 2 3 4
-1 -2 -3 -4
2 4 6 8
>>svd(X)
ans =
21.2132
0.0000
0
0
>> E=10^(-3)*randn(4);
>> XE=X+E
XE =
3.0012 5.9993 9.0003 12.0012
1.0006 2.0017 3.0009 3.9994
-0.9999 -1.9999 -3.0014 -3.9994
2.0004 4.0018 5.9993 7.9996
>> svd(XE)
ans =
21.2143
0.0028
0.0019
0.0005
>> cond(X)
Condition is infinite
ans =
Inf
>> cond(XE)
ans =
4.2567e+04
%An example you can't see with your bare eyes:
>> X2=u'*v
X2 =
0.2468 0.2499 0.0166 0.2487
0.3225 0.3266 0.0218 0.3250
0.1495 0.1514 0.0101 0.1506
0.4367 0.4423 0.0295 0.4401
>> svd(X2)
ans =
1.0730
0.0000
0
0
>> [U,S,V]=svd(X2);
>> U*S*V'
ans =
0.2468 0.2499 0.0166 0.2487
0.3225 0.3266 0.0218 0.3250
0.1495 0.1514 0.0101 0.1506
0.4367 0.4423 0.0295 0.4401
>> 10000*(X2-U*S*V')
ans =
1.0e-11 *
-0.0278 0.0833 -0.0035 0.0555
-0.0555 0.0555 0 0.0555
0 0.0555 0.0017 0.0278
-0.0555 0.1665 0 0.1110
>> A=rand(4)
A =
0.5045 0.4940 0.0737 0.9138
0.5163 0.2661 0.5007 0.5297
0.3190 0.0907 0.3841 0.4644
0.9866 0.9478 0.2771 0.9410
>> flops(0)
>> [L,U,P]=lu(A)
L =
1.0000 0 0 0
0.5233 1.0000 0 0
0.5114 -0.0406 1.0000 0
0.3234 0.9388 0.7363 1.0000
U =
0.9866 0.9478 0.2771 0.9410
0 -0.2298 0.3557 0.0373
0 0 -0.0535 0.4342
0 0 0 -0.1945
>> flops
ans =
34
>> P*A
ans =
0.9866 0.9478 0.2771 0.9410
0.5163 0.2661 0.5007 0.5297
0.5045 0.4940 0.0737 0.9138
0.3190 0.0907 0.3841 0.4644
>> L*U
ans =
0.9866 0.9478 0.2771 0.9410
0.5163 0.2661 0.5007 0.5297
0.5045 0.4940 0.0737 0.9138
0.3190 0.0907 0.3841 0.4644
>> P*A-L*U
ans = 1.0e-15 *
0 0 0 0
-0.1110 0 0 0
0 0 0 0
-0.0555 -0.0139 0 0
Here is what we need to remember:
The condition number in
norm has the simple interpretation
as a function of the singular values, ration of the largest to the smallest:
Read 2.6 of NR (pages 61-71).
Of an order of
steps are sufficient, see NR 51.
This enables, a the price of storing A, to improve the solution by using the residual. Do the following in double precision:

We try to minimize the error which is usually measured
as the norm
Important remark: For any orthogonal Q, this norm is unchanged
by trnasformation by
, so
>> t = (1900:10:1990)';
>> y = [75.995 91.972 105.711 123.203 131.669 ...
150.697 179.323 203.212 226.505 249.633]';
>> X=[t ones(10,1)]
X = 1900 1
1910 1
1920 1
1930 1
1940 1
1950 1
1960 1
1970 1
1980 1
1990 1
>> bhat=X\p
bhat = 1.93
-3594.01
>>yhat=polyval(bhat,t);
>>[polyval(bhat,t) y]
ans = 67.08 76.00
86.35 91.97
105.62 105.71
124.89 123.20
144.16 131.67
163.43 150.70
182.70 179.32
201.96 203.21
221.23 226.50
240.50 249.63
>> norm(yhat-y)
ans =
23.5794
>> [Q,RC]=qr([X y]);
RC =
-6151.30 -3.16 -488.86
0 -0.05 167.82
0 0 23.58
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
>> R1=RC(1:2,1:2)
R1 = -6151.30 -3.16
0 -0.05
>> C1=RC(1:2,3)
C1 = -488.8643
167.8181
>> R1\C1
ans = 1.93
-3594.01
>> C2=RC(3,3)
C2 = 23.5794
>>[Q*[C1;zeros(8,1)] yhat]
ans 67.08 67.08
86.35 86.35
105.62 105.62
124.89 124.89
144.16 144.16
163.43 163.43
182.70 182.70
201.96 201.96
221.23 221.23
240.50 240.50
You can use them to zero out individual isolated elements in any matrix, without changing any of the norms of the vectors, these transformations are orthogonal.
They are based on simple
ones of the form:
You can choose
and
so that a certain vector
is orthogonally transformed (actually, rotated)
into a vector
whose second coordinate is zero.
This is of course easy by solving the equations
| (2.1) | |||
| (2.2) |
The examples that follow show two ways of doing this:
%%%%Van Loan's Function, Chapter 7%%%%%%%%
function [c,s] = Rotate(x1,x2);
% Pre:
% x1,x2 scalars
% Post:
% c,s c^2+s^2=1 so -s*x1 + c*x2 = 0.
%
if x2==0
c = 1;
s = 0;
else
if abs(x2)>=abs(x1)
cotangent = x1/x2;
s = 1/sqrt(1+cotangent^2);
c = s*cotangent;
else
tangent = x2/x1;
c = 1/sqrt(1+tangent^2);
s = c*tangent;
end
end
%%%%%%%%%%%%%%%The resident function:%%%%%%%
function [G,x] = planerot(x)
%PLANEROT Generate a Givens plane rotation.
% [G,Y] = PLANEROT(X), where X is a 2-component column vector,
% returns a 2-by-2 orthogonal matrix G so that Y = G*X has Y(2) = 0.
% See also QRINSERT, QRDELETE.
if x(2) ~= 0
r = norm(x);
G = [x'; -x(2) x(1)]/r;
x = [r; 0];
else
G = eye(2);
end
>> x=[2 3]';
>> [G,y]=planerot(x)
G =
0.55 0.83
-0.83 0.55
y =
3.61
0
Now in order to extend this to a general context you
just notice that if you fill out the rest of the matrix
with
s on the diagonal, it will not change the other coordinates
thus:
backsub=function(X,y){
#Backsubstitution to find solution in a triangular case
l=dim(X)
p=l[2]
b=matrix(0,p,1)
b[p,1]=y[p,1]/X[pp]
for (j in seq(p-1,1,-1){
b[j,1]=(y[j,1]-X[j,(j+1):n]*b[(j+1):n,1])/X[j,j]
}
return(b)}
Algorithm (Householder Vector) in R
house= function(x){
#Householder vector calculation
p = length(x);
mu = sqrt(x%*%x)
v = x
if (mu!= 0)
{
beta = x[1] + sign(x[1])*mu
v[2 : p,1] = v[2 : p,1]/beta
}
v[1] = 1
return(v)
}
Algorithm (Householder Vector) in matlab
function v = house(x);
%Compute a Householder vector
m = length(x);
mu = norm(x,2);
v = x;
if mu ~= 0
beta = x(1) + sign(x(1))*mu;
v(2 : m,1) = v(2 : m,1)/beta;
end
v(1) = 1
rowhouse=function(X,v){
beta=-2/(t(v)%*%v)
w=beta*t(X)%*%v
X=X+v%*%t(w)
return(X)}
qr package:base R Documentation
The QR Decomposition of a Matrix
Description:
'qr' computes the QR decomposition of a matrix. It provides an
interface to the techniques used in the LINPACK routine DQRDC or
the LAPACK routines DGEQP3 and (for complex matrices) ZGEQP3.
Usage:
qr(x, tol = 1e-07 , LAPACK = FALSE)
qr.coef(qr, y)
qr.qy(qr, y)
qr.qty(qr, y)
qr.resid(qr, y)
qr.fitted(qr, y, k = qr$rank)
qr.solve(a, b, tol = 1e-7)
## S3 method for class 'qr':
solve(a, b, ...)
is.qr(x)
as.qr(x)
Arguments:
x: a matrix whose QR decomposition is to be computed.
tol: the tolerance for detecting linear dependencies in the
columns of 'x'. Only used if 'LAPACK' is false and 'x' is
real.
qr: a QR decomposition of the type computed by 'qr'.
y, b: a vector or matrix of right-hand sides of equations.
a: A QR decomposition or ('qr.solve' only) a rectangular matrix.
k: effective rank.
LAPACK: logical. For real 'x', if true use LAPACK otherwise use
LINPACK.
...: further arguments passed to or from other methods
Details:
The QR decomposition plays an important role in many statistical
techniques. In particular it can be used to solve the equation
*Ax* = *b* for given matrix *A*, and vector *b*. It is useful for
computing regression coefficients and in applying the
Newton-Raphson algorithm.
The functions 'qr.coef', 'qr.resid', and 'qr.fitted' return the
coefficients, residuals and fitted values obtained when fitting
'y' to the matrix with QR decomposition 'qr'. 'qr.qy' and 'qr.qty'
return 'Q %*% y' and 't(Q) %*% y', where 'Q' is the (complete) *Q*
matrix.
All the above functions keep 'dimnames' (and 'names') of 'x' and
'y' if there are.
'solve.qr' is the method for 'solve' for 'qr' objects. 'qr.solve'
solves systems of equations via the QR decomposition: if 'a' is a
QR decomposition it is the same as 'solve.qr', but if 'a' is a
rectangular matrix the QR decomposition is computed first. Either
will handle over- and under-determined systems, providing a
minimal-length solution or a least-squares fit if appropriate.
'is.qr' returns 'TRUE' if 'x' is a 'list' with components named
'qr', 'rank' and 'qraux' and 'FALSE' otherwise.
It is not possible to coerce objects to mode '"qr"'. Objects
either are QR decompositions or they are not.
Value:
The QR decomposition of the matrix as computed by LINPACK or
LAPACK. The components in the returned value correspond directly
to the values returned by DQRDC/DGEQP3/ZGEQP3.
qr: a matrix with the same dimensions as 'x'. The upper triangle
contains the *R* of the decomposition and the lower triangle
contains information on the *Q* of the decomposition (stored
in compact form). Note that the storage used by DQRDC and
DGEQP3 differs.
qraux: a vector of length 'ncol(x)' which contains additional
information on *Q*.
rank: the rank of 'x' as computed by the decomposition: always full
rank in the LAPACK case.
pivot: information on the pivoting strategy used during the
decomposition.
Non-complex QR objects computed by LAPACK have the attribute
'"useLAPACK"' with value 'TRUE'.
Note:
To compute the determinant of a matrix (do you _really_ need it?),
the QR decomposition is much more efficient than using Eigen
values ('eigen'). See 'det'.
Using LAPACK (including in the complex case) uses column pivoting
and does not attempt to detect rank-deficient matrices.
References:
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
Language_. Wadsworth & Brooks/Cole.
Dongarra, J. J., Bunch, J. R., Moler, C. B. and Stewart, G. W.
(1978) _LINPACK Users Guide._ Philadelphia: SIAM Publications.
Anderson. E. and ten others (1999) _LAPACK Users' Guide_. Third
Edition. SIAM.
Available on-line at <URL:
http://www.netlib.org/lapack/lug/lapack_lug.html>.
See Also:
'qr.Q', 'qr.R', 'qr.X' for reconstruction of the matrices.
'lm.fit', 'lsfit', 'eigen', 'svd'.
'det' (using 'qr') to compute the determinant of a matrix.
Examples:
hilbert <- function(n) { i <- 1:n; 1 / outer(i - 1, i, "+") }
h9 <- hilbert(9); h9
qr(h9)$rank #--> only 7
qrh9 <- qr(h9, tol = 1e-10)
qrh9$rank #--> 9
##-- Solve linear equation system H %*% x = y :
y <- 1:9/10
x <- qr.solve(h9, y, tol = 1e-10) # or equivalently :
x <- qr.coef(qrh9, y) #-- is == but much better than
#-- solve(h9) %*% y
h9 %*% x # = y
h9 %*% (solve(h9) %*% y)
[,1]
[1,] 0.09999996
[2,] 0.20000011
[3,] 0.30000012
[4,] 0.40000010
[5,] 0.50000008
[6,] 0.60000007
[7,] 0.70000005
[8,] 0.80000004
[9,] 0.90000003
> h9 %*% qr.coef(qrh9, y)
[,1]
[1,] 0.1
[2,] 0.2
[3,] 0.3
[4,] 0.4
[5,] 0.5
[6,] 0.6
[7,] 0.7
[8,] 0.8
[9,] 0.9
> round(h9,4)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1.0000 0.5000 0.3333 0.2500 0.2000 0.1667 0.1429 0.1250 0.1111
[2,] 0.5000 0.3333 0.2500 0.2000 0.1667 0.1429 0.1250 0.1111 0.1000
[3,] 0.3333 0.2500 0.2000 0.1667 0.1429 0.1250 0.1111 0.1000 0.0909
[4,] 0.2500 0.2000 0.1667 0.1429 0.1250 0.1111 0.1000 0.0909 0.0833
[5,] 0.2000 0.1667 0.1429 0.1250 0.1111 0.1000 0.0909 0.0833 0.0769
[6,] 0.1667 0.1429 0.1250 0.1111 0.1000 0.0909 0.0833 0.0769 0.0714
[7,] 0.1429 0.1250 0.1111 0.1000 0.0909 0.0833 0.0769 0.0714 0.0667
[8,] 0.1250 0.1111 0.1000 0.0909 0.0833 0.0769 0.0714 0.0667 0.0625
[9,] 0.1111 0.1000 0.0909 0.0833 0.0769 0.0714 0.0667 0.0625 0.0588