Linear algebra without determinant

(1)

Selcuk Journal of

Applied Mathematics

Vol. 4, No. 2, pp. 23–32, 2003

Linear algebra without determinant

Haydar Bulgak1 and Christoph Zenger2

1 _{Research Centre of Applied Mathematics, Sel¸cuk University, Konya, Turkey;} e-mail: hbulgak@selcuk.edu.tr

2 _{Institute of Informatics, Technical University of Munich, Boltzmannstrasse 3,} 85748 Garching, Germany;

e-mail: zenger@in.tum.de Received: October 31, 2003

Summary. The course of linear algebra is one of the basic courses in modern university education. Since the work by J. Von Neumann and H. Goldstine (1947), the epoch of a determinant in practical computing of the linear algebraic equations has ended. In the present article, a brief description of the structure of a standard course of linear algebra for the beginners is given. This course uses only con-cepts, which are relevant from the point of view of calculations with ﬁnite accuracy. Thus, the spectral theory and the theory of linear equations do not use concept of a determinant.

Key words: Linear algebra, pseudo-spectrum, condition number

2000 Mathematics Subject Classiﬁcation: 15A06, 15A18,15A29

1. Introduction

Problems of the university education are well-known to all of us. In the modern sea of the scientiﬁc information, the basic question is how to teach students. During the last years, another important question was added: how to eﬀectively use computers in the educational pro-cess. All this occurs against a background of congestion in educational programs.

We are constantly faced with a necessity to reconsider curricula and to reconstruct - in view of last scientiﬁc and technological achieve-ments the contents of training courses.

(2)

One of the key courses of modern university education is a course of linear algebra. The latter course includes many beautiful mathe-matical theories ( see e.g. [3]–[6], [8], [10]–[14]). On which of them one should concentrate attention depends on a speciality, to which the students are being trained.

The authors have prepared a variant of a linear algebra textbook which is supported by computer program MVC [15]. It is an extension version of the textbook [5]. The book gives an introduction to linear algebra and it also includes some important material from numeri-cal linear algebra and mathematinumeri-cal analysis. Generally, introductory courses on linear algebra are frequently spoilt as they are being mis-used as an introduction to algebra, mathematical logic and projective geometry - and thus students might not be aware of the connections to analysis when the course is over. However, these connections are very important, particularly in applied mathematics and numerical analysis. In that sense, the present book tries to bridge the gap both between traditional linear algebra and numerical linear algebra as well as between linear algebra and functional analysis.

We would like to stress that we see an urgent need for a textbook emphasizing the fact that certain questions from linear algebra are no longer well-posed, when the problem comes from an application where perturbations have to be considered. The eigenvalue problem for nonsymmetric matrices is just one of several examples.

The main results are introduced and proved without any references to determinant. Since the papers [1], [2] show that a determinant is ill-posed concept of a linear algebra in general case, we think that determinant is a complex concept of linear algebra and it is a subject for the additional articles about it.

The material of the book is organized in such a manner as the reader would master the material without omissions. All statements in the book are proven. All concepts of the book can be correctly computed using the MVC [15]. The MVC is a C++ toolbox. The users don’t have to be familiar with the C++; they only need to call the functions that are listed in the book [5].

Written primarily for ﬁrst- and second- year undergraduates in mathematics, this book features a host of diverse and interesting examples, making it an entertaining and stimulating companion that will also be accessible to students of statistics, computer science and engineering, as well as to professionals in this ﬁelds.

(3)

2. Vector space

One of the most important ideas of modern mathematics is the idea of vector spaces. As an example of the vector spaces over the real ﬁeld, we generally take the set of real polynomials. We have chosen this example, since the students are familiar with the polynomials and besides it is far from the traditional interpretation of a vector as an arrow with the starting and ending points.

The idea of a basis in a vector space V leads us to ﬁnite dimen-sional vector spaces. We work only with such vector spaces in this book.

For N -dimensional vector space V ( dim(V ) = N ) the co-ordinates of a vector with respect to its basis leads us to RN space. It is the ﬁrst time an idea of a linear algebraic system appears here. If we have a bank of co-ordinates of vectors of a vector space V with re-spect to a basis e1, e2, . . . , eN, then we can easily ﬁnd co-ordinates of

them with respect to a new basis w1, w2, . . . , wN, if we have a ma-trix, which characterises these bases. The matrix appears here the very ﬁrst time as one of the objects of linear algebra theory.

3. Linear equations

Let P : V → W be a linear transformation of a vector space V to a vector space W . Let N = dimV and M = dimW . If we have two ordered bases v1, v2, . . . , vN and w1, w2, . . . , wM in V and W

respec-tively. It is convenient using the co-ordinates of a vector x ( V and a matrix with N columns and M rows compute the co-ordinates of vector P x. The matrix is the matrix of the transformation P with respect to the bases. The question of choosing the convenient bases is a very important one from theoretical and practical point of view. This topic leads us to linear equations.

Let P : V → V be a linear operator. We can speak about an identical operator, singular and regular operators. We have kerP and ImP subspaces. Here we have Krylov’s subspaces. Invariant sub-spaces. Projectors. All of these concepts are illustrated by the simple examples.

4. Euclidean and Hermitian spaces

A distance and angle between two vectors are meaningful in Euclidean and Hermitian spaces. The adjoint, self-adjoint and Hermitian oper-ators are objects of Hermitian space. The transpose, symmetric and orthogonal operators are objects of Euclidean space.

(4)

The Hermitian and symmetric operators allow us to speak about sign of operator. In these spaces we can speak about well- and ill-posed operators.

Let V = {p(x), p(x) = ax + b, a, b ∈ R, 0 ≤ x ≤ 1} be a vector space. The following examples illustrate some aspects of the theory of operator equations.

- Deﬁne an inner product in the vector space V such that the vectors 1 + x and 1 + 2x (0 ≤ x ≤ 1) are orthogonal.

-Consider the operators P and Q in V ,

P [ax+b] = (a+b)x+a−b, Q[ax+b] = (2a−b)x+3a, a, b ∈ R, 0 ≤ x ≤ 1.

Check that there is no any inner product such that Q is a self-adjoint operator with respect to it.

Check that P is a non-self-adjoint operator with respect to the inner product

(ax + b, cx + d)1 = 2ac + 7bd, a, b, c, d ∈ R, 0 ≤ x ≤ 1

and it is a self-adjoint operator with respect to the inner product (ax + b, cx + d) = ac + bd, a, b, c, d ∈ R, 0 ≤ x ≤ 1.

It is interesting that a symmetric operator has a symmetric matrix with respect to an orthonormal basis.

In this chapter the orthogonal subspaces, orthogonal direct sum of the subspaces, the orthogonal operators and the orthogonal pro-jectors are introduced.

An important example of an orthogonal operator is a Householder reﬂection. Another example of an orthogonal operator is a Givens rotation [9].

5. Symmetric eigenvalue problem

A couple (λ, L) is known as an eigencouple of a symmetric matrix

S, if for any non-zero vector x of a one-dimensional subspace L the

equality Sx = λx holds. In this case λ is known as an eigevalue of S and L as its eigensubspace, which corresponds to λ. Any vector of L is known as an eigenvector of S which corresponds to λ.

An N -dimensional symmetric matrix S has N diﬀerent eigensub-spaces L1, L2, . . . , LN such that RN = L1 ⊕ L2 ⊕ . . . ⊕ LN is an

orthogonal direct sum.

Let L1 and L2 be two eigensubspaces, which correspond to λ1

(5)

matrix S. These eigensubspaces are orthogonal one to another. Really, take any non-zero vectors u ∈ L1 and w ∈ L2. Since

(Su, w) = (λ1u, w) = λ1(u, w),

(Su, w) = (u, S ∗ w) = (u, Sw) = (u, λ2w) = λ2(u, w)

and λ1 = λ2 we can conclude that (u, w) = 0.

We know that any symmetric matrix S can be reduced to a tridi-agonal symmetric one by a sequence of Householder transformations. Any symmetric tridiagonal matrix is represented by a number of sym-metric Jacobi matrices.

Some words about symmetric eigenvalue problem for symmetric tridiagonal Jacobi matrix. Let d1, dj, bj = 0, j = 2, 3, . . . , N be the

real numbers. A symmetric N -dimensional tridiagonal matrix

S = ⎛ ⎜ ⎜ ⎜ ⎜ ⎝ d1 b2 O b2 d2 . .. . .. ... bN O bN dN ⎞ ⎟ ⎟ ⎟ ⎟ ⎠

is known as a symmetric Jacobi matrix. The sequence

D0(λ) = 1;

D1(λ) = d1− λ; k = 1, 2, . . . , N − 1,

Dk+1(λ) = (dk+1− λ)Dk(λ) − b2_k+1Dk−1(λ) is known as a Sturm sequence ( see, for example, [3], [4], [15]).

The roots of a polynomial Dk(λ), k = 1, 2, 3, . . . , N are real and

simple (not multiple). Between two neighbour roots of the polyno-mial Dj(λ) there is one root of the polynomial Dj−1(λ). Hence the polynomial DN(λ) = 0 has N diﬀerent real roots.

It is easy to check that if (λ, L) is an eigencouple of S then for any non-zero vector u = (u1, u2, . . . , uN)T ∈ L we have u1 = 0.

The vector u with components 1) u1= 1;

2) u2=−(d1− λ)/b2;

uk=−(bk−1uk−2+ (dk− λ)uk−1)/bk, k = 3, 4, . . . , N.

satisﬁes to equations

−(d1− λ)u1− b2u2= 0;

(6)

bNuN−1+ (dN− λ)uN = (−1)N+1DN(λ)/[b2b3. . . bN].

Hence, for λ1, λ2, . . . , λN, the N diﬀerent roots of DN(λ) = 0, there

exist N orthonormal eigenvectors x1, x2, . . . , xN that deﬁne N

eigen-subspaces L1, L2, . . . , LN.

The equalities Sxj = λjxj, j = 1, 2, . . . , N allows concluding that

a diagonal matrix Λ with the diagonal entries λ1, λ2, . . . , λN and an orthogonal matrix U = [x1, x2, . . . , xN] with the columns xj, j =

1, 2, . . . , N are related by the equality SU = U Λ. Hence we proved that for any symmetric Jacobi matrix S there exist an orthogonal matrix U , such that U∗SU is a diagonal matrix. S = U ΛU∗ is known as an eigenvalue decomposition (EVD) of S.

It is easy to check that for a block-diagonal symmetric diagonal matrix there exists its EVD. Hence it is also true for any symmetric matrix. A set of the eigenvalues of the symmetric matrix S is known as its spectrum.

EVD is true for the Hermitian matrices too. For any Hermitian matrix S there exists an unitary matrix U such that U∗SU is a real

diagonal matrix.

6. SVD - singular value decomposition

EVD for a symmetric matrix has a generalisation to non-symmetric matrix. Let A be a real square N -dimensional matrix. It can be shown that the spectrum of

H =

O A∗ A O

is symmetric with respect to 0. EVD of symmetric matrix H guar-anties that there exist non-negative numbers 0≤ σ₁ ≤ σ₂ ≤ . . . ≤ σ_N and two sets of N -dimensional orthonormal N vectors u1, u2, . . . , uN

and q1, q2, . . . , qN such that

Auj = σjqj, A∗qj = σjuj, j = 1, 2, . . . , N.

A diagonal matrix Σ with the diagonal entries σ1, σ2, . . . , σN and

the orthogonal matrices U = [u1, u2, . . . , uN], Q = [q1, q2, . . . , qN] are related by the equality AU = QΣ. The A = QΣU∗ is known as a singular value decomposition (SVD) of A. The numbers σ1, σ2, . . . , σN

are an orthogonal invariant of A and they are known as the singular values of A.

SVD is true for the non-self-adjoint complex matrices too. For any complex matrix A there exist the unitary matrices U and Q such

(7)

that Q∗AU is a real diagonal matrix with non-negative entries on the

diagonal.

SVD has a generalisation for rectangular matrices. If A is a matrix with M rows and N columns then there exist the orthogonal (uni-tary) matrices Q and U such that the matrix Q∗AU is a real

pseudo-diagonal matrix with non-negative entries on the main pseudo-diagonal. If

Σ is a diagonal K = min(N, M ) dimensional matrix, which diagonal

entries are the singular values of A, then either Q∗AU = (Σ O) or Q∗AU = Σ O .

SVD gives a natural way for introduction Fredholm alternatives.

7. Singular and regular matrices

A square matrix A is recognised as a regular matrix if there exists its inverse matrix X = A−1, such as AX = I with an identical matrix

I. If A is not a regular matrix then it is known as a singular one. 7.1. Condition number

For checking regularity of a square matrix A on a computer, one needs a condition number μ(A). For regular matrices this number is introduced as follows: μ(A) = max x=0 Ax x maxξ=0 ξ Aξ =AA−1 = σN(A) σ1(A).

If A is a singular matrix, i.e. σ1(A) = 0, then μ(A) = ∞. It is easy to check that the following statements are true. 1) μ(A) = μ(ρA) for any ρ = 0;

2) μ(A) ≥ 1;

3) μ(A) = μ(QAP ) for any orthogonal matrices P ,Q; 4) if σ1(A) = 0 then μ(A) < ∞ and μ(A) = σN(A)/σ1(A).

It is clear that the equations AA−1 = I, A(A−1+ H) = I + G lead us to the inequality

H

A−1 ≤ μ(A)G.

This inequality shows that the relative error in the computed approx-imation to the inverted matrix can be as great as the relative residual multiplied by the condition number. Therefore, if the condition num-ber is large, the residual gives little information about the accuracy of the computed approximation.

(8)

7.2. Practice regular and practice singular matrices

There are many diﬀerent way to express the quality of regularity of a matrix A. One of them is given by so called the parameter of practical regularity (we note μ∗).

If μ(A) < μ∗ then matrix A is called as μ∗-regular (practice reg-ular) matrix. Introduction of the value μ∗ allows us to organise the process of computation of A−1 such that we either compute it with all valid digits (without the last digit of its computer representation) or detect the ill-conditioning of the problem, which can be detected by inequality μ(A) > μ∗.

8. Non-symmetric eigenvalue problem

A set Σ(A) of the complex numbers, for which A − zI is regular, is a spectrum of A and its elements are the eigenvalues of A. As any matrix is similar to Hessenberg matrix H that needs to be shown that there is an eigenvalue of H. Let H,

H = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ a11 b2 O a21 a22 b3 .. . ... . .. . .. aN−1,1 aN−1,2 . . . aN−1,N−1 bN aN1 aN2 . . . aN,N−1 aNN ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

be a strong lower Hessenberg matrix with the real elements aij, bi =

0, i = 2, 3, ..., N, j = i, i + 1, . . . , N. The sequence

D0(λ) = 1; D1(λ) = a11− λ; k = 2, 3, . . . , N,

Dk(λ) = (akk− λ)Dk−1(λ) − bkak,k−1Dk−2(λ)

+bkbk−1ak,k−2Dk−3(λ) − . . . − (−1)kbkbk−1. . . b2ak1D0(λ)

is a generalization of a Sturm sequence for a strong Hessenberg ma-trix. DN(λ) is known as a characteristic polynomial of H. Let λ

satisﬁes to DN(λ) = 0. It is easy to check that the vector x with the

components

x1= 1, x2 =−D1(λ)

b2 , . . . , xN = (−1)

N+1 DN−1(λ)

b2b3. . . bN

satisﬁes to equation (H − λI)x = 0.

Hence any Hessenberg N -dimensional matrix A has the N ( may be multiple ) eigenvalues. For each of them there exists an invariant with respect to A one-dimensional subspace ( eigensubspace ). These are true for any square N -dimensional matrix A also.

(9)

8.1. Schur decomposition

There are two main generalisations of EVD for non-symmetric ma-trices. The ﬁrst one is SVD. Another generalisation is the well-known Schur theorem: Let λ1(A), λ2(A), . . . , λm(A) be a symmetrical with

respect to Ox-axis set of eigenvalues of A inside a domain Ω1that are well-separated by the given closed curve Γ from another part of the spectrum of A inside a domain Ω2; then there is exist an orthogonal matrix U such that

U∗AU =

B D O C

is true. Here B and C are respectively m- and N − m-dimensional square matrices; all eigenvalues of B are inside Ω1 and all eigenvalues

of C are inside Ω2.

8.2. Maximal invariant eigensubspaces

Let Ω1, Ω2, . . . , Ωmbe m domains without intersection. Assume that

Ωj (j = 1, 2, . . . , m) is a symmetrical with respect to Ox-axis domain

such that Σ(A) ∩ Ωj = ∅, j = 1, 2, . . . , m and Σ(A) ⊂ Ω1∪ Ω2∪ . . . ∪

Ωm.

Then there exist the m subspaces L(A, Ωj), j = 1, 2, . . . , m. Each of them is an A-invariant subspace and spectrum of A : L(A, Ωj)→

L(A, Ωj) is strongly inside Ωj. L(A, Ωj) is known as the maximal invariant eigensubspace of A which corresponds to Ωj. In this case

RN = L(A, Ω1)⊕ L(A, Ω2)⊕ . . . ⊕ L(A, Ωm).

There is exist v1,j, v2,j, . . . , vk(j),j an orthonormal basis in L(A, Ωj).

In general case, the set of similar bases is not an orthonormal basis in RN.

8.3. Spectral portrait of a matrix

The problem of eigenvalues computation is not the one that can be considered as a correctly stated. From the view of the ”indeﬁniteness principle” we can only guarantee that instead of the matrix A we are dealing with a certain matrix A0, which is close to A.

We know about that matrix only that the inequality A − A₀ ≤

εA is true for a small positive value ε. The ε characterises either

the number of digits in a computer cell or the uncertainties in data. Therefore instead of the eigenvalues of the matrix A we deal with the spots of its ”ε-spectrum”.

(10)

Let ε > 0 be given. A complex number λ is in the ε-spectrum of A, which we denote by Σε(A), if the smallest singular value of

A − λI is less than or equal to εA. There are some other equivalent

deﬁnitions of the ε-spectrum of A (see, for example, [4]–[7]).

At present during computing an eigenvalue problem, we have a visualisation of the eigenvalue spots ( see [15]). This visualisation is known as a spectral portrait of a matrix. And now, instead of solving the eigenvalue problem, in practice we can only solve a partial eigen-value problem. The aim of this method is recognising the eigeneigen-value spots and computing the corresponding maximal invariant subspaces.

Acknowledgement. The authors are thankful to T. Huckle and

J. Staudacher, who read the ﬁrst version of the book and gave useful advises that lead to improvements in some chapters of the book.

References

1. Von Neumann, J. and Goldstine, H.H. (1947): Numerical inverting of matrices of high order. Bull. Amer. Math. Soc. 53, 1021 – 1099.

2. Turing, A.M. (1948): Rounding oﬀ errors in matrix processes. Quart. J. Mech. and Appl. Math.1, 287–308.

3. Wilkinson, J. H. (1965): The Algebraic Eigenvalue Problem, Clarendom Press Oxford.

4. Godunov, S.K. (1998): Modern Aspects of Linear Algebra, Translations of Mathematical Monographs. 175. Providence, RI: American Mathematical So-ciety.

5. Bulgak, A. and Bulgak, H. (2001): Linear Algebra,( in Turkish), Selcuk Uni-versity, Research Centre of Applied Mathematics, Konya.

6. Trefethen, L.N., Bau, D. III, (1997):Numerical Linear Algebra, SIAM. 7. Kostin, V.I. (1991): On deﬁnition of matrices’ spectra, High Perfomance

Com-puting II.

8. Gantmaher, F.R. (1959): The theory of matrices, vols. 1-2, Chelsea, New York. 9. Givens, W. (1958): Computation of plane unitary rotations transforming a

general matrix to triangular form. J. Soc. Industr. Appl. Math. 6, 26-50. 10. Halmos, P. (1958): Finite Dimensional Vector Spaces, Van Nostrand, NY. 11. Householder, A.S. (1964): The Theory of Matrices in Numerical Analysis.

Blaisdell. New York.

12. Malt’cev, A. I. (1970): Foundations of Linear Algebra, (in Russian), Nauka, Moscow.

13. Kostrikin, A. I., Manin, Yu. I. (1980): Linear algebra and geometry. (in Rus-sian). Ministerstvo Vysshego i Srednego Spetsial’nogo Obrazovaniya SSSR. Moskva: Izdatel’stvo Moskovskogo Universiteta.

14. Godunov, S.K. (2002): Lectures on modern aspects of linear algebra, (in Rus-sian), A university series 12, Nauchnaya Kniga, Novosibirsk.

15. Bulgak, H. and Eminov, D. (2001): Computer dialogue system MVC, Selcuk J. Appl. Math.,2, No. 2, 17-38.