Matrices and determinants
The beginnings of matrices and determinants goes back to the second century BC although traces can be seen back to the fourth century BC. However it was not until near the end of the 17th Century that the ideas reappeared and development really got underway.
It is not surprising that the beginnings of matrices and determinants should arise through the study of systems of linear equations. The Babylonians studied problems which lead to simultaneous linear equations and some of these are preserved in clay tablets which survive. For example a tablet dating from around 300 BC contains the following problem:-
There are three types of corn, of which three bundles of the first, two of the second, and one of the third make 39 measures. Two of the first, three of the second and one of the third make 34 measures. And one of the first, two of the second and three of the third make 26 measures. How many measures of corn are contained of one bundle of each type?
Now the author does something quite remarkable. He sets up the coefficients of the system of three linear equations in three unknowns as a table on a 'counting board'.
Cardan, in Ars Magna (1545), gives a rule for solving a system of two linear equations which he calls regula de modo and which [7] calls mother of rules ! This rule gives what essentially is Cramer's rule for solving a 2 × 2 system although Cardan does not make the final step. Cardan therefore does not reach the definition of a determinant but, with the advantage of hindsight, we can see that his method does lead to the definition.
Many standard results of elementary matrix theory first appeared long before matrices were the object of mathematical investigation. For example de Witt in Elements of curves, published as a part of the commentaries on the 1660 Latin version of Descartes' Géométrie , showed how a transformation of the axes reduces a given equation for a conic to canonical form. This amounts to diagonalising a symmetric matrix but de Witt never thought in these terms.
The idea of a determinant appeared in Japan before it appearedin Europe. In 1683 Seki wrote Method of solving the dissimulated problems which contains matrix methods written as tables in exactly the way the Chinese methods described above were constructed. Without having any word which corresponds to 'determinant' Seki still introduced determinants and gave general methods for calculating them based on examples. Using his 'determinants' Seki was able to find determinants of 2 × 2, 3 × 3, 4 × 4 and 5 × 5 matrices and applied them to solving equations but not systems of linear equations.
The first appearance of a determinant in Europe was ten years later. In 1693 Leibniz wrote to de l'Hôpital. He explained that the system of equations
Leibniz was convinced that good mathematical notation was the key to progress so he experimented with different notation for coefficient systems. His unpublished manuscripts contain more than 50 different ways of writing coefficient systems which he worked on during a period of 50 years beginning in 1678. Only two publications (1700 and 1710) contain results on coefficient systems and these use the same notation as in his letter to de l'Hôpital mentioned above.
Leibniz used the word 'resultant' for certain combinatorial sums of terms of a determinant. He proved various results on resultants including what is essentially Cramer's rule. He also knew that a determinant could be expanded using any column - what is now called the Laplace expansion. As well as studying coefficient systems of equations which led him to determinants, Leibniz also studied coefficient systems of quadratic forms which led naturally towards matrix theory.
In the 1730's Maclaurin wrote Treatise of algebra although it was not published until 1748, two years after his death. It contains the first published results on determinants proving Cramer's rule for 2 × 2 and 3 × 3 systems and indicating how the 4 × 4 case would work. Cramer gave the general rule for $n \times n$ systems in a paper Introduction to the analysis of algebraic curves (1750). It arose out of a desire to find the equation of a plane curve passing through a number of given points. The rule appears in an Appendix to the paper but no proof is given:-
Work on determinants now began to appear regularly. In 1764 Bezout gave methods of calculating determinants as did Vandermonde in 1771. In 1772 Laplace claimed that the methods introduced by Cramer and Bezout were impractical and, in a paper where he studied the orbits of the inner planets, he discussed the solution of systems of linear equations without actually calculating it, by using determinants. Rather surprisingly Laplace used the word 'resultant' for what we now call the determinant: surprising since it is the same word as used by Leibniz yet Laplace must have been unaware of Leibniz's work. Laplace gave the expansion of a determinant which is now named after him.
Lagrange, in a paper of 1773, studied identities for 3 × 3 functional determinants. However this comment is made with hindsight since Lagrange himself saw no connection between his work and that of Laplace and Vandermonde. This 1773 paper on mechanics, however, contains what we now think of as the volume interpretation of a determinant for the first time. Lagrange showed that the tetrahedron formed by O(0,0,0) and the three points $M(x,y,z), M'(x',y',z'), M''(x'',y'',z'')$ has volume
Gaussian elimination, which first appeared in the text Nine Chapters on the Mathematical Art written in 200 BC, was used by Gauss in his work which studied the orbit of the asteroid Pallas. Using observations of Pallas taken between 1803 and 1809, Gauss obtained a system of six linear equations in six unknowns. Gauss gave a systematic method for solving such equations which is precisely Gaussian elimination on the coefficient matrix.
It was Cauchy in 1812 who used 'determinant' in its modern sense. Cauchy's work is the most complete of the early works on determinants. He reproved the earlier results and gave new results of his own on minors and adjoints. In the 1812 paper the multiplication theorem for determinants is proved for the first time although, at the same meeting of the Institut de France, Binet also read a paper which contained a proof of the multiplication theorem but it was less satisfactory than that given by Cauchy.
In 1826 Cauchy, in the context of quadratic forms in n variables, used the term 'tableau' for the matrix of coefficients. He found the eigenvalues and gave results on diagonalisation of a matrix in the context of converting a form to the sum of squares. Cauchy also introduced the idea of similar matrices (but not the term) and showed that if two matrices are similar they have the same characteristic equation. He also, again in the context of quadratic forms, proved that every real symmetric matrix is diagonalisable.
Jacques Sturm gave a generalisation of the eigenvalue problem in the context of solving systems of ordinary differential equations. In fact the concept of an eigenvalue appeared 80 years earlier, again in work on systems of linear differential equations, by D'Alembert studying the motion of a string with masses attached to it at various points.
It should be stressed that neither Cauchy nor Jacques Sturm realised the generality of the ideas they were introducing and saw them only in the specific contexts in which they were working. Jacobi from around 1830 and then Kronecker and Weierstrass in the 1850's and 1860's also looked at matrix results but again in a special context, this time the notion of a linear transformation. Jacobi published three treatises on determinants in 1841. These were important in that for the first time the definition of the determinant was made in an algorithmic way and the entries in the determinant were not specified so his results applied equally well to cases were the entries were numbers or to where they were functions. These three papers by Jacobi made the idea of a determinant widely known.
Cayley, also writing in 1841, published the first English contribution to the theory of determinants. In this paper he used two vertical lines on either side of the array to denote the determinant, a notation which has now become standard.
Eisenstein in 1844 denoted linear substitutions by a single letter and showed how to add and multiply them like ordinary numbers except for the lack of commutativity. It is fair to say that Eisenstein was the first to think of linear substitutions as forming an algebra as can be seen in this quote from his 1844 paper:-
Cayley in 1858 published Memoir on the theory of matrices which is remarkable for containing the first abstract definition of a matrix. He shows that the coefficient arrays studied earlier for quadratic forms and for linear transformations are special cases of his general concept. Cayley gave a matrix algebra defining addition, multiplication, scalar multiplication and inverses. He gave an explicit construction of the inverse of a matrix in terms of the determinant of the matrix. Cayley also proved that, in the case of 2 × 2 matrices, that a matrix satisfies its own characteristic equation. He stated that he had checked the result for 3 × 3 matrices, indicating its proof, but says:-
In 1870 the Jordan canonical form appeared in Treatise on substitutions and algebraic equations by Jordan. It appears in the context of a canonical form for linear substitutions over the finite field of order a prime.
Frobenius, in 1878, wrote an important work on matrices On linear substitutions and bilinear forms although he seemed unaware of Cayley's work. Frobenius in this paper deals with coefficients of forms and does not use the term matrix. However he proved important results on canonical matrices as representatives of equivalence classes of matrices. He cites Kronecker and Weierstrass as having considered special cases of his results in 1874 and 1868 respectively. Frobenius also proved the general result that a matrix satisfies its characteristic equation. This 1878 paper by Frobenius also contains the definition of the rank of a matrix which he used in his work on canonical forms and the definition of orthogonal matrices.
The nullity of a square matrix was defined by Sylvester in 1884. He defined the nullity of $A, n(A)$, to be the largest $i$ such that every minor of $A$ of order $n-i+1$ is zero. Sylvester was interested in invariants of matrices, that is properties which are not changed by certain transformations. Sylvester proved that
An axiomatic definition of a determinant was used by Weierstrass in his lectures and, after his death, it was published in 1903 in the note On determinant theory. In the same year Kronecker's lectures on determinants were also published, again after his death. With these two publications the modern theory of determinants was in place but matrix theory took slightly longer to become a fully accepted theory. An important early text which brought matrices into their proper place within mathematics was Introduction to higher algebra by Bôcher in 1907. Turnbull and Aitken wrote influential texts in the 1930's and Mirsky's An introduction to linear algebra in 1955 saw matrix theory reach its present major role in as one of the most important undergraduate mathematics topic.
It is not surprising that the beginnings of matrices and determinants should arise through the study of systems of linear equations. The Babylonians studied problems which lead to simultaneous linear equations and some of these are preserved in clay tablets which survive. For example a tablet dating from around 300 BC contains the following problem:-
There are two fields whose total area is 1800 square yards. One produces grain at the rate of $\large\frac{2}{3}\normalsize$ of a bushel per square yard while the other produces grain at the rate of $\large\frac{1}{2}\normalsize$ a bushel per square yard. If the total yield is 1100 bushels, what is the size of each field.The Chinese, between 200 BC and 100 BC, came much closer to matrices than the Babylonians. Indeed it is fair to say that the text Nine Chapters on the Mathematical Art written during the Han Dynasty gives the first known example of matrix methods. First a problem is set up which is similar to the Babylonian example given above:-
There are three types of corn, of which three bundles of the first, two of the second, and one of the third make 39 measures. Two of the first, three of the second and one of the third make 34 measures. And one of the first, two of the second and three of the third make 26 measures. How many measures of corn are contained of one bundle of each type?
Now the author does something quite remarkable. He sets up the coefficients of the system of three linear equations in three unknowns as a table on a 'counting board'.
1 2 3 2 3 2 3 1 1 26 34 39Our late 20th Century methods would have us write the linear equations as the rows of the matrix rather than the columns but of course the method is identical. Most remarkably the author, writing in 200 BC, instructs the reader to multiply the middle column by 3 and subtract the right column as many times as possible, the same is then done subtracting the right column as many times as possible from 3 times the first column. This gives
0 0 3 4 5 2 8 1 1 39 24 39Next the left most column is multiplied by 5 and then the middle column is subtracted as many times as possible. This gives
0 0 3 0 5 2 36 1 1 99 24 39from which the solution can be found for the third type of corn, then for the second, then the first by back substitution. This method, now known as Gaussian elimination, would not become well known until the early 19th Century.
Cardan, in Ars Magna (1545), gives a rule for solving a system of two linear equations which he calls regula de modo and which [7] calls mother of rules ! This rule gives what essentially is Cramer's rule for solving a 2 × 2 system although Cardan does not make the final step. Cardan therefore does not reach the definition of a determinant but, with the advantage of hindsight, we can see that his method does lead to the definition.
Many standard results of elementary matrix theory first appeared long before matrices were the object of mathematical investigation. For example de Witt in Elements of curves, published as a part of the commentaries on the 1660 Latin version of Descartes' Géométrie , showed how a transformation of the axes reduces a given equation for a conic to canonical form. This amounts to diagonalising a symmetric matrix but de Witt never thought in these terms.
The idea of a determinant appeared in Japan before it appearedin Europe. In 1683 Seki wrote Method of solving the dissimulated problems which contains matrix methods written as tables in exactly the way the Chinese methods described above were constructed. Without having any word which corresponds to 'determinant' Seki still introduced determinants and gave general methods for calculating them based on examples. Using his 'determinants' Seki was able to find determinants of 2 × 2, 3 × 3, 4 × 4 and 5 × 5 matrices and applied them to solving equations but not systems of linear equations.
The first appearance of a determinant in Europe was ten years later. In 1693 Leibniz wrote to de l'Hôpital. He explained that the system of equations
$10 + 11x + 12y = 0\newline 20 + 21x + 22y = 0\newline 30 + 31x + 32y = 0$
had a solution because
10.21.32 + 11.22.30 + 12.20.31 = 10.22.31 + 11.20.32 + 12.21.30
which is exactly the condition that the coefficient matrix has determinant 0. Notice that here Leibniz is not using numerical coefficients but
two characters, the first marking in which equation it occurs, the second marking which letter it belongs to.Hence 21 denotes what we might write as $a_{21}$.
Leibniz was convinced that good mathematical notation was the key to progress so he experimented with different notation for coefficient systems. His unpublished manuscripts contain more than 50 different ways of writing coefficient systems which he worked on during a period of 50 years beginning in 1678. Only two publications (1700 and 1710) contain results on coefficient systems and these use the same notation as in his letter to de l'Hôpital mentioned above.
Leibniz used the word 'resultant' for certain combinatorial sums of terms of a determinant. He proved various results on resultants including what is essentially Cramer's rule. He also knew that a determinant could be expanded using any column - what is now called the Laplace expansion. As well as studying coefficient systems of equations which led him to determinants, Leibniz also studied coefficient systems of quadratic forms which led naturally towards matrix theory.
In the 1730's Maclaurin wrote Treatise of algebra although it was not published until 1748, two years after his death. It contains the first published results on determinants proving Cramer's rule for 2 × 2 and 3 × 3 systems and indicating how the 4 × 4 case would work. Cramer gave the general rule for $n \times n$ systems in a paper Introduction to the analysis of algebraic curves (1750). It arose out of a desire to find the equation of a plane curve passing through a number of given points. The rule appears in an Appendix to the paper but no proof is given:-
One finds the value of each unknown by forming n fractions of which the common denominator has as many terms as there are permutations of n things.Cramer does go on to explain precisely how one calculates these terms as products of certain coefficients in the equations and how one determines the sign. He also says how the n numerators of the fractions can be found by replacing certain coefficients in this calculation by constant terms of the system.
Work on determinants now began to appear regularly. In 1764 Bezout gave methods of calculating determinants as did Vandermonde in 1771. In 1772 Laplace claimed that the methods introduced by Cramer and Bezout were impractical and, in a paper where he studied the orbits of the inner planets, he discussed the solution of systems of linear equations without actually calculating it, by using determinants. Rather surprisingly Laplace used the word 'resultant' for what we now call the determinant: surprising since it is the same word as used by Leibniz yet Laplace must have been unaware of Leibniz's work. Laplace gave the expansion of a determinant which is now named after him.
Lagrange, in a paper of 1773, studied identities for 3 × 3 functional determinants. However this comment is made with hindsight since Lagrange himself saw no connection between his work and that of Laplace and Vandermonde. This 1773 paper on mechanics, however, contains what we now think of as the volume interpretation of a determinant for the first time. Lagrange showed that the tetrahedron formed by O(0,0,0) and the three points $M(x,y,z), M'(x',y',z'), M''(x'',y'',z'')$ has volume
$\large\frac{1}{6}\normalsize [z(x'y'' - y'x'') + z'(yx'' - xy'') + z"(xy' - yx')]$.
The term 'determinant' was first introduced by Gauss in Disquisitiones arithmeticae (1801) while discussing quadratic forms. He used the term because the determinant determines the properties of the quadratic form. However the concept is not the same as that of our determinant. In the same work Gauss lays out the coefficients of his quadratic forms in rectangular arrays. He describes matrix multiplication (which he thinks of as composition so he has not yet reached the concept of matrix algebra) and the inverse of a matrix in the particular context of the arrays of coefficients of quadratic forms.
Gaussian elimination, which first appeared in the text Nine Chapters on the Mathematical Art written in 200 BC, was used by Gauss in his work which studied the orbit of the asteroid Pallas. Using observations of Pallas taken between 1803 and 1809, Gauss obtained a system of six linear equations in six unknowns. Gauss gave a systematic method for solving such equations which is precisely Gaussian elimination on the coefficient matrix.
It was Cauchy in 1812 who used 'determinant' in its modern sense. Cauchy's work is the most complete of the early works on determinants. He reproved the earlier results and gave new results of his own on minors and adjoints. In the 1812 paper the multiplication theorem for determinants is proved for the first time although, at the same meeting of the Institut de France, Binet also read a paper which contained a proof of the multiplication theorem but it was less satisfactory than that given by Cauchy.
In 1826 Cauchy, in the context of quadratic forms in n variables, used the term 'tableau' for the matrix of coefficients. He found the eigenvalues and gave results on diagonalisation of a matrix in the context of converting a form to the sum of squares. Cauchy also introduced the idea of similar matrices (but not the term) and showed that if two matrices are similar they have the same characteristic equation. He also, again in the context of quadratic forms, proved that every real symmetric matrix is diagonalisable.
Jacques Sturm gave a generalisation of the eigenvalue problem in the context of solving systems of ordinary differential equations. In fact the concept of an eigenvalue appeared 80 years earlier, again in work on systems of linear differential equations, by D'Alembert studying the motion of a string with masses attached to it at various points.
It should be stressed that neither Cauchy nor Jacques Sturm realised the generality of the ideas they were introducing and saw them only in the specific contexts in which they were working. Jacobi from around 1830 and then Kronecker and Weierstrass in the 1850's and 1860's also looked at matrix results but again in a special context, this time the notion of a linear transformation. Jacobi published three treatises on determinants in 1841. These were important in that for the first time the definition of the determinant was made in an algorithmic way and the entries in the determinant were not specified so his results applied equally well to cases were the entries were numbers or to where they were functions. These three papers by Jacobi made the idea of a determinant widely known.
Cayley, also writing in 1841, published the first English contribution to the theory of determinants. In this paper he used two vertical lines on either side of the array to denote the determinant, a notation which has now become standard.
Eisenstein in 1844 denoted linear substitutions by a single letter and showed how to add and multiply them like ordinary numbers except for the lack of commutativity. It is fair to say that Eisenstein was the first to think of linear substitutions as forming an algebra as can be seen in this quote from his 1844 paper:-
An algorithm for calculation can be based on this, it consists of applying the usual rules for the operations of multiplication, division, and exponentiation to symbolic equations between linear systems, correct symbolic equations are always obtained, the sole consideration being that the order of the factors may not be altered.The first to use the term 'matrix' was Sylvester in 1850. Sylvester defined a matrix to be an oblong arrangement of terms and saw it as something which led to various determinants from square arrays contained within it. After leaving America and returning to England in 1851, Sylvester became a lawyer and met Cayley, a fellow lawyer who shared his interest in mathematics. Cayley quickly saw the significance of the matrix concept and by 1853 Cayley had published a note giving, for the first time, the inverse of a matrix.
Cayley in 1858 published Memoir on the theory of matrices which is remarkable for containing the first abstract definition of a matrix. He shows that the coefficient arrays studied earlier for quadratic forms and for linear transformations are special cases of his general concept. Cayley gave a matrix algebra defining addition, multiplication, scalar multiplication and inverses. He gave an explicit construction of the inverse of a matrix in terms of the determinant of the matrix. Cayley also proved that, in the case of 2 × 2 matrices, that a matrix satisfies its own characteristic equation. He stated that he had checked the result for 3 × 3 matrices, indicating its proof, but says:-
I have not thought it necessary to undertake the labour of a formal proof of the theorem in the general case of a matrix of any degree.That a matrix satisfies its own characteristic equation is called the Cayley-Hamilton theorem so its reasonable to ask what it has to do with Hamilton. In fact he also proved a special case of the theorem, the 4 × 4 case, in the course of his investigations into quaternions.
In 1870 the Jordan canonical form appeared in Treatise on substitutions and algebraic equations by Jordan. It appears in the context of a canonical form for linear substitutions over the finite field of order a prime.
Frobenius, in 1878, wrote an important work on matrices On linear substitutions and bilinear forms although he seemed unaware of Cayley's work. Frobenius in this paper deals with coefficients of forms and does not use the term matrix. However he proved important results on canonical matrices as representatives of equivalence classes of matrices. He cites Kronecker and Weierstrass as having considered special cases of his results in 1874 and 1868 respectively. Frobenius also proved the general result that a matrix satisfies its characteristic equation. This 1878 paper by Frobenius also contains the definition of the rank of a matrix which he used in his work on canonical forms and the definition of orthogonal matrices.
The nullity of a square matrix was defined by Sylvester in 1884. He defined the nullity of $A, n(A)$, to be the largest $i$ such that every minor of $A$ of order $n-i+1$ is zero. Sylvester was interested in invariants of matrices, that is properties which are not changed by certain transformations. Sylvester proved that
max $\{n(A), n(B)\} ≤ n(AB) ≤ n(A) + n(B)$.
In 1896 Frobenius became aware of Cayley's 1858 Memoir on the theory of matrices and after this started to use the term matrix. Despite the fact that Cayley only proved the Cayley-Hamilton theorem for 2 × 2 and 3 × 3 matrices, Frobenius generously attributed the result to Cayley despite the fact that Frobenius had been the first to prove the general theorem.
An axiomatic definition of a determinant was used by Weierstrass in his lectures and, after his death, it was published in 1903 in the note On determinant theory. In the same year Kronecker's lectures on determinants were also published, again after his death. With these two publications the modern theory of determinants was in place but matrix theory took slightly longer to become a fully accepted theory. An important early text which brought matrices into their proper place within mathematics was Introduction to higher algebra by Bôcher in 1907. Turnbull and Aitken wrote influential texts in the 1930's and Mirsky's An introduction to linear algebra in 1955 saw matrix theory reach its present major role in as one of the most important undergraduate mathematics topic.
References (show)
- I Grattan-Guinness and W Ledermann, Matrix theory, in I Grattan-Guinness (ed.), Companion Encyclopedia of the History and Philosophy of the Mathematical Sciences (London, 1994), 775-786.
- T W Hawkins, Cauchy and the spectral theory of matrices, Historia Mathematica 2 (1975), 1-29.
- T W Hawkins, Another look at Cayley and the theory of matrices, Archives Internationales d'Histoire des Sciences 26 (100) (1977), 82-112.
- T W Hawkins, Weierstrass and the theory of matrices, Archive for History of Exact Sciences 17 (1977), 119-163.
- T Hawkins, The theory of matrices in the 19th century, Proceedings of the International Congress of Mathematicians 2 (Montreal, Que., 1975), 561-570.
- A Jennings, Matrices, ancient and modern, Bull. Inst. Math. Appl. 13 (5) (1977), 117-123.
- E Knobloch, Determinants, in I Grattan-Guinness (ed.), Companion Encyclopedia of the History and Philosophy of the Mathematical Sciences (London, 1994), 766-774.
- E Knobloch, From Gauss to Weierstrass : determinant theory and its historical evaluations, in The intersection of history and mathematics (Basel, 1994), 51-66.
- E Knobloch, Zur Vorgeschichte der Determinantentheorie, Theoria cum praxi : on the relationship of theory and praxis in the seventeenth and eighteenth centuries IV (Hannover, 1977), 96-118.
- E Knobloch, Der Beginn der Determinantentheorie, Leibnizens nachgelassene Studien zum Determinantenkalkül (Hildesheim, 1980).
- A E Malykh, Development of the general theory of determinants up to the beginning of the nineteenth century (Russian), Mathematical analysis (Leningrad, 1990), 88-97.
- T Muir, The Theory of Determinants in the Historical Order of Development (4 Volumes) (London, 1960).
- J Tvrdá, On the origin of the theory of matrices, Acta Historiae Rerum Naturalium necnon Technicarum (Prague, 1971), 335-354.
Written by
J J O'Connor and E F Robertson
Last Update February 1996
Last Update February 1996