Vandermonde matrix: Difference between revisions
mNo edit summary |
→Confluent Vandermonde matrices: This section is about Hermite interpolation |
||
(329 intermediate revisions by more than 100 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Mathematical concept}} |
|||
In [[linear algebra]], a '''Vandermonde matrix''' is a [[matrix]] with a [[geometric progression]] in each column, i.e; |
|||
In [[linear algebra]], a '''Vandermonde matrix''', named after [[Alexandre-Théophile Vandermonde]], is a [[matrix (mathematics)|matrix]] with the terms of a [[geometric progression]] in each row: an <math>(m + 1) \times (n + 1)</math> matrix |
|||
:<math>V = V(x_0, x_1, \cdots, x_m) = |
|||
\begin{bmatrix} |
|||
1 & x_0 & x_0^2 & \dots & x_0^n\\ |
|||
1 & x_1 & x_1^2 & \dots & x_1^n\\ |
|||
1 & x_2 & x_2^2 & \dots & x_2^n\\ |
|||
\vdots & \vdots & \vdots & \ddots &\vdots \\ |
|||
1 & x_m & x_m^2 & \dots & x_m^n |
|||
\end{bmatrix}</math> |
|||
with entries <math>V_{i,j} = x_i^j </math>, the ''j''<sup>th</sup> power of the number <math>x_i</math>, for all [[Zero-based numbering|zero-based]] indices <math>i </math> and <math>j </math>.<ref>Roger A. Horn and Charles R. Johnson (1991), ''Topics in matrix analysis'', Cambridge University Press. ''See Section 6.1''.</ref> Some authors define the Vandermonde matrix as the [[transpose]] of the above matrix.<ref name=":0">{{Cite book |last1=Golub |first1=Gene H. |title=Matrix Computations |last2=Van Loan |first2=Charles F. |publisher=The Johns Hopkins University Press |year=2013 |isbn=978-1-4214-0859-0 |edition=4th |pages=203–207}}</ref><ref name="MS" /> |
|||
The [[determinant]] of a [[square matrix|square]] Vandermonde matrix (when <math>n=m</math>) is called a '''Vandermonde determinant''' or [[Vandermonde polynomial]]. Its value is: |
|||
:<math>V=\begin{bmatrix} |
|||
:<math>\det(V) = \prod_{0 \le i < j \le m} (x_j - x_i). </math> |
|||
1 & 1 & 1 & \dots & 1\\ |
|||
This is non-zero if and only if all <math>x_i</math> are distinct (no two are equal), making the Vandermonde matrix [[Invertible matrix|invertible]]. |
|||
\alpha_1 & \alpha_2 & \alpha_3 & \dots & \alpha_n\\ |
|||
\alpha_1^2 & \alpha_2^2 & \alpha_3^2 & \dots & \alpha_n^2\\ |
|||
==Applications== |
|||
\alpha_1^3 & \alpha_2^3 & \alpha_3^3 & \dots & \alpha_n^3\\ |
|||
The [[polynomial interpolation]] problem is to find a polynomial <math>p(x) = a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n</math> which satisfies <math>p(x_0)=y_0, \ldots,p(x_m)=y_m</math> for given data points <math>(x_0,y_0),\ldots,(x_m,y_m)</math>. This problem can be reformulated in terms of [[linear algebra]] by means of the Vandermonde matrix, as follows. <math>V</math> computes the values of <math>p(x)</math> at the points <math>x=x_0,\ x_1,\dots,\ x_m </math> via a matrix multiplication <math>Va = y</math>, where <math>a = (a_0,\ldots,a_n)</math> is the vector of coefficients and <math>y = (y_0,\ldots,y_m)= (p(x_0),\ldots,p(x_m))</math> is the vector of values (both written as column vectors): |
|||
\vdots & \vdots & \vdots & &\vdots \\ |
|||
\alpha_1^{m-1} & \alpha_2^{m-1} & \alpha_3^{m-1} & \dots & \alpha_n^{m-1}\\ |
|||
<math display="block">\begin{bmatrix} |
|||
1 & x_0 & x_0^2 & \dots & x_0^n\\ |
|||
1 & x_1 & x_1^2 & \dots & x_1^n\\ |
|||
1 & x_2 & x_2^2 & \dots & x_2^n\\ |
|||
\vdots & \vdots & \vdots & \ddots &\vdots \\ |
|||
1 & x_m & x_m^2 & \dots & x_m^n |
|||
\end{bmatrix} |
|||
\cdot |
|||
\begin{bmatrix} |
|||
a_0\\ |
|||
a_1\\ |
|||
\vdots\\ |
|||
a_n |
|||
\end{bmatrix} |
|||
= |
|||
\begin{bmatrix} |
|||
p(x_0)\\ |
|||
p(x_1)\\ |
|||
\vdots\\ |
|||
p(x_m) |
|||
\end{bmatrix}. |
|||
</math>If <math>n = m</math> and <math>x_0,\dots,\ x_n </math> are distinct, then ''V'' is a square matrix with non-zero determinant, i.e. an [[invertible matrix]]. Thus, given ''V'' and ''y'', one can find the required <math>p(x)</math> by solving for its coefficients <math>a </math> in the equation <math>Va = y</math>:<ref>François Viète (1540-1603), Vieta's formulas, https://en.wikipedia.org/wiki/Vieta%27s_formulas</ref> <blockquote><math>a = V^{-1}y</math>. </blockquote>That is, the map from coefficients to values of polynomials is a bijective linear mapping with matrix ''V'', and the interpolation problem has a unique solution. This result is called the [[unisolvence theorem]], and is a special case of the [[Chinese remainder theorem#Over univariate polynomial rings and Euclidean domains|Chinese remainder theorem for polynomials]]. |
|||
In [[statistics]], the equation <math>Va = y</math> means that the Vandermonde matrix is the [[design matrix]] of [[polynomial regression]]. |
|||
In [[numerical analysis]], solving the equation <math>Va = y</math> [[System of linear equations#Solving a linear system|naïvely by Gaussian elimination]] results in an algorithm with [[time complexity]] O(''n''<sup>3</sup>). Exploiting the structure of the Vandermonde matrix, one can use [[Newton polynomial|Newton's divided differences method]]<ref>{{Cite journal |last1=Björck |first1=Å. |last2=Pereyra |first2=V. |date=1970 |title=Solution of Vandermonde Systems of Equations |journal=[[American Mathematical Society]] |volume=24 |issue=112 |pages=893–903 |doi=10.1090/S0025-5718-1970-0290541-1|s2cid=122006253 |doi-access=free }}</ref> (or the [[Lagrange polynomials|Lagrange interpolation formula]]<ref>{{Cite book |last1=Press |first1=WH |title=Numerical Recipes: The Art of Scientific Computing |last2=Teukolsky |first2=SA |last3=Vetterling |first3=WT |last4=Flannery |first4=BP |publisher=Cambridge University Press |year=2007 |isbn=978-0-521-88068-8 |edition=3rd |location=New York |chapter=Section 2.8.1. Vandermonde Matrices |chapter-url=http://apps.nrbook.com/empanel/index.html?pg=94}}</ref><ref>Inverse of Vandermonde Matrix (2018), https://proofwiki.org/wiki/Inverse_of_Vandermonde_Matrix</ref>) to solve the equation in O(''n''<sup>2</sup>) time, which also gives the [[LU decomposition|UL factorization]] of <math>V^{-1}</math>. The resulting algorithm produces extremely accurate solutions, even if <math>V</math> is [[Condition number|ill-conditioned]].<ref name=":0" /> (See [[polynomial interpolation]].) |
|||
The Vandermonde determinant is used in the [[representation theory of the symmetric group]].<ref>{{Fulton-Harris}} ''Lecture 4 reviews the representation theory of symmetric groups, including the role of the Vandermonde determinant''.</ref> |
|||
When the values <math>x_i</math> belong to a [[finite field]], the Vandermonde determinant is also called the [[Moore matrix|Moore determinant]], and has properties which are important in the theory of [[BCH code]]s and [[Reed–Solomon error correction]] codes. |
|||
The [[discrete Fourier transform]] is defined by a specific Vandermonde matrix, the [[DFT matrix]], where the <math>x_i</math> are chosen to be {{math|''n''}}th [[roots of unity]]. The [[Fast Fourier transform]] computes the product of this matrix with a vector in <math>O(n\log^2n)</math> time.<ref>Gauthier, J. "Fast Multipoint Evaluation On n Arbitrary Points." ''Simon Fraser University, Tech. Rep'' (2017).</ref> |
|||
In the [[Physics|physical]] theory of the [[quantum Hall effect]], the Vandermonde determinant shows that the [[Laughlin wavefunction]] with filling factor 1 is equal to a [[Slater determinant]]. This is no longer true for filling factors different from 1 in the [[fractional quantum Hall effect]]. |
|||
In the geometry of [[polyhedra]], the Vandermonde matrix gives the normalized volume of arbitrary <math>k</math>-faces of [[cyclic polytope]]s. Specifically, if <math>F = C_{d}(t_{i_{1}}, \dots, t_{i_{k + 1}})</math> is a <math>k</math>-face of the cyclic polytope <math>C_d(T) \subset \mathbb{R}^{d}</math> corresponding to <math>T = \{t_{1}< \cdots < t_{N}\} \subset \mathbb{R}</math>, then<math display="block">\mathrm{nvol}(F) = \frac{1}{k!}\prod_{1 \leq m < n \leq k + 1}{(t_{i_{n}} - t_{i_{m}})}.</math> |
|||
== Determinant == |
|||
The [[determinant]] of a [[square matrix|square]] Vandermonde matrix is called a ''[[Vandermonde polynomial]]'' or ''Vandermonde determinant''. Its value is the [[polynomial]] |
|||
:<math>\det(V) = \prod_{0 \le i < j \le n} (x_j - x_i) </math> |
|||
which is non-zero if and only if all <math>x_i</math> are distinct. |
|||
The Vandermonde determinant was formerly sometimes called the ''discriminant'', but in current terminology the [[discriminant]] of a polynomial <math>p(x)=(x-x_0)\cdots(x-x_n)</math> is the ''square'' of the Vandermonde determinant of the [[root of a function|roots]] <math>x_i</math>. The Vandermonde determinant is an [[alternating form]] in the <math>x_i</math>, meaning that exchanging two <math>x_i</math> changes the sign, and <math>\det(V)</math> thus depends on order for the <math>x_i</math>. By contrast, the discriminant <math>\det(V)^2</math> does not depend on any order, so that [[Galois theory]] implies that the discriminant is a [[polynomial function]] of the coefficients of <math>p(x)</math>. |
|||
The determinant formula is proved below in three ways. The first uses polynomial properties, especially the [[unique factorization domain|unique factorization property]] of [[multivariate polynomial]]s. Although conceptually simple, it involves non-elementary concepts of [[abstract algebra]]. The second proof is based on the [[linear algebra]] concepts of [[change of basis]] in a vector space and the determinant of a [[linear map]]. In the process, it computes the [[LU decomposition]] of the Vandermonde matrix. The third proof is more elementary but more complicated, using only [[elementary matrix|elementary row and column operations]]. |
|||
===First proof: Polynomial properties=== |
|||
The first proof relies on properties of polynomials. |
|||
By the [[Leibniz formula (determinant)|Leibniz formula]], <math>\det(V)</math> is a polynomial in the <math>x_i</math>, with integer coefficients. All entries of the <math>(i-1)</math>-th column have [[total degree]] <math>i</math>. Thus, again by the Leibniz formula, all terms of the determinant have total degree |
|||
:<math>0 + 1 + 2 + \cdots + n = \frac{n(n+1)}2;</math> |
|||
(that is, the determinant is a [[homogeneous polynomial]] of this degree). |
|||
If, for <math>i \neq j</math>, one substitutes <math>x_i</math> for <math>x_j</math>, one gets a matrix with two equal rows, which has thus a zero determinant. Thus, considering the determinant as univariate in <math>x_i,</math> the [[factor theorem]] implies that <math>x_j-x_i</math> is a divisor of <math>\det(V).</math> It thus follows that for all <math>i</math> and <math>j</math>, <math>x_j-x_i</math> is a divisor of <math>\det(V).</math> |
|||
This will now be strengthened to show that the product of all those divisors of <math>\det(V)</math> is a divisor of <math>\det(V).</math> Indeed, let <math>p</math> be a polynomial with <math>x_i-x_j</math> as a factor, then <math>p=(x_i-x_j)\,q,</math> for some polynomial <math>q.</math> If <math>x_k-x_l</math> is another factor of <math>p,</math> then <math>p</math> becomes zero after the substitution of <math>x_k</math> for <math>x_l.</math> If <math>\{x_i,x_j\}\neq \{x_k, x_l\}, </math> the factor <math>q</math> becomes zero after this substitution, since the factor <math>x_i-x_j</math> remains nonzero. So, by the factor theorem, <math>x_k-x_l</math> divides <math>q,</math> and <math>(x_i-x_j)\,(x_k-x_l)</math> divides <math>p.</math> |
|||
Iterating this process by starting from <math>\det(V),</math> one gets that <math>\det(V)</math> is divisible by the product of all <math>x_i-x_j</math> with <math>i<j;</math> that is |
|||
:<math>\det(V)=Q\prod_{0\le i<j\le n} (x_j-x_i),</math> |
|||
where <math>Q</math> is a polynomial. As the product of all <math>x_j-x_i</math> and <math>\det(V)</math> have the same degree <math>n(n + 1)/2</math>, the polynomial <math>Q</math> is, in fact, a constant. This constant is one, because the product of the diagonal entries of <math>V</math> is <math>x_1 x_2^2\cdots x_n^n</math>, which is also the [[monomial]] that is obtained by taking the first term of all factors in <math>\textstyle \prod_{0\le i<j\le n} (x_j-x_i).</math> This proves that <math>Q=1,</math> and finishes the proof. |
|||
:<math>\det(V)=\prod_{0\le i<j\le n} (x_j-x_i).</math> |
|||
===Second proof: linear maps=== |
|||
Let {{mvar|F}} be a [[field (mathematics)|field]] containing all <math>x_i,</math> and <math>P_n</math> the {{mvar|F}} [[vector space]] of the polynomials of degree less than or equal to {{mvar|n}} with coefficients in {{mvar|F}}. Let |
|||
:<math>\varphi:P_n\to F^{n+1}</math> |
|||
be the [[linear map]] defined by |
|||
:<math>p(x) \mapsto (p(x_0), p(x_1), \ldots, p(x_n))</math>. |
|||
The Vandermonde matrix is the matrix of <math>\varphi</math> with respect to the [[canonical basis|canonical bases]] of <math>P_n</math> and <math>F^{n+1}.</math> |
|||
[[Change of basis|Changing the basis]] of <math>P_n</math> amounts to multiplying the Vandermonde matrix by a change-of-basis matrix {{mvar|M}} (from the right). This does not change the determinant, if the determinant of {{mvar|M}} is {{val|1}}. |
|||
The polynomials <math>1</math>, <math>x-x_0</math>, <math>(x-x_0)(x-x_1)</math>, …, <math>(x-x_0) (x-x_1) \cdots (x-x_{n-1})</math> are [[monic polynomial|monic]] of respective degrees 0, 1, …, {{mvar|n}}. Their matrix on the [[monomial basis]] is an [[upper-triangular matrix]] {{mvar|U}} (if the monomials are ordered in increasing degrees), with all diagonal entries equal to one. This matrix is thus a change-of-basis matrix of determinant one. The matrix of <math>\varphi</math> on this new basis is |
|||
:<math>\begin{bmatrix} |
|||
1 & 0 & 0 & \ldots & 0 \\ |
|||
1 & x_1-x_0 & 0 & \ldots & 0 \\ |
|||
1 & x_2-x_0 & (x_2-x_0)(x_2-x_1) & \ldots & 0 \\ |
|||
\vdots & \vdots & \vdots & \ddots & \vdots \\ |
|||
1 & x_n-x_0 & (x_n-x_0)(x_n-x_1) & \ldots & |
|||
(x_n-x_0)(x_n-x_1)\cdots (x_n-x_{n-1}) |
|||
\end{bmatrix}</math>. |
|||
Thus Vandermonde determinant equals the determinant of this matrix, which is the product of its diagonal entries. |
|||
This proves the desired equality. Moreover, one gets the [[LU decomposition]] of {{mvar|V}} as <math>V=LU^{-1}</math>. |
|||
===Third proof: row and column operations=== |
|||
The third proof is based on the fact that if one adds to a column of a matrix the product by a scalar of another column then the determinant remains unchanged. |
|||
So, by subtracting to each column – except the first one – the preceding column multiplied by <math>x_0</math>, the determinant is not changed. (These subtractions must be done by starting from last columns, for subtracting a column that has not yet been changed). This gives the matrix |
|||
:<math>V = \begin{bmatrix} |
|||
1&0&0&0&\cdots&0\\ |
|||
1&x_1-x_0&x_1(x_1-x_0)&x_1^2(x_1-x_0)&\cdots&x_1^{n-1}(x_1-x_0)\\ |
|||
1&x_2-x_0&x_2(x_2-x_0)&x_2^2(x_2-x_0)&\cdots&x_2^{n-1}(x_2-x_0)\\ |
|||
\vdots&\vdots&\vdots&\vdots&\ddots&\vdots\\ |
|||
1&x_n-x_0&x_n(x_n-x_0)&x_n^2(x_n-x_0)&\cdots&x_n^{n-1}(x_n-x_0)\\ |
|||
\end{bmatrix}</math> |
\end{bmatrix}</math> |
||
Applying the [[Laplace_expansion|Laplace expansion formula]] along the first row, we obtain <math>\det(V)=\det(B)</math>, with |
|||
:<math>B = \begin{bmatrix} |
|||
x_1-x_0&x_1(x_1-x_0)&x_1^2(x_1-x_0)&\cdots&x_1^{n-1}(x_1-x_0)\\ |
|||
x_2-x_0&x_2(x_2-x_0)&x_2^2(x_2-x_0)&\cdots&x_2^{n-1}(x_2-x_0)\\ |
|||
\vdots&\vdots&\vdots&\ddots&\vdots\\ |
|||
x_n-x_0&x_n(x_n-x_0)&x_n^2(x_n-x_0)&\cdots&x_n^{n-1}(x_n-x_0)\\ |
|||
\end{bmatrix}</math> |
|||
As all the entries in the <math>i</math>-th row of <math>B</math> have a factor of <math>x_{i+1}-x_0</math>, one can take these factors out and obtain |
|||
:<math>\det(V)=(x_1-x_0)(x_2-x_0)\cdots(x_n-x_0)\begin{vmatrix} |
|||
1&x_1&x_1^2&\cdots&x_1^{n-1}\\ |
|||
1&x_2&x_2^2&\cdots&x_2^{n-1}\\ |
|||
\vdots&\vdots&\vdots&\ddots&\vdots\\ |
|||
1&x_n&x_n^2&\cdots&x_n^{n-1}\\ |
|||
\end{vmatrix}=\prod_{1<i\leq n}(x_i-x_0)\det(V')</math>, |
|||
where <math>V'</math> is a Vandermonde matrix in <math>x_1,\ldots, x_n</math>. Iterating this process on this smaller Vandermonde matrix, one eventually gets the desired expression of <math>\det(V)</math> as the product of all <math>x_j-x_i</math> such that <math>i<j</math>. |
|||
===Rank of the Vandermonde matrix=== |
|||
* An {{math|''m'' × ''n''}} rectangular Vandermonde matrix such that {{math|''m'' ≤ ''n''}} has [[rank of a matrix|rank]] {{math|''m''}} [[if and only if]] all {{math|''x''<sub>''i''</sub>}} are distinct. |
|||
* An {{math|''m'' × ''n''}} rectangular Vandermonde matrix such that {{math|''m'' ≥ ''n''}} has [[rank of a matrix|rank]] {{math|''n''}} if and only if there are {{mvar|n}} of the {{math|''x''<sub>''i''</sub>}} that are distinct. |
|||
* A square Vandermonde matrix is [[invertible matrix|invertible]] if and only if the {{math|''x''<sub>''i''</sub>}} are distinct. An explicit formula for the inverse is known (see below).<ref>{{Cite book |
|||
| title = Inverse of the Vandermonde matrix with applications |
|||
| last = Turner |
|||
| first = L. Richard |
|||
| date = August 1966 |
|||
| url =https://ntrs.nasa.gov/api/citations/19660023042/downloads/19660023042.pdf |
|||
}}</ref><ref name="MS">{{Cite journal |
|||
| volume = 65 |
|||
| issue = 2 |
|||
| pages = 95–100 |
|||
| last = Macon |
|||
| first = N. |
|||
|author2=A. Spitzbart |
|||
| title = Inverses of Vandermonde Matrices |
|||
| journal = The American Mathematical Monthly |
|||
| date = February 1958 |
|||
| jstor = 2308881 |
|||
| doi = 10.2307/2308881}}</ref> |
|||
== Inverse Vandermonde matrix == |
|||
As explained above in Applications, the [[polynomial interpolation]] problem for <math>p(x) = a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n</math>satisfying <math>p(x_0)=y_0, \ldots,p(x_n)=y_n</math> is equivalent to the matrix equation <math>Va = y</math>, which has the unique solution <math>a = V^{-1}y</math>. There are other known formulas which solve the interpolation problem, which must be equivalent to the unique <math>a = V^{-1}y</math>, so they must give explicit formulas for the inverse matrix <math>V^{-1}</math>. In particular, [[Lagrange polynomial|Lagrange interpolation]] shows that the columns of the inverse matrix |
|||
<math display="block">V^{-1}= |
|||
\begin{bmatrix} |
|||
1 & x_0 & \dots & x_0^n\\ |
|||
\vdots & \vdots & &\vdots \\[.5em] |
|||
1 & x_n & \dots & x_n^n |
|||
\end{bmatrix}^{-1} |
|||
= L = |
|||
\begin{bmatrix} |
|||
L_{00} & \!\!\!\!\cdots\!\!\!\! & L_{0n} |
|||
\\ |
|||
\vdots & & \vdots |
|||
\\ |
|||
L_{n0} & \!\!\!\!\cdots\!\!\!\! & L_{nn} |
|||
\end{bmatrix}</math> |
|||
are the coefficients of the Lagrange polynomials <blockquote><math>L_j(x)=L_{0j}+L_{1j}x+\cdots+L_{nj}x^{n} |
|||
= \prod_{0\leq i\leq n \atop i\neq j}\frac{x-x_i}{x_j-x_i} |
|||
= \frac{f(x)}{(x-x_j)\,f'(x_j)}\,, </math></blockquote>where <math>f(x)=(x-x_0)\cdots(x-x_n)</math>. This is easily demonstrated: the polynomials clearly satisfy <math>L_{j}(x_i)=0</math> for <math>i\neq j</math> while <math>L_{j}(x_j)=1</math>, so we may compute the product <math>VL = [L_j(x_i)]_{i,j=0}^n = I</math>, the identity matrix. |
|||
== Confluent Vandermonde matrices == |
|||
{{see also|Hermite interpolation}} |
|||
As described before, a Vandermonde matrix describes the linear algebra [[polynomial interpolation|interpolation problem]] of finding the coefficients of a polynomial <math>p(x)</math> of degree <math>n - 1</math> based on the values <math> p(x_1),\, ...,\, p(x_n)</math>, where <math>x_1,\, ...,\, x_n</math> are ''distinct'' points. If <math>x_i</math> are not distinct, then this problem does not have a unique solution (and the corresponding Vandermonde matrix is singular). However, if we specify the values of the derivatives at the repeated points, then the problem can have a unique solution. For example, the problem |
|||
:<math>\begin{cases} |
|||
p(0) = y_1 \\ |
|||
p'(0) = y_2 \\ |
|||
p(1) = y_3 |
|||
\end{cases}</math> |
|||
where <math>p(x) = ax^2+bx+c</math>, has a unique solution for all <math>y_1,y_2,y_3</math> with <math>y_1\neq y_3</math>. In general, suppose that <math>x_1, x_2, ..., x_n</math> are (not necessarily distinct) numbers, and suppose for simplicity that equal values are adjacent: |
|||
:<math> |
|||
x_1 = \cdots = x_{m_1},\ |
|||
x_{m_1+1} = \cdots = x_{m_2},\ |
|||
\ldots,\ |
|||
x_{m_{k-1}+1} = \cdots = x_{m_k} |
|||
</math> |
|||
where <math>m_1 < m_2 < \cdots < m_k=n,</math> and <math>x_{m_1}, \ldots ,x_{m_k}</math> are distinct. Then the corresponding interpolation problem is |
|||
:<math>\begin{cases} |
|||
p(x_{m_1}) = y_1, & p'(x_{m_1}) = y_2, & \ldots, & p^{(m_1-1)}(x_{m_1}) = y_{m_1}, \\ |
|||
p(x_{m_2}) = y_{m_1+1}, & p'(x_{m_2})=y_{m_1+2}, & \ldots, & p^{(m_2-m_1-1)}(x_{m_2}) = y_{m_2}, \\ |
|||
\qquad \vdots & & & \qquad\vdots \\ |
|||
p(x_{m_k}) = y_{m_{k-1}+1}, & p'(x_{m_k}) = y_{m_{k-1}+2}, & \ldots, & p^{(m_k-m_{k-1}-1)}(x_{m_k}) = y_{m_k}. |
|||
\end{cases}</math> |
|||
The corresponding matrix for this problem is called a '''confluent Vandermonde matrix''', given as follows |
|||
In mathematical terms: |
|||
<ref>{{Cite journal | volume = 57| issue = 1| pages = 15-21| last = Kalman| first = D.| title = The Generalized Vandermonde Matrix| journal = Mathematics Magazine| date = 1984| doi = 10.1080/0025570X.1984.11977069}}</ref>. If <math>1 \leq i,j \leq n</math>, then <math>m_\ell < i \leq m_{\ell + 1} </math> for a unique <math>0 \leq \ell \leq k-1</math> (denoting <math>m_0 = 0</math>). We let |
|||
:<math>V_{i,j} = \ |
:<math>V_{i,j} = \begin{cases} |
||
0 & \text{if } j < i - m_\ell, \\[6pt] |
|||
\dfrac{(j-1)!}{(j - (i - m_\ell))!} x_i^{j-(i-m_\ell)} & \text{if } j \geq i - m_\ell. |
|||
\end{cases}</math> |
|||
This generalization of the Vandermonde matrix makes it [[Invertible matrix|non-singular]], so that there exists a unique solution to the system of equations, and it possesses most of the other properties of the Vandermonde matrix. Its rows are derivatives (of some order) of the original Vandermonde rows. It exists an algorithm for the inverse of the confluent Vandermonde matrix that works in quadratic time for evey input allowed by the definition.<ref>{{Cite journal |last1=Shui-Hung |first1=H. |last2=Wan-Kai |first2=P. |date=2002 |title=Inversion of confluent Vandermonde matrices |journal=[[Computers & Mathematics with Applications]] |volume=43 |issue=12 |pages=1539-1547 |doi=10.1016/S0898-1221(02)00117-7|doi-access=free }}</ref><ref>{{Cite journal | volume = 218| issue = 5| pages = 2044-2054| last = [[Jerzy Respondek|Respondek, J.S.]]| title = Numerical recipes for the high efficient inverse of the confluent Vandermonde matrices| journal = Applied Mathematics and Computation| date = 2011| doi = 10.1016/j.amc.2011.07.017}}</ref> |
|||
These matrices are useful in [[polynomial interpolation]], since solving an equation <math>Va=b</math> for <math>a</math>, is equivalent to finding the [[coefficient]]s <math>a_i</math> of a polynomial that has values <math>b_i</math> at <math>\alpha_i</math>. |
|||
Another way to derive the above formula is by taking a limit of the Vandermonde matrix as the <math>x_i</math>'s approach each other. For example, to get the case of <math>x_1 = x_2</math>, take subtract the first row from second in the original Vandermonde matrix, and let <math>x_2\to x_1</math>: this yields the corresponding row in the confluent Vandermonde matrix. This derives the generalized interpolation problem with given values and derivatives as a limit of the original case with distinct points: giving <math>p(x_i), p'(x_i)</math> is similar to giving <math>p(x_i), p(x_i + \varepsilon)</math> for small <math>\varepsilon</math>. Geometers have studied the problem of tracking confluent points along their tangent lines, known as [[Configuration space (mathematics)|compacitification of configuration space]]. |
|||
The [[determinant]] of a square Vandermonde matrix of a dimension <math>n\times n</math>can be expressed as follows: |
|||
==See also== |
|||
:<math>\det(V) = \prod_{1\le i<j\le n} (\alpha_j-\alpha_i) </math> |
|||
* {{slink|Companion matrix#Diagonalizability}} |
|||
* [[Schur polynomial]] – a generalization |
|||
* [[Alternant matrix]] |
|||
* [[Lagrange polynomial]] |
|||
* [[Wronskian]] |
|||
* [[List of matrices]] |
|||
* [[Moore determinant over a finite field]] |
|||
* [[Vieta's formulas]] |
|||
==References== |
|||
If two or more exponents <math>\alpha_i</math> are equal, the rank of the matrix decreases (if all <math>\alpha_i</math> are distinct, then <math>V</math> is of full rank). This problem can alleviated by using a generalisation called confluent Vandermonde matrices, where the ''k''-multiple columns are replaced by: |
|||
{{reflist}} |
|||
==Further reading== |
|||
:<math>V_{i,j} = \frac{i!}{(i-j-1)!}\alpha_j^{i-j-1}</math> where <math>0\leq j<k</math> |
|||
*{{Citation |last=Ycart |first=Bernard |title=A case of mathematical eponymy: the Vandermonde determinant |journal=Revue d'Histoire des Mathématiques |volume=13 |year=2013 |arxiv=1204.4716|bibcode=2012arXiv1204.4716Y }}. |
|||
{{Matrix classes}} |
|||
[[Category:Matrices]] |
|||
Vandermonde matrices were named after [[Alexandre-Théophile Vandermonde]] ([[1735]]-[[1796]]), a French mathematician and musician. |
|||
[[Category:Determinants]] |
|||
[[Category:Numerical linear algebra]] |
Latest revision as of 03:02, 16 December 2024
In linear algebra, a Vandermonde matrix, named after Alexandre-Théophile Vandermonde, is a matrix with the terms of a geometric progression in each row: an matrix
with entries , the jth power of the number , for all zero-based indices and .[1] Some authors define the Vandermonde matrix as the transpose of the above matrix.[2][3]
The determinant of a square Vandermonde matrix (when ) is called a Vandermonde determinant or Vandermonde polynomial. Its value is:
This is non-zero if and only if all are distinct (no two are equal), making the Vandermonde matrix invertible.
Applications
[edit]The polynomial interpolation problem is to find a polynomial which satisfies for given data points . This problem can be reformulated in terms of linear algebra by means of the Vandermonde matrix, as follows. computes the values of at the points via a matrix multiplication , where is the vector of coefficients and is the vector of values (both written as column vectors):
If and are distinct, then V is a square matrix with non-zero determinant, i.e. an invertible matrix. Thus, given V and y, one can find the required by solving for its coefficients in the equation :[4]
.
That is, the map from coefficients to values of polynomials is a bijective linear mapping with matrix V, and the interpolation problem has a unique solution. This result is called the unisolvence theorem, and is a special case of the Chinese remainder theorem for polynomials.
In statistics, the equation means that the Vandermonde matrix is the design matrix of polynomial regression.
In numerical analysis, solving the equation naïvely by Gaussian elimination results in an algorithm with time complexity O(n3). Exploiting the structure of the Vandermonde matrix, one can use Newton's divided differences method[5] (or the Lagrange interpolation formula[6][7]) to solve the equation in O(n2) time, which also gives the UL factorization of . The resulting algorithm produces extremely accurate solutions, even if is ill-conditioned.[2] (See polynomial interpolation.)
The Vandermonde determinant is used in the representation theory of the symmetric group.[8]
When the values belong to a finite field, the Vandermonde determinant is also called the Moore determinant, and has properties which are important in the theory of BCH codes and Reed–Solomon error correction codes.
The discrete Fourier transform is defined by a specific Vandermonde matrix, the DFT matrix, where the are chosen to be nth roots of unity. The Fast Fourier transform computes the product of this matrix with a vector in time.[9]
In the physical theory of the quantum Hall effect, the Vandermonde determinant shows that the Laughlin wavefunction with filling factor 1 is equal to a Slater determinant. This is no longer true for filling factors different from 1 in the fractional quantum Hall effect.
In the geometry of polyhedra, the Vandermonde matrix gives the normalized volume of arbitrary -faces of cyclic polytopes. Specifically, if is a -face of the cyclic polytope corresponding to , then
Determinant
[edit]The determinant of a square Vandermonde matrix is called a Vandermonde polynomial or Vandermonde determinant. Its value is the polynomial
which is non-zero if and only if all are distinct.
The Vandermonde determinant was formerly sometimes called the discriminant, but in current terminology the discriminant of a polynomial is the square of the Vandermonde determinant of the roots . The Vandermonde determinant is an alternating form in the , meaning that exchanging two changes the sign, and thus depends on order for the . By contrast, the discriminant does not depend on any order, so that Galois theory implies that the discriminant is a polynomial function of the coefficients of .
The determinant formula is proved below in three ways. The first uses polynomial properties, especially the unique factorization property of multivariate polynomials. Although conceptually simple, it involves non-elementary concepts of abstract algebra. The second proof is based on the linear algebra concepts of change of basis in a vector space and the determinant of a linear map. In the process, it computes the LU decomposition of the Vandermonde matrix. The third proof is more elementary but more complicated, using only elementary row and column operations.
First proof: Polynomial properties
[edit]The first proof relies on properties of polynomials.
By the Leibniz formula, is a polynomial in the , with integer coefficients. All entries of the -th column have total degree . Thus, again by the Leibniz formula, all terms of the determinant have total degree
(that is, the determinant is a homogeneous polynomial of this degree).
If, for , one substitutes for , one gets a matrix with two equal rows, which has thus a zero determinant. Thus, considering the determinant as univariate in the factor theorem implies that is a divisor of It thus follows that for all and , is a divisor of
This will now be strengthened to show that the product of all those divisors of is a divisor of Indeed, let be a polynomial with as a factor, then for some polynomial If is another factor of then becomes zero after the substitution of for If the factor becomes zero after this substitution, since the factor remains nonzero. So, by the factor theorem, divides and divides
Iterating this process by starting from one gets that is divisible by the product of all with that is
where is a polynomial. As the product of all and have the same degree , the polynomial is, in fact, a constant. This constant is one, because the product of the diagonal entries of is , which is also the monomial that is obtained by taking the first term of all factors in This proves that and finishes the proof.
Second proof: linear maps
[edit]Let F be a field containing all and the F vector space of the polynomials of degree less than or equal to n with coefficients in F. Let
be the linear map defined by
- .
The Vandermonde matrix is the matrix of with respect to the canonical bases of and
Changing the basis of amounts to multiplying the Vandermonde matrix by a change-of-basis matrix M (from the right). This does not change the determinant, if the determinant of M is 1.
The polynomials , , , …, are monic of respective degrees 0, 1, …, n. Their matrix on the monomial basis is an upper-triangular matrix U (if the monomials are ordered in increasing degrees), with all diagonal entries equal to one. This matrix is thus a change-of-basis matrix of determinant one. The matrix of on this new basis is
- .
Thus Vandermonde determinant equals the determinant of this matrix, which is the product of its diagonal entries.
This proves the desired equality. Moreover, one gets the LU decomposition of V as .
Third proof: row and column operations
[edit]The third proof is based on the fact that if one adds to a column of a matrix the product by a scalar of another column then the determinant remains unchanged.
So, by subtracting to each column – except the first one – the preceding column multiplied by , the determinant is not changed. (These subtractions must be done by starting from last columns, for subtracting a column that has not yet been changed). This gives the matrix
Applying the Laplace expansion formula along the first row, we obtain , with
As all the entries in the -th row of have a factor of , one can take these factors out and obtain
- ,
where is a Vandermonde matrix in . Iterating this process on this smaller Vandermonde matrix, one eventually gets the desired expression of as the product of all such that .
Rank of the Vandermonde matrix
[edit]- An m × n rectangular Vandermonde matrix such that m ≤ n has rank m if and only if all xi are distinct.
- An m × n rectangular Vandermonde matrix such that m ≥ n has rank n if and only if there are n of the xi that are distinct.
- A square Vandermonde matrix is invertible if and only if the xi are distinct. An explicit formula for the inverse is known (see below).[10][3]
Inverse Vandermonde matrix
[edit]As explained above in Applications, the polynomial interpolation problem for satisfying is equivalent to the matrix equation , which has the unique solution . There are other known formulas which solve the interpolation problem, which must be equivalent to the unique , so they must give explicit formulas for the inverse matrix . In particular, Lagrange interpolation shows that the columns of the inverse matrix
are the coefficients of the Lagrange polynomials
where . This is easily demonstrated: the polynomials clearly satisfy for while , so we may compute the product , the identity matrix.
Confluent Vandermonde matrices
[edit]As described before, a Vandermonde matrix describes the linear algebra interpolation problem of finding the coefficients of a polynomial of degree based on the values , where are distinct points. If are not distinct, then this problem does not have a unique solution (and the corresponding Vandermonde matrix is singular). However, if we specify the values of the derivatives at the repeated points, then the problem can have a unique solution. For example, the problem
where , has a unique solution for all with . In general, suppose that are (not necessarily distinct) numbers, and suppose for simplicity that equal values are adjacent:
where and are distinct. Then the corresponding interpolation problem is
The corresponding matrix for this problem is called a confluent Vandermonde matrix, given as follows [11]. If , then for a unique (denoting ). We let
This generalization of the Vandermonde matrix makes it non-singular, so that there exists a unique solution to the system of equations, and it possesses most of the other properties of the Vandermonde matrix. Its rows are derivatives (of some order) of the original Vandermonde rows. It exists an algorithm for the inverse of the confluent Vandermonde matrix that works in quadratic time for evey input allowed by the definition.[12][13]
Another way to derive the above formula is by taking a limit of the Vandermonde matrix as the 's approach each other. For example, to get the case of , take subtract the first row from second in the original Vandermonde matrix, and let : this yields the corresponding row in the confluent Vandermonde matrix. This derives the generalized interpolation problem with given values and derivatives as a limit of the original case with distinct points: giving is similar to giving for small . Geometers have studied the problem of tracking confluent points along their tangent lines, known as compacitification of configuration space.
See also
[edit]- Companion matrix § Diagonalizability
- Schur polynomial – a generalization
- Alternant matrix
- Lagrange polynomial
- Wronskian
- List of matrices
- Moore determinant over a finite field
- Vieta's formulas
References
[edit]- ^ Roger A. Horn and Charles R. Johnson (1991), Topics in matrix analysis, Cambridge University Press. See Section 6.1.
- ^ a b Golub, Gene H.; Van Loan, Charles F. (2013). Matrix Computations (4th ed.). The Johns Hopkins University Press. pp. 203–207. ISBN 978-1-4214-0859-0.
- ^ a b Macon, N.; A. Spitzbart (February 1958). "Inverses of Vandermonde Matrices". The American Mathematical Monthly. 65 (2): 95–100. doi:10.2307/2308881. JSTOR 2308881.
- ^ François Viète (1540-1603), Vieta's formulas, https://en.wikipedia.org/wiki/Vieta%27s_formulas
- ^ Björck, Å.; Pereyra, V. (1970). "Solution of Vandermonde Systems of Equations". American Mathematical Society. 24 (112): 893–903. doi:10.1090/S0025-5718-1970-0290541-1. S2CID 122006253.
- ^ Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (2007). "Section 2.8.1. Vandermonde Matrices". Numerical Recipes: The Art of Scientific Computing (3rd ed.). New York: Cambridge University Press. ISBN 978-0-521-88068-8.
- ^ Inverse of Vandermonde Matrix (2018), https://proofwiki.org/wiki/Inverse_of_Vandermonde_Matrix
- ^ Fulton, William; Harris, Joe (1991). Representation theory. A first course. Graduate Texts in Mathematics, Readings in Mathematics. Vol. 129. New York: Springer-Verlag. doi:10.1007/978-1-4612-0979-9. ISBN 978-0-387-97495-8. MR 1153249. OCLC 246650103. Lecture 4 reviews the representation theory of symmetric groups, including the role of the Vandermonde determinant.
- ^ Gauthier, J. "Fast Multipoint Evaluation On n Arbitrary Points." Simon Fraser University, Tech. Rep (2017).
- ^ Turner, L. Richard (August 1966). Inverse of the Vandermonde matrix with applications (PDF).
- ^ Kalman, D. (1984). "The Generalized Vandermonde Matrix". Mathematics Magazine. 57 (1): 15–21. doi:10.1080/0025570X.1984.11977069.
- ^ Shui-Hung, H.; Wan-Kai, P. (2002). "Inversion of confluent Vandermonde matrices". Computers & Mathematics with Applications. 43 (12): 1539–1547. doi:10.1016/S0898-1221(02)00117-7.
- ^ Respondek, J.S. (2011). "Numerical recipes for the high efficient inverse of the confluent Vandermonde matrices". Applied Mathematics and Computation. 218 (5): 2044–2054. doi:10.1016/j.amc.2011.07.017.
Further reading
[edit]- Ycart, Bernard (2013), "A case of mathematical eponymy: the Vandermonde determinant", Revue d'Histoire des Mathématiques, 13, arXiv:1204.4716, Bibcode:2012arXiv1204.4716Y.