You are here

Kronecker Products and Matrix Calculus with Applications

Alexander Graham
Dover Publications
Publication Date: 
Number of Pages: 
[Reviewed by
Michael Berg
, on

I remember my first true experience working with tensor products, many years ago. In some number theory work I was doing, I needed to go beyond their characterization as universal objects modulo bilinearity, i.e. their beautiful but at the time not very revealing property that they should uniquely linearize bilinear maps courtesy of a commutative diagram. Fortuitously, I found out about Kronecker products: using these I could grapple pretty effectively with what I was studying in explicit, computational terms and that shone a lot of light into the darkness for me. Taking a bird’s eye view, this is not all that unnatural, since (working for convenience in the category of vector spaces) the dimension of \(V\otimes W\) is the product \(\dim(V)\dim(W)\), and if one takes the Kronecker product of an \(n\times n\) matrix and an \(,\times m\) matrix, one ends up with an |(nm\times nm\) matrix. So morally it all comes down to thinking about linearity (i.e. linear transformations) in the right way.

Well, let’s get down to brass tacks. If, more generally than in the preceding comments, we have matrices \(A=(a_{ij})\) and \(B\) of respective dimensions \(m\times n\), \(r\times s\), then their Kronecker product, denoted (suggestively) \(A\otimes B\), is defined as the block matrix \((a_{ij}B\) of dimension \(mr\times ns\). The author introduces this notion on p. 21 of the book under review, and notes that it’s also referred to as the direct product or even the tensor product of \(A\) and \(B\). He also observes that one of the raisons d’être for the Kronecker product is its “important application in particle physics.” He then goes on to establish a host of very nice properties of the Kronecker product, and here are a pair of perhaps somewhat unexpected examples (cf. p. 122: Graham includes tables of formulae in his book): \((A\otimes B)(C\otimes D) = AC\otimes BD\) (caveat: the correct statement is on p. 24: it’s misstated on p. 122);and \(A\otimes B=U_1(B\otimes A)U_2\) for certain permutation matrices \(U_1\) and \(U_2\).

The other notion the book under review is concerned with is that of matrix derivatives. The first definition is this (p. 37): given the matrix (valued function) \(A(t)=(a+{ij}(t))\), the derivative is just \(\frac{d}{dt}A(t)=\left(\frac{da_{ij}}{dt}(t)\right)\); Graham defines the integral of a matrix in the obvious complementary way. Well, what’s so special about that? Fair enough. But what about this (cf. p. 56)? Let \(X=(x_{ij})\) of size \(m\times n\), and let \(y=f(X)\) be a scalar function of the matrix \(X\). Then, by definition,\(\frac{\partial y}{\partial X} =\frac{\partial f(X)}{dX} =\left(\frac{\partial y}{\partial x_{ij}}\right)\). And then there’s this (p. 81): if \(Y=(y_{rs})\) of size \(p\times q\) and \(X=(x_{ij})\) of size \(m\times n\), then we get nothing less than the derivative of the matrix \(Y\) with respect to the matrix \(X\), defined as \(\frac{\partial Y}{\partial X} = \left(\frac{\partial Y}{\partial x_{ij}}\right)\). This is evidently of size \(pm\times qn\). Indeed, in terms of elementary matrices, if we set, as always, \(E_{rs} = (\delta_{rs})\), we get \(\frac{\partial Y}{\partial X}=\sum_{r,s} E_{rs}\otimes \frac{\partial Y}{\partial x_{rs}}\). So the plot is thickening in the obvious manner: one simply systematically employs the yoga of the Kronecker product. For this latter beast, we get, for example (cf. p. 125), \(\frac{\partial XY}{\partial Z} = \frac{\partial X}{\partial Z}(I\otimes Y) + (I\otimes X)\frac{\partial Y}{\partial Z}\), and, perhaps somewhat perversely, \(\frac{\partial X}{\partial X}\neq I\) in general; see p. 125 for what it actually is equal to — very cool stuff, really.

The book under review is therefore a cornucopia of very unusual results, at least to the uninitiated, and these days this is a set with a very small complement. Therefore, it is fun in and of itself, to learn this material; it is eminently accessible and would even provide a lot of material for a clever undergraduate to cut his teeth on. But this is not to say that there are no application for this business: see Graham’s last chapter, where we encounter such themes as least squares and constrained optimization, the general least squares problem, and evaluation of certain Jacobians.

I do like this perhaps somewhat anachronistic (and certainly recheché) material a lot, and I think Graham’s book is a very nice source to be introduced to it.

Michael Berg is Professor of Mathematics at Loyola Marymount University in Los Angeles, CA.

The table of contents is not available.