Tensor reshaping

In multilinear algebra, a reshaping of tensors is any bijection between the set of indices of an order- $M$ tensor and the set of indices of an order- $L$ tensor, where $L<M$ . The use of indices presupposes tensors in coordinate representation with respect to a basis. The coordinate representation of a tensor can be regarded as a multi-dimensional array, and a bijection from one set of indices to another therefore amounts to a rearrangement of the array elements into an array of a different shape. Such a rearrangement constitutes a particular kind of linear map between the vector space of order- $M$ tensors and the vector space of order- $L$ tensors.

Definition

Given a positive integer $M$ , the notation $[M]$ refers to the set $\{1,\dots ,M\}$ of the first $M$ positive integers.

For each integer $m$ where $1\leq m\leq M$ for a positive integer $M$ , let $V_{m}$ denote an $I_{m}$ -dimensional vector space over a field $F$ . Then there are vector space isomorphisms (linear maps)

${\begin{aligned}V_{1}\otimes \cdots \otimes V_{M}&\simeq F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}\\&\simeq F^{I_{\pi _{1}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\simeq F^{I_{\pi _{1}}I_{\pi _{2}}}\otimes F^{I_{\pi _{3}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\simeq F^{I_{\pi _{1}}I_{\pi _{3}}}\otimes F^{I_{\pi _{2}}}\otimes F^{I_{\pi _{4}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\,\,\,\vdots \\&\simeq F^{I_{1}I_{2}\ldots I_{M}},\end{aligned}}$

where $\pi \in {\mathfrak {S}}_{M}$ is any permutation and ${\mathfrak {S}}_{M}$ is the symmetric group on $M$ elements. Via these (and other) vector space isomorphisms, a tensor can be interpreted in several ways as an order- $L$ tensor where $L\leq M$ .

Coordinate representation

The first vector space isomorphism on the list above, $V_{1}\otimes \cdots \otimes V_{M}\simeq F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}$ , gives the coordinate representation of an abstract tensor. Assume that each of the $M$ vector spaces $V_{m}$ has a basis $\{v_{1}^{m},v_{2}^{m},\ldots ,v_{I_{m}}^{m}\}$ . The expression of a tensor with respect to this basis has the form ${\mathcal {A}}=\sum _{i_{1}=1}^{I_{1}}\ldots \sum _{i_{M}=1}^{I_{M}}a_{i_{1},i_{2},\ldots ,i_{M}}v_{i_{1}}^{1}\otimes v_{i_{2}}^{2}\otimes \cdots \otimes v_{i_{M}}^{M},$ where the coefficients $a_{i_{1},i_{2},\ldots ,i_{M}}$ are elements of $F$ . The coordinate representation of ${\mathcal {A}}$ is $\sum _{i_{1}=1}^{I_{1}}\ldots \sum _{i_{M}=1}^{I_{M}}a_{i_{1},i_{2},\ldots ,i_{M}}\mathbf {e} _{i_{1}}^{1}\otimes \mathbf {e} _{i_{2}}^{2}\otimes \cdots \otimes \mathbf {e} _{i_{M}}^{M},$ where $\mathbf {e} _{i}^{m}$ is the $i^{\text{th}}$ standard basis vector of $F^{I_{m}}$ . This can be regarded as a M-way array whose elements are the coefficients $a_{i_{1},i_{2},\ldots ,i_{M}}$ .

General flattenings

For any permutation $\pi \in {\mathfrak {S}}_{M}$ there is a canonical isomorphism between the two tensor products of vector spaces $V_{1}\otimes V_{2}\otimes \cdots \otimes V_{M}$ and $V_{\pi (1)}\otimes V_{\pi (2)}\otimes \cdots \otimes V_{\pi (M)}$ . Parentheses are usually omitted from such products due to the natural isomorphism between $V_{i}\otimes (V_{j}\otimes V_{k})$ and $(V_{i}\otimes V_{j})\otimes V_{k}$ , but may, of course, be reintroduced to emphasize a particular grouping of factors. In the grouping, $(V_{\pi (1)}\otimes \cdots \otimes V_{\pi (r_{1})})\otimes (V_{\pi (r_{1}+1)}\otimes \cdots \otimes V_{\pi (r_{2})})\otimes \cdots \otimes (V_{\pi (r_{L-1}+1)}\otimes \cdots \otimes V_{\pi (r_{L})}),$ there are $L$ groups with $r_{l}-r_{l-1}$ factors in the $l^{\text{th}}$ group (where $r_{0}=0$ and $r_{L}=M$ ).

Letting $S_{l}=(\pi (r_{l-1}+1),\pi (r_{l-1}+2),\ldots ,\pi (r_{l}))$ for each $l$ satisfying $1\leq l\leq L$ , an $(S_{1},S_{2},\ldots ,S_{L})$ -flattening of a tensor ${\mathcal {A}}$ , denoted ${\mathcal {A}}_{(S_{1},S_{2},\ldots ,S_{L})}$ , is obtained by applying the two processes above within each of the $L$ groups of factors. That is, the coordinate representation of the $l^{\text{th}}$ group of factors is obtained using the isomorphism $(V_{\pi (r_{l-1}+1)}\otimes V_{\pi (r_{l-1}+2)}\otimes \cdots \otimes V_{\pi (r_{l})})\simeq (F^{I_{\pi (r_{l-1}+1)}}\otimes F^{I_{\pi (r_{l-1}+2)}}\otimes \cdots \otimes F^{I_{\pi (r_{l})}})$ , which requires specifying bases for all of the vector spaces $V_{k}$ . The result is then vectorized using a bijection $\mu _{l}:[I_{\pi (r_{l-1}+1)}]\times [I_{\pi (r_{l-1}+2)}]\times \cdots \times [I_{\pi (r_{l})}]\to [I_{S_{l}}]$ to obtain an element of $F^{I_{S_{l}}}$ , where ${\textstyle I_{S_{l}}:=\prod _{i=r_{l-1}+1}^{r_{l}}I_{\pi (i)}}$ , the product of the dimensions of the vector spaces in the $l^{\text{th}}$ group of factors. The result of applying these isomorphisms within each group of factors is an element of $F^{I_{S_{1}}}\otimes \cdots \otimes F^{I_{S_{L}}}$ , which is a tensor of order $L$ .

Vectorization

By means of a bijective map $\mu :[I_{1}]\times \cdots \times [I_{M}]\to [I_{1}\cdots I_{M}]$ , a vector space isomorphism between $F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}$ and $F^{I_{1}\cdots I_{M}}$ is constructed via the mapping $\mathbf {e} _{i_{1}}^{1}\otimes \cdots \mathbf {e} _{i_{m}}^{m}\otimes \cdots \otimes \mathbf {e} _{i_{M}}^{M}\mapsto \mathbf {e} _{\mu (i_{1},i_{2},\ldots ,i_{M})},$ where for every natural number $i$ such that $1\leq i\leq I_{1}\cdots I_{M}$ , the vector $\mathbf {e} _{i}$ denotes the ith standard basis vector of $F^{i_{1}\cdots i_{M}}$ . In such a reshaping, the tensor is simply interpreted as a vector in $F^{I_{1}\cdots I_{M}}$ . This is known as vectorization, and is analogous to vectorization of matrices. A standard choice of bijection $\mu$ is such that

$\operatorname {vec} ({\mathcal {A}})={\begin{bmatrix}a_{1,1,\ldots ,1}&a_{2,1,\ldots ,1}&\cdots &a_{n_{1},1,\ldots ,1}&a_{1,2,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{M}}\end{bmatrix}}^{T},$

which is consistent with the way in which the colon operator in Matlab and GNU Octave reshapes a higher-order tensor into a vector. In general, the vectorization of ${\mathcal {A}}$ is the vector $[a_{\mu ^{-1}(i)}]_{i=1}^{I_{1}\cdots I_{M}}$ .

The vectorization of ${\mathcal {A}}$ denoted with $vec({\mathcal {A}})$ or ${\mathcal {A}}_{[:]}$ is an $[S_{1},S_{2}]$ -reshaping where $S_{1}=(1,2,\ldots ,M)$ and $S_{2}=\emptyset$ .

Mode-m Flattening / Mode-m Matrixization

Let ${\mathcal {A}}\in F^{I_{1}}\otimes F^{I_{2}}\otimes \cdots \otimes F^{I_{M}}$ be the coordinate representation of an abstract tensor with respect to a basis. Mode-m matrixizing (a.k.a. flattening) of ${\mathcal {A}}$ is an $[S_{1},S_{2}]$ -reshaping in which $S_{1}=(m)$ and $S_{2}=(1,2,\ldots ,m-1,m+1,\ldots ,M)$ . Usually, a standard matrixizing is denoted by

${\mathbf {A} }_{[m]}={\mathcal {A}}_{[S_{1},S_{2}]}$

This reshaping is sometimes called matrixizing, matricizing, flattening or unfolding in the literature. A standard choice for the bijections $\mu _{1},\ \mu _{2}$ is the one that is consistent with the reshape function in Matlab and GNU Octave, namely

${\mathbf {A} }_{[m]}:={\begin{bmatrix}a_{1,1,\ldots ,1,1,1,\ldots ,1}&a_{2,1,\ldots ,1,1,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},1,I_{m+1},\ldots ,I_{M}}\\a_{1,1,\ldots ,1,2,1,\ldots ,1}&a_{2,1,\ldots ,1,2,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},2,I_{m+1},\ldots ,I_{M}}\\\vdots &\vdots &&\vdots \\a_{1,1,\ldots ,1,I_{m},1,\ldots ,1}&a_{2,1,\ldots ,1,I_{m},1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},I_{m},I_{m+1},\ldots ,I_{M}}\end{bmatrix}}$

Definition Mode-m Matrixizing:^[1] $[{\mathbf {A} }_{[m]}]_{jk}=a_{i_{1}\dots i_{m}\dots i_{M}},\;\;{\text{ where }}j=i_{m}{\text{ and }}k=1+\sum _{n=0 \atop n\neq m}^{M}(i_{n}-1)\prod _{l=0 \atop l\neq m}^{n-1}I_{l}.$ The mode-m matrixizing of a tensor ${\mathcal {A}}\in F^{I_{1}\times ...I_{M}},$ is defined as the matrix ${\mathbf {A} }_{[m]}\in F^{I_{m}\times (I_{1}\dots I_{m-1}I_{m+1}\dots I_{M})}$ . As the parenthetical ordering indicates, the mode-m column vectors are arranged by sweeping all the other mode indices through their ranges, with smaller mode indexes varying more rapidly than larger ones; thus

References

^ Vasilescu, M. Alex O. (2009), "Multilinear (Tensor) Algebraic Framework for Computer Graphics, Computer Vision and Machine Learning" (PDF), University of Toronto, p. 21

[Vasilescu2009-1] Vasilescu, M. Alex O. (2009), "Multilinear (Tensor) Algebraic Framework for Computer Graphics, Computer Vision and Machine Learning" (PDF), University of Toronto, p. 21

[1]