Recently there has been great interest in sparse representations of signals under the assumption that signals (data sets) can be well approximated by a linear combination of few elements of a known basis (dictionary). Many algorithms have been developed to find such representations for one-dimensional signals (vectors), which requires finding the sparsest solution of an underdetermined linear system of algebraic equations. In this letter, we generalize the theory of sparse representations of vectors to multiway arrays (tensors)—signals with a multidimensional structure—by using the Tucker model. Thus, the problem is reduced to solving a large-scale underdetermined linear system of equations possessing a Kronecker structure, for which we have developed a greedy algorithm, Kronecker-OMP, as a generalization of the classical orthogonal matching pursuit (OMP) algorithm for vectors. We also introduce the concept of multiway block-sparse representation of N-way arrays and develop a new greedy algorithm that exploits not only the Kronecker structure but also block sparsity. This allows us to derive a very fast and memory-efficient algorithm called N-BOMP (N-way block OMP). We theoretically demonstrate that under the block-sparsity assumption, our N-BOMP algorithm not only has a considerably lower complexity but is also more precise than the classic OMP algorithm. Moreover, our algorithms can be used for very large-scale problems, which are intractable using standard approaches. We provide several simulations illustrating our results and comparing our algorithms to classical algorithms such as OMP and BP (basis pursuit) algorithms. We also apply the N-BOMP algorithm as a fast solution for the compressed sensing (CS) problem with large-scale data sets, in particular, for 2D compressive imaging (CI) and 3D hyperspectral CI, and we show examples with real-world multidimensional signals.
A concept that underlies some recent developments in many fields such as in image and signal processing is sparsity. Usually signals living in a vector space do not cover the entire space uniformly (Lu & Do, 2008). In particular, it was discovered that most signals of interest can be well approximated by a sparse representation over a known dictionary by using a linear combination of few dictionary elements (atoms) (Donoho, 2006; Candès, Romberg, & Tao, 2006). Moreover, sometimes it is useful to consider overcomplete dictionaries, where the number of atoms is larger than the signal size (Elad, Figueiredo, & Ma, 2010). In fact, sparse representations are found in the way that visual cortex codes natural images in the brain (Olshausen & Field, 1996). The implications of the sparsity assumption have recently driven the development of many exciting applications in signal processing. For example, it was proved that signals with a sparse representation can be reconstructed from a reduced number of measurements, the main objective of compressed sensing (CS) (Donoho, 2006; Candès et al., 2006); some algorithms for blind source separation are based on the sparsity assumption (Bobin, Starck, Fadili, & Moudden, 2007; Li et al, 2004; Caiafa & Cichocki, 2009); also techniques for denoising and inpainting of images have been developed using similar ideas (Elad, Starck, Querre, & Donoho, 2005; Elad & Aharon, 2006). In this framework, the main problem is how to solve an underdetermined system of linear algebraic equations constrained to the fact that among the infinite number of solutions, the sparsest solution may be unique. To this end, many algorithms were proposed, including greedy methods such as matching pursuit (MP) (Davis, Mallat, & Zhang, 1994), orthogonal matching pursuit (OMP) (Tropp, 2004), compressive sampling matching pursuit (CoSaMP) (Needell & Tropp, 2009), norm minimization methods such as basis pursuit (Chen & Donoho, 2001), gradient projection sparse reconstruction (GPSR) (Figueiredo, Nowak, & Wright, 2007), and many others (see Tropp & Wright, 2010, for an up-to-date review of algorithms).
Another characteristic of signals found in modern applications is that they often have a multidimensional structure, where each dimension (mode) has a particular physical meaning (e.g., space, time, frequency, trials). For example, a 3D image produced by a computed tomography (CT) system or a magnetic resonance imaging (MRI) system corresponds to a sampled version of a 3D function f(x1, x2, x3). In this case, the multidimensional image is stored in memory as a multiway array (tensor) whose elements are samples taken on a grid, that is, (, n=1, 2, 3), with h being the discretization step among all dimensions. Multiway arrays are the generalization of vectors and matrices to a higher number of dimensions and are usually referred in the literature as N-way arrays or tensors. They are very attractive mathematical tools possessing their own properties (Kolda & Bader, 2009; Cichocki, Zdunek, Phan, & Amari, 2009).
Due to the availability of massive data sets occurring in new applications and requirements in scientific computation, a great deal of effort has been devoted to the development of new algorithms for multiway structured data sets, sometimes with a high number of dimensions (Oseledets, Savostianov, & Tyrtyshnikov, 2008; Caiafa & Cichocki, 2010; Oseledets, 2011). The curse of dimensionality problem makes the task of finding sparse representations of N-way arrays very expensive in terms of memory storage resources and computation load since the number of entries grows exponentially with the number of dimensions. To solve this problem, one may try to find some structure in data sets and build approximate models using fewer parameters than the number of entries, as is the case of the Tucker model (Lathauwer, Moor, & Vandewalle, 2000b; Kolda & Bader, 2009). In fact, multidimensional signals often reveal a structure in each mode that allows one to adopt good approximate representations based on the Kronecker product of dictionaries associated with each of the modes (Rivenson & Stern, 2009a). For example, most popular transforms applied to two-dimensional signals (2D images) are based on the application of a transformation of rows followed by a transformation of columns known as separable transforms. In this case, the dictionary associated with the vectorized image can be written as the Kronecker product of dictionaries associated with rows and columns. The Kronecker structure has been extensively used in the image processing community (Nagy & Kilmer, 2006; Rivenson & Stern, 2009a). Also the Kronecker structure has been been proposed in the matrix analysis community as a good preconditioner to solve linear systems (Loan & Pitsianis, 1992) and to approximate function-related linear systems (Ford & Tyrtyshnikov, 2003; Tyrtyshnikov, 2004).
This letter is organized as follows. In section 2, the basic notation is introduced, and some important previous results are presented; in section 3, we generalize the theory of sparse representations of vectors to multiway arrays and introduce the concept of block sparsity; in section 4, we introduce the multidimensional compressed sensing problem; in section 5, our new greedy algorithms are developed for the case of multidimensional sparsity (Kronecker-OMP algorithm) and multiway block sparsity (N-BOMP), including a detailed analysis of their computational complexities; in section 6, we present a new theoretical result about the performance guarantee of N-BOMP algorithm, which we show to be much less restrictive than the case of the classical OMP algorithm; in section 7, extensive simulations are presented using synthetically generated N-way arrays, as well as the application to CS of real-world multidimensional signals. In section 8, we outline the main conclusions of this work.
2. Notation and Preliminaries
In this letter, N-way arrays (multidimensional signals) are denoted by underlined boldface capital letters; matrices (two-way arrays) are denoted by bold uppercase letters and vectors by boldface lowercase letters, that is, ; and and are examples of an N-way array, a matrix, and a vector, respectively. The ith entry of a vector y is denoted by yi, and the element (i, j) of a matrix y is denoted by either of the following ways Y(i, j)=yij. The same notation is used for N-way arrays by referring to the element as . As a natural generalization, we define the Frobenius norm of an N-way array by . Sometimes we use MATLAB notation to indicate the full range of indices in certain modes; for example, Y(:, j) means a vector composed of all the entries of column j. Additionally, we refer to a subarray (or block) by restricting the indices to belonging to certain subsets of indices, for example, given the following subsets of Sn indices in each mode , the subarray is obtained by keeping the entries of the original N-way array at the selected index subsets (). We denote the cardinality of a subset of indices by .
2.1. Sparse Solutions of Underdetermined Linear Systems.
We say that a one-dimensional signal (vector) has a sparse representation in a known dictionary if it can be recovered exactly by a linear combination of few selected elements of the dictionary (atoms) as the following definition states:
Several algorithms have been proposed to find the sparsest representation of a signal for a given dictionary by solving the corresponding underdetermined linear system of algebraic equations under the sparsity constraint (Tropp & Wright, 2010). These methods can be divided in two main groups: basis pursuit (BP) and matching pursuit (MP).
2.2. Kronecker Dictionaries.
The Kronecker product is defined as follows:
2.3. Multiway Arrays (N-way arrays) and the Tucker Model.
Given a multiway array (tensor) , its mode-n vectors are obtained by fixing every index but the one in mode n. The mode-n unfolding matrix is defined by arranging all the mode-n vectors as columns of a matrix. Note that for the 2D case, mode-1 and mode-2 vectors are columns and rows, respectively, and the mode-2 unfolding matrix is the transposed matrix.
The result of proposition 1 tells us that the multilinear Tucker model structure for a multidimensional signal is equivalent to a linear representation (matrix equation) of the vectorized signal where the dictionary has a Kronecker structure. In the standard Tucker model, a core tensor usually has a much smaller size than with and the main objective is to find such compressed decomposition, that is, to compute and factor matrices An, usually with additional constraints. In contrast, in our approach, the data set (measurements) and dictionaries Dn are assumed to be known, and our objective is to compute the core tensor , which is assumed to be very sparse and is larger than ().
3. Sparse Representations of N-Way Arrays
Here we generalize the concept of sparse representations of vectors to N-way arrays through the equivalence between the Tucker model and the Kronecker structure. Formally, definition 1 can be applied to a vectorized version of an N-way array with respect to a Kronecker dictionary arriving at the following obvious generalization:
(multiway sparsity). A multidimensional signal (N-way array) is K-sparse with respect to the factors (, ) if its vectorized version y admits a K-sparse representation over the Kronecker dictionary according to representation 2.14.
The columns of factor matrices Dn are interpreted as dictionary elements or atoms associated with each mode. For many problems, these atoms can be selected to resemble the coherent structures that appear in each mode of the input N-way array. In other applications, we may want to design such dictionaries in order to favor a sparse representation. In this work, we assume that the dictionaries Dn for are known (mode-n dictionaries).
Without losing generality, we will assume throughout the letter that mode-n dictionaries have unit-norm columns (, , ).
In the multiway sparsity definition, definition 3, the nonzero entries can be located anywhere within the core tensor . Sparsity is a simple assumption that naturally arises in real life, but it leads to mathematical problems that are not easy to solve. Therefore, additional assumptions are needed in order to simplify the problems and develop more efficient algorithms, especially for high-dimensional data sets. Here, we propose block sparsity for tensor data sets motivated by the fact that in the real world, the nonzero coefficients are not evenly distributed and are likely to be grouped in blocks (in Figure 2, two motivating examples of natural images revealing block-sparse representations, are shown). In other words, block sparsity is a natural and realistic assumption that incorporates valuable prior information about signals in nature, because block-sparse signals are more likely to occur than totally random sparse representations.4
We typically assume that and . In other words, multiway block sparsity assumes that the nonzero entries of the core tensor are located within a subarray (block) defined by . In Figure 1, a comparison of the sparsity types used in this work is shown.
The following basic results are easily derived from the previous definitions:
(multiway block sparsity implies sparsity of the vectorized version of the signal). If an N-way array is -block sparse with respect to factor matrices (), then its vectorized version is K-sparse (K=S1S2⋅⋅⋅SN) with respect to the Kronecker dictionary .
(multiway block sparsity implies sparsity of mode-n vectors). If an N-way array is -block sparse, then its mode-n vectors have an Sn-sparse representation with respect to Dn for each .
The N-way array has a Tucker representation with a core tensor and factors Dn (). By using equation 2.12, we obtain
Note that all n-vectors in each mode n belong to the same subspace with dimension at most Sn; that is, not only all these vectors can be sparsely represented over Dn but also they all use the same reduced set of dictionary elements, which means that the reciprocal of proposition 3 is not true in general; that is, sparsity of mode-n vectors does not implies multiway block sparsity.
It is important to mention that our definition of multiway block sparsity is different from the common definition of block-sparse representation introduced for 1D signals in Eldar, Kuppinger, and Bolcskei (2010). For 1D signals, block sparsity of order K usually refers to the fact that the vector of coefficients x is composed of a concatenation of blocks of length d, that is, , with no more than K nonzero blocks.
4. Multidimensional Compressed Sensing
The generalization of CS to higher dimensions is straightforward by assuming the Kronecker structure for the dictionary and the sensing matrix. This was recently proposed by Rivenson and Stern (2009a) for 2D signals and Duarte and Baraniuk (2011) for the general N-dimensional signals case. Formally, the vectorization of an N-way array has a K-sparse representation on a Kronecker basis, that is, with . Then, by using a global sensing matrix with Kronecker structure , we obtain a large-scale linear system of equations with Kronecker structure as given by equation 2.14, where ().
5. Solving Large, Underdetermined Linear Systems with Kronecker Structure
It is clear that if the dictionary does not have a predefined structure, we can use any of the available algorithms such as OMP or BP. With a Kronecker dictionary, we can exploit its structure in order to reduce the computational complexity and memory requirements of a greedy algorithm as presented in section 5.1. It is worth mentioning that a BP algorithm that exploits the Kronecker structure has been proposed by Rivenson and Stern (2009a, 2009b), which we refer to here as Kronecker-BP (or Kronecker-SPGL1) algorithm. But we can do it better: if the multiway block-sparsity model is valid, we can use this structure to develop a very fast algorithm called N-BOMP, which we introduce in section 5.2. In Table 1 we summarize the algorithms and their applicability conditions that are covered in this letter.
|Dictionary D .||Sparsity Type .||Algorithms .|
|Kronecker||Multiway block sparsity||N-BOMP|
|Dictionary D .||Sparsity Type .||Algorithms .|
|Kronecker||Multiway block sparsity||N-BOMP|
In the following sections, we consider algorithms to compute a sparse representation of a multidimensional signal (N-way array) given a fixed dictionary , with . To simplify the notation and facilitate the analysis of the associated complexity, we consider N-way arrays with the same size in each mode and dictionaries with the same number of atoms in each mode: In=I and Mn=M ().
5.1. The Kronecker-OMP Algorithm for Multiway Sparsity.
We exploit here the Kronecker structure of the dictionary to avoid the explicit storage of it and save memory resources. In other words, we can compute the equivalent multiway product, in algorithm 1, STEP 3, which also gives us much lower complexity. Note that M is related to the number of dictionary elements associated with each mode, which is usually in the range (underdetermined system of equations). For example, if M=2I, with this simple technique we achieve a complexity of order , which is smaller compared to the complexity of the OMP algorithm applied to the vectorized signal (approximately )).
5.2. N-BOMP Algorithm: An Efficient Algorithm to Find Multiway Block-Sparse Representations.
The N-BOMP algorithm not only optimizes the memory storage but also requires far fewer iterations compared to the classic OMP algorithm because the maximum number of iterations is , with K being the number of nonzero entries within the core tensor (see a complexity analysis below).
5.3. Complexity Analysis.
Here we analyze the computational complexity associated with each of the algorithms discussed in this letter. For this analysis, we assume an N-way array and mode-n dictionaries given by matrices . We also assume that an -block sparse representation of with factors Dn (), which means that there are SN nonzero coefficients. We consider the arithmetic operations required by step 3 (maximum correlated atom detection), step 5 (least squares estimation of nonzero coefficients), and step 6 (residual update) in all algorithms at iteration number k in terms of I (mode size), M (number of atoms per mode), and N (number of dimensions). The comparative results are summarized in Table 2 (see the details in the appendix) where the advantage of N-BOMP over Kronecker-OMP and OMP is evident. From this table, we observe that for very sparse representations with , the complexity is dominated by step 3, which is exactly the same for Kronecker-OMP and N-BOMP. The key advantage of N-BOMP over the other algorithms is that it requires many fewer iterations ( against iterations in OMP/Kronecker-OMP). In addition, in step 5, the N-BOMP algorithm complexity in terms of the number of entries IN is sublinear compared to a linear dependence of the standard OMP and the Kronecker-OMP algorithms. In section 7, we show several numerical results with comparisons of the computation times required by different algorithms applied to multidimensional signals.
|.||OMP .||Kronecker-OMP .||N-BOMP .|
|Number of Iterations||SN||SN|
|.||OMP .||Kronecker-OMP .||N-BOMP .|
|Number of Iterations||SN||SN|
Note: Iterations and operations required at iteration k for standard OMP, Kronecker-OMP and N-BOMP algorithms for an N-way array with dictionaries and the block sparsity parameter S (usually ).
6. Algorithms Performance Guarantees
The following theorem shows that for an N-way array generated as described before, the N-BOMP algorithm is able to obtain the true expected sparse representation in fewer iterations and under a less restrictive condition compared to equation 6.1.
Condition 6.2 establishes an upper bound for the number of nonzero entries SN in the multiway block-sparse representation, which is less restrictive than condition 6.1, as illustrated in Figure 3, where this bound was computed numerically. We can see that the N-BOMP bound is always much larger than the classic OMP bound, and the situation improves for a higher number of dimensions N (at a fixed global coherence value ). It is interesting to note that in practice, even beyond this upper bound, the algorithms are able to recover the correct representation with high probability, and N-BOMP always performs better than OMP for multiway block-sparse representations as our simulations in section 7 demonstrate.
7. Experimental Evaluation
In this section, we present several simulation results on synthetically generated signals and real-world signals in order to compare the performance of our algorithms against state-of-the-art methods like the classic OMP and BP. In particular, we evaluated an optimized version of a BP algorithm called SPGL1 (a solver for large-scale sparse reconstruction) (Berg & Friedlander, 2007), referred to here as Kronecker-SPGL1, which is optimized for large data sets and takes into account the Kronecker structure of the linear operator as used in Rivenson and Stern (2009a, 2009b) in the CI context. All the experiments were performed on a 2.65 GHz Intel Core i5 with 8GB RAM with Matlab using the Tensor Toolbox (Bader & Kolda, 2007) for efficient handling of multiway arrays. (A downloadable Matlab Demo package is available at http://web.fi.uba.ar/~ccaiafa/Cesar/N-BOMPdemo.html, which reproduces our results for 2D and 3D CS as reported in this letter.)
7.1. Recovery of Exact Multiway Block-Sparse Representations.
Here we analyze the capability of our algorithms to recover the correct set of nonzero coefficients and compare them against the classic OMP and BP (SPGL1). We have generated N-way arrays having exact multiway block-sparse representations according to definition 4 with Mn=M, In=I, and Sn=S for . The indices that define the subarray of nonzero coefficients were randomly chosen, and the values of the nonzero entries were generated with independent and identically distributed (i.i.d.) gaussian numbers. We decide that a sparse representation was correctly recovered if the relative error based in the Frobenius norm is less than a threshold of 10−2, that is, if .
In experiment 1 (deterministic dictionary), we generated matrices by concatenating two orthogonal bases: the discrete cosine transform (DCT) basis and the canonical basis (spikes). More specifically, the dictionaries are generated as follows: Dn=[C|I] where the entries of the DCT matrix are defined by for and and the matrix is the identity matrix. It is known that sines and spikes provide highly incoherent matrices (Tropp, 2008), which is desirable in order to guarantee the recovery of sparse representations. In fact, the coherence of these matrices is , and therefore the global coherence of matrix is . These kinds of dictionaries (i.e., concatenation of orthogonal bases) have already been tested in the compressed sensing literature (Donoho & Elad, 2003; Gribonval, 2003). In Figure 4 we compare the percentage of correctly recovered representations over an ensemble of 100 simulations by using standard OMP/Kronecker-OMP, BP (implemented by SPGL1/Kronecker-SPGL1), and N-BOMP algorithms applied to generated signals for 1D, 2D, and 3D cases as a function of the total number of nonzero entries (SN). It is important to note that the N-BOMP algorithm always performed better than standard OMP and BP. Furthermore, it is interesting to note that the maximum number of nonzero entries required by conditions 6.1 and 6.2 to guarantee the success of OMP and N-BOMP, respectively, is much lower than the actual required number of nonzero entries. For example, in Figure 4c, the global coherence is , which gives a maximum number of nonzero entries of S3max =1.87 (OMP) and S3max =4.05 (N-BOMP); nevertheless, our experimental results show that 100% of the cases were correctly recovered even with S3=27 (OMP) and S3=64 (N-BOMP), which reveals that the theoretical guarantees are very conservative (pesimistic) and the algorithms also perform well beyond those limits.
In experiment 2 (random dictionary), we consider that matrices are random as in the case of CS, because they are the result of multiplying a random-sensing matrix by an orthogonal matrix. Dictionaries have been generated using i.i.d. gaussian variables and applying normalization to the columns. It is well known that gaussian matrices have a high probability of low coherence (Tropp & Gilbert, 2007). In Figure 5 we compare the percentage of correctly recovered representations over an ensemble of 100 simulations by using standard OMP/Kronecker-OMP, BP (implemented by SPGL1/Kronecker-SPGL1), and N-BOMP algorithms applied to generated signals for 1D, 2D and 3D cases as a function of the total number of measurements (IN). N-BOMP provides a higher percentage of correctly recovered signals for a few measurements. For example, in the 3D case for approximately 3000 measurements, which represents 3000/243=21.7% of the whole signal, N-BOMP provides almost 90% of successfully recovered representations against 65% and 35% obtained by BP and Kronecker-OMP, respectively.
In Figure 6, the computation times required by Kronecker-OMP, Kronecker-BP, and N-BOMP algorithms are shown for experiment 1 (time versus the number of nonzero coefficients SN) and experiment 2 (time versus the number of measurements IN). It is important to highlight that we used the SPGL1 algorithm for the Kronecker-BP, which is optimally implemented by Berg and Friedlander (2007) in Matlab, including fast C implementations. Our implementation of Kronecker-OMP and N-BOMP does not include any C implementation. For experiment 1 (see Figures 6a and 6b), the complexity of Kronecker-OMP, as a function of the number of nonzero entries, grows faster than the cases of Kronecker-BP and N-BOMP. This is because the number of iterations required by Kronecker-OMP is SN and for for N-BOMP is between S and NS (see section 5.3). On the other side, the number of iterations of Kronecker-BP depends on the convergence of the algorithm, not on the number of nonzero entries directly. For experiment 2 (see Figures 6c and 6d), we observe that Kronecker-OMP and N-BOMP show approximately flat plots because the number of nonzero entries is kept fixed to S2=82=64 in Figure 6c and S3=33=27 in Figure 6d. On the other side, as more measurements become available, the Kronecker-BP algorithm runs faster because it requires fewer iterations to converge. From all these plots, the advantage of N-BOMP algorithm over the other algorithms is clear.
7.2. Application to Multidimensional CS.
Here we apply our N-BOMP algorithm to CS and compare it to a state-of-the-art method for the case of multidimensional images (large-scale data sets). More specifically we compare it to the optimized implementation of Kronecker-SPGL1 as proposed in Rivenson and Stern (2009a, 2009b), where the Kronecker structure is considered. We use the peak signal-to-noise Ratio (PSNR) to measure the quality of reconstruction of a multidimensional signal , which is defined as PSNR(dB) . Our results demonstrate that by taking into account the multiway block sparsity, we are able to obtain much better results in terms of computation time and quality of reconstruction, as the following experiments clearly show.
7.2.1. Compressive Imaging.
Our N-BOMP algorithm is very attractive for CI since the Kronecker structure can be forced in the implementation of sensors by separable optics demonstrated in Rivenson and Stern (2009a, 2009b). In order to put our algorithms in the context of the state-of-the-art methods for CI, in this section we consider the reconstruction of a 1-megapixel 2D image ( with M=1024) benchmark already used in Candès and Romberg (2006) Candès and Wakin (2008), and Rivenson and Stern (2009a) where the original image was first processed by thresholding the largest 25, 000 wavelet coefficients. Here, we kept 99, 078 coefficients contained in a 2D block, which concentrates most of the signal energy in the Daubechies-separable wavelet transform domain (this block was selected by using the N-BOMP algorithm applied to the whole image with a target reconstruction relative error of 0.0874, the same error obtained by keeping the 25, 000 largest coefficients). We have projected the input image by using gaussian sensing matrices (n=1, 2) for a wide range of the sampling ratio I2/M2. In Figure 7a, the obtained PSNRs for the Kronecker-SPGL1 and the N-BOMP algorithms as a function of the sampling ratio I2/M2 are shown. We see that for small sampling ratios, N-BOMP outperforms the BP strategy. For example, for a sampling ratio of 15%, we obtained PSNR values of 27 dB (Kronecker-SPGL1) and 35 dB (N-BOMP), as illustrated in Figure 8. In Figure 7b, the computation times required by these algorithms are shown. It is interesting to note that N-BOMP is always faster than the very optimized Kronecker-SPGL1 algorithm, which is extremely expensive for low levels of the sampling ratio due to the number of iterations required to converge to the solution.
7.2.2. Hyperspectral Compressive Imaging.
Our sparse models for tensors are particularly relevant when dealing with large data sets with high dimensionality () because existing vector techniques such as OMP or SPGL1 algorithms become prohibitively expensive or intractable. For example, large-scale problems with 3D signals cannot be processed with state-of-the-art algorithms (such as the Kronecker-BP based on SPGL1) without the aid of a super-computer. One example of such a problem is hyperspectral compressive imaging (HCI), introduced in section 4, which we illustrate in this section.
We show the results of applying our N-BOMP algorithm to two examples of natural scene hyperspectral images () corresponding to the signals extracted from scenes 7 and 8 in the Foster and Nascimento & Amano database (Foster, Nascimento, & Amano, 2004, available online at http://personalpages.manchester.ac.uk webpage). No preprocessing steps were applied to these data sets. These hyperspectral images contain scene reflectances measured at 32 different frequency channels acquired by a low-noise Peltier-cooled digital camera in the wave-length range of 400 to 720 nm (see the details in Foster et al., 2004).
As explained in section 4, for each channel, we apply a separable random-sensing matrix given by , where is a gaussian random matrix. We also assume that the data set has a multiway block-sparse representation on the separable Daubechies wavelet transform basis given by the matrix , with and . The resulting sampling ratio is . Results are shown in Figures 9 and 10. Quantitatively, the reconstructions perform very well with PSNR =35.26 dB (global PSNR) for scene 7 and PSNR =42.52 dB (global PSNR) for scene 8. This can be qualitatively verified by visual inspection of the images (the worst and best reconstructed slices are shown in Figures 9 and 10 with their corresponding PSNR values). It is important to highlight that these large data sets required only about 1 hour of computation time with our N-BOMP algorithm, which is very fast compared to the computation required by other state-of-the-art algorithms.
Sparsity is a simple assumption that naturally arises in real life, but it leads to mathematical problems that are not easy to solve. Therefore, additional assumptions are needed in order to simplify the problems and develop more efficient algorithms, especially for high-dimensional data sets. In this work, we demonstrated the advantage of keeping the multidimensional structure of a data set by applying N-way array algorithms instead of classical vector algorithms. Since multidimensional signals usually have a sparse representation over, for example, the cosine, fourier, or wavelet separable transforms, it is reasonable to consider that dictionaries of multidimensional signals can be modeled by a Kronecker product of mode dictionaries. Additionally, motivated by the fact that in the real world, the nonzero coefficients are not evenly distributed and are likely to be grouped in blocks, we have introduced block sparsity, which allowed us to develop N-BOMP, a very efficient algorithm. In other words, block sparsity is a natural and realistic assumption that incorporates valuable prior information about signals, because block-sparse signals are more likely to occur than totally random sparse representations. Moreover, block-sparse signals are better recovered; in fact, the theoretical bound for N-BOMP is better than the theoretical bound corresponding to the vector case (OMP). We have provided extensive experimental results demonstrating the tremendous advantages of using algorithm 3 (N-BOMP), especially for higher-dimensional signals. Examples of application of N-BOMP algorithm to compressed sensed (CS), especially to 2D compressive imaging (CI) and 3D hyperspectral CI, are presented showing the efficiency and usefulness of our algorithm compared to available state-of-the-art algorithms such as classical OMP and Kronecker-BP. Summarizing, our work showed that block-sparsity naturally arises in nature; incorporating block sparsity allows one to dramatically reduce the complexity of algorithms; and the block-sparsity assumption also allows one to obtain better-quality reconstructions of signals, as our experimental results demonstrated.
Appendix: Detailed Complexity Analysis
A.1. OMP Algorithm (Algorithm 1).
We begin with analysis of algorithm 1 in its recent optimized version as explained in Rubinstein et al. (2008). This algorithm has as input the vector and an explicit dictionary with M>I (overcomplete case). Assuming that the algorithm is guaranteed to obtain the true sparse representation, it will need exactly SN iterations:
Step 3: This step includes computing DTr and taking the absolute value maximum (2(MI)N+2MN operations).
Step 5: By using the Cholesky factorization method, this step requires computing (kI)N+3k2N operations (Rubinstein et al., 2008).
Step 6: This step includes computing and subtracting it from y (2(Ik)N+IN operations)
A.2. Kronecker-OMP Algorithm (Algorithm 2).
As in the previous case, assuming that the algorithm is guaranteed to obtain the true sparse representation, it will need exactly SN iterations:
Step 3: This step includes computing ( opera-rations) and taking its absolute-value maximum (2MN operations).
Step 5: This step includes computing (2IN op-erations), ( operations), d=(Z(k−1))−1b (2k2N operations), bTd (2kN op-erations), updating the inverse matrix according to equation 5.4 (2kN+3k2N), and computing the nonzero coefficients according to equation 5.7 (2k2N). Thus, giving a total number of operations equals 2IN+2NkNI+(N+4)kN+7k2N.
Step 6: This step includes computing and subtracting it from (N(N−1)IN+IN operations).
A.3. N-BOMP Algorithm (Algorithm 3).
A distinctive characteristic of this algorithm compared to the previous ones is that, assuming that the algorithm is granted to obtain the true sparse representation, it will require much less iterations. More specifically, after the maximum correlated atom is detected in step 3, its position within the multiway array determines the indices to be added to the current () subsets. Some of these indices may already be included in the corresponding mode indices subsets. It is granted that at least one new index in one mode will be added. Thus, at every iteration, a situation between the following two extreme cases can happen: case 1: only one subset of indices is incremented by 1 for some n; case 2: a new index is selected so are incremented by 1 for every . Below, we analyze the complexity for the worst case (case 2). We note that the minimum number of iterations is S (case 2) and the maximum number of iterations is NS (case 1):
Step 3: The same as for the Kronecker-OMP algorithm ( operations).
Step 5: The update of the Cholesky factorization for each mode, similar to the case of the algorithm 1, requires 2kI+k2+2k operations and solving a set of N equations of type 5.10 using again the Cholesky factorization, requires 2NkN+1 operations. Thus, in the worst case (case 2) where the Cholesky factorization update is needed for every mode, the total number of operations is .
Step 6: This step includes computing ( operations) and subtracting it from , giving us approximately 2kIN+IN operations where we assumed that .
We are grateful to the anonymous reviewers for their comments, which helped us to improve this letter. This work has been supported in part by CONICET under grant PIP 2012-2014, number 11420110100021.
Note that norm is not a norm formally since .
Classical examples of such separable transforms in the area of image processing are the discrete Fourier transform (DFT), the discrete cosine transform (DCT), discrete wavelet transform (DWT), and others.
It is worth mentioning that equation 2.12 is also valid for factor matrices with more columns than rows, that is, with Rn>In.
This multiway block-sparsity structure was recently proposed in Caiafa and Cichocki (2012)
The Khatri-Rao product is obtained by applying the Kronecker product of columns, that is, given matrices A=[a1a2⋅⋅⋅aM] and B=[b1b2⋅⋅⋅bM], then .