logo

SCIENCE CHINA Information Sciences, Volume 60, Issue 11: 112101(2017) https://doi.org/10.1007/s11432-016-9021-9

A generalized power iteration method for solving quadratic problem on the Stiefel manifold

More info
  • ReceivedOct 18, 2016
  • AcceptedJan 10, 2017
  • PublishedMay 5, 2017

Abstract

In this paper, we first propose a novel generalized power iteration (GPI) method to solve the quadratic problem on the Stiefel manifold (QPSM) as $\min_{W^{\rm~T}W=I}$ ${\rm~Tr}(W^{\rm~T}AW-2W^{\rm~T}B)$ along with the theoretical analysis. Accordingly, its special case known as the orthogonal least square regression (OLSR) is under further investigation. Based on the aforementioned studies, we then majorly focus on solving the unbalanced orthogonal procrustes problem (UOPP). As a result, not only a general convergent algorithm is derived theoretically but the efficiency of the proposed approach is verified empirically as well.


References

[1] Green B F. The orthogonal approximation of an oblique structure in factor analysis. Psychometrika, 1952, 17: 429-440 CrossRef Google Scholar

[2] Hurley J, Cattell R. The procrustes program: producing direct rotation to test a hypothesized factor structure. Behav Sci, 1962, 6: 258--262. Google Scholar

[3] Golub G H, van Loan C F. Matrix Computations. Baltimore: The Johns Hopkins University Press, 1989. Google Scholar

[4] Thomas V. Algorithms for the weighted orthogonal procrustes problem and other least squares problems. Dissertation for Ph.D. Degree. Umeå: Umeå University, 2006. Google Scholar

[5] Souza R C S N P, Leite S C, Borges C C H. Online algorithm based on support vectors for orthogonal regression. Pattern Recognition Lett, 2013, 34: 1394-1404 CrossRef Google Scholar

[6] Chu M T, Trendafilov N T. The Orthogonally Constrained Regression Revisited. J Comp Graphical Stat, 2001, 10: 746-771 CrossRef Google Scholar

[7] Green B, Goers J. A problem with congruence. In: Proceedings of the Annual Meeting of the Psychometric Society, Monterey, 1979. Google Scholar

[8] Park H. A parallel algorithm for the unbalanced orthogonal procrustes problem. Parallel Computing, 1991, 17: 913-923 CrossRef Google Scholar

[9] Bojanczyk A W, Lutoborski A. The Procrustes Problem for Orthogonal Stiefel Matrices. SIAM J Sci Comput, 1999, 21: 1291-1304 CrossRef Google Scholar

[10] Zhang Z, Du K. Successive projection method for solving the unbalanced Procrustes problem. SCI CHINA SER A, 2006, 49: 971-986 CrossRef ADS Google Scholar

[11] Zhang Z, Huang Y. A Projection Method for Least Squares Problems with a Quadratic Equality Constraint. SIAM J Matrix Anal Appl, 2003, 25: 188-212 CrossRef Google Scholar

[12] Xia Y, Han Y W. Partial Lagrangian relaxation for the unbalanced orthogonal Procrustes problem. Math Meth Oper Res, 2014, 79: 225-237 CrossRef Google Scholar

[13] Journée M, Nesterov Y, Richtárik P, et al. Generalized power method for sparse principal component analysis. J Mach Learn Res, 2008, 11: 517--553. Google Scholar

[14] Fiori S. Formulation and integration of learning differential equations on the stiefel manifold.. IEEE Trans Neural Netw, 2005, 16: 1697-1701 CrossRef PubMed Google Scholar

  • Figure 1

    (Color online) Comparisons of 6 different values of $\delta$ are performed under the GPI method with 3 different data matrices. (a) (50, 100, 30); (b) (80, 170, 80); (c) (40, 120, 60).

  • Figure 2

    (Color online) Comparisons of the convergence rate are performed for 6 approaches including EB [7], RSR [8], LSR [9], SP [10]LR [12]and our GPI method under 3 different data matrices. (a) (100, 10, 100); (b) (100, 15, 100); (c) (200, 15, 200).

  • Figure 3

    (Color online) Comparisons of PMCT [11]and GPI are performed over 2 different data matrices. (a) (900, 1000); (b) (2000, 1700).

  • Figure 4

    (Color online) CPU time comparison under case 3, dimension($n$, $m$).

  • Table 1   Orders of complexity for 6 algorithms$^{\rm~a)}$
    RSR [8] LSR [9] SP [10]
    Order of the complexity $O(mnk+m^3kt)$ $O(mnk+m^3kt)$ $O(mnk+(m^2n+m^3)t)$
    LR [12] EB [7] GPI (ours)
    Order of the complexity $O(m^2n+nk^2+m^2kt)$ $O(m^3+(m^2n+m^3)t)$ $O(m^2n+m^2kt)$
    a) $t$ stands for the iteration number and $(n,m,k)$ stands for the dimension.
  •   

    Algorithm 1 Generalized power iteration method (GPI)

    Input: The symmetric matrix $A\in\mathbb{R}^{m\times~m}$ and the matrix $B\in\mathbb{R}^{m\times~k}$.

    Initialize a random $W\in\mathbb{R}^{m\times~k}$ satisfying $W^{\rm~T}W=I_k$ and $\alpha$ via power method such that ${A}=\alpha~I_m-A\in\mathbb{R}^{m\times~m}$ is a positive definite matrix.

    (

    1) Update $M\leftarrow~2{A}W+2B$.

    (

    2) STATE Calculate $USV^{\rm~T}=M$ via the compact SVD method of $M$ where $U\in\mathbb{R}^{m\times~k}$, $S\in\mathbb{R}^{k\times~k}$ and $V\in\mathbb{R}^{k\times~k}$.

    (

    3) STATE Update $W\leftarrow~UV^{\rm~T}$.

    Iteratively perform the steps (1)–(3) until the algorithm converges.

  • Table 2   Comparisons of CPU time (s) under thesquare matrix $E$ for case 2$^{\rm~b)}$
    $(n=m=200)$ RSR[8] LSR[9] SP[10] LR[12] EB[7] GPI (ours)
    $k=10$ 64.940 23.426 2.386 0.541 0.337 0.228
    $k=15$ 136.020 21.635 3.221 1.134 0.347 0.226
    $k=20$ 229.851 20.560 5.054 1.806 0.445s 0.273
    $(n=m=1000)$ RSR[8] LSR[9] SP[10] LR[12] EB[7] GPI (ours)
    $k=10$ 842.849 132.232 3.869 11.440 1.290
    $k=15$ 851.231 196.761 5.180 12.534 1.434
    $k=20$ 860.746 260.132 7.700 12.625 1.575
    b) Iteration stops when $\Vert~EQ_{i-1}-G\Vert_F^2-\Vert~EQ_i-G\Vert_F^2\leq\tau$ where $~\tau=10^{-3}$.
  •   

    Algorithm 2 GPI for solving UOPP in (sect. 4.2)

    Input: The matrix $E\in\mathbb{R}^{n\times~m}$ and the matrix $G\in\mathbb{R}^{n\times~k}$ where $m>k$.

    Initialize $Q\in\mathbb{R}^{m\times~k}$ and $\gamma$ such that $Q^{\rm~T}Q=I_k$ and the matrix $\gamma~I_m-E^{\rm~T}E$ is positive definite, respectively.

    While not converge do

    (

    1) Update matrix $M\leftarrow~2(\gamma~I_m-E^{\rm~T}E)Q+2E^{\rm~T}G$.

    (

    2) Calculate $U\in\mathbb{R}^{m\times~k}$ and $V\in\mathbb{R}^{~k\times~k}$ via the compact SVD of $M$ as $M=USV^{\rm~T}$.

    (

    3) Update $Q\leftarrow~UV^{\rm~T}$.

    End while

    Return $Q$.

  • Table 3   Comparisons of CPU time (s) under the general dimension for case 2$^{\rm~b)}$
    Dimension $(n,m,k)$ RSR[8] LSR[9] SP[10] LR[12] EB[7] GPI (ours)
    $(5000,500,15)$ 713.156 528.034 450.028 20.709 16.554 3.581
    $(10000,1000,30)$ 56.772 191.970 9.384
    $(3000,3000,90)$ 186.125 395.401 17.320
    $(30000,1500,30)$ 306.132 1056.311 19.440
    $(5000,4000,100)$ 405.937 1187.512 30.128
    $(100000,3000,50)$ 215.173

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1