Cabos

Disponível somente no TrabalhosFeitos
  • Páginas : 43 (10534 palavras )
  • Download(s) : 0
  • Publicado : 11 de outubro de 2012
Ler documento completo
Amostra do texto
Spectral Regularization Algorithms for Learning Large Incomplete Matrices
Rahul Mazumder
Department of Statistics Stanford University rahulm@stanford.edu

Trevor Hastie
Statistics Department and Department of Health, Research and Policy Stanford University

hastie@stanford.edu

Robert Tibshirani
Department of Health, Research and Policy and Statistics Department Stanford Universitytibs@stanford.edu

Editor: Submitted for publication July 30, 2009

Abstract
We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Ouralgorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, weshow that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 106 × 106 incomplete matrix with 105 observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very goodperformance both in training and test error when compared to other competitive state-of-the art techniques.

1. Introduction
In many applications measured data can be represented in a matrix Xm×n , for which only a relatively small number of entries are observed. The problem is to “complete” the matrix based on the observed entries, and has been dubbed the matrix completion problem [CCS08, CR08, RFP07,CT09, KOM09, RS05]. The “Netflix” competition (e.g. [SN07]) is a popular example, where the data is the basis for a recommender system. The rows correspond to viewers and the columns to movies, with the entry Xij being the rating ∈ {1, . . . , 5} by viewer i for movie j. There are 480K viewers and 18K movies, and hence 8.6 billion (8.6 × 109 ) potential entries. However, on average each viewerrates about 200 movies, so only 1.2% or 108 entries are observed. The task is to predict the ratings that viewers would give to movies they have not yet rated. 1

These problems can be phrased as learning an unknown parameter (a matrix Zm×n ) with very high dimensionality, based on very few observations. In order for such inference to be meaningful, we assume that the parameter Z lies in a muchlower dimensional manifold. In this paper, as is relevant in many real life applications, we assume that Z can be well represented by a matrix of low rank, i.e. Z ≈ Vm×k Gk×n , where k ≪ min(n, m). In this recommender-system example, low rank structure suggests that movies can be grouped into a small number of “genres”, with Gℓj the relative score for movie j in genre ℓ. Viewer i on the other handhas an affinity Viℓ for genre ℓ, and hence the modeled score for viewer i on k movie j is the sum ℓ=1 Viℓ Gℓj of genre affinities times genre scores. Typically we view the observed entries in X as the corresponding entries from Z contaminated with noise. Recently [CR08, CT09, KOM09] showed theoretically that under certain assumptions on the entries of the matrix, locations, and proportion of unobservedentries, the true underlying matrix can be recovered within very high accuracy. [SAJ05] studied generalization error bounds for learning low-rank matrices. For a matrix Xm×n let Ω ⊂ {1, . . . , m}×{1, . . . , n} denote the indices of observed entries. We consider the following optimization problem: minimize subject to
(i,j)∈Ω

rank(Z) (Xij − Zij )2 ≤ δ, (1)

where δ ≥ 0 is a regularization...
tracking img