Joint CIAM and Optimization and Systems Theory Seminar
November 15, 2007, 14.00-15.00, Room 3721, Lindstedtsvägen 25

Shankar Sastry
University of Berkeley, CA, USA E-mail:

Generalized Principal Component Analysis: An Introduction 

There are a large number of problems in which we encounter the problem of modeling large amounts of data, by what is referred to as a "mixture of models", that is to say that the data can be segmented into finitely many sub components, each of which can be separately modeled. In the context of the identification of hybrid systems it is easy to see how this would arise when the input-output behavior depends on the "discrete state" of the hybrid system. Of course, the applications in computer vision, signal and image processing and indeed more generally in statistics are extremely numerous. This area of work has found a tremendous outpouring of effort and methods in recent years in the signal processing, hybrid systems, statistics and learning systems literature. However, it is our perception that the conceptual and theoretical underpinnings of the bulk of the literature are weak.

In the course of a recent set of papers with Yi Ma of the University of Illinois, Urbana Champaign and Rene Vidal of Johns Hopkins University and their students, we have developed what we believe to be an interesting new approach to simultaneously segmenting and modeling data from mixtures of models. The heart of our approach lies in what is called "Generalized Principal Component Analysis". This in turn has many connections with such classical problems as Hilbert's Nullstellensatz and many unsolved problems in statistics. In my talk at this workshop, I will give a brief overview of the approaches and their applications to date. The work is being incorporated into a monograph to appear in 2008 and a preview of this monograph is available at

 The website for the code for GPCA is

Calendar of seminars Last update: November 1, 2007 by Marie Lundin.