What is it about?

Covariance matrices are used throughout the physical and social sciences to measure how variables move together. Sampling error in an estimated covariance matrix matrix, and particularly in the leading eigenvector, can give rise to inaccurate conclusions, especially for quadratic optimization. We develop a shrinkage formula that improves on the leading sample eigenvector as an estimate of the truth. In high dimensions this leads to greater accuracy in variance minimization. Our methods come with rigorous theoretical guarantees that do not depend on Gaussian assumptions.

Featured Image

Why is it important?

Our shrinkage formulas generate improved estimates for the most significant direction of covariance in high dimensional data. They further lead to improved decisions and optimization results that are less affected by eigenvector bias. We illustrate the power of our estimators on optimized portfolios, while potential applications to other high dimensional data estimation problems, such as genome-wide association studies and machine learning, await exploration.


Our results link the transformative works of economist Harry Markowitz and statistician Charles Stein from the 1950s. They solve practical problems that have been open for more than seven decades.

Lisa Goldberg
University of California Berkeley

Success in estimating covariances of high dimensional data is often elusive, especially with small sample sizes. Yet this problem arises naturally in disparate fields, especially portfolio optimization. This paper is part of a recent and fast-moving stream of research that is finding fresh success overcoming some long-standing difficulties.

Alec Kercheval
Florida State University

Read the Original

This page is a summary of: James–Stein for the leading eigenvector, Proceedings of the National Academy of Sciences, January 2023, Proceedings of the National Academy of Sciences, DOI: 10.1073/pnas.2207046120.
You can read the full text:




The following have contributed to this page