The Wharton School

In 1881, American entrepreneur and industrialist Joseph Wharton established the world’s first collegiate school of business at the University of Pennsylvania — a radical idea that revolutionized both business practice and higher education.

Since then, the Wharton School has continued innovating to meet mounting global demand for new ideas, deeper insights, and  transformative leadership. We blaze trails, from the nation’s first collegiate center for entrepreneurship in 1973 to our latest research centers in alternative investments and neuroscience.

Wharton's faculty members generate the intellectual innovations that fuel business growth around the world. Actively engaged with the leading global companies, governments, and non-profit organizations, they represent the world's most comprehensive source of business knowledge.

For more information, see the Research, Directory & Publications site.

Search results

Now showing 1 - 10 of 69
  • Publication
    Large-Scale Multiple Testing of Correlations
    (2016-05-05) Cai, T. Tony; Liu, Weidong
    Multiple testing of correlations arises in many applications including gene coexpression network analysis and brain connectivity analysis. In this article, we consider large-scale simultaneous testing for correlations in both the one-sample and two-sample settings. New multiple testing procedures are proposed and a bootstrap method is introduced for estimating the proportion of the nulls falsely rejected among all the true nulls. We investigate the properties of the proposed procedures both theoretically and numerically. It is shown that the procedures asymptotically control the overall false discovery rate and false discovery proportion at the nominal level. Simulation results show that the methods perform well numerically in terms of both the size and power of the test and it significantly outperforms two alternative methods. The two-sample procedure is also illustrated by an analysis of a prostate cancer dataset for the detection of changes in coexpression patterns between gene expression levels. Supplementary materials for this article are available online.
  • Publication
    A Max-Norm Constrained Minimization Approach to 1-Bit Matrix Completion
    (2013-12-01) Cai, T. Tony; Zhou, Wen-Xin
    We consider in this paper the problem of noisy 1-bit matrix completion under a general non-uniform sampling distribution using the max-norm as a convex relaxation for the rank. A max-norm constrained maximum likelihood estimate is introduced and studied. The rate of convergence for the estimate is obtained. Information-theoretical methods are used to establish a minimax lower bound under the general sampling model. The minimax upper and lower bounds together yield the optimal rate of convergence for the Frobenius norm loss. Computational algorithms and numerical performance are also discussed.
  • Publication
    Optimal Rates of Convergence for Sparse Covariance Matrix Estimation
    (2012-01-01) Cai, T. Tony; Zhou, Harrison H
    This paper considers estimation of sparse covariance matrices and establishes the optimal rate of convergence under a range of matrix operator norm and Bregman divergence losses. A major focus is on the derivation of a rate sharp minimax lower bound. The problem exhibits new features that are significantly different from those that occur in the conventional nonparametric function estimation problems. Standard techniques fail to yield good results, and new tools are thus needed. We first develop a lower bound technique that is particularly well suited for treating “two-directional” problems such as estimating sparse covariance matrices. The result can be viewed as a generalization of Le Cam’s method in one direction and Assouad’s Lemma in another. This lower bound technique is of independent interest and can be used for other matrix estimation problems. We then establish a rate sharp minimax lower bound for estimating sparse covariance matrices under the spectral norm by applying the general lower bound technique. A thresholding estimator is shown to attain the optimal rate of convergence under the spectral norm. The results are then extended to the general matrix ℓw operator norms for 1 ≤ w ≤ ∞. In addition, we give a unified result on the minimax rate of convergence for sparse covariance matrix estimation under a class of Bregman divergence losses.
  • Publication
    Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach
    (1999) Cai, T. Tony
    We study wavelet function estimation via the approach of block thresholding and ideal adaptation with oracle. Oracle inequalities are derived and serve as guides for the selection of smoothing parameters. Based on an oracle inequality and motivated by the data compression and localization properties of wavelets, an adaptive wavelet estimator for nonparametric regression is proposed and the optimality of the procedure is investigated. We show that the estimator achieves simultaneously three objectives: adaptivity, spatial adaptivity and computational efficiency. Specifically, it is proved that the estimator attains the exact optimal rates of convergence over a range of Besov classes and the estimator achieves adaptive local minimax rate for estimating functions at a point. The estimator is easy to implement, at the computational cost of O(n). Simulation shows that the estimator has excellent numerical performance relative to more traditional wavelet estimators.
  • Publication
    Nonquadratic Estimators of a Quadratic Functional
    (2005-01-01) Cai, T. Tony; Low, Mark G
    Estimation of a quadratic functional over parameter spaces that are not quadratically convex is considered. It is shown, in contrast to the theory for quadratically convex parameter spaces, that optimal quadratic rules are often rate suboptimal. In such cases minimax rate optimal procedures are constructed based on local thresholding. These nonquadratic procedures are sometimes fully efficient even when optimal quadratic rules have slow rates of convergence. Moreover, it is shown that when estimating a quadratic functional nonquadratic procedures may exhibit different elbow phenomena than quadratic procedures.
  • Publication
    Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks
    (2009-01-01) Cai, T. Tony; Sun, Wenguang
    In large-scale multiple testing problems, data are often collected from heterogeneous sources and hypotheses form into groups that exhibit different characteristics. Conventional approaches, including the pooled and separate analyses, fail to efficiently utilize the external grouping information. We develop a compound decision theoretic framework for testing grouped hypotheses and introduce an oracle procedure that minimizes the false nondiscovery rate subject to a constraint on the false discovery rate. It is shown that both the pooled and separate analyses can be uniformly improved by the oracle procedure. We then propose a data-driven procedure that is shown to be asymptotically optimal. Simulation studies show that our procedures enjoy superior performance and yield the most accurate results in comparison with both the pooled and separate procedures. A real-data example with grouped hypotheses is studied in detail using different methods. Both theoretical and numerical results demonstrate that exploiting external information of the sample can greatly improve the efficiency of a multiple testing procedure. The results also provide insights on how the grouping information is incorporated for optimal simultaneous inference.
  • Publication
    Limiting Laws of Coherence of Random Matrices With Applications to Testing Covariance Structure and Construction of Compressed Sensing Matrices
    (2011-01-01) Cai, T. Tony; Jiang, Tiefeng
    Testing covariance structure is of significant interest in many areas of statistical analysis and construction of compressed sensing matrices is an important problem in signal processing. Motivated by these applications, we study in this paper the limiting laws of the coherence of an n × p random matrix in the high-dimensional setting where p can be much larger than n. Both the law of large numbers and the limiting distribution are derived. We then consider testing the bandedness of the covariance matrix of a high-dimensional Gaussian distribution which includes testing for independence as a special case. The limiting laws of the coherence of the data matrix play a critical role in the construction of the test. We also apply the asymptotic results to the construction of compressed sensing matrices.
  • Publication
    Discussion: “A Significance Test for the Lasso”
    (2014-01-01) Cai, T. Tony; Yuan, Ming
  • Publication
    Asymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression
    (2009-01-01) Cai, T. Tony; Zhou, Harrison H
    Asymptotic equivalence theory developed in the literature so far are only for bounded loss functions. This limits the potential applications of the theory because many commonly used loss functions in statistical inference are unbounded. In this paper we develop asymptotic equivalence results for robust nonparametric regression with unbounded loss functions. The results imply that all the Gaussian nonparametric regression procedures can be robustified in a unified way. A key step in our equivalence argument is to bin the data and then take the median of each bin. The asymptotic equivalence results have significant practical implications. To illustrate the general principles of the equivalence argument we consider two important nonparametric inference problems: robust estimation of the regression function and the estimation of a quadratic functional. In both cases easily implementable procedures are constructed and are shown to enjoy simultaneously a high degree of robustness and adaptivity. Other problems such as construction of confidence sets and nonparametric hypothesis testing can be handled in a similar fashion.
  • Publication
    Optimal Estimation of the Mean Function Based on Discretely Sampled Functional Data: Phase Transition
    (2011-01-01) Cai, T. Tony; Yuan, Ming
    The problem of estimating the mean of random functions based on discretely sampled data arises naturally in functional data analysis. In this paper, we study optimal estimation of the mean function under both common and independent designs. Minimax rates of convergence are established and easily implementable rate-optimal estimators are introduced. The analysis reveals interesting and different phase transition phenomena in the two cases. Under the common design, the sampling frequency solely determines the optimal rate of convergence when it is relatively small and the sampling frequency has no effect on the optimal rate when it is large. On the other hand, under the independent design, the optimal rate of convergence is determined jointly by the sampling frequency and the number of curves when the sampling frequency is relatively small. When it is large, the sampling frequency has no effect on the optimal rate. Another interesting contrast between the two settings is that smoothing is necessary under the independent design, while, somewhat surprisingly, it is not essential under the common design.