Skip to main content

An Introduction to Envelopes: Dimension Reduction for Efficient Estimation in Multivariate Statistics

An Introduction to Envelopes: Dimension Reduction for Efficient Estimation in Multivariate Statistics

R. Dennis Cook

ISBN: 978-1-119-42296-9

Sep 2018

320 pages

$120.99

Description

Written by the leading expert in the field, this text reviews the major new developments in envelope models and methods 

An Introduction to Envelopes provides an overview of the theory and methods of envelopes, a class of procedures for increasing efficiency in multivariate analyses without altering traditional objectives. The author offers a balance between foundations and methodology by integrating illustrative examples that show how envelopes can be used in practice. He discusses how to use envelopes to target selected coefficients and explores predictor envelopes and their connection with partial least squares regression. The book reveals the potential for envelope methodology to improve estimation of a multivariate mean.

The text also includes information on how envelopes can be used in generalized linear models, regressions with a matrix-valued response, and reviews work on sparse and Bayesian response envelopes. In addition, the text explores relationships between envelopes and other dimension reduction methods, including canonical correlations, reduced-rank regression, supervised singular value decomposition, sufficient dimension reduction, principal components, and principal fitted components. This important resource: 

•    Offers a text written by the leading expert in this field

•    Describes groundbreaking work that puts the focus on this burgeoning area of study

•    Covers the important new developments in the field and highlights the most important directions

•    Discusses the underlying mathematics and linear algebra

•    Includes an online companion site with both R and Matlab support

Written for researchers and graduate students in multivariate analysis and dimension reduction, as well as practitioners interested in statistical methodology, An Introduction to Envelopes offers the first book on the theory and methods of envelopes.

Preface xv

Notation and Definitions xix

1 Response Envelopes 1

1.1 The Multivariate Linear Model 2

1.1.1 Partitioned Models and Added Variable Plots 5

1.1.2 Alternative Model Forms 6

1.2 Envelope Model for Response Reduction 6

1.3 Illustrations 10

1.3.1 A Schematic Example 10

1.3.2 Compound Symmetry 13

1.3.3 Wheat Protein: Introductory Illustration 13

1.3.4 Cattle Weights: Initial Fit 14

1.4 More on the Envelope Model 19

1.4.1 Relationship with Sufficiency 19

1.4.2 Parameter Count 19

1.4.3 Potential Gains 20

1.5 Maximum Likelihood Estimation 21

1.5.1 Derivation 21

1.5.2 Cattle Weights: Variation of the X-Variant Parts of Y 23 

1.5.3 Insights into ÊΣ (B)24

1.5.4 Scaling the Responses 25

1.6 Asymptotic Distributions 25

1.7 Fitted Values and Predictions 28

1.8 Testing the Responses 29

1.8.1 Test Development 29

1.8.2 Testing Individual Responses 32

1.8.3 Testing Containment Only 34

1.9 Nonnormal Errors 34

1.10 Selecting the Envelope Dimension, u 36

1.10.1 Selection Methods 36

1.10.1.1 Likelihood Ratio Testing 36

1.10.1.2 Information Criteria 37

1.10.1.3 Cross-validation 37

1.10.2 Inferring About rank (𝛽) 38

1.10.3 Asymptotic Considerations 38

1.10.4 Overestimation Versus Underestimation of u 41

1.10.5 Cattle Weights: Influence of u 43

1.11 Bootstrap and Uncertainty in the Envelope Dimension 45

1.11.1 Bootstrap for Envelope Models 45

1.11.2 Wheat Protein: Bootstrap and Asymptotic Standard Errors, u Fixed 46

1.11.3 Cattle Weights: Bootstrapping u 47

1.11.4 Bootstrap Smoothing 48

1.11.5 Cattle Data: Bootstrap Smoothing 49

2 Illustrative Analyses Using Response Envelopes 51

2.1 Wheat Protein: Full Data 51

2.2 Berkeley Guidance Study 51

2.3 Banknotes 54

2.4 Egyptian Skulls 55

2.5 Australian Institute of Sport: Response Envelopes 58

2.6 Air Pollution 59

2.7 Multivariate Bioassay 63

2.8 Brain Volumes 65

2.9 Reducing Lead Levels in Children 67

3 Partial Response Envelopes 69

3.1 Partial Envelope Model 69

3.2 Estimation 71

3.2.1 Asymptotic Distribution of ̂ 72

3.2.2 Selecting u1 73

3.3 Illustrations 74

3.3.1 Cattle Weight: Incorporating Basal Weight 74

3.3.2 Mens’ Urine 74

3.4 Partial Envelopes for Prediction 77

3.4.1 Rationale 77

3.4.2 Pulp Fibers: Partial Envelopes and Prediction 78

3.5 Reducing Part of the Response 79

4 Predictor Envelopes 81

4.1 Model Formulations 81

4.1.1 Linear Predictor Reduction 81

4.1.1.1 Predictor Envelope Model 83

4.1.1.2 Expository Example 83

4.1.2 Latent Variable Formulation of Partial Least Squares Regression 84

4.1.3 Potential Advantages 86

4.2 SIMPLS 88

4.2.1 SIMPLS Algorithm 88

4.2.2 SIMPLS When n < p 90

4.2.2.1 Behavior of the SIMPLS Algorithm 90

4.2.2.2 Asymptotic Properties of SIMPLS 91

4.3 Likelihood-Based Predictor Envelopes 94

4.3.1 Estimation 95

4.3.2 Comparisions with SIMPLS and Principal Component Regression 97

4.3.2.1 Principal Component Regression 98

4.3.2.2 SIMPLS 98

4.3.3 Asymptotic Properties 98

4.3.4 Fitted Values and Prediction 100

4.3.5 Choice of Dimension 101

4.3.6 Relevant Components 101

4.4 Illustrations 102

4.4.1 Expository Example, Continued 102

4.4.2 Australian Institute of Sport: Predictor Envelopes 103

4.4.3 Wheat Protein: Predicting Protein Content 105

4.4.4 Mussels’ Muscles: Predictor Envelopes 106

4.4.5 Meat Properties 109

4.5 Simultaneous Predictor–Response Envelopes 109

4.5.1 Model Formulation 109

4.5.2 Potential Gain 110

4.5.3 Estimation 113

5 Enveloping Multivariate Means 117

5.1 Enveloping a Single Mean 117

5.1.1 Envelope Structure 117

5.1.2 Envelope Model 119

5.1.3 Estimation 120

5.1.4 Minneapolis Schools 122

5.1.4.2 Four Untransformed Responses 124

5.1.5 Functional Data 126

5.2 Enveloping Multiple Means with Heteroscedastic Errors 126

5.2.1 Heteroscedastic Envelopes 126

5.2.2 Estimation 128

5.2.3 Cattle Weights: Heteroscedastic Envelope Fit 129

5.3 Extension to Heteroscedastic Regressions 130

6 Envelope Algorithms 133

6.1 Likelihood-Based Envelope Estimation 133

6.2 Starting Values 135

6.2.1 Choosing the Starting Value from the Eigenvectors of M̂ 135

6.2.2 Choosing the Starting Value from the Eigenvectors of M̂ + Û 137

6.2.3 Summary 138

6.3 A Non-Grassmann Algorithm for Estimating EM(V) 139

6.4 Sequential Likelihood-Based Envelope Estimation 141

6.4.1 The 1D Algorithm 141

6.4.2 Envelope Component Screening 142

6.4.2.1 ECS Algorithm 143

6.4.2.2 Alternative ECS Algorithm 144

6.5 Sequential Moment-Based Envelope Estimation 145

6.5.1 Basic Algorithm 145

6.5.2 Krylov Matrices and dim(V) = 1 147

6.5.3 Variations on the Basic Algorithm 147

7 Envelope Extensions 149

7.1 Envelopes for Vector-Valued Parameters 149

7.1.1 Illustrations 151

7.1.2 Estimation Based on a Complete Likelihood 154

7.1.2.1 Likelihood Construction 154

7.1.2.2 Aster Models 156

7.2 Envelopes for Matrix-Valued Parameters 157

7.3 Envelopes for Matrix-Valued Responses 160

7.3.1 Initial Modeling 161

7.3.2 Models with Kronecker Structure 163

7.3.3 Envelope Models with Kronecker Structure 164

7.4 Spatial Envelopes 166

7.5 Sparse Response Envelopes 168

7.5.1 Sparse Response Envelopes when r ≪ n 168

7.5.2 Cattle Weights and Brain Volumes: Sparse Fits 169

7.5.3 Sparse Envelopes when r > n 170

7.6 Bayesian Response Envelopes 171

8 Inner and Scaled Envelopes 173

8.1 Inner Envelopes 173

8.1.1 Definition and Properties of Inner Envelopes 174

8.1.2 Inner Response Envelopes 175

8.1.3 Maximum Likelihood Estimators 176

8.1.4 Race Times: Inner Envelopes 179

8.2 Scaled Response Envelopes 182

8.2.1 Scaled Response Model 183

8.2.2 Estimation 184

8.2.3 Race Times: Scaled Response Envelopes 185

8.3 Scaled Predictor Envelopes 186

8.3.1 Scaled Predictor Model 187

8.3.2 Estimation 188

8.3.3 Scaled SIMPLS Algorithm 189

9 Connections and Adaptations 191

9.1 Canonical Correlations 191

9.1.1 Construction of Canonical Variates and Correlations 191

9.1.2 Derivation of Canonical Variates 193

9.1.3 Connection to Envelopes 194

9.2 Reduced-Rank Regression 195

9.2.1 Reduced-Rank Model and Estimation 195

9.2.2 Contrasts with Envelopes 196

9.2.3 Reduced-Rank Response Envelopes 197

9.2.4 Reduced-Rank Predictor Envelopes 199

9.3 Supervised Singular Value Decomposition 199

9.4 Sufficient Dimension Reduction 202

9.5 Sliced Inverse Regression 204

9.5.1 SIR Methodology 204

9.5.2 Mussels’ Muscles: Sliced Inverse Regression 205

9.5.3 The “Envelope Method” 206

9.5.4 Envelopes and SIR 207

9.6 Dimension Reduction for the Conditional Mean 207

9.6.1 Estimating One Vector in SE(Y|X) 208

9.6.2 Estimating SE(Y|X) 209

9.7 Functional Envelopes for SDR 211

9.7.1 Functional SDR 211

9.7.2 Functional Predictor Envelopes 211

9.8 Comparing Covariance Matrices 212

9.8.1 SDR for Covariance Matrices 213

9.8.2 Connections with Envelopes 215

9.8.3 Illustrations 216

9.8.4 SDR for Means and Covariance Matrices 217

9.9 Principal Components 217

9.9.1 Introduction 217

9.9.2 Random Latent Variables 219

9.9.2.1 Envelopes 220

9.9.2.2 Envelopes with Isotropic Intrinsic and Extrinsic Variation 222

9.9.2.3 Envelopes with Isotropic Intrinsic Variation 223

9.9.2.4 Selection of the Dimension u 225

9.9.3 Fixed Latent Variables and Isotropic Errors 225

9.9.4 Numerical Illustrations 226

9.10 Principal Fitted Components 229

9.10.1 Isotropic Errors, ΣX|Y = 𝜎2Ip 230

9.10.2 Anisotropic Errors, ΣX|Y > 0 231

9.10.3 Nonnormal Errors and the Choice of f 232

9.10.3.1 Graphical Choices 232

9.10.3.2 Basis Functions 232

9.10.3.3 Categorical Response 232

9.10.3.4 Sliced Inverse Regression 233

9.10.4 High-Dimensional PFC 233

Appendix A Envelope Algebra 235

A.1 Invariant and Reducing Subspaces 235

A.2 M-Envelopes 240

A.3 Relationships Between Envelopes 241

A.3.1 Invariance and Equivariance 241

A.3.2 Direct Sums of Envelopes 244

A.3.3 Coordinate Reduction 244

A.4 Kronecker Products, vec and vech 246

A.5 Commutation, Expansion, and Contraction Matrices 248

A.6 Derivatives 249

A.6.1 Derivatives for 𝜂, Ω, and Ω0 249

A.6.2 Derivatives with Respect to Γ 250

A.6.3 Derivatives of Grassmann Objective Functions 251

A.7 Miscellaneous Results 252

A.8 Matrix Normal Distribution 255

A.9 Literature Notes 256

Appendix B Proofs for Envelope Algorithms 257

B.1 The 1D Algorithm 257

B.2 Sequential Moment-Based Algorithm 262

B.2.1 First Direction Vector w1 263

B.2.2 Second Direction Vector w2 263

B.2.3 (q + 1)st Direction Vector wq+1, q < u 264

B.2.4 Termination 265

Appendix C Grassmann Manifold Optimization 267

C.1 Gradient Algorithm 268

C.2 Construction of B 269

C.3 Construction of exp{𝛿A(B)} 271

C.4 Starting and Stopping 272

Bibliography 273

Author Index 283

Subject Index 287