Abstract
lassopack is a suite of programs for penalized regression methods suitable for the highdimensional setting where the number of predictors p may be large and possibly greater than the number of observations. The pdslasso package contains routines forestimating structural parameters in linear models with many controls and/orinstruments. The lassopack package consists of six main programs: lasso2 implements lasso, squareroot lasso,elastic net, ridge regression, adaptive lasso and postestimation OLS. cvlassosupports Kfold crossvalidation and rolling crossvalidation forcrosssection, panel and timeseries data. rlasso implements theorydriven penalization for the lasso and squareroot lasso for crosssection and paneldata. lassologit, cvlassologit and rlassologit are the corresponding programsfor logistic lasso regression. The lasso (Least Absolute Shrinkage and Selection Operator, Tibshirani 1996), the squarerootlasso (Belloni et al. 2011) and the adaptive lasso (Zou 2006) a reregularization methods that use L1 norm penalization to achieve sparsesolutions: of the full set of p predictors, typically most will havecoefficients set to zero. Ridge regression (Hoerl & Kennard 1970) relies onL2 norm penalization; the elastic net (Zou & Hastie 2005) uses a mix of L1and L2 penalization. lasso2 implements all these estimators. rlasso uses the theorydriven penalization methodology of Belloni et al. (2012, 2013, 2014,2016) for the lasso and squareroot lasso. cvlasso implements Kfoldcrossvalidation and hstep ahead rolling crossvalidation (for timeseries and panel data) to choose the penalization parameters for all the implemented estimators. lassologit, rlassologit and cvlassologit extend support to the case where the dependent variable is abinary response. In addition, rlassoimplements the Chernozhukov et al. (2013) supscore test of joint significance of the regressors that is suitable for the highdimensional setting. The pdslasso package consists of twoprograms: pdslasso and ivlasso are routines for estimating structural parameters in linear models with many controls and/or instruments. The routines use methods for estimating sparse highdimensional models, specifically the lasso (Least Absolute Shrinkage and Selection Operator, Tibshirani 1996) and the squarerootlasso (Belloni et al. 2011, 2014). These estimators are used to select controls (pdslasso) and/or instruments (ivlasso) from a large set of variables (possibly numbering more than the number of observations), in a setting where the researcher is interested in estimating the causal impact of one or more (possibly endogenous) causal variables of interest. Two approaches are implemented in pdslasso and ivlasso: (1) The"postdoubleselection" (PDS) methodology of Belloni et al. (2012,2013, 2014, 2015, 2016). (2) The "postregularization" (CHS)methodology of Chernozhukov, Hansen and Spindler (2015). For instrumental variable estimation, ivlasso implements weakidentificationrobust hypothesistests and confidence sets using the Chernozhukov et al. (2013) supscore test.The implementation of these methods in pdslasso and ivlasso require the Stataprogram rlasso (available in the separate Stata module lassopack.
Original language  English 

Place of Publication  Boston, USA 
Publisher  Boston College Department of Economics 
Media of output  Online 
Publication status  Published  24 Jan 2019 
Keywords
 econometrics
 highdimensional models
 inference
 lasso
 elastic net
 sparsity
Fingerprint
Dive into the research topics of 'PDSLASSO & LASSOPACK: Stata module for postselection and postregularization OLS or IV estimation and inference'. Together they form a unique fingerprint.Profiles

Mark Edwin Schaffer
 School of Social Sciences, Edinburgh Business School  Professor
 School of Social Sciences  Professor
Person: Academic (Research & Teaching)