IRTG1792DP2018 060
RESIDUAL'S INFLUENCE INDEX (RINFIN), BAD LEVERAGE AND UNMASKING IN HIGH DIMENSIONAL L2-REGRESSION
Yannis G. Yatracos
Abstract
In linear regression of Y on X(2 Rp) with parameters (2 Rp+1);
statistical inference is unreliable when observations are obtained
from gross-error model, F;G = (1??)F +G; instead of the assumed
probability F;G is gross-error probability, 0 < < 1: When G is unit
mass at (x; y); Residual's In
uence Index, RINFIN(x; y; ; ), measures
the dierence in small x-perturbations of L2-residual, r(x; y);
for model F and for F;G via r's x-partial derivatives. Asymptotic
properties are presented for sample RINFIN that is successful in
extracting indications for in
uential and bad leverage cases in microarray
data and simulated, high dimensional data. Its performance
improves as p increases and can also be used in multiple response
linear regression. RINFIN's advantage is that, whereas in in
uence
functions of L2-regression coecients each x-coordinate and r(x; y)
appear in a sum as product with moderate size when (x; y) is bad
leverage case and masking makes r(x; y) nearly vanish, RINFIN's
x-partial derivatives convert the product in sum allowing for unmasking.
Keywords:
Big Data, Data Science, In
fluence Function, Leverage, Masking, Residual's In
fluence Index (RINFIN)
JEL Classification:
AMS 2010 subject classications:
62-07, 62-09, 62J05, 62F35, 62G35