Humboldt-Universität zu Berlin - High Dimensional Nonstationary Time Series

IRTG1792DP2018 018

Adaptive Nonparametric Clustering

Kirill Efimov
Larisa Adamyan
Vladimir Spokoiny

This paper presents a new approach to non-parametric cluster analysis
called Adaptive Weights Clustering (AWC). The idea is to identify the
clustering structure by checking at different points and for dierent scales
on departure from local homogeneity. The proposed procedure describes
the clustering structure in terms of weights wij each of them measures
the degree of local inhomogeneity for two neighbor local clusters using
statistical tests of \no gap" between them. The procedure starts from
very local scale, then the parameter of locality grows by some factor
at each step. The method is fully adaptive and does not require to
specify the number of clusters or their structure. The clustering results
are not sensitive to noise and outliers, the procedure is able to recover
dierent clusters with sharp edges or manifold structure. The method
is scalable and computationally feasible. An intensive numerical study
shows a state-of-the-art performance of the method in various articial
examples and applications to text data. Our theoretical study states
optimal sensitivity of AWC to local inhomogeneity.

adaptive weights, clustering, gap coecient, manifold clustering

JEL classification:

AMS 2000 Subject Classication:
Primary 62H30. Secondary 62G10