Matteo Ruffini, Ricard Gavaldà, Esther Limón

In this paper we present a method for the unsupervised clustering ofhigh-dimensional binary data, with a special focus on electronic healthcarerecords. We present a robust and efficient heuristic to face this problem usingtensor decomposition. We present the reasons why this approach is preferablefor tasks such as clustering patient records, to more commonly useddistance-based methods. We run the algorithm on two datasets of healthcarerecords, obtaining clinically meaningful results.