A new information theory based clustering fusion method for multi-view representations of text documents

JUAN FRANCISCO ZAMORA OSORIO, Jérémie Sublime

Resultado de la investigación: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

Multi-view clustering is a complex problem that consists in extracting partitions from multiple representations of the same objects. In text mining and natural language processing, such views may come in the form of word frequencies, topic based representations and many other possible encoding forms coming from various vector space model algorithms. From there, in this paper we propose a clustering fusion algorithm that takes clustering results acquired from multiple vector space models of given documents, and merges them into a single partition. Our fusion method relies on an information theory model based on Kolmogorov complexity that was previously used for collaborative clustering applications. We apply our algorithm to different text corpuses frequently used in the literature with results that we find to be very satisfying.

Idioma originalInglés
Título de la publicación alojadaSocial Computing and Social Media. Design, Ethics, User Behavior, and Social Network Analysis - 12th International Conference, SCSM 2020, Held as Part of the 22nd HCI International Conference, HCII 2020, Proceedings
EditoresGabriele Meiselwitz
EditorialSpringer
Páginas156-167
Número de páginas12
ISBN (versión impresa)9783030495695
DOI
EstadoPublicada - 2020
Publicado de forma externa
Evento12th International Conference on Social Computing and Social Media, SCSM 2020, held as part of the 22nd International Conference on Human-Computer Interaction, HCII 2020 - Copenhagen, Dinamarca
Duración: 19 jul. 202024 jul. 2020

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen12194 LNCS
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia12th International Conference on Social Computing and Social Media, SCSM 2020, held as part of the 22nd International Conference on Human-Computer Interaction, HCII 2020
País/TerritorioDinamarca
CiudadCopenhagen
Período19/07/2024/07/20

Huella

Profundice en los temas de investigación de 'A new information theory based clustering fusion method for multi-view representations of text documents'. En conjunto forman una huella única.

Citar esto