Subsampling the concurrent AdaBoost algorithm: An efficient approach for large datasets

Héctor Allende-Cid, Diego Acuña, Héctor Allende

Resultado de la investigación: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

In this work we propose a subsampled version of the Concurrent AdaBoost algorithm in order to deal with large datasets in an efficient way. The proposal is based on a concurrent computing approach focused on improving the distribution weight estimation in the algorithm, hence obtaining better capacity of generalization. On each round, we train in parallel several weak hypotheses, and using a weighted ensemble we update the distribution weights of the following boosting rounds. Instead of creating resamples of size equal to the original dataset, we subsample the datasets in order to obtain a speed-up in the training phase. We validate our proposal with different resampling sizes using 3 datasets, obtaining promising results and showing that the size of the resamples does not affect considerably the performance of the algorithm, but the execution time improves greatly.

Idioma originalInglés
Título de la publicación alojadaProgress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 21st Iberoamerican Congress, CIARP 2016, Proceedings
EditoresCesar Beltran-Castanon, Fazel Famili, Ingela Nystrom
EditorialSpringer Verlag
Páginas318-325
Número de páginas8
ISBN (versión impresa)9783319522760
DOI
EstadoPublicada - 2017
Publicado de forma externa
Evento21st Iberoamerican Congress on Pattern Recognition, CIARP 2016 - Lima, Perú
Duración: 8 nov. 201611 nov. 2016

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen10125 LNCS
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia21st Iberoamerican Congress on Pattern Recognition, CIARP 2016
País/TerritorioPerú
Ciudad Lima
Período8/11/1611/11/16

Huella

Profundice en los temas de investigación de 'Subsampling the concurrent AdaBoost algorithm: An efficient approach for large datasets'. En conjunto forman una huella única.

Citar esto