Data mining technique for fast retrieval of similar waveforms in Fusion massive databases

J. Vega, A. Pereira, A. Portas, S. Dormido-Canto, G. Farias, R. Dormido, J. Sánchez, N. Duro, M. Santos, E. Sánchez, G. Pajares

Research output: Contribution to journalArticlepeer-review

17 Scopus citations


Fusion measurement systems generate similar waveforms for reproducible behavior. A major difficulty related to data analysis is the identification, in a rapid and automated way, of a set of discharges with comparable behaviour, i.e. discharges with "similar" waveforms. Here we introduce a new technique for rapid searching and retrieval of "similar" signals. The approach consists of building a classification system that avoids traversing the whole database looking for similarities. The classification system diminishes the problem dimensionality (by means of waveform feature extraction) and reduces the searching space to just the most probable "similar" waveforms (clustering techniques). In the searching procedure, the input waveform is classified in any of the existing clusters. Then, a similarity measure is computed between the input signal and all cluster elements in order to identify the most similar waveforms. The inner product of normalized vectors is used as the similarity measure as it allows the searching process to be independent of signal gain and polarity. This development has been applied recently to TJ-II stellarator databases and has been integrated into its remote participation system.

Original languageEnglish
Pages (from-to)132-139
Number of pages8
JournalFusion Engineering and Design
Issue number1
StatePublished - Jan 2008
Externally publishedYes


  • Data mining
  • Fusion databases
  • Pattern recognition
  • Similar waveforms
  • TJ-II


Dive into the research topics of 'Data mining technique for fast retrieval of similar waveforms in Fusion massive databases'. Together they form a unique fingerprint.

Cite this