Databases in fusion experiments are made up of thousands of signals. For this reason, data analysis must be simplified by developing automatic mechanisms for fast search and retrieval of specific data in the waveform database. In particular, a method for finding similar waveforms would be very helpful. The term 'similar' implies the use of proximity measurements in order to quantify how close two signals are. In this way, it would be possible to define several categories (clusters) and to classify the waveforms according to them, where this classification can be a starting point for exploratory data analysis in large databases. The clustering process is divided in two stages. The first one is feature extraction, i.e., to choose the set of properties that allow us to encode as much information as possible concerning a signal. The second one establishes the number of clusters according to a proximity measure.
- Feature extraction
- TJ-II signals