-
Notifications
You must be signed in to change notification settings - Fork 0
Machine Learning Approaches
In order to determine the most appropriate classifiers for each problem we employ a variety of the algorithms available through the scikit-learn package. For each phenomenon we train using the created simulated data sets, train with a sub-sample of the simulated training data which is kept separate from that used for the training, and finally test using a sample of real lightcurves from the XMM serendipitous source catalogue. For details of those observations and sources which are used for the testing with real observational data see Testing on sample of XMM SSC. Below we detail the results of this testing, listing: accuracy on simulated testing data; accuracy on observational data; purity of observational data; completeness of observational data; F1 score for observational data; optional - F beta score for imbalanced data sets.
Quasi-Periodic Eruptions For this phenomena we test using a combination of observations for QPE and QPO sources, as observations of AGN. This set contains 109 total observations, 34 with eruptions, and 75 without. As this phenomenon is expected to be very rare we also present the F beta score for beta=0.5 to prefer classifiers which have lower false positive scores.
- Ada Boost : 0.989, 0.784, 0.778, 0.259, 0.389, 0.556
- Extra-Trees : 0.987, 0.922, 0.828, 0.889, 0.857, 0.839
- Gradient Boosting : 0.992, 0.303, 0.278, 1.000, 0.432, 0.322
- Neural Network : 0.982, 0.755, 1.000, 0.074, 0.138, 0.286
- Random Forest : 0.989, 0.765, 0.800, 0.148, 0.250, 0.426
- Support Vector Machine : 0.879, 0.735, 0.000, 0.000, 0.000, 0.000
Quasi-Periodic Oscillations For this phenomena we test using a combination of observations for QPO and QPE sources, as observations of AGN. This set contains 109 total observations, 18 with oscillations, and 91 without. As this phenomenon is expected to be very rare we also present the F beta score for beta=0.5 to prefer classifiers which have lower false positive scores.
- Ada Boost : 0.783, 0.823, 0.000, 0.000, 0.000, 0.000
- Extra-Trees : 0.813, 0.627, 0.237, 0.500, 0.321, 0.265
- Gradient Boosting : 0.821, 0.814, 0.000, 0.000, 0.000, 0.000
- Neural Network : 0.650, 0.608, 0.211, 0.444, 0.286, 0.235
- Random Forest : 0.817, 0.490, 0.160, 0.444, 0.235, 0.183
- Support Vector Machine : 0.804, 0.823, 0.000, 0.000, 0.000, 0.000