Back to Search
Start Over
Systematic Serendipity: A Test of Unsupervised Machine Learning as a Method for Anomaly Detection
- Publication Year :
- 2018
-
Abstract
- Advances in astronomy are often driven by serendipitous discoveries. As survey astronomy continues to grow, the size and complexity of astronomical databases will increase, and the ability of astronomers to manually scour data and make such discoveries decreases. In this work, we introduce a machine learning-based method to identify anomalies in large datasets to facilitate such discoveries, and apply this method to long cadence lightcurves from NASA's Kepler Mission. Our method clusters data based on density, identifying anomalies as data that lie outside of dense regions. This work serves as a proof-of-concept case study and we test our method on four quarters of the Kepler long cadence lightcurves. We use Kepler's most notorious anomaly, Boyajian's Star (KIC 8462852), as a rare `ground truth' for testing outlier identification to verify that objects of genuine scientific interest are included among the identified anomalies. We evaluate the method's ability to identify known anomalies by identifying unusual behavior in Boyajian's Star, we report the full list of identified anomalies for these quarters, and present a sample subset of identified outliers that includes unusual phenomena, objects that are rare in the Kepler field, and data artifacts. By identifying <4% of each quarter as outlying data, we demonstrate that this anomaly detection method can create a more targeted approach in searching for rare and novel phenomena.<br />Comment: 21 pages, 10 figures, Submitted to MNRAS
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.1812.07156
- Document Type :
- Working Paper
- Full Text :
- https://doi.org/10.1093/mnras/sty3461