- Open Access
- Authors : Rahul Noronha , Sahana M , Sandesh Bhat , Karishma Chavhan
- Paper ID : IJERTV11IS040238
- Volume & Issue : Volume 11, Issue 04 (April 2022)
- Published (First Online): 06-05-2022
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Comparison of Various Techniques to Classify Galaxies based on Morphology and to Detect Potential Exoplanets
Karishma Chavhan
Dept. of Computer Science and Engineering Dayanada Sagar University, School of Engineering Bengaluru, India
Sandesh Bhat
Dept. of Computer Science and Engineering Dayanada Sagar University, School of Engineering Bengaluru, India
Rahul Noronha
Dept. of Computer Science and Engineering Dayanada Sagar University, School of Engineering Bengaluru, India
Sahana M
Dept. of Computer Science and Engineering Dayanada Sagar University, School of Engineering Bengaluru, India
Abstract In this paper, we are trying to examine which method is most suitable for classifying the galaxies based on their morphology into their various shapes Spiral, Elliptical and irregular. We are also trying to determine which method would work best to detect potential exoplanets.
KeywordsGalaxy morphology, Exoplanet Detection, ImageNet, Artificial Neural Network, Hubble type, Decision trees.
-
INTRODUCTION
We adopt a transfer learning approach and use the ResNet50 model on the crowdsourced Galaxy Zoo dataset. The different methods we compare for the potential exoplanet detection task are as follows: Tree-based, Naive Bayes, Logistic regression. Along with the machine learning models, we also make use of Deep learning models like a perceptron (Artificial Neuron) and compare their results.
A. Abbreviations
Abbreviations used:
ANN – Artificial Neural Network.
ResNet50 – Residual Neural Network (50 layers). ResNet152 – Residual Neural Network (152 layers). Xception – Extreme inception .
KNN – K-Nearest Neighbors. RBF – Radial Basis Function.
-
PROBLEM STATEMENT
To determine which method is the best for classifying the galaxies based on their morphology and to examine for potential exoplanets detection which method would perform better. To use ANN for potential exoplanet detection and to try different ImageNet models like ResNet152, Xception, etc., and see how they compare with ResNet50.
-
LITERATURE SURVEY
We conducted a survey about the different methods available to perform classification of galaxies based on their
Morphology. Looking through the relevant research papers in potential exoplanet detection we identified some key techniques used, based on their time and resource usage. We will cover a few methods for each of these two tasks.
-
Galaxy Morphology
-
Rules-Based Approach
In the first method, we use a rules-based approach where we derive the Hubble type by following the Galaxy Zoo Decision Tree. What is the Galaxy Zoo 2 Project: In this crowdsourced project, the online participants are given an image starting with a question asking if the galaxy is simply smooth and rounded with no sign of a disk, depending on the responses the users give to the questions, another question is asked with the same image, until finally the Galaxy gets classified into spiral, elliptical or irregular shape. A small drawback is that non- expert labelling of the data may lead to human error.
Fig. 1. The Hubble type decision tree.
Fig. 2. Flowchart showing the rules-based mapping onto Hubble type
-
Transfer Learning using ImageNet Models
In the second method we use one of the pre-trained ImageNet models (ImageNet Models Neural Network Libraries 1.25.0 documentation (nnabla.readthedocs.io)) we have available to us using Transfer learning approach. Since Deep learning training takes considerable time and resources for training an emerging technique, especially in this field has been the use of transfer learning. In transfer learning, a neural network trained for another set of images can be repurposed and
used for a different use case. This is done by removing the output layer of the image net and replacing it with just an output layer in the cases where we have less amount of unlabelled data, or replacing the last few, or even many of the Image net layers in case we have a large amount of labelled data with us. The ResNet50 is an Image net model that came around the time of Galaxy Zoo and that can be used effectively in the case of Galaxy morphology classification. There are also other Image Net models like ResNet152, Xception, etc that could potentially be used for this purpose.
Fig. 3. ResNet50 and ResNet152 architectures with other ResNet model architectures too.
-
Hubble Tuning Fork method
Fig. 4. Xception ImageNet architecture
In the third method we use the Hubble tuning fork classification schemes for galaxies. Here techniques such as PCA (Principal component analysis) are used after which artificial neuron networks are trained using locally weighted regression.
But to map the 37 vectors used to the Hubble classification scheme we use the first two methods which are rules based and the Transfer learning approach.
Fig. 5. Hubble Classification Scheme
-
-
Potential Exoplanet Detection
Planets orbiting stars outside our solar systems are called extrasolar planets or exoplanets. Several approaches have been proposed by astronomers for detecting them, being the fine-grained analysis of periodicities in star light- curves the most successful so far.
The methods present for Potential exoplanet detection are:
-
Transient Light curve analysis
-
The Hubble telescope captures light coming from a star, on seeing the time variance of the light reaching the telescope we can find out if an object such as an exoplanet has passed in front of the star by analysing the light curve. This method is resource intensive and requires more training time and resources. Usually for this light curve analysis we use CNNs.
5) Logistic Regression
Logistic Regression is useful analysis method for classification problems, where you are trying to determine if a new sample fits best into a category.
-
-
CONCLUSIONS
The paper aims on performing comparative analysis and introduce different techniques. In the field of astronomy and combine it with machine learning to get more accurate results. Transfer Learning improves the training dataset and hence accuracy for galaxy images. Single or multi-layer perceptron is expected to give better results for exoplanets metadata. Among KNN, Random Forest, SVM and Logistic we adopt the model that gives better accuracy and save the model. The two models saved are used in a web app and deployed.
-
c-SVM
Fig. 6. Transient light curve.
REFERENCES
[1] M. Z. Variawa, T. L. van Zyl and M. Woolway, "A rules-based and Transfer Learning approach for deriving the Hubble type of a galaxy from the Galaxy Zoo data," 2020 IEEE 23rd International Conference on Information Fusion (FUSION), 2020, pp. 1-7, doi: 10.23919/FUSION45008.2020.9190462.J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.6873. [2] H DomÃnguez Sánchez, M Huertas-Company, M Bernardi, S Kaviraj, J L Fischer, T M C Abbott, F B Abdalla, J Annis, S Avila, D Brooks, E Buckley-Geer, A Carnero Rosell, M Carrasco Kind, J Carretero, C E Cunha, C B DAndrea, L N da Costa, C Davis, J De Vicente, P Doel, AFor the c-SVM we use a Gaussian kernel as the Radial Basis Function (RBF) because it produces a more flexible decision boundary.
-
KNN
Load the data. Initialize K to your chosen number of neighbors. For each example in the data. Calculate the distance between the query example and the current example fro the data. Add the distance and the index of the example to an ordered collection. Sort the ordered collection of distances and indices from smallest to largest (in ascending order) by the distances. Pick the first K entries from the sorted collection. Get the labels of the selected K entries. If regression, return the mean of the K labels. If classification, return the mode of the K labels.
-
Random Forest
One of the most important features of the Random Forest Algorithm is that it can handle the data set containing categorical variables as in the case of classification. It performs better results for classification problems.
E Evrard, P Fosalba, J Frieman, J GarcÃa-Bellido, E Gaztanaga, D W
Gerdes, D Gruen, R A Gruendl, J Gschwend, G Gutierrez, W G Hartley, D L Hollowood, K Honscheid, B Hoyle, D J James, K Kuehn, N Kuropatkin, O Lahav, M A G Maia, M March, P Melchior, F Menanteau, R Miquel, B Nord, A A Plazas, E Sanchez, V Scarpine, R Schindler, M Schubnell, M Smith, R C Smith, M Soares-Santos, F Sobreira, E Suchyta, M E C Swanson, G Tarle, D Thomas, A R Walker, J Zuntz, Transfer learning for galaxy morphology from one survey to another, Monthly Notices of the Royal Astronomical Society, Volume 484, Issue 1, March 2019, Pages 93100, https://doi.org/10.1093/mnras/sty3497.
[3] Ismael Araujo (2020). Using Machine Learning to Find Exoplanets with NASAs Data; https://towardsdatascience.com/using-machine-learning- to-find-exoplanets-with-nasas-dataset-bb818515e3b3. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, Electron spectroscopy studies on magneto-optical media and plastic substrate interface, IEEE Transl. J. Magn. Japan, vol. 2, pp. 740741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982]. [4] L. Ofman, A. Averbuch, Adi Shliselberg, Idan Benaun, David Segev, Aron Rissman (2021). Automated identification of transiting exoplanet candidates in NASA Transiting Exoplanets Survey Satellite (TESS) data with machine learning methods, Physics, Computer Science, 2021, doi: 10.1016/j.newast.2021.101693.