- Open Access
- Authors : Dishi Patangiya , Bhavya Sharma , Dr. Vikas Khare
- Paper ID : IJERTV11IS030142
- Volume & Issue : Volume 11, Issue 03 (March 2022)
- Published (First Online): 04-04-2022
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Stock Assessment of Tatasteel using Time Series Analysis
Dishi Patangiya1, Bhavya Sharma1, Dr. Vikas Khare2 1MBA Tech III Year Students, STME, NMIMS, Indore, INDIA 2Associate Professor, STME, NMIMS, Indore, INDIA
Abstract: Time series is a collection of continuous data points that have been ordered by date and time. The analysis of the data through this method can be utilised to understand the stocks behaviour and quantify the risk associated with it. The methods used to predict the future price of TATASTEEL stock are ARIMA (Auto regressive integrated moving average) model, python, and power BI. There were various libraries of python used, which are numpy, matplotlib, pandas, scikit learn, pmdarima. Power BI is an interactive data visualisation software aimed largely at business intelligence. . The Power BI dashboard is a story-telling one-page display. Reports are used to create dashboard displays, and each report is based on a dataset. The dataset of TATASTEEL used here is of 11 years.
Key words Time series Analysis, Python, Power BI, Stock market
-
INTRODUCTION
A time series is a collection of discrete data points that have been arranged chronologically.
A continuous sequence of temporal data connects the data. Time series data analysis is used to derive relevant statistics and other data characteristics [1]. Before making any investments, statistical time series data calculations and data analysis can be used to gain a better understanding of the stock's behaviour and estimate the risk. Time series forecasting is a step forward in the process of learning more about what will happen in the future. It refers to the application of mathematical models to forecast future values based on past data. [3]. The Auto-Regressive Integrated Moving Average (ARIMA) model, as well as the Augmented Dickey- Fuller Test is used to determine the stationarity of time series data and to estimate future stock prices for a specified period of time.
The stock market is a market that allows people to buy and sell business equity. The Stock Index has its unique value on each Stock Exchange. [2]. The index is the average value generated by combining the prices of several stocks. This makes it simpler to see the entire stock market as well as market forecasts over time. Individuals and the economy as a whole are heavily influenced by the stock market. As a result, correctly predicting market trends can lower the risk of losing money while increasing profits.
The Indian Stock Exchange, often known as the National Stock Exchange of India (NSE), is a private company based in India. The country's first demutualized electronic exchange, this market, which is located in India's economic metropolis, was formed in 1992. The National Stock Exchange was India's first exchange to offer a contemporary, fully automated screen- based electronic trading system, allowing investors from throughout the country to trade with ease. Nifty is frequently utilised as a barometer of the Indian capital market by investors in India and throughout the world.
Kwon and Shin (1999), Christiansen et al. (2012), Engle et al. (2013), and Bekrios et al. (2013) all underline the relevance of economic factors, particularly economic growth, on stock market return or volatility (2016). Several research, such as Erb, Harvey, and Viskanta (1995) and Hassan et al. (2003), have looked into the impact of country risk on the stock market. Erb et al. (1995) indicate that country risk indicators, such as political, economic, and financial hazards, are essential for predicting predicted stock returns using a panel based model. Hassan et al. (2003), on the other hand, want to look at the impact of country risk on stock market volatility in the Middle East and Africa from 1984 to 1999. According to their findings, country risk characteristics are significant determinants of stock market return volatility.
-
METHODOLOGY
The specific procedure or technique used in this paper to identify, select, process, and analyse information about the stock assessment using time series are ARIMA model, various libraries of python, and Power BI. The methodologies used in this research paper are represented through figure (1).
2.1 ARIMA MODEL
Before working with non-stationary data, the Autoregressive Integrated Moving Average (ARIMA) Model converts it to stationary data. One of the most widely used models for predicting linear time series data is this one. The ARIMA model has been widely utilised in banking and economics since it was shown to be dependable, efficient, and capable of anticipating short-term share market fluctuations. The abbreviation for "autoregressive integrated moving average" (ARIMA).
ARIMA
Model
METHODS
Power
BI
Python
Figure 1: Methodologies
It's a time series model that's used to track events across time in statistics and econometrics. The model is used to interpret past data or predict future data in a series.
When a metric is measured at regular intervals, such as fractions of a second, daily, weekly, or monthly, it's called a periodic metric. ARIMA is a model based on the Box-Jenkins approach. Consider the case when you have a certain value A that is influenced by another value B. The link between data points A and B must then be determined in order to perform linear regression. A and B (A's previous value) have now become linked, and A's present value is now dependant on A's past value. As a result, the current value of A will decide any future value for it.
-
PYTHON LIBRARIES
Pandas: Pandas is an open-source library that makes working with relational or labelled data simple and intuitive. It comes with a number of data structures and methods for working with numerical and time series data. The NumPy Python library provides the foundation for this library. Pandas is fast and provides its users with a high level of performance and productivity. It includes data analysis, cleansing, exploration, and manipulation tools. Pandas allows you to analyse enormous volumes of data and come up with findings based on statistical theory. It can clean up data sets and make them readable and valuable. Relevant data is crucial in data science. Figure (2) shows all of the python libraries that were utilised in the analysis.
Pandas
Matplotlib
Python
Numpy
Scikit Learn
Pmdarima
Figure (2): Python Libraries
Numpy: t's a Python library that includes a multidimensional array object, derived objects (such as masked arrays and matrices), and a number of routines for performing fast array operations such as mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation, and more.
Matplotlib: For 2D array charts, it's a useful Python visualisation package. It's a multi-platform data visualisation package built on NumPy arrays that's meant to work with the entire SciPy stack. One of the most important benefits of visualisation is that it allows us to see large amounts of data in easily understood images. There are many plots available, including line, bar, scatter, histogram, and so on. Scikit learn: Python's most useful and robust machine learning library is Scikitlearn (Sklearn).
Through Python's integrity interface, it provides a set of efficient machine learning and statistical modelling methods such as classification, regression, clustering, and dimensionality reduction.
his library is mostly written in Python and is based on NumPy, SciPy, and Matplotlib.
Pmdarima: Pmdarima is a statistical library that connects Python's time series analysis capabilities with other statistical libraries. A collection of statistical tests for stationarity and seasonality, B., for example, is a time series utility. Differentiation and inverse differentiation are two terms that are used interchangeably. BoxCox and Fourier transforms are just two examples of intrinsic and extrinsic transformers and functionalizers. Decomposition of seasonal time series utility for mutual verification for experimentation and examples, there is a large number of built-in time series datasets. To integrate quotes and drive production, use the Scikitlearnesque pipeline.
-
POWER BI
Power BI is a business intelligence-focused interactive data visualisation product from Microsoft. Data preparation, data detection, and interactive dashboards are among the data warehousing tools available. Microsoft Power BI is a tool for creating reports and gaining insights from your company's data. It links too many datasets and "cleans up" the data so that it can be processed and understood more easily. The Power BI architecture is an Azure service that allows you to connect to a variety of data sources. You can build dataset reports and data visualisations using Power BI Desktop. To access continuous data for reporting and analysis, the Power BI gateway connects to your on-premises data source. A Power BI service is a cloud-based service that enables the publication of Power BI reports and data visualisations. With the Power BI mobile app, you can view your data from anywhere. The Power BI app is available for Windows, iOS, and Android. The Power BI dashboard is a one-page presentation that tells a story. Reports are used to create dashboard displays, and each report is based on a dataset. The canvas is the name for the one-page dashboard. The visualisations that display on the dashboard are known as tiles, and the report creator pins them to the dashboard. Power Query, Power Pivot, Power View, Power Map, Power Q&A, and Power BI Desktop are the components of Power BI (Development Tool). Stream Analytics, multiple data sources, and custom visualization are some of the Power BI's features.
-
DATA
Tata Steel Limited is an Indian multinational steel-making firm based in Jamshedpur, Jharkhand. It is headquartered in Mumbai, Maharashtra, India. The corporation is owned by the Tata Group. With an annual crude steel capacity of 34 million tonnes, Tata Steel, formerly known as Tata Iron and Steel Company Limited (TISCO), is one of the world's major steel makers. It is one of the world's most geographically diverse steel producers, with operations and commercial presence all over the globe. The group generated a consolidated turnover of US$19.7 billion in the financial year ending March 31, 2020 (excluding SEA activities). Tata Steel employs more than 80,500 people across 26 countries, with the majority of its operations in India, the Netherlands, and the United Kingdom. The company's largest factory is located in Jamshedpur, Jharkhand (10 MTPA capacity). In 2007, Tata Steel purchased Corus, a steel manufacturer based in the United Kingdom. It was ranked 486th on the Fortune Global 500 list of the world's largest companies in 2014. TATASTEEL LTD's dataset is being used for an 11-year period, from January 1, 2011 to January 1, 2022. After Steel Authority of India Ltd, it is India's second largest steel company (measured by domestic production) with an annual capacity of 13 million tonnes (SAIL). Table (1) shows the data for the stock that was used in the analysis, which includes attributes such as Date. Open, Close, High, Low, Adj. Close, and Volume. The table also shows the first 10 tuples of the dataset.
Date
Open
High Tabl
e (1): DLaotawset of T
ATSTECElLossetock
Adj Close
Volume
01-01-2011
652.815125
680.158691
625.138123
629.997131
485.806061
30599552
08-01-2011
628.806152
634.236755
588.31488
593.030945
457.300568
35957677
15-01-2011
587.981445
614.038757
584.218079
599.795349
462.516724
28110844
22-01-2011
599.176086
633.093506
595.460388
607.32196
468.320618
23250152
29-01-2011
591.649475
621.184265
586.695251
605.845215
467.181946
43123736
05-02-2011
609.465637
614.991516
547.918823
567.164124
437.354004
39067026
12-02-2011
578.596985
630.140015
571.165588
608.179443
468.981903
34980617
19-02-2011
611.085266
702.166931
563.257874
578.168213
445.839508
34980617
26-02-2011
584.980286
605.749939
570.975098
588.981812
454.178131
22202239
05-03-2011
584.456299
591.220703
549.871948
553.825806
427.068512
27059203
-
Analysis
Figure (3): Highest Stock Price Graph
Figure (4): Visualization of stocks daily closing price
Figure (5): Probability distribution of closing price of stock
Figure (6)
They applied the ADF (Augmented Dickey- Fuller) Test, which is the most extensively used statistical test, in figure (6). It's used to see if a series is stationary or has a unit root.
The following are the null and alternate hypotheses:
The series has a unit root, according to the null hypothesis. Alternative Hypothesis: There is no unit root in the series.
When the null hypothesis is rejected, the series becomes stationary, and the mean and standard deviation become flat lines. This graph's conclusion is that it is non-stationary.
Figure (7): Separation of Trend and Seasonality
Figure (8): Log of the series and calculating rolling average
Figure (9): ARIMA model to train the data
Figure (10)
The researchers choose the parameters for the ARIMA model, which are p, q, and d, in Fig. (10).
This time, they used Auto ARIMA to select the parameters (Automatically discover the optimal order for an ARIMA model).The auto arima model returns the most fitting ARIMA model after calculating the optimal parameters.
The residual error appears to have a uniform variance and fluctuate around a mean of zero in the top-left graph in the preceding figure.
The density plot on the top-right graph shows a normal distribution with a mean of zero. Because the red line is not completely aligned with the dots in the bottom left graph. This illustrates the data's skewness. The residual errors are not auto-correlated, as shown in the bottom-right graph.
Figure (11): Forecasting with 95% confidence level
Figure (12)(a): Candlestick Graph
Figure (12) (b): Candlestick Graph
In Figure (12)(b), the researchers examined the stock price on a monthly basis using a line chart and a candlestick graph for the years 2019and 2020.
The world was hit by COVID -19 during this time, and stock values plummeted, with the lowest price of Rs.250.85 and the highest price of Rs.653.50.
Figure (13) Prediction of stock price for next 6 years
CONCLUSION:
From the analysis, the researchers conclude that the stock has hit the lowest price during the starting of the COVID -19 pandemic and highest (until January, 2022) in the year 2021. By using power BI and time series analysis they also predicted the price of this particular stock for the next 6 years. We compared tata steel stocks for the past ten years and concluded that the stocks open price, close price. By using a power BI tool i.e., forecasting we have predicted the price for 6 years. The upper bound for 6 years are 1434.15, 1534.45, 1613.83, 1681.62, 1741.77, and 1796.41 and the lower bound is 853.24, 752.93, 673.56, 605.77, 545.62,
and 490.99.
REFERENCES:
[1] Kirikkaleli D. , 2020, The effect of domestic and foreign risks on an emerging stock market: A time series analysis, North American Journal of Economics and Finance(51) [2] Christy Jackson J. a,, Prassanna J. a , Abdul Quadir Md. a , Sivakumar V., 2020, Stock market analysis and prediction using time series analysis,materials today: Proceedings.
[3] Jia Zhu, Daijun Wei, 2021, Analysis of stock market based on visibility graph and structure entropy, Physica A: Statistical Mechanics and its Applications. Volume 576