CS419  Assignment 1-Stock Price Prediction using Machine Learning Solved

30.00 $

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: zip solution files instantly, after Payment

Securely Powered by: Secure Checkout

Description

Rate this product

 

GitHub link : https://github.com/akj-new-era/Stock-Price-Prediction Abstract

To examine a number of different forecasting techniques to predict future stock returns based on past returns. We do this by applying supervised learning methods for stock price forecasting by interpreting market data. We are primarily looking to apply linear models and later on moving to neural networks.

Introduction

Stock (also known as equity) is a security that represents the ownership of a fraction of a corporation. This entitles the owner of the stock to a proportion of the corporation’s assets and profits equal to how much stock they own. Units of stock are called “shares.”A stock is a general term used to describe the ownership certificates of any company.

Stock prices change everyday by market forces. By this we mean that share prices change because of supply and demand. If more people want to buy a stock (demand) than sell it (supply), then the price moves up. Conversely, if more people wanted to sell a stock than buy it, there would be greater supply than demand, and the price would fall.

Data Set

To train and test our model we used data from HDFC and ITC from the year 2000 to 2021 which is taken from kaggle(https://www.kaggle.com/datasets/rohanrao/nifty50-stock-market-data). This dataset contains daily stock prices at which they open and close, daily high and low values, company’s volume and turnover,previous day stock price,etc.

HDFC ITC

INFOSYS

Linear Regression

Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a statistical method that is used for predictive analysis. Linear regression makes predictions for continuous/real or numeric variables such as sales, salary, age, product price, etc.

Linear regression algorithm shows a linear relationship between a dependent (y) and one or more independent (xi) variables and shows how the value of the dependent variable is changing according to the value of the independent variables.

In this method we find the best fit line as follows.

1) Define a linear model y= a0+a1x+ ε

Here,

Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value). ε = random error
Writing it in matrix form, we get
Y = AX + E
2) Define a Cost function

Here we will be using MSE cost function

Where,

N=Total number of observation
Yi = Actual value
(a1xi+a0)= Predicted value
3) Use Matrix method for minimizing the error between predicted values and actual values A = (XTX)-1XTY

This equation will give the optimized value of the Coefficient Matrix (A)
4) Checking the Performance using R Squared/ coefficient of determination

The high value of R-square determines the less difference between the predicted values and actual values and hence represents a good model

Results:
ITC: Train Data Test Data

HDFC : Train Data Test Data

INFOSYS : Train Data Test Data

Recurrent Neural Networks

It is a class of neural networks tailored to deal with temporal data. The neurons of RNN have a cell state/memory, and input is processed according to this internal state, which is achieved with the help of loops within the neural network. There are recurring module(s) of ‘tanh’ layers in RNNs that allow them to retain information. However, not for a long time, which is why we need LSTM models.

LSTM Model

Recurrent neural networks (RNN) have proved one of the most powerful models for processing sequential data. Long Short-Term memory is one of the most successful RNNs architectures.

LSTM introduces the memory cell, a unit of computation that replaces traditional artificial neurons in the hidden layer of the network. With these memory cells, networks are able to effectively associate memories and input remotely in time, hence suited to grasp the structure of data dynamically over time with high prediction capacity.

LSTMs have a chain-like structure, but the repeating module has a different structure. Instead of having a single neural network layer, there are four, interacting in a very special way

The repeating module in a LSTM

Analysis

For analyzing the efficiency of the system we have used the Mean Square Error(MSE). The error or the difference between the target and the obtained output value is minimized by using MSE value. MSE is the mean/average of the square of all of the errors. The use of MSE is highly common and it makes an excellent general purpose error metric for numerical predictions.

Implementation:

1. Using Scikit Learning
( Machine Learning model)

  1. Data Preprocessing using

    dataset

  2. Visualization of Dataset
  3. Feature Scaling
  4. Preparing the Datasets for training 11. Result visualization
  5. Reshaping the datasets

7. Model development
8. Implementation of sequential, dense, LSTM and dropout.

9. Preprocessing the Data 10. Predicting the Output

Methodology

Stage 1: Raw Data: In this stage, the historical stock data is collected as per described in the dataset part. and this historical data is used for the prediction of future stock prices.

Stage 2: Data Preprocessing: The pre-processing stage involves

a) Data discretization: Part of data reduction but with particular importance, especially for numerical data

b) Data transformation: Normalization. c) Data cleaning: Fill in missing values.

d) Data integration: Integration of data files. After the dataset is transformed into a clean dataset, the dataset is divided into training and testing sets so as to evaluate. Here, the training values are taken as the more recent values. Testing data is kept as 5-10 percent of the total dataset.

Stage 3: Feature Extraction: In this layer, only the features which are to be fed to the neural network are chosen. We will choose the feature from Date, open, high, low, close, and volume.Here have chosen Date as feature data.

Stage 4: Training Neural Network: In this stage, the data is fed to the neural network and trained for prediction assigning random biases and weights. Our LSTM model is composed of a sequential input layer followed by 2 LSTM layers and a dense layer with ReLU activation and then finally a dense output layer with linear activation function.

Results
ITC Testing Data Infosys Testing Data

HDFC Testing Data

Analysis:
Errors obtained using Linear

Companies

HDFC

ITC

INFOSYS

Mean Squared Error

245146.57934

53423.3812

159334.2670

Mean Absolute Error

416.1567

228.669

280.6234

Coefficient of Determination

-1.5125

-29.9926

-2.1797

Errors obtained using LSTM

Companies

HDFC

ITC

INFOSYS

Mean Squared Error

5933.854308397654

61.66858715680009

3004.878401029405

Mean Absolute Error

56.56770254532921

5.757666215539615

28.037047148515832

Coefficient of Determination

0.9259315325191522

0.9683613473676155

0.9542569842578266

Thus, we observe that the error for LSTM is significantly less than that for linear regression. It provides a much better fit as it accounts for past data to predict future values.

Comparative Results Using Different Parameters and Epochs

Set of normalized features

No.of epochs

Mean squared error for Normalized features

Prices + Volume

25

0.0022

Prices + Volume

50

0.0018

7_MA + Volume

25

0.0015

7_MA + Volume

50

0.0013

Price + Volume + 7_MA

25

0.0018

Price + Volume + 7_MA

50

0.0012

Conclusion

The popularity of stock market trading is continuously increasing, prompting experts to develop new methods for predicting the future utilizing new techniques. The forecasting technique benefits not only scholars, but also investors and anyone involved in the stock market. A forecasting model with high accuracy is required to assist in the prediction of stock indexes. We used one of the most precise forecasting technologies in this work, which uses a Recurrent Neural Network and a Long Short-Term Memory unit for predicting the stock market’s future situation. We can also conclude that by taking different combinations of given variables as parameters for the model, we can acquire higher accuries, which makes us aware of the fact

that “More data with better models produces outstanding results”.

  • Stock-Price-Prediction-main-kqf6ii.zip