This project aims to develop, evaluate, and compare autoregressive, gradient boosting, and neural networks algorithms on forecasting using stock market data, incorporating sentiment analysis from social media as exogenous data.The objective is to analyze the effectiveness of advance data analytics techniques in conjunction with Big Data processing.
The selected companies for the experiment are: AAPL, AMZN, NFLX, NVDIA, and TSLA.
The project is divided into two Jupyter notebooks.
Part 1:
Focuses on Big Data techniques, including the use of HDFS, MySQL, and NoSQL databases; exploratory data analysis; sentiment analysis; and big data streaming techniques utilizing the PySpark language.
Part 2:
Covers advanced data analytics, such as the development and application of autoregressive, gradient boosting, and neural network models; sentiment analysis; and the incorporation of an interactive dashboard.