Stock Analysis with LLMs: Research Automation for Quantitative Investment

This repository contains a cutting-edge project exploring the use of Large Language Models (LLMs) to gain a competitive advantage in stock selection. The system is built to automate research and streamline the process of identifying promising investment opportunities by analyzing vast amounts of textual data and market metrics.

Project Overview

The project focuses on building an intelligent Research Automation System capable of:

Understanding natural language queries to identify relevant stocks (e.g., "What are companies that build data centers?").
Enabling advanced filtering and search capabilities for all stocks listed on the New York Stock Exchange (NYSE) based on key metrics, including Market Capitalization, Volume, Sector, and more.
Leveraging state-of-the-art AI technologies to bridge the gap between textual data analysis and actionable investment insights.

Features

1. Natural Language Stock Queries

Users can enter complex queries in natural language to identify stocks meeting specific criteria.
Example: "Show me tech companies with a market capitalization greater than $10 billion."

2. Search by Metrics

Advanced search options for stocks based on:
- Market Capitalization
- Volume
- Industry/Sector
- And more.

3. Sentiment Analysis for Trading

Integrates Large Language Models to analyze sentiment from news articles, reports, and other textual sources, enhancing decision-making.

4. Real-Time Data Integration

Retrieves up-to-date market data using Yahoo Finance (yFinance).

5. Embeddings and Similarity Search

Uses vector embeddings and similarity search to match user queries with relevant stocks efficiently.

Tech Stack

Core Technologies

Streamlit: Interactive and user-friendly web interface.
Pinecone: Vector database for fast similarity search.
OpenAI API: Natural Language Processing (NLP) with GPT models.
Groq API: High-performance AI computing for model execution.

Libraries and Tools

LangChain: Framework for working with LLMs and embeddings.
HuggingFace Sentence Transformers: For embedding textual data.
scikit-learn: To compute cosine similarity between embeddings.
yFinance: Real-time market data retrieval.

Other Dependencies

dotenv: Securely manage environment variables.
NumPy: Data manipulation and analysis.
Requests: To handle API requests.

Installation

Prerequisites

Python 3.8+
API keys for OpenAI, Pinecone, and Groq.

Steps to Run Locally

Clone the repository:

git clone https://github.com/sheicky/stock_analysis_with_LLM.git  
cd stock_analysis_with_LLM

Install dependencies:
```
pip install -r requirements.txt  
```

Set up environment variables:
Create a .env file in the root directory and add the following:

OPENAI_API_KEY=<your_openai_api_key>  
PINECONE_API_KEY=<your_pinecone_api_key>  
GROQ_API_KEY=<your_groq_api_key>

Run the Streamlit application:
```
streamlit run app.py  
```
check my demo here on youtbe : https://www.youtube.com/watch?v=M9TzqpBcggg

How It Works

Query Processing
- User inputs are processed using OpenAI’s GPT model, converting natural language queries into actionable search commands.
Stock Retrieval
- Stocks are filtered using Yahoo Finance data and further refined using vector similarity with Pinecone.
Sentiment Analysis
- News articles and reports are embedded using HuggingFace Sentence Transformers, and sentiment scores are computed to aid trading decisions.

Future Enhancements

Deep Sentiment Analysis: Integrate advanced LLMs for context-aware sentiment scoring.
Multi-Market Support: Extend coverage to global stock markets beyond the NYSE.
Prediction Models: Incorporate time-series forecasting for price and volume trends.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.streamlit		.streamlit
__pycache__		__pycache__
.env		.env
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
stock.py		stock.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stock Analysis with LLMs: Research Automation for Quantitative Investment

Project Overview

Features

1. Natural Language Stock Queries

2. Search by Metrics

3. Sentiment Analysis for Trading

4. Real-Time Data Integration

5. Embeddings and Similarity Search

Tech Stack

Core Technologies

Libraries and Tools

Other Dependencies

Installation

Prerequisites

Steps to Run Locally

How It Works

Future Enhancements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

sheicky/stock_analysis_with_LLM

Folders and files

Latest commit

History

Repository files navigation

Stock Analysis with LLMs: Research Automation for Quantitative Investment

Project Overview

Features

1. Natural Language Stock Queries

2. Search by Metrics

3. Sentiment Analysis for Trading

4. Real-Time Data Integration

5. Embeddings and Similarity Search

Tech Stack

Core Technologies

Libraries and Tools

Other Dependencies

Installation

Prerequisites

Steps to Run Locally

How It Works

Future Enhancements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages