FinanceLake is an open-source financial data platform that ingests, analyzes, and visualizes market and financial data β similar in ambition to platforms like Bloomberg Terminal, but powered by open technologies.
Whether you're a quant, data engineer, open-source maintainer, or trading enthusiast, FinanceLake offers a scalable and intelligent data stack to support real-time insights, financial research, and data-driven decision-making.
-
π₯ Data Ingestion
Real-time and batch ingestion pipelines using Apache Kafka, Apache NiFi, and API connectors (e.g., Yahoo Finance, Alpha Vantage, Quandl, etc.) -
βοΈ Big Data Processing
Built on top of Apache Spark, Hadoop, and Delta Lake for scalable and resilient analytics. -
π Advanced Analytics
Analyze financial trends, compute indicators, perform backtesting, and build custom financial metrics. -
π Interactive Visualization
Visual dashboards powered by Grafana, Apache Superset, or Streamlit. -
π§ Query Engine
Ask questions and get answers using a simple SQL-like interface or a natural language layer (NLQ) with optional LLM integration. -
π‘ Data APIs
REST & GraphQL APIs to expose insights and dashboards to downstream systems or external apps.
- Market trend monitoring for trading teams
- Quantitative research and strategy testing
- Portfolio performance visualization
- Risk metrics computation
- Real-time financial data streaming and alerting
FinanceLake empowers users to unlock value from vast streams of financial and economic data. Hereβs what you can achieve:
- Track price movements, volatility, and trends across global markets
- Identify leading/lagging indicators to guide investment decisions
- Measure performance against custom benchmarks or indices
- Backtest strategies using historical tick/ohlcv data
- Generate buy/sell signals using technical indicators (RSI, MACD, EMAβ¦)
- Evaluate drawdown, Sharpe ratio, beta, and other risk metrics
- Build dynamic dashboards to monitor positions, portfolios, and KPIs
- Stream live feeds for asset prices, news sentiment, or macro indicators
- Trigger alerts on thresholds or anomalies (via webhook, email, Slack)
- Explore structured and unstructured financial data using SQL or natural language
- Query fundamentals, earnings, economic events, ESG scores, and more
- Slice and dice data per sector, geography, or custom segments
- Create custom dashboards for hedge funds, fintech apps, or research teams
- Feed data into machine learning pipelines (e.g., predictive models)
- Connect external systems (trading bots, ML models, BI tools) via API
- Add custom connectors to new data sources (e.g., crypto exchanges, alt-data)
- Contribute notebooks, indicators, or data visualizations
- Help shape the roadmap of an open, transparent financial platfor
Comming soon !!
Comming soon !!
You can set up FinanceLake by following our step-by-step instructions for either Docker Compose or Helm. Feel free to ask the community if you get stuck at any point.
Please see detailed usage instructions. Here's an overview on how to get started using FinanceLake.
Please read the contribution guidelines before you make contribution. The following docs list the resources you might need to know after you decided to make contribution.
- Create an Issue: Report a bug or feature request to FinanceLake
- Submit a PR: Start with good first issues or issues with no assignees
- Join Mailing list: Initiate or participate in project discussions on the mailing list
- Write a Blog: Write a blog to share your use cases about FinanceLake
- Develop a Plugin: Integrate FinanceLake with more data sources as requested by the community
If you plan to contribute code to FinanceLake, we have instructions on how to get started with setting up your Development environemtn.
One of the best ways to get started contributing is by improving FinanceLake's documentation.
- FinanceLake's documentation is hosted at FinanceLake
- We have a separate GitHub repository for FinanceLack's documentation: github.com/FinanceLake/financelake-docs
- Roadmap: Detailed roadmaps for FinanceLake.
Message us on Discord
Before running the project, configure your environment variables.
- Copy the
.env.examplefile and create your own.envfile:
cp .env.example .env- Edit the .env file and fill in your specific configuration:
-DB_HOST: Database host (e.g., localhost) -KAFKA_BROKER: Kafka broker address -SPARK_MASTER Spark master URL -API_KEY: Your API key -DATA_SOURCE_URL: URL to fetch data from -RAW_DATA_PATH: Path for storing raw data -DASHBOARD_USER: Dashboard login user -LOG_LEVEL: Logging level (e.g., INFO, DEBUG)
- Start Kafka & Zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties bin/kafka-server-start.sh config/server.properties
- Start HDFS
Make sure HDFS is running on localhost:9000.
- Start the Kafka producer
python producer.py
- Deploy the HDFS connector
curl -X POST -H "Content-Type: application/json" --data @hdfs-sink.json http://localhost:8083/connectors
