Skip to content

Releases: AtaCanYmc/ForexFactoryScrapper

v1.1.1 - ForexFactoryScrapper

07 Mar 14:51

Choose a tag to compare

ForexFactoryScrapper

CI Python License: MIT

ForexFactoryScrapper is a small Flask-based API that exposes scraping logic for several economic-calendar sources (ForexFactory, CryptoCraft, EnergyExch, MetalsMine).

What this repository provides:

  • Flask HTTP API endpoints returning JSON (or HTML for the root page)
  • Per-site scrapers under src/scrapper/ (site-specific logic)
  • Simple test-suite using pytest under tests/
  • A minimal OpenAPI spec (served at /openapi.json) and a Swagger UI at /swagger

Quick start

  1. Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Run the app locally:
python main.py
# or
python src/app.py

By default the app listens on 0.0.0.0:5000. You can configure HOST, PORT and DEBUG via environment variables or a .env file (the app uses python-dotenv if present).

Open the welcome page in your browser: http://localhost:5000/
Open API docs: http://localhost:5000/swagger
Open raw OpenAPI JSON: http://localhost:5000/openapi.json


Available endpoints

  • GET / — Welcome HTML page (quick links)
  • GET /api/hello — simple hello response
  • GET /api/health — quick health check
  • GET /api/forex/daily — ForexFactory daily events (query params: day, month, year, optional limit, offset)
  • GET /api/cryptocraft/daily — CryptoCraft daily events (same parameters)
  • GET /api/energyexch/daily — EnergyExch daily events (same parameters)
  • GET /api/metalsmine/daily — MetalsMine daily events (same parameters)

All /.../daily endpoints follow the same validation and paging semantics:

  • Required query parameters: day, month, year (integers)
  • Optional limit and offset (integers, >= 0)
  • On success, list results are wrapped in a pagination object: { total, offset, limit, results }.
  • On parameter validation error, endpoints return HTTP 400 with JSON: { "error": "..." }.

OpenAPI / Swagger

  • The OpenAPI document is available at /openapi.json and is generated from src/openapi_spec.py.
  • The interactive Swagger UI is served at /swagger and uses the OpenAPI JSON. If your environment blocks external CDN assets, the UI falls back to an inline minimal page.

If you update endpoints or schemas, please update src/openapi_spec.py accordingly so the docs stay accurate.


Environment variables

  • HOST — host to bind (default 0.0.0.0)
  • PORT — port to bind (default 5000)
  • DEBUG — debug mode (default True)
  • DOTENV_PATH — optional path to a .env file

Tests

Run tests with:

python -m pytest -q

Tests are under tests/ and use pytest and the Flask test client. Many tests monkeypatch src.app and main to avoid network calls.


Docker

A Dockerfile is provided for convenience; if you prefer to run inside Docker, build and run the image as usual (adjust ports as needed).


Contributing

Contributions welcome. Suggested workflow:

  1. Create a branch for your change
  2. Add tests for any behavior you modify
  3. Run the full test suite
  4. Open a pull request describing the change

If you modify or add a new scraper under src/scrapper/, try to keep the get_records(url) and get_url(day, month, year, timeline) function signatures so the route helpers can call them interchangeably.

Code of Conduct: Please read CODE_OF_CONDUCT.md before contributing — it describes expected behaviour and reporting contacts.


Contact

Maintainer: Ata Can — atacanymc@gmail.com


If you want, I can also generate a short CONTRIBUTING.md or add CI steps to run lint/tests automatically on PRs. Let me know what else to update.

v1.1.0 - ForexFactoryScrapper

23 Feb 04:08

Choose a tag to compare

ForexFactoryScrapper

CI Python License: MIT

ForexFactoryScrapper is a Python-based web scraping tool designed to extract financial event data from the ForexFactory website. This project provides a simple and effective way to scrape calendar events, forecast data, actual values, and other relevant information for forex trading analysis.

Features

  • Scrape calendar events, including date, time, currency, event name, forecast, actual, and previous values.
  • Export or process extracted data in structured formats suitable for analysis.
  • Simple and customizable scraping logic using BeautifulSoup.
  • Includes examples for extracting data and creating basic reports.

Requirements

  • Python 3.9 or newer
  • See requirements.txt for dependency versions used during development and testing.

Installation

  1. Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt

Running locally

Start the application locally:

python app.py

By default this will start the app on 0.0.0.0:5000. Example endpoints you can call:

  • GET /api/hello
  • GET /api/health
  • GET /api/forex/daily?day=1&month=1&year=2020

(Adjust host/port or endpoint parameters as needed in main.py.)

Example requests

Below are simple example requests you can use to interact with the running application. Replace localhost:5000 with the host/port where your app is listening if different.

1) Hello

Curl:

curl -sS http://localhost:5000/api/hello

Expected JSON response (HTTP 200):

{
  "message": "Hello, World!",
  "status": "success"
}

2) Health

Curl:

curl -sS http://localhost:5000/api/health

Expected JSON response (HTTP 200):

{
  "status": "ok"
}

3) Forex daily — missing or invalid parameters

  • Missing parameters (HTTP 400):
curl -sS http://localhost:5000/api/forex/daily

Response body:

{ "error": "Missing one or more required parameters: day, month, year" }
  • Invalid (non-integer) parameters (HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=aa&month=bb&year=cc"

Response body:

{ "error": "Parameters day, month and year must be integers" }
  • Out-of-range parameters (HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=99&month=99&year=3000"

Response body:

{ "error": "Parameters out of reasonable range" }

4) Forex daily — success

Curl (example):

curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020"

Expected JSON response (HTTP 200): a JSON array of records. Example record format:

[
  {
    "Time": "01/01/2020 00:00",
    "Currency": "USD",
    "Event": "NFP",
    "Forecast": "100k",
    "Actual": "120k",
    "Previous": "90k"
  }
]

Python requests example:

import requests

resp = requests.get(
    'http://localhost:5000/api/forex/daily',
    params={'day': 1, 'month': 1, 'year': 2020},
)
print(resp.status_code)
print(resp.json())

5) Forex daily — paging (limit & offset)

This project added optional paging support to the /api/forex/daily endpoint via two query parameters: limit and offset.

  • offset (optional): integer >= 0, default 0. Skip this many records from the start.
  • limit (optional): integer >= 0, default is unlimited. Return at most this many records after applying the offset.

Behavior and validation:

  • Both limit and offset must be integers. Non-integer values return HTTP 400.
  • Negative values return HTTP 400.
  • If offset is greater than or equal to the number of available records, the endpoint returns an empty list and HTTP 200.
  • limit=0 returns an empty list (valid request).
  • If the scraper returns a non-list structure, paging is not applied and the raw response is returned.

Examples:

  • First 10 records:
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&limit=10"
  • Start from the 5th record and return up to 3 records:
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&offset=4&limit=3"
  • Non-integer or negative paging params (example, HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&limit=abc"
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&offset=-1"

Notes and suggestions:

  • There is no enforced maximum limit in the current implementation. For production use you may want to cap limit (for example 500 or 1000) to avoid large responses or memory spikes.
  • Consider returning a pagination wrapper like { "total": N, "offset": X, "limit": Y, "results": [...] } if clients benefit from metadata. Current response remains a plain JSON array for backward compatibility.

Notes:

  • The exact fields and values depend on the parser and target site's HTML structure. When running the real scraper, values reflect what is parsed from ForexFactory for the given date.
  • The examples above match the app behavior implemented in main.py and the test fixtures in tests/test_app.py.

Tests

Run the test suite with pytest:

pytest -q

Unit tests are located in the tests/ folder. Network calls and external dependencies are isolated using monkeypatching to keep tests deterministic.

Notes and caveats

  • The scraper depends on the target site's HTML structure. If ForexFactory changes its markup, the parsing code will need updating.
  • requirements.txt pins versions that were used during development; consider updating or pinning further for deployments.
  • Respect the target site's robots.txt and terms of service when scraping.

Contributing

Contributions, bug reports, and feature requests are welcome. Please open an issue or a pull request.

License

This project is licensed under the MIT License — see the LICENSE file for details.