Skip to content

Novoxpert/NeuralFusionCore

NeuralFusionCore: Direct Portfolio Weight Forecasting with Cross‑Gated Attention Fusion

This variant directly forecasts portfolio weights using multi‑modal inputs (news + OHLCV) and fuses the streams with Cross‑Gated Attention (CGA).
CGA lets each stream attend to the other via gates that modulate information flow, improving robustness over naive concatenation.


Table of Contents


Architecture Overview

  • Timeframe: 3‑minute bars
  • Input Window: 80 timestamps (~4 hours)
  • Prediction Horizon: next 80 timestamps (~4 hours)
  • Assets: configurable universe

Encoders

  1. News stream (single LSTM)

    • Each article → BigBird embedding
    • Average embeddings of all articles per 3-min window
    • If no news: use learned [NO_NEWS] embedding
    • Coverage one-hot (which stocks are mentioned) is concatenated to the news embedding at each timestamp
    • The sequence is fed to one LSTM → produces news sequence embedding
  2. OHLCV stream (TimesNet)

    • A TimesNetBlock processes per-asset OHLCV sequences → produces market embedding

Fusion — Cross‑Gated Attention (CGA)

  • Let N be the news embedding and M the market (OHLCV) embedding
  • Compute cross‑attention in both directions (N→M and M→N)
  • Apply gates (sigmoid/tanh) to the attended features before adding residuals:
$$\tilde{N} = g_N \odot \text{Attn}(N \rightarrow M) + (1-g_N) \odot N$$ $$\tilde{M} = g_M \odot \text{Attn}(M \rightarrow N) + (1-g_M) \odot M$$
  • Concatenate or sum $\tilde{N}$ and $\tilde{M}$ to form the fused embedding

Output Head

  • A linear layer maps the fused embedding to portfolio weights for all assets

Training Objective

The model uses a top-k long/short portfolio construction and optimizes a risk-adjusted return loss with regularization.

Let:

  • $w \in \mathbb{R}^N$ be the portfolio weights computed from logits
  • $R \in \mathbb{R}^{H \times N}$ be the returns matrix for a batch (H time steps, N assets)
  • $k$ be the number of assets to select for active trading
  • $\epsilon$ a small constant for numerical stability

Portfolio Weighting (Top-k Long/Short)

Weights are computed as:

$$ w = \text{apply}(\text{logits}, k) $$

where the function topk_long_short_abs selects the top-k absolute logits and normalizes them.

Only the top-k largest absolute values of logits are selected, and the weights are normalized:

$$ w_i = \begin{cases} \dfrac{\text{sign}(\text{logits}_i) \cdot |\text{logits}_i|}{\sum_{j \in \text{top-k}} |\text{logits}_j|}, & i \in \text{top-k} \\ 0, & \text{otherwise} \end{cases} $$


Sharpe Ratio Loss (maximize risk-adjusted return)

Portfolio returns:

$$ R_p = \sum_{i=1}^{N} w_i \cdot R_i $$

Sharpe ratio:

$$ \text{Sharpe} = \frac{\mathbb{E}[R_p]}{\sqrt{\text{Var}(R_p) + \epsilon}} $$

Sharpe loss:

$$ \mathcal{L}_{\text{Sharpe}} = - \text{Sharpe} $$


Regularization Terms

  1. Distribution regularizer (prevents concentration):

$$ \mathcal{L}_{\text{dist}} = \lambda_{\text{div}} \cdot \frac{1}{N} \sum_{i=1}^N w_i^2 $$

  1. Net exposure regularizer (encourages market-neutral portfolio):

$$ \mathcal{L}_{\text{net}} = \lambda_{\text{net}} \cdot \left(\sum_{i=1}^N w_i \right)^2 $$

  1. Turnover regularizer (optional, penalizes large changes in weights):

$$ \mathcal{L}_{\text{turnover}} = \lambda_{\text{turnover}} \cdot \sum_{i=1}^N | w_i - w_i^{\text{prev}} | $$


Total Loss

The total loss optimized:

$$ \mathcal{L}_{\text{total}} = \mathcal{L}_{\text{Sharpe}} + \mathcal{L}_{\text{dist}} + \mathcal{L}_{\text{net}} + \mathcal{L}_{\text{turnover}} $$

  • Only the turnover regularizer (L_turnover) is applied if previous weights w_prev are provided.
  • $\lambda_{\text{div}}, \lambda_{\text{net}}, \lambda_{\text{turnover}}$ control the regularization strength.

Repository Layout

NeuralFusionCore/
     ├── data/
     │   ├── outputs/
     │   │   └── model_weights.pt        
     │   └── processed/
     │       └── show_files.py                   
     │   
     ├── lib/
     │   ├── backtest.py
     │   ├── backtest_weights.py        
     │   ├── dataset.py
     │   ├── loss_weights.py            
     │   ├── model.py
     │   ├── train.py
     │   └── utils.py
     ├──_init__.py
     ├── README.md
     ├── requirements.txt
     ├── config.py
     └── scripts/
          ├── train_service.py
          ├── finetune_service.py
          ├── prediction_service.py 
          ├── backtesting_service.py
          └── api_service.py

Any folders missing on your machine will be created by the scripts if needed.


Setup

# Clone repository
git clone https://github.com/Novoxpert/NeuralFusionCore.git
cd NeuralFusionCore


# (optional) create a virtual environment
python -m venv .venv

# Linux/macOS:
source .venv/bin/activate

# Windows (PowerShell):
 .\.venv\Scripts\Activate.ps1

# install exact dependencies
pip install -r requirements.txt

Script Cheat‑Sheet

  • lib/*.py — internal modules for datasets, models,training loops, utilities, and backtesting specialized for direct weights.
  • config.py — central configuration / argument helpers used by the scripts.
  • scripts/train_service.py — Train from scratch on processed/train.parquet and processed/val.parquet Usage Example:
python -m scripts.train_service --epocha 50 
  • scripts/finetune_service.py —Fine-tune an existing saved model using the latest features. If validation loss improves, replace saved model and keep previous version with timestamp.

Usage Example:

python -m scripts.finetune_service --epocha 10 --save_best
  • scripts/prediction_service.py —Scheduled inference: fetch latest data, compute features, infer model, transform logits into portfolio weights, and save predictions to MongoDB and Redis.

Usage Example:

python -m scripts.prediction_service --hours 4 
  • scripts/backtesting_service.py —Backtesting & Model Evaluation Service for Market-News Fusion Model.

Usage Example:

python -m scripts.backtesting_service --epochs 50 --mode fetch --hours 12 
  • scripts/api_service.py — create API for Get NeuralFusion weights from Mongodb.

Pipeline (Direct Weights)

1) run data_ingest_service
2) run features_service
3) run train_service
4) run prediction_service

Dependencies

  • Python 3.12+
  • PyTorch 2.x
  • Hugging Face transformers (BigBird)

Outputs

  • Predicted weights per timestamp

  • Performance metrics:

    • Sharpe ratio
    • Cumulative P&L
    • Max Drawdown
    • Turnover
  • Plots: equity curve, rolling Sharpe, weights heatmap


Notes

  • CGA allows directional, gated cross‑attention between news and market signals
  • The distribution loss helps prevent one-asset collapse
  • Total Loss combines Sharpe ratio maximization and regularizations

Appendix

Upstream Repositories

Influential upstream repositories:

  • BigBird: A sparse-attention transformer model enabling efficient processing of longer sequences
  • finBERT: A pre-trained NLP model fine-tuned for financial sentiment analysis
  • Time-Series-Library (TSlib): Library providing deep learning-based time series analysis, covering forecasting, anomaly detection, and classification

Inspiration

This work is inspired by the article:


Authors & Citation

Developed by the Novoxpert Research Team
If you use this repository or build upon our work, please cite:

Novoxpert Research (2025). NeuralFusionCore: Direct Portfolio Weight Forecasting with Cross-Gated Attention Fusion.
GitHub: https://github.com/Novoxpert/NeuralFusionCore

@software{novoxpert_neuralfusioncore_2025,
  author       = {Elham Esmaeilnia and Hamidreza Naeini},
  title        = {NeuralFusionCore: Direct Portfolio Weight Forecasting with Cross-Gated Attention Fusion},
  organization = {Novoxpert Research},
  year         = {2025},
  url          = {https://github.com/Novoxpert/NeuralFusionCore}
}

Support


About

Multimodal deep forecasting

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •