NeuralFusionCore: Direct Portfolio Weight Forecasting with Cross‑Gated Attention Fusion

This variant directly forecasts portfolio weights using multi‑modal inputs (news + OHLCV) and fuses the streams with Cross‑Gated Attention (CGA).
CGA lets each stream attend to the other via gates that modulate information flow, improving robustness over naive concatenation.

Architecture Overview

Timeframe: 3‑minute bars
Input Window: 80 timestamps (~4 hours)
Prediction Horizon: next 80 timestamps (~4 hours)
Assets: configurable universe

Encoders

News stream (single LSTM)
- Each article → BigBird embedding
- Average embeddings of all articles per 3-min window
- If no news: use learned [NO_NEWS] embedding
- Coverage one-hot (which stocks are mentioned) is concatenated to the news embedding at each timestamp
- The sequence is fed to one LSTM → produces news sequence embedding
OHLCV stream (TimesNet)
- A TimesNetBlock processes per-asset OHLCV sequences → produces market embedding

Fusion — Cross‑Gated Attention (CGA)

Let N be the news embedding and M the market (OHLCV) embedding
Compute cross‑attention in both directions (N→M and M→N)
Apply gates (sigmoid/tanh) to the attended features before adding residuals:

$$\tilde{N} = g_N \odot \text{Attn}(N \rightarrow M) + (1-g_N) \odot N$$

$$\tilde{M} = g_M \odot \text{Attn}(M \rightarrow N) + (1-g_M) \odot M$$

Concatenate or sum $\tilde{N}$ and $\tilde{M}$ to form the fused embedding

Output Head

A linear layer maps the fused embedding to portfolio weights for all assets

Training Objective

The model uses a top-k long/short portfolio construction and optimizes a risk-adjusted return loss with regularization.

Let:

$w \in \mathbb{R}^N$ be the portfolio weights computed from logits
$R \in \mathbb{R}^{H \times N}$ be the returns matrix for a batch (H time steps, N assets)
$k$ be the number of assets to select for active trading
$\epsilon$ a small constant for numerical stability

Portfolio Weighting (Top-k Long/Short)

Weights are computed as:

$$ w = \text{apply}(\text{logits}, k) $$

where the function topk_long_short_abs selects the top-k absolute logits and normalizes them.

Only the top-k largest absolute values of logits are selected, and the weights are normalized:

$$ w_i = \begin{cases} \dfrac{\text{sign}(\text{logits}_i) \cdot |\text{logits}_i|}{\sum_{j \in \text{top-k}} |\text{logits}_j|}, & i \in \text{top-k} \\ 0, & \text{otherwise} \end{cases} $$

Sharpe Ratio Loss (maximize risk-adjusted return)

Portfolio returns:

$$ R_p = \sum_{i=1}^{N} w_i \cdot R_i $$

Sharpe ratio:

$$ \text{Sharpe} = \frac{\mathbb{E}[R_p]}{\sqrt{\text{Var}(R_p) + \epsilon}} $$

Sharpe loss:

$$ \mathcal{L}_{\text{Sharpe}} = - \text{Sharpe} $$

Regularization Terms

Distribution regularizer (prevents concentration):

$$ \mathcal{L}_{\text{dist}} = \lambda_{\text{div}} \cdot \frac{1}{N} \sum_{i=1}^N w_i^2 $$

Net exposure regularizer (encourages market-neutral portfolio):

$$ \mathcal{L}_{\text{net}} = \lambda_{\text{net}} \cdot \left(\sum_{i=1}^N w_i \right)^2 $$

Turnover regularizer (optional, penalizes large changes in weights):

$$ \mathcal{L}_{\text{turnover}} = \lambda_{\text{turnover}} \cdot \sum_{i=1}^N | w_i - w_i^{\text{prev}} | $$

Total Loss

The total loss optimized:

$$ \mathcal{L}_{\text{total}} = \mathcal{L}_{\text{Sharpe}} + \mathcal{L}_{\text{dist}} + \mathcal{L}_{\text{net}} + \mathcal{L}_{\text{turnover}} $$

Only the turnover regularizer (L_turnover) is applied if previous weights w_prev are provided.
$\lambda_{\text{div}}, \lambda_{\text{net}}, \lambda_{\text{turnover}}$ control the regularization strength.

Repository Layout

NeuralFusionCore/
     ├── data/
     │   ├── outputs/
     │   │   └── model_weights.pt        
     │   └── processed/
     │       └── show_files.py                   
     │   
     ├── lib/
     │   ├── backtest.py
     │   ├── backtest_weights.py        
     │   ├── dataset.py
     │   ├── loss_weights.py            
     │   ├── model.py
     │   ├── train.py
     │   └── utils.py
     ├──_init__.py
     ├── README.md
     ├── requirements.txt
     ├── config.py
     └── scripts/
          ├── train_service.py
          ├── finetune_service.py
          ├── prediction_service.py 
          ├── backtesting_service.py
          └── api_service.py

Any folders missing on your machine will be created by the scripts if needed.

Setup

# Clone repository
git clone https://github.com/Novoxpert/NeuralFusionCore.git
cd NeuralFusionCore


# (optional) create a virtual environment
python -m venv .venv

# Linux/macOS:
source .venv/bin/activate

# Windows (PowerShell):
 .\.venv\Scripts\Activate.ps1

# install exact dependencies
pip install -r requirements.txt

Script Cheat‑Sheet

lib/*.py — internal modules for datasets, models,training loops, utilities, and backtesting specialized for direct weights.
config.py — central configuration / argument helpers used by the scripts.
scripts/train_service.py — Train from scratch on processed/train.parquet and processed/val.parquet Usage Example:

python -m scripts.train_service --epocha 50

scripts/finetune_service.py —Fine-tune an existing saved model using the latest features. If validation loss improves, replace saved model and keep previous version with timestamp.

Usage Example:

python -m scripts.finetune_service --epocha 10 --save_best

scripts/prediction_service.py —Scheduled inference: fetch latest data, compute features, infer model, transform logits into portfolio weights, and save predictions to MongoDB and Redis.

Usage Example:

python -m scripts.prediction_service --hours 4

scripts/backtesting_service.py —Backtesting & Model Evaluation Service for Market-News Fusion Model.

Usage Example:

python -m scripts.backtesting_service --epochs 50 --mode fetch --hours 12

scripts/api_service.py — create API for Get NeuralFusion weights from Mongodb.

Pipeline (Direct Weights)

1) run data_ingest_service

2) run features_service

3) run train_service

4) run prediction_service

Dependencies

Python 3.12+
PyTorch 2.x
Hugging Face transformers (BigBird)

Outputs

Predicted weights per timestamp
Performance metrics:
- Sharpe ratio
- Cumulative P&L
- Max Drawdown
- Turnover
Plots: equity curve, rolling Sharpe, weights heatmap

Notes

CGA allows directional, gated cross‑attention between news and market signals
The distribution loss helps prevent one-asset collapse
Total Loss combines Sharpe ratio maximization and regularizations

Appendix

Upstream Repositories

Influential upstream repositories:

BigBird: A sparse-attention transformer model enabling efficient processing of longer sequences
finBERT: A pre-trained NLP model fine-tuned for financial sentiment analysis
Time-Series-Library (TSlib): Library providing deep learning-based time series analysis, covering forecasting, anomaly detection, and classification

Inspiration

This work is inspired by the article:

Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism: Introduces the Multimodal Stable Fusion with Gated Cross-Attention (MSGCA) architecture, designed to robustly integrate multimodal inputs for stock movement prediction.

Authors & Citation

Developed by the Novoxpert Research Team
If you use this repository or build upon our work, please cite:

Novoxpert Research (2025). NeuralFusionCore: Direct Portfolio Weight Forecasting with Cross-Gated Attention Fusion.
GitHub: https://github.com/Novoxpert/NeuralFusionCore

@software{novoxpert_neuralfusioncore_2025,
  author       = {Elham Esmaeilnia and Hamidreza Naeini},
  title        = {NeuralFusionCore: Direct Portfolio Weight Forecasting with Cross-Gated Attention Fusion},
  organization = {Novoxpert Research},
  year         = {2025},
  url          = {https://github.com/Novoxpert/NeuralFusionCore}
}

Support

Issues & Bugs: Open on GitHub
Discussions: GitHub Discussions
Feature Requests: Open a feature request issue

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
apps		apps
core		core
data		data
lib		lib
notebooks		notebooks
scripts		scripts
.env_example		.env_example
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DISCLAIMER.md		DISCLAIMER.md
LICENSE		LICENSE
README.md		README.md
SECURITY		SECURITY
UPSTREAM_LICENSES.md		UPSTREAM_LICENSES.md
__init__.py		__init__.py
config.py		config.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuralFusionCore: Direct Portfolio Weight Forecasting with Cross‑Gated Attention Fusion

Table of Contents

Architecture Overview

Encoders

Fusion — Cross‑Gated Attention (CGA)

Output Head

Training Objective

Portfolio Weighting (Top-k Long/Short)

Sharpe Ratio Loss (maximize risk-adjusted return)

Regularization Terms

Total Loss

Repository Layout

Setup

Script Cheat‑Sheet

Pipeline (Direct Weights)

1) run data_ingest_service

2) run features_service

3) run train_service

4) run prediction_service

Dependencies

Outputs

Notes

Appendix

Upstream Repositories

Inspiration

Authors & Citation

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

Novoxpert/NeuralFusionCore

Folders and files

Latest commit

History

Repository files navigation

NeuralFusionCore: Direct Portfolio Weight Forecasting with Cross‑Gated Attention Fusion

Table of Contents

Architecture Overview

Encoders

Fusion — Cross‑Gated Attention (CGA)

Output Head

Training Objective

Portfolio Weighting (Top-k Long/Short)

Sharpe Ratio Loss (maximize risk-adjusted return)

Regularization Terms

Total Loss

Repository Layout

Setup

Script Cheat‑Sheet

Pipeline (Direct Weights)

1) run data_ingest_service

2) run features_service

3) run train_service

4) run prediction_service

Dependencies

Outputs

Notes

Appendix

Upstream Repositories

Inspiration

Authors & Citation

Support

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages