Skip to content

[FEATURE] Categorical Embeddings for Ticker, Period, Sector, etc. Information #7

@GongJr0

Description

@GongJr0

Feature Details

Implement categorical embedding layers for metadata features such as ticker ID, sector classification, and temporal context (e.g., day of week, month, quarter). These embeddings will provide dense vector representations of categorical features for integration into the monolithic neural network.

Embeddings should:

  • Be initialized uniformly/randomly (optionally pre-trained later)
  • Allow flexible dimension sizing per feature (e.g., ticker might need larger embedding than sector)
  • Output torch.Tensor representations suitable for concatenation with numeric features

Affected Modules

As stated in the parent issue.

Implementation Checklist

  • Define embedding dimensions for each categorical feature (ticker, sector, period, etc.)
  • Embedding layers in FeatureGen (with nn.Embedding from PyTorch)
  • Utility to map raw categorical values → integer IDs for embedding lookup
  • Integrate embeddings with AR(n) lag features in concatenation module
  • Unit tests:
    • Verify correct tensor shapes per feature
    • Ensure unseen categories are handled (mask or fallback index)
    • Confirm embeddings update during training

Limitations

As stated in the parent issue.

Metadata

Metadata

Assignees

Labels

featureImplementation tracking for approved features

Projects

Status

In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions