EvolutionaryScale is a New York-based biology foundation model lab spun out of Meta AI's ESM team that develops AI to deepen scientific understanding of biology. Its flagship ESM3 model is a multimodal generative protein language model that reasons jointly across sequence, structure, and function, scaling to 98B parameters trained on 771B tokens from 2.78B natural proteins. The companion ESM Cambrian (ESM C) family provides protein representation learning at 300M–6B parameters as a performant ESM2 replacement. Models are accessible via the hosted Forge inference API (forge.evolutionaryscale.ai), an open-source Python SDK (pip install esm), open weights on Hugging Face, and AWS Marketplace (SageMaker, NVIDIA BioNeMo and NIM). EvolutionaryScale was integrated into the Biohub organization in 2025; the ESM SDK now lives at github.com/Biohub/esm.
URL: Visit APIs.json
Run: Capabilities Using Naftiko
- AI, Artificial Intelligence, Biology, Bioinformatics, Computational Biology, Drug Discovery, ESM, ESM3, ESM Cambrian, Foundation Models, Generative Biology, Life Sciences, Machine Learning, Protein Design, Protein Folding, Protein Language Models, Proteins, Representation Learning, Structure Prediction
- Created: 2026-05-24
- Modified: 2026-05-24
| Model | Family | Parameters | Access |
|---|---|---|---|
esm3-large-2024-03 |
ESM3 | 98B | Forge |
esm3-medium-2024-08 |
ESM3 | 7B | Forge |
esm3-small-2024-08 |
ESM3 | 1.4B | Forge |
esm3-open (biohub/esm3-sm-open-v1) |
ESM3 | 1.4B | Open weights (research) |
esmc-6b-2024-12 |
ESM Cambrian | 6B | Forge |
esmc-600m-2024-12 |
ESM Cambrian | 600M | Forge + Open weights |
esmc-300m-2024-12 |
ESM Cambrian | 300M | Forge + Open weights |
ESM3 was trained on 771B tokens from 2.78B natural proteins (1e24 FLOPs).
Hosted inference for the ESM3 multimodal protein language model. Generate, batch-generate, encode, decode, forward-and-sample, and logits across small (1.4B), medium (7B), and large (98B) checkpoints. Reasons jointly across sequence, structure, secondary_structure, sasa, and function tracks.
Human URL: https://forge.evolutionaryscale.ai
- Documentation
- SourceCode
- OpenAPI
- JSON Schema — ESMProtein
- JSON Schema — GenerationConfig
- JSON-LD
- Naftiko Capability — Generation
- Naftiko Capability — Encoding
Hosted inference for the ESM Cambrian (ESM C) representation learning family. Drop-in ESM2 replacement with lower memory footprint, available at 300M, 600M, and 6B parameters. Exposes sequence-only encode and logits for embedding workflows.
Human URL: https://forge.evolutionaryscale.ai
Hosted folding and inverse-folding inference. fold predicts atom37 backbone coordinates plus pLDDT/PTM confidence; inverse_fold designs candidate sequences for a target structure; msa fetches multiple sequence alignments used to condition fold predictions.
Human URL: https://forge.evolutionaryscale.ai
Official Python SDK packaging ESM3 and ESM Cambrian model loaders, the ESMProtein multi-track data model, generation/sampling configurations, structure tokenization utilities, and a forge.client() factory that swaps local checkpoints for Forge-hosted inference without code changes. Installable from PyPI as esm. Mixed commercial / non-commercial licenses.
Human URL: https://github.com/Biohub/esm
- Portal
- SignUp — EvolutionaryScale Forge
- SourceCode — ESM SDK on GitHub
- SDK —
esmpackage on PyPI - Biohub on Hugging Face
- Models — ESM3-open (1.4B) on Hugging Face
- Models — ESM C 300M on Hugging Face
- Models — ESM C 600M on Hugging Face
- CodeExamples — ESM Cookbook
- Tutorials — ESM Tutorials
- Documentation — ESM3 Science paper (Hayes et al. 2025)
- Blog — ESM3 release announcement
- Blog — ESM Cambrian announcement
- Blog
- Marketplace — EvolutionaryScale on AWS Marketplace (SageMaker)
- CodeExamples — ESM on Amazon SageMaker examples
- CodeExamples — Partner integrations
- TermsOfService — Cambrian Inference Clickthrough License
- Documentation — Responsible Biodesign Framework
- Forum — ESM Community Slack
- GitHub Organization — EvolutionaryScale
- GitHub Organization — Biohub (ESM home)
- Plans
- Rate Limits
- FinOps
- Vocabulary
- ESM3 — multimodal generative model jointly conditioning on protein sequence, structure, and function
- 98B-parameter ESM3 trained on 771B tokens from 2.78B natural proteins (1e24 FLOPs)
- ESM Cambrian (ESM C) representation models at 300M, 600M, and 6B parameters
- Forge API providing generate, batch_generate, encode, decode, forward_and_sample, and logits operations
- Fold and inverse-fold endpoints for structure prediction and structure-conditioned sequence design
- MSA endpoint for fetching multiple sequence alignments used by structure prediction
- Iterative masked sampling with configurable num_steps, temperature, top_p, and decoding schedules
- Per-track generation across sequence, structure, secondary_structure, sasa, and function tracks
- Structure tokenizer converting PDB / atom37 coordinates to and from discrete tokens
- ESMProtein and ESMProteinTensor data model unifying raw and tokenized representations
- Async/sync client surface (
async_generate,async_fold,async_encode, ...) for high-throughput jobs - Drop-in Forge client (
esm.sdk.client(model, token=...)) replaces local checkpoints with hosted inference - Open-weights ESM3-open (1.4B) and ESM Cambrian distributions on Hugging Face under research license
- AWS Marketplace deployment via SageMaker, NVIDIA BioNeMo, and NVIDIA NIM microservice
- Cookbook tutorials covering protein generation, embedding workflows, and esmGFP-style design
- Responsible Biodesign Framework governing model release and biosecurity review
Consuming
- Kin Lane (info@apievangelist.com) — apievangelist.com