Skip to content

api-evangelist/evolutionaryscale

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EvolutionaryScale (evolutionaryscale)

EvolutionaryScale is a New York-based biology foundation model lab spun out of Meta AI's ESM team that develops AI to deepen scientific understanding of biology. Its flagship ESM3 model is a multimodal generative protein language model that reasons jointly across sequence, structure, and function, scaling to 98B parameters trained on 771B tokens from 2.78B natural proteins. The companion ESM Cambrian (ESM C) family provides protein representation learning at 300M–6B parameters as a performant ESM2 replacement. Models are accessible via the hosted Forge inference API (forge.evolutionaryscale.ai), an open-source Python SDK (pip install esm), open weights on Hugging Face, and AWS Marketplace (SageMaker, NVIDIA BioNeMo and NIM). EvolutionaryScale was integrated into the Biohub organization in 2025; the ESM SDK now lives at github.com/Biohub/esm.

URL: Visit APIs.json

Run: Capabilities Using Naftiko

Tags

  • AI, Artificial Intelligence, Biology, Bioinformatics, Computational Biology, Drug Discovery, ESM, ESM3, ESM Cambrian, Foundation Models, Generative Biology, Life Sciences, Machine Learning, Protein Design, Protein Folding, Protein Language Models, Proteins, Representation Learning, Structure Prediction

Timestamps

  • Created: 2026-05-24
  • Modified: 2026-05-24

Models

Model Family Parameters Access
esm3-large-2024-03 ESM3 98B Forge
esm3-medium-2024-08 ESM3 7B Forge
esm3-small-2024-08 ESM3 1.4B Forge
esm3-open (biohub/esm3-sm-open-v1) ESM3 1.4B Open weights (research)
esmc-6b-2024-12 ESM Cambrian 6B Forge
esmc-600m-2024-12 ESM Cambrian 600M Forge + Open weights
esmc-300m-2024-12 ESM Cambrian 300M Forge + Open weights

ESM3 was trained on 771B tokens from 2.78B natural proteins (1e24 FLOPs).

APIs

EvolutionaryScale Forge ESM3 API

Hosted inference for the ESM3 multimodal protein language model. Generate, batch-generate, encode, decode, forward-and-sample, and logits across small (1.4B), medium (7B), and large (98B) checkpoints. Reasons jointly across sequence, structure, secondary_structure, sasa, and function tracks.

Human URL: https://forge.evolutionaryscale.ai

EvolutionaryScale Forge ESM Cambrian API

Hosted inference for the ESM Cambrian (ESM C) representation learning family. Drop-in ESM2 replacement with lower memory footprint, available at 300M, 600M, and 6B parameters. Exposes sequence-only encode and logits for embedding workflows.

Human URL: https://forge.evolutionaryscale.ai

EvolutionaryScale Forge Folding API

Hosted folding and inverse-folding inference. fold predicts atom37 backbone coordinates plus pLDDT/PTM confidence; inverse_fold designs candidate sequences for a target structure; msa fetches multiple sequence alignments used to condition fold predictions.

Human URL: https://forge.evolutionaryscale.ai

EvolutionaryScale ESM Python SDK

Official Python SDK packaging ESM3 and ESM Cambrian model loaders, the ESMProtein multi-track data model, generation/sampling configurations, structure tokenization utilities, and a forge.client() factory that swaps local checkpoints for Forge-hosted inference without code changes. Installable from PyPI as esm. Mixed commercial / non-commercial licenses.

Human URL: https://github.com/Biohub/esm

Common Properties

Examples

Rules

Features

  • ESM3 — multimodal generative model jointly conditioning on protein sequence, structure, and function
  • 98B-parameter ESM3 trained on 771B tokens from 2.78B natural proteins (1e24 FLOPs)
  • ESM Cambrian (ESM C) representation models at 300M, 600M, and 6B parameters
  • Forge API providing generate, batch_generate, encode, decode, forward_and_sample, and logits operations
  • Fold and inverse-fold endpoints for structure prediction and structure-conditioned sequence design
  • MSA endpoint for fetching multiple sequence alignments used by structure prediction
  • Iterative masked sampling with configurable num_steps, temperature, top_p, and decoding schedules
  • Per-track generation across sequence, structure, secondary_structure, sasa, and function tracks
  • Structure tokenizer converting PDB / atom37 coordinates to and from discrete tokens
  • ESMProtein and ESMProteinTensor data model unifying raw and tokenized representations
  • Async/sync client surface (async_generate, async_fold, async_encode, ...) for high-throughput jobs
  • Drop-in Forge client (esm.sdk.client(model, token=...)) replaces local checkpoints with hosted inference
  • Open-weights ESM3-open (1.4B) and ESM Cambrian distributions on Hugging Face under research license
  • AWS Marketplace deployment via SageMaker, NVIDIA BioNeMo, and NVIDIA NIM microservice
  • Cookbook tutorials covering protein generation, embedding workflows, and esmGFP-style design
  • Responsible Biodesign Framework governing model release and biosecurity review

Position

Consuming

Maintainer