EvolutionaryScale
World-class protein-LM science meets a non-commercial exit — CZI absorption in Nov 2025 leaves Series A investors with undisclosed returns and no standalone equity story.
EvolutionaryScale produced frontier-quality protein language models (Science-validated ESM3) but the November 2025 CZI absorption — barely 14 months after the $142M Series A — eliminates the standalone investment thesis and leaves commercial investor returns publicly unaccounted for.
Cover facts
Company profile
EvolutionaryScale was a San Francisco-based AI biology company that built and released ESM3, a 98-billion-parameter generative multimodal protein language model trained on 2.78 billion sequences (1×10^24 FLOPs on NVIDIA H100s), and ESM Cambrian (ESM-C), a representation-focused follow-on. The four co-founders — Alex Rives (CEO), Tom Sercu (VP Eng), Zeming Lin, and Sanjay Rao — were all formerly at Meta AI Research (FAIR), where they originated the ESM lineage. The company raised a seed round (June 2024) and a $142M Series A co-led by Amazon and NVIDIA (September 2024) at an implied ~$1.35B post-money valuation, alongside Lux Capital, Nat Friedman, and Daniel Gross. ESM3 was peer-validated by a Science publication (January 2025) and integrated into AWS SageMaker JumpStart and NVIDIA BioNeMo. On November 6, 2025, fewer than 14 months after the Series A close, the entire team joined CZ Biohub under the Chan Zuckerberg Initiative's Frontier AI for Biology initiative — ending EvolutionaryScale as an independent for-profit entity. Commercial revenue was never disclosed during the company's independent existence.
- Website
- www.evolutionaryscale.ai
- Founded
- 2023-01-01
- Founders
- Alexander (Alex) Rives, Tom Sercu, Zeming Lin, Sanjay Rao
- Founding location
- San Francisco, California, USA
- Headquarters
- San Francisco, California, USA (pre-acquisition); post-acquisition operations consolidated into CZ Biohub Network sites.
- Product
- EvolutionaryScale's product lineup centered on the ESM (Evolutionary Scale Modeling) family of protein language models — ESM3 (multimodal generative, up to 98B parameters) and ESM Cambrian / ESM-C (representation-focused, 300M / 600M / 6B variants) — distributed through (a) the open-weights HuggingFace channel for academic/non-commercial use (esm3-sm-open-v1 and ESM-C variants accumulated ~9,400+ downloads combined as of May 2026), (b) the commercial Forge API at forge.evolutionaryscale.ai with developer-facing SDK and documentation, (c) NVIDIA BioNeMo NIM microservice integration for enterprise H100 deployment, and (d) the AWS Marketplace SageMaker JumpStart listing. A peer-reviewed Science paper (Jan 16, 2025; DOI 10.1126/science.ads0018) validated ESM3's ability to design novel functional fluorescent proteins, providing independent scientific credibility.
- Customers
- Pre-acquisition the intended customer base was biotechnology and pharmaceutical R&D teams (target identification, protein engineering, antibody design), synthetic biology and industrial enzyme companies, academic researchers (free open-weights tier), and infrastructure customers reaching the models through AWS Marketplace and NVIDIA BioNeMo. No named pharma customer or paying-customer count was ever publicly disclosed.
- Business model
- SaaS / usage-based API pricing through the Forge platform plus partner revenue-share with AWS (SageMaker JumpStart listing) and NVIDIA (BioNeMo NIM microservice). Free open-weights ESM3 and ESM-C variants were released on HuggingFace under non-commercial research licenses as a developer-community / lead-gen strategy. Post-acquisition the entity is part of a non-profit research network (CZI / CZ Biohub) and the future commercial status of the Forge API is publicly unconfirmed.
- Stage
- acquired
- Funding status
- Acquired by Chan Zuckerberg Initiative / CZ Biohub on November 6, 2025; team absorbed into the CZ Biohub Network under the Frontier AI for Biology initiative. Acquisition terms (cash, stock, IP transfer, employee retention packages, treatment of Series A preferred shares) have not been publicly disclosed. Total disclosed pre-acquisition capital raised was $142M (Series A, Sep 26 2024) plus an undisclosed seed round (announced Jun 25 2024) at an implied ~$1.35B post-money valuation on the Series A.
Executive summary
Top strengths
- ESM3 is the largest publicly released protein language model (98B parameters, 2.78B training sequences, 1×10^24 FLOPs) and was peer-validated in Science (January 2025) for genuinely novel functional protein design — a credibility moat no closed-source competitor can match.
- The founding team — Rives, Sercu, Lin, Rao — built the original ESM lineage at Meta AI Research and are widely regarded as among the strongest protein-LM researchers globally; CZI's hiring of the entire team is itself evidence of talent quality.
- Two strategic co-investors — Amazon (AWS) and NVIDIA — provided both capital and distribution (AWS SageMaker JumpStart listing, NVIDIA BioNeMo NIM microservice), giving ESM3 enterprise reach far beyond what a Series A startup could achieve alone.
- Strong developer-signal: ~9,400+ combined HuggingFace downloads across ESM3 and ESM-C model cards, active GitHub presence (esm + DeepEP + infrastructure forks), and a growing downstream academic citation graph (32+ papers built on ESM3 per Semantic Scholar).
Top risks
- The November 6, 2025 CZI / CZ Biohub absorption (under 14 months after the Series A close) ends the standalone commercial entity and the public-equity exit thesis; acquisition terms, treatment of Series A preferred, and any commercial Forge API continuity are entirely undisclosed.
- The founding team's single-employer provenance (all four ex-Meta FAIR) heightens cultural / methodological homogeneity and key-person concentration risk, and the post-acquisition role of Rives at CZI (rather than at a successor commercial entity) closes the option of re-spinning the team.
- Open-source commoditization threat is acute: the predecessor ESM2 is MIT-licensed and freely available, AlphaFold 3 publishes weights for non-commercial use, OpenFold and Chai-1 are open, and Meta retains underlying ESM IP — collectively eroding willingness to pay for closed Forge API access.
- Zero disclosed commercial revenue, no SEC Form D filings under any "EvolutionaryScale" variant in EDGAR (unusual for a $142M raise), and a Bloomberg-paywalled Series A article all collectively block verification of capital structure, runway, and customer economics.
- Dual-use / biosecurity regulatory drag is rising (US Executive Order Oct 2023 §4.4 protein-design watchlist, BIS advance notice on AI bio, EU AI Act dual-use provisions), and a non-profit successor may face different — but not necessarily lighter — compliance obligations than the standalone for-profit would have.
Open gaps
- Acquisition terms of the November 2025 CZI / CZ Biohub deal — cash, stock, IP-transfer, retention packages, and most importantly the treatment of Series A preferred shares held by Amazon, NVIDIA, Lux Capital, Nat Friedman, and Daniel Gross — are not publicly disclosed.
- Forge API operational status, customer count, pricing tier, and any continuity commitment post-acquisition; whether the commercial API will remain available or be deprecated under the CZ Biohub non-profit umbrella.
- Commercial revenue and ARR at any point in the company's life were never publicly disclosed; with no public S-1 / Form D / 10-K filings, no audited revenue figure is verifiable.
- Exact seed-round amount (announced June 25, 2024 alongside the ESM3 launch, with NVIDIA, Amazon, Lux, Friedman, and Gross) is not publicly disclosed.
- The IP-transfer agreement between Meta and EvolutionaryScale for the ESM2 lineage — and the subsequent transfer to CZI / CZ Biohub — has not been publicly described; legal ownership of the ESM trademark and underlying model weights is unclear.
- Bloomberg's September 26, 2024 Series A article is paywalled, blocking independent verification of investor rights, board composition, pro-rata agreements, and any secondary components of the round.
Contents
01Company Overview
1.1 Identity, Headquarters, and Business Model
EvolutionaryScale, Inc. was an early-stage AI biology company headquartered in San Francisco, California, incorporated in 2023 and operationally active from approximately March 2024. The company's stated mission was to use large generative models to decode the language of protein sequences, treating proteins as text encoding billions of years of biological evolution, and apply that understanding to design novel proteins with programmable functions. Its primary commercial offering was the Forge API platform (forge.evolutionaryscale.ai), a developer-facing service providing programmatic access to the ESM3 and ESM Cambrian (ESM-C) model families. The intended revenue model was a software-as-a-service API subscription and usage-based pricing for biotechnology, pharmaceutical, and synthetic biology customers seeking to accelerate protein engineering. The company also released the open-weights ESM3 model (esm3-sm-open-v1) for non-commercial academic use via Hugging Face, building a developer community alongside the commercial offering. As of November 2025, EvolutionaryScale ceased to operate as an independent entity when its full team was absorbed into CZ Biohub under the Chan Zuckerberg Initiative, marking the end of its path as a standalone commercial company. No revenue, ARR, or paying customer counts were ever publicly disclosed by the company during its independent existence. [CO001, CO003, CO006, CO007, CO028, CO029]
| Metric | Value / Status | Date / Period | Confidence | Source / Gap |
|---|---|---|---|---|
| Company Stage | Absorbed into CZ Biohub (non-profit) | November 2025 | high | biohub.org acquisition announcement |
| Headquarters (pre-acquisition) | San Francisco, California, USA | 2024-2025 | medium | Crunchbase; no primary street address confirmed |
| Founded | 2023 (incorporated); operational ~March 2024 | 2023 / Mar 2024 | high | Official website; chapter context brief |
| CEO / Founder | Alexander (Alex) Rives (CEO through Nov 2025) | 2024-Nov 2025 | high | evolutionaryscale.ai; NVIDIA blog; LinkedIn |
| Total Confirmed Capital Raised | $142M+ (seed undisclosed + $142M Series A) | Sept 2024 | high | Bloomberg (paywall); NVIDIA blog; Crunchbase |
| Series A Valuation (implied) | ~$1.35B post-money | Sept 26, 2024 | medium | Third-party estimates; not confirmed in primary filing |
| Employees (pre-acquisition) | 11-50 | LinkedIn as of 2024 | low | LinkedIn company page; no official headcount disclosed |
| Lead Product | ESM3 (98B param protein language model) | Launched June 25, 2024 | high | evolutionaryscale.ai/blog/esm3-release |
| ESM Cambrian (ESM-C) | 300M / 600M / 6B parameter models | Released Dec 4, 2024 | high | evolutionaryscale.ai/blog/esm-cambrian |
| Science Paper (ESM3) | Published Science journal, Jan 16, 2025 | Jan 16, 2025 | high | DOI: 10.1126/science.ads0018; BioRxiv preprint prior |
| Revenue / ARR | Not publicly disclosed; Forge API SaaS model | Current | low | No filings; forge.evolutionaryscale.ai is JS-only |
| NVIDIA Partnership | BioNeMo NIM integration; seed + Series A investor | 2024-2025 | high | blogs.nvidia.com; nvidia.com/bionemo |
| SEC Form D Filings | None found in EDGAR (2024-2026) | May 2026 | high | efts.sec.gov Form D search; SEC EDGAR |
| Wikipedia Page | Does not exist (404) | May 2026 | high | en.wikipedia.org/wiki/EvolutionaryScale returns 404 |
Valuation and headcount figures are based on third-party reports only; no SEC filings or primary cap-table disclosures are available. Series A seed-round amount is not publicly disclosed. Post-acquisition corporate structure and investor returns are unknown.
[CO001, CO003, CO004, CO007, CO008, CO015]Key performance indicators summarizing EvolutionaryScale capital, technology scale, model adoption, and current status as of May 2026.
HuggingFace download counts are snapshots from research session in May 2026 and may change. Valuation is an implied estimate from third-party sources. Parameter counts and training data sizes are from official company publications.
[CO013, CO016, CO017, CO019, CO021, CO025]1.2 Founders, Leadership, and Key-Person Risk
EvolutionaryScale was co-founded by four researchers who had worked together at Meta AI Research (FAIR): Alexander (Alex) Rives, Tom Sercu, Zeming Lin, and Sanjay Rao. Alex Rives served as CEO and was the principal architect of the ESM protein language model lineage, which had its roots in his academic work and continued at Meta FAIR prior to the spin-out. Tom Sercu served as co-founder and VP of Engineering, leading the infrastructure and engineering team that built the Andromeda H100 cluster. Zeming Lin and Sanjay Rao served in technical co-founder roles contributing to model development and research. Following the November 2025 acquisition, Alex Rives became Head of Science at the Chan Zuckerberg Initiative, and the rest of the founding team joined CZ Biohub in senior research roles. The company's entire founding team originating from a single employer (Meta FAIR) represents a significant concentration risk: the cultural, methodological, and technical assumptions of the team are deeply homogeneous, and there was no evidence of independent board members, advisors from outside the AI research community, or experienced commercial biotechnology executives among the founding team. This single-employer provenance also heightens key-person dependency, since the departure of any one founder, particularly Rives as CEO and technical visionary, would have had an outsized impact on the company's research direction and investor confidence. [CO002, CO004, CO005, CO022, CO030]
| Person | Role (at EvolutionaryScale) | Pre-Company Background | Founder-Market Fit | Key-Person Dependency |
|---|---|---|---|---|
| Alexander (Alex) Rives | CEO, Co-Founder | Meta AI (FAIR) researcher; originated ESM protein LM research lineage | Deep expertise in protein language modeling; inventor of the ESM model family | Critical: public face, research visionary, and CEO; departure would materially impair the company |
| Tom Sercu | Co-Founder, VP Engineering | Meta AI (FAIR) engineer; co-authored ESM3 and BioRxiv preprint | Infrastructure and engineering leadership for large-scale model training and inference | High: led engineering; directly responsible for Andromeda training cluster operations |
| Zeming Lin | Co-Founder, Research | Meta AI (FAIR) researcher; co-authored ESM3 BioRxiv preprint | Core research contributor to ESM model development | Medium: research contributor; less public-facing than Rives or Sercu |
| Sanjay Rao | Co-Founder, Research | Meta AI (FAIR); contributed to ESM research program | Core technical co-founder with AI research background | Medium: research co-founder; limited independent public record |
All four co-founders previously worked at Meta AI Research (FAIR), creating single-employer concentration risk. Salvatore Candido appears as a BioRxiv preprint author and may have been a fifth co-founder or early team member, but their role at EvolutionaryScale was not independently confirmed in available sources. Post-acquisition: Rives became Head of Science at CZI; other founders joined CZ Biohub in research roles.
[CO002, CO004, CO005, CO022, CO030, CO031]1.3 Funding History, Valuation, and Capital Structure
EvolutionaryScale raised capital in two known rounds prior to its absorption into CZI. The first was a seed round announced simultaneously with the ESM3 public launch on June 25, 2024. NVIDIA and Amazon both participated in the seed round alongside Lux Capital, Nat Friedman, and Daniel Gross. The exact seed amount was not publicly disclosed, though NVIDIA announced its participation separately. The Series A followed rapidly: on September 26, 2024, EvolutionaryScale announced a $142M Series A co-led by Amazon and NVIDIA, with continued participation from Lux Capital, Nat Friedman, and Daniel Gross, at an implied post-money valuation of approximately $1.35B. This closed only three months after the ESM3 launch and represented one of the largest AI biology funding rounds of 2024. Notably, no SEC Form D filings were found in EDGAR under any variant of "EvolutionaryScale" for the 2024 to 2026 period. Bloomberg reported on the Series A but the article is behind a paywall, preventing public verification of full deal terms, investor rights, and any secondary components. Total confirmed capital raised is $142M plus an undisclosed seed amount. No commercial revenue, debt facilities, or secondaries are on public record. The company's capital structure at the time of the CZI acquisition remains private. [CO015, CO016, CO017, CO018, CO019, CO036]
| Stakeholder | Role / Relationship | Round / Stage | Economic / Control Importance | Diligence Ask |
|---|---|---|---|---|
| Amazon (AWS) | Lead Series A investor; cloud infrastructure partner | Series A co-lead (Sept 2024) | High economic stake; likely preferred equity; strategic compute supply arrangement probable | Confirm investment amount, preferred terms, board seat, and AWS compute credit arrangement |
| NVIDIA | Lead Series A investor; seed investor; BioNeMo integration partner | Seed + Series A co-lead (2024) | High economic stake; strategic: ESM3 integrated into BioNeMo/NIM on H100 infrastructure | Confirm investment amount, NIM licensing revenue share, and GPU infrastructure commitment terms |
| Lux Capital | Early-stage VC; participated in seed and Series A | Seed + Series A (2024) | Early investor with likely significant ownership from seed stage | Confirm ownership percentage, liquidation preferences, and post-acquisition treatment |
| Nat Friedman | Angel investor; participated in seed and Series A | Seed + Series A (2024) | Individual angel; likely minor economic stake vs. institutional investors | Confirm participation amount and any advisory role |
| Daniel Gross | Angel investor; participated in seed and Series A | Seed + Series A (2024) | Individual angel; AI compute investor background | Confirm participation amount and relationship to AI infrastructure ecosystem |
| Chan Zuckerberg Initiative (CZI) / CZ Biohub | Acquirer / successor organization (Nov 2025) | Acquisition / integration (Nov 2025) | Critical: absorbed the entire EvolutionaryScale team and presumably IP; controls the future of ESM models | Disclose acquisition terms: equity buyout, IP transfer, commercial agreement continuity, and investor return details |
Amazon and NVIDIA are confirmed as Series A co-leads from NVIDIA blog post and Crunchbase. Full dollar amounts per investor are not public. CZI acquisition terms have not been disclosed publicly. There may be additional Series A investors not identified in available sources.
[CO015, CO016, CO017, CO018, CO021, CO023]1.4 Products, Technology, and Operations
EvolutionaryScale's product portfolio centered on the ESM (Evolutionary Scale Modeling) family of protein language models. ESM3, released June 25, 2024, is a generative multimodal protein model available in sizes up to 98 billion parameters, the largest protein language model released publicly at the time of its launch. ESM3 was trained on 2.78 billion protein sequences totaling 771 billion tokens using approximately one times ten to the power 24 floating-point operations on a cluster of NVIDIA H100 GPUs branded Andromeda internally. The company described ESM3 as simulating 500 million years of protein evolution during a single generation pass, enabling the design of novel proteins with user-specified structural and functional properties. A peer-reviewed paper validating ESM3's ability to design genuinely novel fluorescent proteins was published in Science on January 16, 2025 (DOI: 10.1126/science.ads0018), providing significant independent credibility. ESM Cambrian (ESM-C), released December 4, 2024, is a next-generation protein language model available in 300M, 600M, and 6B parameter sizes optimized for efficient inference. NVIDIA integrated ESM3 into its BioNeMo platform and made it available as an NVIDIA NIM microservice for enterprise deployment on H100 infrastructure. The company maintained a GitHub organization with 9+ repositories including the ESM model codebase, the DeepEP repository for mixture-of-experts inference, and forks of open-source training infrastructure. The Forge API platform served as the commercial developer interface. EvolutionaryScale had 11 to 50 employees as indicated on LinkedIn prior to the acquisition. [CO007, CO008, CO009, CO010, CO011, CO012]
How EvolutionaryScale protein language models connect from training data through model development, API platform, and strategic partnerships to the CZI acquisition outcome.
[CO009, CO010, CO025, CO026, CO028, CO032]1.5 Key Milestones and Adverse Events
EvolutionaryScale's history spans from its 2023 incorporation through a rapid product-and-capital phase to its absorption into CZI in November 2025, a total operational lifespan of under three years as an independent entity. The company's most significant technical milestone is the ESM3 Science paper publication, which provided the first independent peer-reviewed validation that a generative protein language model could design novel functional proteins with programmable properties. The $142M Series A at approximately $1.35B valuation in September 2024, just three months after product launch, reflected extraordinary investor enthusiasm for the technology. However, the November 2025 acquisition by CZI/Biohub is the dominant adverse governance event: with less than 14 months between the Series A close and the absorption into a non-profit entity, commercial investors face a highly uncertain exit path, since the terms of the CZI acquisition and any compensation to equity holders have not been publicly disclosed. The transition from a for-profit commercial entity to a non-profit research initiative raises material questions about the fate of the Series A investors, any employee equity, and the continuity of the Forge API commercial offering. Additional adverse events include: the absence of any SEC Form D filings, which is unusual for a company that raised $142M; the ESM GitHub repository redirect to the biohub organization, signaling IP transfer; and the company's Wikipedia page does not exist, indicating limited mainstream media documentation. [CO001, CO008, CO013, CO016, CO019, CO021]
| Date | Event | Type | Amount / Valuation / Status | Participants | Implication |
|---|---|---|---|---|---|
| 2023 | Company incorporated; founding team departs Meta AI (FAIR) | founding | — | Alex Rives, Tom Sercu, Zeming Lin, Sanjay Rao | Establishes legal entity; founding team coalesces around ESM protein LM research lineage |
| Mar 2024 | Company operational milestone; pre-launch development phase | founding | — | Founding team | Research and engineering ramp; model training on Andromeda H100 cluster begins |
| Jun 25, 2024 | ESM3 public launch and seed round announcement | product | Seed amount undisclosed | Lux Capital, Nat Friedman, Daniel Gross, NVIDIA, Amazon | ESM3 (98B param) publicly released with open-weights and Forge API; seed funding confirmed |
| Jul 2024 | BioRxiv preprint published (doi: 10.1101/2024.07.01.600583) | product | — | Rives, Sercu, Candido, Lin et al. | ESM3 scientific claims available for community review prior to peer review |
| Sep 26, 2024 | $142M Series A announced | financing | $142M at ~$1.35B implied valuation | Amazon (co-lead), NVIDIA (co-lead), Lux Capital, Nat Friedman, Daniel Gross | Largest AI biology raise of 2024 H2; institutional validation of protein LM platform |
| Dec 4, 2024 | ESM Cambrian (ESM-C) released | product | — | EvolutionaryScale team | Expanded model family (300M/600M/6B params); broadens inference efficiency options for customers |
| Jan 16, 2025 | ESM3 paper published in Science journal | product | — | Rives et al., Science DOI: 10.1126/science.ads0018 | Peer-reviewed validation of novel GFP design; strengthens scientific credibility and institutional adoption |
| Nov 6, 2025 | Team joins CZ Biohub; EvolutionaryScale absorbed by CZI | adverse | Terms undisclosed | CZI / CZ Biohub, full EvolutionaryScale team | Company ceases as independent entity; investors face uncertain exit; ESM IP transfers to non-profit |
The CZI acquisition event (Nov 2025) is adverse from a commercial investor perspective. Exact seed round amount remains undisclosed. Series A valuation is an implied figure from third-party sources, not confirmed in a primary SEC filing. March 2024 operational launch date is from chapter context brief; no primary announcement found.
[CO001, CO008, CO011, CO013, CO015, CO016]Key milestones from founding in 2023 through ESM3 launch, Series A, ESM Cambrian, Science publication, and CZI acquisition in November 2025.
March 2024 operational launch date is from chapter context brief; no primary announcement found. Series A valuation of ~$1.35B is third-party estimate. Post-acquisition ESM IP continuity claim is inferred from biohub.org announcement, not confirmed by formal IP transfer agreement.
[CO001, CO002, CO007, CO008, CO010, CO011]02Market Analysis
2.1 Market Definition and Boundary
EvolutionaryScale's addressable market has two interconnected layers. The primary layer is the protein language model (PLM) API and platform market—cloud-hosted or on-premise AI models that enable protein engineers to generate, predict, and optimize protein sequences and structures without exhaustive wet-lab directed evolution. This market is distinct from general cloud compute, traditional bioinformatics pipelines, genomic sequencing platforms, medical imaging AI, or clinical trial management software. The secondary layer comprises the broader protein engineering research tools market, which encompasses all reagents, instruments, and software used in recombinant protein research and design—a larger addressable segment into which PLM APIs fit as a high-growth AI sub-category. Status-quo substitutes for protein LM platforms include: (1) AlphaFold2/3 (Google DeepMind) for protein structure prediction—freely available for non-commercial use via a public database of over 200 million predicted structures; (2) Rosetta and PyRosetta (University of Washington) for computational protein design—open-source but computationally intensive and requiring significant expertise; (3) directed evolution in wet lab (iterative random mutagenesis and screening)—expensive, slow (weeks per cycle), and throughput-limited; and (4) traditional molecular dynamics and homology modeling software (GROMACS, MODELLER, Schrödinger Maestro)—established tools for structure-guided design without generative capability. ESM3 and ESM-C differentiate from these substitutes by jointly reasoning over protein sequence, structure, and function within a single unified generative model. Adjacent markets include the AI drug discovery platform market (all computational tools accelerating small-molecule and biologic drug design), which Grand View Research estimates at $2.35B in 2025 growing to $13.77B by 2033 at 24.8% CAGR. Industrial biotechnology— enzyme engineering for green chemistry, agriculture, and biomaterials—represents a further adjacency with distinct procurement patterns and lower regulatory burden than pharmaceutical applications. The outer boundary is the global drug discovery market ($71.89B in 2025 per Precedence Research), which includes all modalities and services, well beyond EvolutionaryScale's direct footprint.[CM001, CM002, CM003, CM004, CM005, CM006]
| Market Segment / Category | Included Spend | Excluded Spend | Primary Buyer / Payer | EvolutionaryScale Relevance |
|---|---|---|---|---|
| Protein Language Model API (core) | Cloud API access for protein sequence generation, embedding, structure prediction, and multi-modal reasoning via ESM3/ESM-C | Traditional wet-lab directed evolution, genomic sequencing instruments, general cloud compute without protein-specific models | Pharma/biotech VP Computational Biology, CSO, ML/bioinformatics scientists | Direct revenue source: Forge API, AWS SageMaker JumpStart distribution, NVIDIA BioNeMo NIM microservices |
| Protein Engineering Research Tools (primary adjacent) | All software, AI tools, and services enabling recombinant protein research, design, and optimization | Lab instruments (PCR machines, crystallography), DNA synthesis reagents, directed evolution consumables | Research and development leaders in biopharma, biotech, and industrial biotech | ESM3/ESM-C are AI-native protein design tools that address this broader category; analyst estimates $2.6–5.1B (2023–2025) |
| AI Drug Discovery Platforms (secondary adjacent) | AI-enabled target identification, generative molecule design, virtual screening, ADMET prediction | CRO wet-lab services, clinical trial management, sequencing platforms, medical imaging AI | Pharma R&D Leadership / Chief Scientific Officers, Business Development | ESM3 enables protein target characterization and antibody design, positioning EvolutionaryScale as an enabling platform for AI drug discovery workflows |
| Industrial Biotechnology (adjacent non-pharma) | Enzyme engineering for green chemistry, biofuels, agriculture, specialty materials, food science | Diagnostic tools, medical devices, hospital IT, insurance platforms | Industrial biotech R&D teams, CTO/VP Engineering in biotech manufacturing | Growing secondary buyer segment: ESM3/ESM-C applicable to enzyme optimization workflows; lower regulatory burden than pharma |
| Open-Source / Self-Hosted Substitutes (status quo) | Self-hosted ESM2 (MIT), Rosetta (open-source), AlphaFold (free non-commercial), PyRosetta | None—this category represents the free alternative EvolutionaryScale competes with for paid conversion | Academic labs, well-resourced startup computational teams, internal pharma bioinformatics groups | Open-source ESM-C under MIT license is a direct substitute for Forge API; self-hosting is the primary cost competitor |
Market boundaries derived from EvolutionaryScale product documentation, ESM3 Science paper, analyst market reports (MarketsandMarkets, Precedence Research, Grand View Research), and FDA regulatory guidance. The Open-Source substitutes row is included to document the direct competitive dynamic between EvolutionaryScale's free offerings and its paid Forge API. AI drug discovery platform market adjacent estimate is from Grand View Research ($2.35B, 2025).
[CM001, CM002, CM003, CM004, CM005, CM006]2.2 Market Sizing and Analyst Estimates
Published estimates for the protein engineering market vary considerably based on scope definition. MarketsandMarkets estimates the market at $2.2B (2019) growing to $3.9B by 2024 at a 12.4% CAGR—a narrower estimate that focuses on protein engineering tools and services. Precedence Research takes a broader scope, estimating the market at $5.09B in 2025 growing to $23.59B by 2035 at 16.57% CAGR, incorporating a wider set of protein engineering applications including industrial enzymes and biopharmaceuticals. Allied Market Research estimates $2.2B in 2022 growing to $7.7B by 2032 at 13.2% CAGR. Grand View Research estimates $2.60B in 2023 growing to $7.62B by 2030 at 16.24% CAGR. The directional consensus is clear—mid-teen percentage annual growth for over a decade—but the $2.2B vs. $5.09B base discrepancy reflects scope inconsistency, not contradictory evidence. The adjacent AI drug discovery market (Grand View Research) is estimated at $2.35B in 2025 growing to $13.77B by 2033 at 24.8% CAGR—a materially faster growth rate than protein engineering tools alone, reflecting the pull from large-pharma AI investment post-AlphaFold. The broader drug discovery market (all modalities) is estimated at $71.89B in 2025 growing to $158.74B by 2034 at 9.2% CAGR (Precedence Research), serving as the outer TAM reference. EvolutionaryScale's serviceable addressable market is the sub-segment of protein engineering tools customers who (a) have computational biology infrastructure, (b) are adopting AI-native tools rather than traditional bioinformatics, and (c) have compute scale justifying a paid API subscription over self-hosting open weights. EvolutionaryScale's current serviceable obtainable market includes: Forge API subscription revenue from biopharma and biotech customers; commercial model access via AWS SageMaker JumpStart and NVIDIA BioNeMo; and enterprise licensing for ESM3 and ESM-C deployment in regulated settings. The company raised $142M in Series A funding (Crunchbase), indicating investor validation of the market opportunity, though no independent SOM figure for PLM APIs within pharmaceutical R&D has been published. This is a material diligence gap documented in the sizing-gaps section. HuggingFace download metrics (6,320+ for ESM-C, 3,110+ for ESM3) and 129+ bioRxiv citations provide proxies for user adoption rather than commercial revenue traction.[CM007, CM008, CM009, CM010, CM011, CM012]
| Publisher | Year | Geography | Market Value (USD) | CAGR | Methodology / Scope | Confidence | Key Limitation |
|---|---|---|---|---|---|---|---|
| MarketsandMarkets (paywalled) | 2019–2024 | Global | $2.2B (2019) → $3.9B (2024) | 12.4% CAGR | Protein engineering tools and services; includes software, services, and some equipment; bottom-up | medium | Narrower scope; excludes industrial enzymes broadly; methodology not fully disclosed at free tier |
| Precedence Research | 2025–2035 | Global | $5.09B (2025) → $23.59B (2035) | 16.57% CAGR | Broad protein engineering market; incorporates industrial applications, biopharmaceuticals, and research tools | medium | Broader scope inflates base vs. MAM; 2035 forecast carries compounding uncertainty; scope inconsistency vs. narrower estimates |
| Allied Market Research | 2022–2032 | Global | $2.2B (2022) → $7.7B (2032) | 13.2% CAGR | Research-focused protein engineering tools; consistent with MAM midpoint trajectory | medium | Scope partially overlapping with MAM and GVR; paywalled primary; confidence limited by secondary reporting |
| Grand View Research – Protein Engineering | 2023–2030 | Global | $2.60B (2023) → $7.62B (2030) | 16.24% CAGR | Protein engineering market including biopharmaceutical protein engineering and industrial applications | medium | Via Wayback Machine snapshot; current live page requires subscription; scope consistent with Allied but higher CAGR |
| Grand View Research – AI Drug Discovery | 2025–2033 | Global (adjacent) | $2.35B (2025) → $13.77B (2033) | 24.8% CAGR | AI-enabled drug discovery software platforms; faster growth than protein engineering tools alone due to pharma AI investment acceleration | medium | Adjacent market; EvolutionaryScale is an enabling tool rather than a full drug discovery platform; partial TAM overlap |
| Precedence Research – Drug Discovery (outer boundary) | 2025–2034 | Global | $71.89B (2025) → $158.74B (2034) | 9.2% CAGR | All drug discovery services and tools; broadest boundary including CRO wet-lab services, instruments, software, and AI tools | medium | Significantly over-inclusive; outer TAM boundary only; most of this market is not addressable by PLM APIs |
| EvolutionaryScale / Crunchbase (SOM proxy) | 2024–2026 | Global | $142M raised (Series A) as investor-validated proxy for commercial opportunity | N/A | Series A round as funding proxy for commercial potential; HuggingFace downloads (6,320+ ESM-C) as developer traction indicator | low | No SOM figure published; funding and download metrics are proxies, not revenue evidence; Forge API pricing undisclosed |
All primary analyst report TAMs are paywalled; values obtained from accessible landing pages, Wayback Machine snapshots, and secondary summaries. The 10× spread between narrowest ($2.2B, MAM 2019) and broadest ($23.59B, Precedence 2035) estimates is primarily attributable to scope differences and forward projection period, not genuine market disagreement. The EvolutionaryScale SOM proxy is analytical, not published; no independent PLM API sub-market sizing was identified. Growth rates are directionally consistent at 12–17% for protein engineering, accelerating to 24.8% for AI drug discovery specifically.
[CM007, CM008, CM009, CM011, CM012, CM013]Four-level sizing pyramid for EvolutionaryScale's market: TAM outer boundary (all drug discovery services globally), TAM addressable (protein engineering tools market), SAM (AI-native protein design tools and APIs within protein engineering), and SOM (EvolutionaryScale's current Forge API and distribution channel revenue zone), as of 2026.
TAM outer from Precedence Research ($71.89B, 2025); TAM addressable midpoint of four analyst estimates (MAM, Allied, GVR, Precedence). SAM is an analytical estimate not independently published; derived from the fraction of protein engineering tools that are AI-native software-only (excluding instruments and reagents). SOM reflects commercial API and cloud distribution only; Forge API pricing and subscriber count are undisclosed. All figures are directional.
[CM006, CM007, CM008, CM009, CM011, CM012]Low/base/high estimates across four market sizing lenses in USD billion: protein engineering tools market (2024–2025 base), AI drug discovery adjacent market (2025), protein engineering 2030 forecast, and broader drug discovery market (2025). All values in USD billion for consistent comparison.
Protein Engineering 2024–2025: low=MAM 2024 estimate ($3.9B base; $2.2B is the 2019/2022 entry), mid=MAM $3.9B, high=Precedence $5.09B. AI Drug Discovery 2025: low=conservative industry estimate, mid=GVR base ($2.35B), high=market expansion scenario. Protein Engineering 2030 forecast: low=linear extrapolation at 12% CAGR from $3.9B, mid=GVR $7.62B, high=Precedence trajectory projection. Drug Discovery outer boundary: anchored on Precedence Research $71.89B midpoint. All values in USD billion; incompatible units excluded.
[CM007, CM008, CM009, CM011]2.3 Buyer and User Segmentation
The primary commercial buyer for EvolutionaryScale's Forge API is the large and mid-tier pharmaceutical or biotech company with an active computational biology or protein engineering program. The economic buyer is typically a VP of Computational Biology, Director of Drug Discovery, or Chief Scientific Officer, with procurement authority delegated through R&D and IT budgets. The technical champion—who evaluates model capabilities and advocates for adoption—is the computational biologist, structural biologist, or machine learning scientist embedded in discovery teams. Finance and procurement officers act as formal payers, requiring ROI justification via reduction in wet-lab screening cycles or accelerated lead identification timelines. The Forge commercial API targets enterprise customers needing large-scale protein sequence generation and embedding with guaranteed availability and compliance. AWS SageMaker JumpStart and NVIDIA BioNeMo serve as secondary commercial channels, reaching pharma customers already embedded in those cloud ecosystems. Academic and government research labs represent a high-volume but non-revenue user segment that downloads ESM-C open weights under MIT license from HuggingFace (6,320+ downloads) and installs via the PyPI `esm` package—building community mindshare that feeds commercial pipeline. Industrial biotechnology companies—engineering enzymes for green chemistry, agriculture, and specialty materials—represent a materially different buyer segment with shorter product development cycles and lower regulatory burden. Contract research organizations offer a dual role: potential buyers of protein LM APIs to enhance their service offerings, and potential channel partners who resell compute-enabled protein engineering services to pharma clients. Biotech startups at the Series A–B stage represent an emerging paid segment: they have computational infrastructure but lack the internal resources to train frontier protein LMs, making Forge API access economically rational.[CM015, CM016, CM017, CM018, CM019, CM020]
| Segment | Economic Buyer | Technical Champion / User | Payer | Workflow Need | Budget Owner | Primary Adoption Trigger |
|---|---|---|---|---|---|---|
| Top-20 Global Pharma (e.g., Pfizer, Roche, Novartis) | VP Computational Biology / Chief Scientific Officer | Computational Biologist / ML Scientist / Structural Biologist | R&D Budget Committee / CFO | Protein sequence optimization for biologics and ADCs; antibody engineering; target characterization; virtual library generation | CSO / VP R&D with CFO approval for enterprise contracts | Pipeline productivity mandate: reduce directed evolution cycles; integrate AI protein design into existing informatics stack |
| Mid-Tier Biopharma / Biotech ($50M–$2B revenue) | VP R&D / Director Computational Biology | Computational Scientist / Research Engineer | R&D / Series B–C investor capital | Lead optimization and protein stability engineering with limited internal compute resources; access frontier LMs without training infrastructure | CTO or VP R&D with board sign-off above threshold | Fundraising milestone: need in silico validation data for next round; cost advantage vs. internal LM training |
| Academic / Government Research Labs | Principal Investigator / Department Head | Postdoctoral Researcher / Graduate Student | NIH grants, government funding, institutional IT | Protein function prediction, evolutionary analysis, variant effect prediction for basic research; no commercial intent | PI with grant budget authority; no formal procurement cycle | Open-weight availability (ESM-C MIT license on HuggingFace); integration with existing PyTorch workflows at zero direct cost |
| Industrial Biotech (Enzyme Engineering, Synthetic Biology) | VP Protein Engineering / Chief Technology Officer | Enzyme Engineer / Protein Scientist / Computational Chemist | R&D budget / product development capital | Enzyme engineering for specific activity, stability, pH/temperature tolerance; de novo protein design for biomaterials and green chemistry | CTO / VP Product Development | Shorter product cycles vs. pharma; demonstrated ROI via reduced wet-lab screening iterations; lower regulatory risk |
| Contract Research Organizations (CROs) | VP Scientific Services / Business Development | Computational Biologist / Protein Modeling Specialist | CRO operating budget (tools integrated into service cost) | Enhanced computational protein engineering service offering to pharma and biotech clients; differentiation vs. wet-lab-only CROs | Operations / Finance with scientific leadership input | Competitive differentiation: offer AI-native protein engineering services that pure wet-lab CROs cannot; resell value created by ESM3/ESM-C |
Buyer segments derived from EvolutionaryScale product documentation (Forge API, ESM-C on AWS/NVIDIA), NVIDIA BioNeMo and AWS SageMaker distribution announcements, HuggingFace developer adoption data, and analogous SaaS API commercial structures. Academic/government segment is highest volume but non-revenue under the free open-weight tier. Top pharma and mid-tier biotech are the primary commercial targets. Budget ownership structures are archetypes; actual organizational authority varies by company size and culture.
[CM015, CM016, CM017, CM018, CM019, CM020]Matrix mapping EvolutionaryScale buyer segment against economic buyer role, technical champion, and primary adoption trigger for Forge API and ESM-C purchasing or adoption decisions.
Buyer roles are archetypes derived from EvolutionaryScale Forge API documentation, ESM-C distribution announcements (AWS SageMaker, NVIDIA BioNeMo), HuggingFace developer community signals, and analogous SaaS API commercial models. Actual organizational titles and approval thresholds vary. Academic/government row reflects open-weight non-commercial adoption only, with no paid revenue contribution under current MIT license model.
[CM014, CM015, CM016, CM017, CM018, CM019]2.4 Growth Drivers and Adoption Constraints
Five structural forces drive protein LM market growth. First, exponential protein sequence database expansion: DNA sequencing costs declined from ~$10,000 per genome in 2011 to ~$100 per genome by 2023 (National Human Genome Research Institute), enabling the generation of billions of new protein sequences. ESM3 was trained on 2.78 billion protein sequences with 98 billion parameters—a scale only achievable because of this data democratization. Second, AlphaFold2/3 (Google DeepMind) generated a free database of 200+ million protein structures—removing the historical $500,000+ crystallography cost barrier—and trained the field to accept computational protein tools as production-grade. Third, NVIDIA BioNeMo delivers 2× faster biofoundation model training and 6× faster model inference, reducing the total cost of ownership for enterprise PLM deployment, lowering the economic barrier to Forge API alternatives and enhancing EvolutionaryScale's distribution reach. Fourth, FDA regulatory engagement is accelerating: over 500 AI/ML-enabled drug development submissions were received between 2016 and 2023, a 2025 draft guidance was issued, and the CDER AI Council was established in 2024—reducing regulatory ambiguity for pharma customers evaluating AI-native discovery tools. Fifth, the ESM-C open-weight release under MIT license establishes EvolutionaryScale as the community standard for protein LMs, mirroring the Hugging Face open-weight strategy that drove commercial cloud API conversion in NLP. Four material constraints limit adoption pace. First, open-source commoditization: ESM-C weights are available for free under MIT license; any well-resourced lab can self-host, reducing paid conversion rates and pricing power at the non-frontier end of the market. Second, wet-lab validation requirements: no protein designed purely by computational AI has entered regulatory approval without extensive in vitro and in vivo confirmation—the API has value but cannot replace the experimental bottleneck. Third, enterprise procurement friction: pharma IT security reviews, cloud data governance policies, and multi-year vendor vetting cycles add 12–24 months to commercial deployment timelines relative to academic adoption. Fourth, competitive pressure from Big Tech: Google DeepMind, NVIDIA BioNeMo, and AWS HealthOmics all have distribution, compute, and ecosystem advantages that could threaten EvolutionaryScale if frontier protein LM capabilities converge toward commodity.[CM022, CM023, CM024, CM025, CM026, CM027]
| Driver / Constraint | Direction | Timing | Implication for EvolutionaryScale | Diligence Ask |
|---|---|---|---|---|
| DNA sequencing cost democratization (~$100/genome by 2023, down from $10,000 in 2011) | Growth driver | Structural, ongoing | Enabled training of ESM3 on 2.78B protein sequences; expanding protein databases sustain competitive moat for frontier model training | Confirm that EvolutionaryScale has ongoing data pipeline access and compute budget to retrain on new sequences as databases expand |
| AlphaFold2/3 free protein structure database (200M+ structures) | Growth driver | Now, accelerating | Removes the historical crystallography cost barrier; normalizes computational protein tools in pharma R&D; expands the addressable buyer base for ESM3/ESM-C by removing 'is this trustworthy?' friction | Confirm ESM3's training data overlap with AlphaFold structural database; assess whether EvolutionaryScale co-trains on structural inputs |
| NVIDIA BioNeMo 2× training / 6× inference speedup | Growth driver | Now, hardware-dependent | Reduces total cost of ownership for enterprise ESM deployment; strengthens AWS and NVIDIA partnership distribution channel | Verify that EvolutionaryScale's ESM-C NIM microservices are tested and GA on BioNeMo; assess revenue share or referral structure |
| FDA regulatory AI engagement: 500+ submissions 2016–2023, 2025 draft guidance, CDER AI Council 2024 | Growth driver | Emerging, 2025–2027 | Reduces regulatory uncertainty for pharma partners evaluating AI-native discovery tools; signals FDA acceptance of AI in IND submissions | Confirm whether any ESM3/Forge-enabled biology has been referenced in an IND or regulatory filing; obtain current FDA guidance applicability assessment |
| Biopharma R&D spend growth (pharma R&D budgets ~5–6% annual growth) | Growth driver | Structural, long-term | Expanding total procurement budget in target buyer segment supports Forge API revenue growth even at constant market share | Track top-pharma R&D budget disclosures annually; assess computational biology as proportion of R&D budget in Forge target accounts |
| Open-source commoditization: ESM-C free under MIT license; ESM2 freely available from Meta | Adoption constraint | Now, persistent | Self-hosting displaces paid Forge conversions for price-sensitive customers with GPU access; limits pricing power at smaller model tiers | Quantify the fraction of HuggingFace ESM-C downloaders who subsequently convert to paid Forge API; assess self-hosting economics at 6B parameter scale |
| Wet-lab validation requirement: no AI-designed protein has entered regulatory approval without experimental confirmation | Adoption constraint | Structural, long-term | Pure API value proposition is limited to computational workflow steps; cannot replace in vitro / in vivo validation; limits per-user revenue ceiling for platform-only plays | Confirm whether EvolutionaryScale plans to offer wet-lab validation partnerships or data integration services as part of Forge enterprise offering |
| Enterprise procurement friction: 12–24 months for pharma cloud vendor vetting | Adoption constraint | Now, structural | Commercial deployment timelines longer than academic adoption; impacts revenue conversion from pilot to enterprise contract | Obtain reference customer disclosure of time from first API trial to enterprise Forge contract; assess SOC2 and GxP compliance certifications |
Drivers and constraints derived from FDA regulatory guidance page, NVIDIA BioNeMo technical documentation, EvolutionaryScale ESM3 Science paper, AlphaFold public database disclosures, National Human Genome Research Institute sequencing cost data, and HuggingFace/GitHub developer adoption metrics. Patent cliff and pharma R&D spend figures from IQVIA and Statista secondary sources. No single source covers all items; synthesis reflects convergence across multiple evidence types.
[CM022, CM023, CM024, CM025, CM026, CM027]Protein LM API adoption funnel from all potential enterprise users globally through to active EvolutionaryScale Forge commercial subscribers, illustrating conversion stages and estimated magnitude at each step as of 2026.
Funnel counts are analytical estimates; no authoritative published survey of PLM API adoption stages exists. Total global protein engineering companies estimated from biotech/pharma industry databases. Companies with computational biology budgets derived from proportion of top-1000 pharma/biotech companies maintaining dedicated in-house computational teams. Free-tier ESM user count extrapolated from HuggingFace download metrics (6,320+ ESM-C downloads) and GitHub stars. Commercial Forge and enterprise contract counts are estimates; EvolutionaryScale has not publicly disclosed subscriber or contract counts. All figures are directional order-of-magnitude estimates requiring diligence verification.
[CM020, CM021, CM027, CM031]2.5 Sizing and Adoption Diligence Gaps
Several material evidence gaps limit the precision of the market analysis. First, no independently published serviceable addressable market figure exists for protein language model APIs within pharmaceutical R&D specifically. All protein engineering market estimates ($2.2B–$23.59B) encompass the full market including reagents, instruments, and services; the pure API/software sub-segment has not been sized in any accessible analyst report. Deriving a PLM API SAM requires analytical estimates of what fraction of protein engineering spend is addressable by computational tools—an assumption-heavy inference without independently verifiable basis. Second, EvolutionaryScale's Forge API pricing and actual paid subscriber count have not been publicly disclosed. HuggingFace downloads (6,320+ ESM-C, 3,110+ ESM3) and PyPI install counts indicate developer traction, but do not translate to paid commercial revenue without knowledge of the Forge conversion funnel and pricing structure. Third, the biorxiv preprint search returned 129 papers citing ESM3/EvolutionaryScale, and the ESM3 Science paper (DOI: 10.1126/science.ads0018) has been cited extensively—but academic citation impact does not directly map to commercial market penetration in pharma. Fourth, analyst estimates for the protein engineering market span a 10× range ($2.2B to $23.59B) for adjacent base years; this range is primarily driven by scope differences (tools-only vs. instruments+reagents+services) rather than genuine market disagreement, but investors comparing across sources without scope adjustment may reach erroneous conclusions. Evidence preserved for successor chapters: (a) no paid customer disclosures have been identified—this is a diligence gap for Chapter 4 (Business Model); (b) comparable protein LM API pricing benchmarks against OpenAI API, Anthropic API, and AWS Bedrock should be obtained in Chapter 3 (Competitors); (c) whether ESM3's commercial Forge API has been used in any IND submission or regulatory filing has not been confirmed—this gap should be addressed in Chapter 5 (Technology) or Chapter 7 (Regulatory).[CM032, CM033, CM034, CM035, CM036]
2.6 Exhibits
03Competitors
3.1 Competitive Universe and Category Segmentation
EvolutionaryScale competes at the intersection of protein language modeling, generative biology, and AI-enabled drug discovery. The competitive universe divides into four segments. The first and most direct segment is AI protein design platform peers: companies that build and commercialize foundation models specifically for protein engineering and design. Profluent Bio (San Francisco, ~$44M raised) uses ProGen-derived models and produced OpenCRISPR—described on its website as the world's first AI-designed gene editor. Cradle.bio (Amsterdam, ~$73M raised) offers a SaaS protein engineering platform that integrates proprietary wet-lab data cycles with customer data, claiming 2–12x faster development timelines; it counts Novonesis (formerly Novozymes) among its customers. Generate Biomedicines (Somerville, MA, ~$700M+ raised) operates the Generate Platform—a continuously trained generative biology loop that has generated, built, and tested over 42,000 proteins across 140,000+ sq ft of lab space—with active major-pharma partnerships. AbSci Corporation (NASDAQ: ABSI; Vancouver, WA) is the only publicly traded direct peer, with an AI Drug Creation Platform for de novo antibody design using 6-week iterative cycles and a FY2025 10-K filed with the SEC in March 2026. Adaptyv Bio (Lausanne, Switzerland) positions as a cloud lab for protein designers at the Biopole Life Science Campus. The second segment comprises foundation-model and broader bio-AI peers. Isomorphic Labs (London, Alphabet subsidiary) holds the exclusive commercial license to AlphaFold 3 for drug discovery, following the joint Google DeepMind announcement in May 2024, and signed landmark deals with Eli Lilly and Novartis in early 2024. Chai Discovery (San Francisco) released Chai-1 as an open model and is advancing Chai-2 for drug-like de novo antibody design with atomic precision. Xaira Therapeutics (San Francisco, ~$1B raised at launch in April 2024) is building predictive and agentic AI models across the complete drug discovery spectrum. Iambic Therapeutics uses Enchant and NeuralPLexer AI technologies and has Phase 1b data for IAM1363 (HER2 inhibitor). Inceptive (Palo Alto/Berlin/Zurich, founded 2021) specializes in RNA/mRNA/siRNA/ASO/peptide foundation models. The third segment includes AI drug-discovery integrators. Insilico Medicine (HKEX: 3696) has the most advanced clinical proof among AI-first companies, having completed a Phase 2 trial with ISM001-055 (TNIK inhibitor for IPF). Recursion Pharmaceuticals (NASDAQ: RXRX) acquired Exscientia and operates over 50 petabytes of phenomics data with BioHive-2 (built with NVIDIA). Schrödinger (NASDAQ: SDGR) is the dominant physics-based platform with 30+ years of R&D and FEP+, WaterMap, and LiveDesign tools. The fourth segment covers academic and open-source actors. The Institute for Protein Design (Baker Lab, UW)—whose co-director David Baker shared the 2024 Nobel Prize in Chemistry—distributes RFdiffusion and RoseTTAFold as free open-source tools and has a royalty-free COVID-19 vaccine approved in the UK and South Korea. Google DeepMind makes AlphaFold 3 available via AlphaFold Server (free, non-commercial) and the EMBL-EBI AlphaFold DB (200M+ structures, CC-BY-4.0). The most adversarially important open-source threat is Meta's ESM model family (ESM2, ESMFold)—created by Alexander Rives, Zeming Lin, Tom Sercu, and Salvatore Candido, the exact same individuals who founded EvolutionaryScale—and released under an MIT license, establishing a commoditization floor for basic protein language modeling. [CP001, CP002, CP003, CP004, CP005, CP006]
| Competitor | Category | Scale / Funding | Target Segment | Key Product | Differentiation | Limitation |
|---|---|---|---|---|---|---|
| Profluent Bio | Direct AI protein design | ~$44M raised (est.) | Biotech / pharma protein engineering | ProGen-based models; OpenCRISPR | First AI-designed gene editor; open-access publication strategy | Smaller compute/data scale vs ESM3; limited wet-lab |
| Cradle.bio | Direct AI protein design | ~$73M raised (est.) | Biopharma; industrial biotech | SaaS protein engineering platform; in-house wet lab | Data-flywheel lock-in; SOC2; Novonesis partnership; no-royalties model | No large foundation pre-training; customer-data model limits generalization |
| Generate Biomedicines | Direct AI protein design | ~$700M+ raised (est.) | Biopharma therapeutics | The Generate Platform; 42K+ proteins tested; 140K+ sq ft lab | Most capital; full wet-lab loop; major pharma partnerships | No public API; partnership-only access; highly capital-intensive |
| AbSci Corporation | Direct AI protein design | NASDAQ:ABSI (public) | Biopharma antibody programs | AI Drug Creation Platform; 6-week cycles; ABS-201 candidate | Publicly traded transparency; iterative wet-lab + AI; ABSI SEC filings | Quarter-by-quarter revenue pressure; no approved product yet |
| Adaptyv Bio | Direct AI protein design | Early-stage (undisclosed) | Academic; smaller biotech | Cloud lab for protein designers | Swiss life-science hub location; Biopole campus | Very limited public information; product scope unclear |
| Isomorphic Labs | Foundation-model bio-AI | Alphabet-funded (undisclosed) | Major pharma drug discovery | AlphaFold 3 commercial license; drug discovery platform | Exclusive commercial AF3 rights; Alphabet resources; Lilly/Novartis deals | Non-commercial AF3 still free via Server; limited to drug discovery |
| Chai Discovery | Foundation-model bio-AI | Private (undisclosed) | Drug discovery; antibody design | Chai-2 de novo antibody design | Atomic-precision antibody design; Chai-1 open-released | Early-stage; compute scale smaller than EvolutionaryScale |
| Baker Lab / IPD (UW) | Academic / open-source | NSF/NIH/public funding | Global academic; spinout biotech | RFdiffusion; RoseTTAFold; protein therapeutics | Nobel 2024 (Baker); royalty-free tools; approved COVID-19 vaccine spinout | Not commercial; distributes tools that compete with paid offerings |
| DeepMind AlphaFold | Academic / platform | Alphabet (unlimited) | Global research; pharma | AlphaFold 3; AFDB (200M+ structures; CC-BY-4.0) | Freely available for research; 3M+ users in 190+ countries | Non-commercial use only via Server/DB; commercial via Isomorphic exclusively |
| Meta FAIR / ESM2 | Open-source baseline | Meta (unlimited) | All biology researchers | ESM2 (MIT license); ESMFold | MIT license for all uses including commercial; same founders as EvolutionaryScale | Not maintained/updated post-2023; no multimodal reasoning; ESM3 supersedes internally |
Funding figures for Profluent (~$44M), Cradle (~$73M), and Generate Biomedicines (~$700M+) are estimates from public reporting and analyst context; exact totals are not officially confirmed by all companies. AbSci is publicly traded (NASDAQ:ABSI) with SEC-disclosed financials. Isomorphic Labs, Chai Discovery, and Adaptyv Bio do not disclose funding publicly. "Limitation" cells reflect publicly observable constraints, not internal assessments.
[CP001, CP002, CP003, CP004, CP005, CP006]Ordinal positioning of major competitors on two evidence-backed axes: research-tool vs. clinical/drug-pipeline focus (x-axis) and open/freely accessible vs. proprietary/ commercial (y-axis). Positions are evidence-backed ordinal scores (1–5), not numeric metrics.
Axis positions are evidence-backed ordinal scores (1=most research/open, 5=most clinical/proprietary). Numeric values are not metric measurements; they represent relative ordering based on public information about product type, licensing, and pipeline status as of May 2026.
[CP001, CP002, CP007, CP009, CP010, CP011]3.2 Direct AI Protein Design Peers: Detailed Profiles
Among direct AI protein design platform peers, each carries a distinct go-to-market model and differentiation strategy relative to EvolutionaryScale. Profluent Bio (San Francisco, 2022) focuses on protein-centric AI and has taken a public-good strategy with OpenCRISPR, publishing the first AI-designed gene editor as open access. The company's core ProGen model architecture is designed for protein sequence generation. Profluent's estimated ~$44M funding base is substantially smaller than EvolutionaryScale's $142M, implying more limited compute and data infrastructure. Cradle.bio (Amsterdam, 2021) differentiates through a SaaS platform that integrates customer wet-lab data with proprietary protein engineering models. Its model improves each time customers upload experimental results ("models learn with you"), creating a data-flywheel lock-in effect. Cradle claims customers achieve 2–12x faster protein development timelines and explicitly retains a no-royalties subscription model where customers own all generated IP. The company operates its own wet lab in Amsterdam as a proof-of-concept validation layer. Cradle's Novonesis partnership represents one of the world's largest industrial biotech companies embedding AI into its innovation workflow. The platform is SOC 2 compliant and supports single sign-on. Generate Biomedicines (Somerville, MA, 2018) is the most heavily funded direct peer, with over $700M raised. Its Generate Platform integrates AI model training, high-throughput protein expression, and iterative learning across 140,000+ sq ft of lab space. The company has tested 42,000+ proteins with a continuously improving feedback loop. Its lead program GB-0895 targets TSLP for asthma, co-optimized from the start for improved biological effect and reduced dosing frequency (twice-yearly potential). Active partnerships with major biopharma firms signal platform-level pharma validation. AbSci Corporation (NASDAQ: ABSI, Vancouver, WA) is the only publicly traded direct competitor. Its AI Drug Creation Platform integrates wet-lab and AI in iterative 6-week cycles for de novo biologic (antibody) design and multi-parametric lead optimization. AbSci's ABS-201 is an AI-designed antibody targeting prolactin receptors for androgenetic alopecia with demonstrated in vivo hair follicle regeneration. A 10-K for FY2025 was filed with the SEC on March 24, 2026, providing public disclosure not available from private peers, but also exposing AbSci to quarterly revenue pressure. Adaptyv Bio (Lausanne, Switzerland) positions as a cloud lab for protein designers at the Biopole Life Science Campus. Public information is significantly limited compared to other direct peers, suggesting early-stage or pre-public status. [CP018, CP019, CP020, CP021, CP022, CP023]
| Capability | EvolutionaryScale (ESM3) | Profluent | Cradle.bio | Generate Biomedicines | AbSci | AlphaFold 3 | Meta ESM2 |
|---|---|---|---|---|---|---|---|
| Multimodal (seq + struct + func jointly) | Yes — simultaneous reasoning | Sequence-centric (partial) | Within-project optimization | Generate-build-measure loop | Wet-lab + AI iterative | Struct + molecule interactions | No — sequence only |
| De novo generative design | Yes (ESM3 generative) | Yes (ProGen-based) | Yes (guided by lab data) | Yes (platform core) | Yes (de novo antibody) | Limited (structure prediction) | Limited (embedding / prediction) |
| Self-service API / developer access | Yes (Forge, public beta) | Unknown | Yes (SaaS platform) | No (partnership only) | No (partnership only) | Yes (AlphaFold Server, free) | Yes (HuggingFace, MIT, free) |
| Wet-lab validation loop | No (no disclosed wet lab) | Unknown | Yes (Amsterdam wet lab) | Yes (140K+ sq ft) | Yes (integrated cycles) | No | No |
| Commercial license (non-research) | Yes (Forge paid) | Unknown | Yes (subscription) | Yes (partnership) | Yes (partnership + public) | Yes (via Isomorphic Labs) | Yes (MIT, free) |
| Largest model scale (parameters) | 98B (largest disclosed) | ~1B–10B (estimated) | Unknown | Unknown | Unknown | Unknown (large) | 15B (largest free variant) |
Unknown cells reflect absence of public disclosure; they are evidence gaps, not negatives. "Limited" indicates capability exists only partially. AbSci's "public" qualifier refers to NASDAQ:ABSI trading status, not public API. Meta ESM2 15B is the largest publicly available free protein LM. AlphaFold 3 parameter count is not publicly disclosed.
[CP003, CP004, CP005, CP019, CP020, CP027]3.3 Pricing, Packaging, and Distribution Comparison
Pricing models across the AI protein design landscape reflect fundamentally different go-to-market philosophies. EvolutionaryScale offers Forge, a commercial API platform for ESM3 and ESMC access, with public beta announced in early 2025. The ESM3 small model (open) and ESMC (300M/600M) are available freely on HuggingFace for non-commercial and research use, while larger models and generative API access require Forge accounts. Cradle.bio explicitly positions as a software subscription ("no royalties, just a software subscription fee"), where customers retain all generated IP and their experimental data is never used to train models for others. This SaaS stickiness contrasts with EvolutionaryScale's usage-based API model. Meta's ESM2 is MIT-licensed and freely available on GitHub and HuggingFace for all uses including commercial, at zero cost. ESMFold structure prediction is similarly open. DeepMind's AlphaFold DB (200M+ structures) is CC-BY-4.0 for any use; AlphaFold Server is free for non-commercial research. Commercial AF3 use in drug discovery routes through Isomorphic Labs' exclusive licensing arrangements with pharma. Generate Biomedicines and AbSci both operate B2B partnership-and-licensing models with pharma majors rather than self-service APIs. Neither offers a public pricing page or developer API. Baker Lab/IPD tools (RFdiffusion, RoseTTAFold) and OpenFold are entirely free under permissive licenses. Key diligence gap: Forge enterprise list pricing, volume tiers, and realized per-call costs are not publicly disclosed. Isomorphic Labs' deal economics with Lilly and Novartis are reported but not publicly broken down. [CP027, CP028, CP029, CP030, CP031, CP032]
| Vendor | Access Model | Approximate Price / Tier | Included Capabilities | IP / Data Arrangement | Key Unknown / Gap |
|---|---|---|---|---|---|
| EvolutionaryScale (Forge) | API (usage-based) | Public beta; enterprise pricing undisclosed | ESM3 and ESMC inference; sequence/struct/function generation | User retains IP of generated proteins | Enterprise list price and volume tiers not public |
| Cradle.bio | SaaS subscription | Undisclosed list price | Protein engineering; multi-property optimization; data mgmt | Customer retains full IP; no royalties; data never shared | Exact subscription cost undisclosed |
| Meta ESM2 / ESMFold | Open-source (MIT license) | Free (zero cost) | Sequence embeddings; structure prediction (ESMFold) | MIT — any use including commercial | None — freely available |
| DeepMind AlphaFold (non-commercial) | Free server + DB (research) | Free for non-commercial (CC-BY-4.0) | 200M+ structures; AF3 predictions | CC-BY-4.0 for research; commercial only via Isomorphic | Isomorphic Labs commercial pricing not public |
| Baker Lab / IPD tools | Open-source (permissive) | Free | RFdiffusion; RoseTTAFold; design tools | Royalty-free; open-source | None — fully open |
| Generate Biomedicines | Strategic partnerships | Negotiated (undisclosed) | Full lab + AI generative biology platform | Partnership and licensing arrangements | Not self-service; pricing private; deal terms private |
| AbSci (NASDAQ:ABSI) | Partnerships + SEC-disclosed revenue | Negotiated per partnership | AI antibody design + wet-lab validation cycles | Partnership licensing; FY revenue in SEC filings | Realized economics per deal not itemized |
| Schrödinger (NASDAQ:SDGR) | Enterprise software license | ARR ~$130–150M est. (2024 analyst) | FEP+; WaterMap; LiveDesign; drug pipeline programs | Software license; pipeline via separate entity | Exact per-seat or per-module pricing private |
Pricing data is based on publicly available information. EvolutionaryScale Forge enterprise pricing, Cradle subscription cost, and Isomorphic Labs deal economics are undisclosed. Schrödinger ARR is an analyst estimate from 2024 coverage, not company-disclosed. AbSci revenue details appear in NASDAQ:ABSI quarterly/annual filings.
[CP027, CP028, CP029, CP030, CP031, CP032]3.4 Moat Durability, Commoditization Risk, and Adverse Analysis
EvolutionaryScale's most structurally critical competitive risk is the open-source commoditization of protein language models. ESM2 and ESMFold—developed by Alexander Rives, Zeming Lin, Tom Sercu, and Salvatore Candido at Meta AI FAIR—are available on GitHub and HuggingFace under the MIT license for any purpose including commercial use. ESM2 variants span 8M to 15B parameters; ESMFold predicts protein structure up to 60x faster than prior state-of-the-art. OpenFold provides an independent open reimplementation of AlphaFold with permissive licensing. Any biotech or pharma can deploy these free models as a baseline, directly compressing the value of ESM3 for applications that do not require multimodal prompting or frontier scale. EvolutionaryScale attempts to escape this floor through three strategies: (1) frontier scale—ESM3 at 98B parameters and 10^24 FLOPs is the largest protein language model with demonstrated emergent generative capabilities; (2) multimodal differentiation— ESM3 is the first model to jointly reason over protein sequence, structure, and function, a capability absent from ESM2; and (3) R&D velocity—ESM Cambrian (ESMC) was released in December 2024, sustaining a cadence of new model releases. However, several adverse signals are present. First, Chai Discovery's Chai-2 advances atomic-precision de novo antibody design, and Isomorphic Labs holds AlphaFold 3 commercial exclusivity, meaning the multimodal protein design space is crowding. Second, clinical proof-of-concept moats are held by Insilico Medicine (Phase 2 completion for an AI-designed drug) and Recursion (multi-program clinical pipeline), while EvolutionaryScale has no disclosed internal pipeline. Third, Generate Biomedicines' capital base ($700M+) far exceeds EvolutionaryScale's ($142M), enabling a more capital-intensive lab-validated strategy. Fourth, pharma clients can and do multi-home across free tools (AlphaFold, ESM2) and paid platforms (Forge, Cradle, Generate) simultaneously, limiting any single vendor's pricing power. [CP033, CP034, CP035, CP036, CP037, CP038]
| Moat Claim | Threat / Counter-Evidence | Severity | Mitigation / Diligence Ask |
|---|---|---|---|
| Frontier scale — 98B params, 10^24 FLOPs | Open-source models approach capability (ESM2 15B free; Chai-2 advancing); rapid convergence | High | Validate whether ESM3 98B shows meaningfully better scientific outcomes vs smaller free models on independent benchmarks |
| Multimodal joint reasoning (seq+struct+func) | AlphaFold3 handles molecules + structure; Chai-2 targets antibody structure + function; convergence accelerating | Medium | Assess how many pharma workflows genuinely require simultaneous trimodal prompting |
| Forge API commercial distribution | Self-service API is low-moat GTM; Cradle has stickier data-flywheel SaaS; no disclosed multi-year enterprise contracts | High | Determine whether Forge has enterprise contracts with switching costs or is pay-per-call |
| Founder reputation / ESM research lineage | Same founders released ESM2 for free (MIT) at Meta, providing a high-quality free alternative to their own paid offering | High | Seek founder clarity on commercialization strategy beyond model API; assess if research credibility drives enterprise pipeline |
| AWS SageMaker + NVIDIA BioNeMo distribution | Both channels are non-exclusive; multiple competitors available on same marketplaces (Recursion, others) | Medium | Determine EvolutionaryScale's contractual exclusivity or priority on AWS/NVIDIA channels |
| No disclosed clinical pipeline | Competitors with clinical proof (Insilico Phase 2; Recursion multi-phase pipeline) command higher pharma deal values | High | Assess whether EvolutionaryScale plans any internal therapeutic programs or remains a pure-tool company |
Severity reflects impact on EvolutionaryScale's competitive durability if the threat materializes. All threats are based on publicly observed evidence; internal strategic mitigations not known to diligence team are not captured.
[CP033, CP034, CP035, CP036, CP037, CP038]Coverage matrix of six key buying-criteria capabilities across seven major competitors, showing which platforms support which capabilities based on publicly available evidence.
"Partial" indicates partial or limited capability based on public sources. "Unknown" cells would indicate no public evidence; all cells here reflect public product pages. Clinical pipeline status as of May 2026 from official sources.
[CP003, CP004, CP005, CP018, CP019, CP020]Compact summary of key competitive durability indicators for EvolutionaryScale relative to peers, derived from publicly available evidence as of May 2026.
[CP001, CP002, CP003, CP007, CP010, CP033]3.5 Exhibits
04Financials
4.1 Revenue Streams and Pricing Model
EvolutionaryScale's revenue model was built around Forge, a commercial API platform providing inference access to ESM3, ESM Cambrian, and related protein language models. Forge launched in public beta in January 2025 with a per-token usage fee for protein sequence generation and structure prediction. The exact pricing schedule was not publicly listed on the forge.evolutionaryscale.ai interface and required a login; enterprise contract terms were negotiated privately. Three revenue streams were identifiable from public sources: (1) Forge API pay-per-use, (2) enterprise annual API access contracts (volume-based), and (3) distribution through strategic partners—NVIDIA's BioNeMo platform and AWS SageMaker JumpStart provided cloud-hosted access to ESM models, potentially on a revenue-share or referral-fee basis with NVIDIA and Amazon respectively. ESM Cambrian (released January 2025) was released exclusively as a commercial model through Forge, in contrast to the open-weight ESM2 released by Meta AI Research, which remains freely available on HuggingFace. This commercial-only design for ESM Cambrian reinforced the API access model. An academic free tier with capped token allowances was listed on the Forge product page. No revenue figures, ARR, or customer metrics were disclosed at any point during EvolutionaryScale's operational life as a standalone entity. The company's acquisition by CZI Biohub in November 2025 has fundamentally disrupted the commercial path, and as of May 2026 the Forge API's operational status and pricing under CZI Biohub management is not confirmed in public sources. [CI001][CI002][CI003][CI004][CI005][CI006][CI007][CI008][CI009][CI010]
| Stream | Mechanism | Unit / Pricing | Current Status (May 2026) | Evidence Quality | Diligence Ask |
|---|---|---|---|---|---|
| Forge API — pay-per-use | Per-token or per-call fee for ESM3 / ESM Cambrian protein generation and structure prediction | Not publicly listed; requires login to forge.evolutionaryscale.ai | Operational status unclear post-CZI transaction; public beta launched Jan 2025 | Low — pricing not disclosed; commercial launch confirmed | Request current Forge API pricing and usage metrics from CZI Biohub |
| Enterprise API access contract | Volume-based annual license for Forge API access; negotiated per customer | Not disclosed; estimated mid-five to mid-six figures USD/year | Status unclear post-CZI; no enterprise customers named publicly | Low — contract structure inferred from product design; no confirmed deals | Identify any signed enterprise customers; obtain contract template in diligence |
| NVIDIA BioNeMo distribution | ESM3 hosted on NVIDIA BioNeMo NIM platform; potential revenue share or co-marketing fee | Not disclosed; NVIDIA partnership terms private | Active — ESM3 listed on BioNeMo as of 2026 | Medium — distribution confirmed via NVIDIA announcements | Confirm revenue-share or in-kind terms with NVIDIA |
| AWS SageMaker JumpStart distribution | ESM models listed on AWS SageMaker JumpStart; Amazon was lead Series A investor | Not disclosed; may be in-kind cloud credits rather than cash revenue | Active — AWS listing confirmed as of 2026 | Medium — distribution confirmed; financial terms unknown | Determine whether Amazon relationship generates cash revenue vs cloud-credit offsets |
| Academic / free tier | Capped token allowance for academic and research users | Free; conversion funnel to paid tiers | Listed on Forge product page; status post-CZI unclear | Low — listed as feature; no conversion metrics available | Confirm academic tier conversion rate and whether it drives enterprise pipeline |
All pricing data is undisclosed. Status assessments are based on product page review and partner announcements. Post-CZI Biohub transaction (Nov 2025), the commercial continuity of Forge API under CZI is not confirmed.
[CI001, CI002, CI005, CI006, CI007]| Product / Channel | List vs. Realized Pricing | Discounts / Unknowns | Source |
|---|---|---|---|
| Forge API (ESM3 / ESM Cambrian) — public beta | List price: not published; login required to view; no per-token rate confirmed | Academic free tier and assumed enterprise discounts; no realized pricing disclosed | forge.evolutionaryscale.ai (official, requires login); ESM Cambrian blog (official) |
| Enterprise API contract | Not disclosed; not listed; negotiated individually | Volume discounts, exclusivity, indication scope all variable and unknown | Inferred from product structure; no public contract examples |
| NVIDIA BioNeMo — ESM3 NIM | Included in NVIDIA BioNeMo platform; user pricing per NVIDIA terms, not EvolutionaryScale | Revenue share terms with EvolutionaryScale undisclosed; may be zero cash | NVIDIA BioNeMo product page; nvidianews.nvidia.com partnership announcement |
| AWS SageMaker JumpStart — ESM models | AWS marketplace pricing (per instance-hour); EvolutionaryScale share of listing fee undisclosed | Amazon's Series A lead investment may include in-kind cloud compute — may zero out cash cost but also revenue | AWS SageMaker JumpStart product page; CNBC Series A announcement |
No realized pricing data is available in public sources. All figures are inferred from product structure and industry analogues. The dominant risk is that ESM2 (Meta open-weight) provides a zero-cost alternative for embeddings, capping pricing power for non-generative use cases.
[CI001, CI004, CI006, CI007]How customer interactions with Forge API and distribution channels converted into EvolutionaryScale's revenue streams.
No revenue figures or pricing were publicly disclosed. Node detail reflects known mechanisms and confirmed partnership structures, not dollar amounts. Post-CZI Biohub transaction, commercial flow is disrupted.
[CI001, CI005, CI006, CI007, CI009]4.2 Cost Structure, Gross Margin Drivers, and Capital Intensity
EvolutionaryScale's cost structure was dominated by three categories: research personnel, GPU compute for model training, and inference infrastructure for the Forge API. The company trained ESM3 using over 10^24 FLOPs—more compute than any prior biological model per the company's own blog—on what it described as "one of the highest throughput GPU clusters in the world today." This single training run would have cost an estimated $10–50 million at prevailing cloud GPU rates (H100 at $2–5 per GPU-hour), making it a dominant capital expense. Personnel costs were limited by headcount: LinkedIn shows the company in the 11-50 employee bracket, suggesting approximately 25-50 FTE at peak. At a blended all-in cost of $200,000–$300,000 per FTE (standard for San Francisco AI research talent), annual personnel burn was approximately $5–15 million per year. Ongoing inference costs for the Forge API would add a variable component scaling with API call volume—protein generation with ESM3 at 98B parameters is computationally expensive per query. Gross margins for an API inference business depend heavily on whether the company operated its own GPU cluster (capital-intensive, higher long-run margins) versus renting cloud compute (lower capex, higher COGS). No gross margin figures were disclosed. A positive indicator was Amazon's lead investment in the Series A: AWS may have provided substantial in-kind cloud credits as part of the deal structure, materially reducing near-term infrastructure costs. The ESM2 model's open-source availability on HuggingFace represents a permanent competitive cost floor for the embeddings market, which is a structural margin-compression risk for the ESM Cambrian commercial tier. [CI011][CI012][CI013][CI014][CI015][CI016][CI017]
| Cost Category | Primary Driver | Estimated Magnitude | Evidence Basis | Confidence |
|---|---|---|---|---|
| GPU compute — model training | ESM3 training: >10^24 FLOPs on high-throughput cluster | $10–50M one-time (est.); H100 compute at $2–5/GPU-hour | Direct quote from ESM3 official blog; NVIDIA BioNeMo blog | Medium — compute scale confirmed; cost rate estimated |
| GPU compute — inference (Forge API) | ESM3 (98B parameters) inference per API call; variable with call volume | $1–5M/month at scale (est.); highly sensitive to call volume | Inferred from model size and cloud GPU pricing benchmarks | Low — no call volume data disclosed |
| Personnel | ~25–50 FTE; AI researchers, ML engineers, platform engineers | $5–15M/year (est.); $200–300K all-in per FTE in SF | LinkedIn company size 11-50 bracket; Wikipedia headcount | Low — headcount estimated; no payroll data |
| Infrastructure / cloud (non-training) | API serving, storage, data pipelines, internal tooling | $0.5–2M/month (est.) | Inferred from API business benchmarks; Amazon AWS lead investor may provide credits | Low — unconfirmed; potential in-kind from Amazon |
| Research / data | Training data licensing, academic dataset access, wet-lab validation (limited) | Low-to-moderate; most training data (UniProt, PDB) is public | ESM3 blog cites public protein sequence databases | Medium — data cost likely low; compute dominates |
| G&A / corporate overhead | Legal, finance, HR, facilities for 25-50 FTE | $1–3M/year (est.) | Benchmark for early-stage SF AI company | Low — estimated |
All magnitude estimates are analyst-derived from public proxies; no financial statements or confirmed cost data are available. The dominant cost driver is GPU compute for training and inference. Amazon's lead investor status may have included in-kind cloud credits, which would reduce cash infrastructure spend. Estimates are order-of-magnitude only.
[CI011, CI012, CI013, CI014, CI015]Qualitative flow of revenue inputs and dominant cost deductions for the Forge API inference business; all values are estimates.
All values estimated from headcount proxy, GPU pricing benchmarks, and cloud API analogues. No actual revenue, COGS, or gross margin data was disclosed by EvolutionaryScale.
[CI011, CI012, CI014, CI015, CI016]4.3 Capital Adequacy and the CZI Biohub Transaction
EvolutionaryScale's capital trajectory was: seed round (~$3M, late 2023) followed by a $142M Series A on September 26, 2024 (announced simultaneously by CNBC, Axios, and NVIDIA). The Series A was led by Amazon and NVIDIA with co-investment from Lux Capital, Nat Friedman, and Daniel Gross at a reported ~$1.35B post-money valuation. Total capital raised was approximately $145M. At an estimated burn rate of $5–15M per month (based on headcount + compute + overhead), the $142M Series A provided a theoretical runway of 9–28 months from the September 2024 close. The actual runway ended in November 2025—only approximately 14 months after the Series A closed—when the EvolutionaryScale team joined CZI Biohub. Alex Rives, co-founder and chief scientist, became Head of Science at CZI Biohub. The transaction terms were not disclosed, and no SEC filings triggered by the transaction (e.g., a Form D amendment, Hart-Scott-Rodino disclosure, or acquisition notice) have been located in EDGAR or other public regulatory databases. The financial implications for the $142M Series A investors (Amazon, NVIDIA, Lux Capital, and angels) are unclear. If the CZI/Biohub transaction was a cash acquisition of the entity, investors would have received a distribution. If it was a talent acquisition (acqui-hire) without entity purchase, the $142M capital would have been substantially consumed by the time of the transaction—leaving investors with limited recovery. The absence of any public deal-value disclosure makes this risk assessment open-ended. A notable financial compliance gap was identified during research: no Form D filings appear in SEC EDGAR under any name variant for EvolutionaryScale (searches conducted for "EvolutionaryScale," "Evolutionary Scale," "Evolutionary Scale Inc," and by key person "Alexander Rives"). Private companies raising $142M via Regulation D exemptions are typically required to file Form D with the SEC within 15 days of the first sale. The absence of any Form D is either an indicator of filing under a different legal entity name, use of an alternate securities exemption (e.g., Regulation S for offshore investors), or a filing gap. [CI018][CI019][CI020][CI021][CI022][CI023][CI024][CI025][CI026][CI027][CI028]
| Metric | Estimated Value | Basis | Confidence | Key Assumption / Caveat |
|---|---|---|---|---|
| Series A net proceeds | ~$142M | CNBC, Axios, MIT Tech Review announcements; NVIDIA partner news | High | Close date Sep 26, 2024; standard ~99% close assumed |
| Seed round capital | ~$3M (est.) | Crunchbase, NVIDIA seed investment news; not publicly confirmed dollar amount | Low | Amount not publicly confirmed; investor names confirmed |
| Total capital raised (pre-CZI) | ~$145M | Derived from Series A + seed estimates | Low-to-medium | Seed amount unconfirmed |
| Estimated monthly burn rate | $5–15M/month (est.) | Headcount (25-50 FTE × $200-300K) + compute ($2-7M/mo) + overhead | Low | No actual burn data; very wide range; lower end plausible with AWS in-kind credits |
| Theoretical runway from Series A (Sep 2024) | 9–28 months (est.) | $142M ÷ estimated $5–15M/month burn | Low | Runway actually ended with CZI transaction, Nov 2025 (~14 months) |
| CZI Biohub transaction (Nov 2025) | Terms undisclosed | CNBC Nov 2025 article; Biohub.org announcement | High — transaction confirmed; terms unknown | Whether investors received return is not publicly disclosed |
| Debt / project-finance obligations | None identified | SEC EDGAR search; no public debt disclosure | Low | Absence of disclosure does not confirm absence of debt |
| SEC Form D filings | Zero filings found in EDGAR | SEC EDGAR full-text search for 'EvolutionaryScale', 'Evolutionary Scale', 'Evolutionary Scale Inc' | High — multiple searches returned 0 results | May be filed under different legal entity name; Reg S offshore exemption possible |
Capital adequacy assessment is severely hampered by private-company non-disclosure. The CZI Biohub transaction on Nov 6, 2025 effectively marks the end of EvolutionaryScale as an independent commercial entity. Forward capital adequacy analysis is moot for standalone purposes; all future diligence must be directed at CZI Biohub. The funding chronology (round dates, investors) is established in the Company Overview chapter; this table focuses on adequacy and compliance signals.
[CI018, CI019, CI020, CI022, CI023, CI024]Source-backed and analyst-estimated ranges for key financial parameters; all values carry low-to-medium confidence due to non-disclosure.
Low/mid/high bounds derived from: (1) headcount proxy (LinkedIn 11-50 bracket), (2) cloud GPU pricing benchmarks, (3) CNBC/Axios confirmed $142M raise, (4) peer fundraising analogues. No EvolutionaryScale financial statements available.
[CI018, CI019, CI022, CI011]4.4 Peer Capital Benchmarks and Positioning
EvolutionaryScale's $142M Series A and $1.35B valuation sit at the mid-upper range of protein AI fundraising activity, above pure infrastructure plays like Profluent (~$44M) and Cradle (~$73M), but well below full-stack drug discovery AI companies such as Generate:Biomedicines (~$700M+) and Xaira Therapeutics ($1B at founding). The $1.35B valuation implied a significant premium for a pre-revenue, foundation-model protein AI company with fewer than 50 employees. On a capital-per-head basis, EvolutionaryScale was exceptionally capital-intensive at approximately $3–6M raised per employee—reflecting the cost of world-class AI research talent and compute-heavy model development rather than a scaled commercial operation. By contrast, publicly traded AI drug discovery companies (AbSci, Recursion) have far larger employee bases relative to capital raised, diluted across clinical and manufacturing operations. The peer comparison underscores that EvolutionaryScale was structured as a frontier research vehicle, not a scaled commercial enterprise, at the time of its Series A—raising questions about the commercial revenue ramp hypothesis embedded in the $1.35B valuation. [CI029][CI030][CI031][CI032]
| Company | Total Capital Raised (est.) | Stage / Focus | Post-money Valuation | Capital Efficiency Note |
|---|---|---|---|---|
| EvolutionaryScale | ~$145M | Series A; protein AI foundation model; acqui-hired by CZI Nov 2025 | ~$1.35B (Sep 2024) | High capital-per-head (~$3-6M/FTE); pre-revenue at acqui-hire |
| Profluent Bio | ~$44M | Series A; protein design AI; open-sourced ProGen2 | ~$200-300M est. | Lower capital base; narrower scope; more commercially focused |
| Cradle.bio | ~$73M | Series B; AI protein engineering platform | ~$400M est. | Comparable capital efficiency to EvolutionaryScale |
| Generate:Biomedicines | ~$700M+ | Series C+; full-stack AI protein therapeutics | ~$2B est. | Much larger scale; targeting drug revenue, not API |
| Xaira Therapeutics | ~$1B | Seed/launch; full-stack AI drug discovery | ~$2.5B est. at launch | Best-capitalized protein AI startup; broadest scope |
| AbSci (ABSI) | $200M+ (pre-IPO); public since 2021 | Biomanufacturing + AI drug design; ~350 employees | Market cap ~$500M (2026 est.) | Much larger headcount; different model (wet-lab + AI) |
| Isomorphic Labs | Undisclosed (Alphabet subsidiary) | Series A; AI-first drug design; DeepMind spin-out | N/A (subsidiary) | Structurally non-comparable; backed by corporate parent |
Peer data from public news sources (Axios, Crunchbase, Wikipedia) and company websites. Valuations are post-money estimates at last-known round; not confirmed by audited filings. Cradle, Profluent, Generate, and Xaira data sourced from publicly available funding announcements and analyst databases. All comparisons are approximate and for relative context only.
[CI029, CI030, CI031, CI032]Estimated waterfall showing how the $142M Series A capital was likely deployed through the ~14-month runway to the CZI Biohub transaction.
All values are analyst estimates derived from headcount, GPU benchmarks, and timeline analysis. No actual use-of-funds disclosure was made by EvolutionaryScale. Figures are illustrative ranges presented at midpoint estimates.
[CI018, CI019, CI022, CI023, CI025]4.5 Financial Information Gaps and Diligence Path
The public financial record for EvolutionaryScale is near-empty. As a private company, EvolutionaryScale was not required to file financial statements with the SEC. The absence of Form D filings further limits what can be verified through public regulatory channels. None of the following key metrics are available in any public source reviewed: actual revenue, ARR, gross margin, customer count, churn rate, cash position, or confirmed monthly burn rate. Crunchbase misclassifies the $142M Series A as a "seed investment," which illustrates the unreliability of private-market data aggregators for verified financial analysis. Post-CZI transaction, financial diligence must now flow through CZI Biohub, a nonprofit supported by the Chan Zuckerberg Initiative. This fundamentally changes the diligence path: rather than standard VC financial due diligence, the relevant inquiry becomes (1) the terms of the CZI transaction and what was paid to former investors and shareholders, (2) the ongoing operational status and commercialization strategy for the Forge API and ESM model family under CZI, and (3) whether any residual commercial entity (EvolutionaryScale entity holding IP) continues to operate independently. All outstanding financial analysis requires access to CZI Biohub's internal documentation and any deal disclosures. [CI033][CI034][CI035][CI036]
| Missing Metric | Impact on Analysis | Diligence Path | Priority |
|---|---|---|---|
| Revenue and ARR (Forge API) | Cannot assess commercial product viability, pricing-market fit, or revenue trajectory | Request from CZI Biohub commercial team; Forge API analytics data | Critical |
| Gross margin (API inference) | Cannot assess unit economics or path to profitability | Request cost-of-goods breakdown from CZI Biohub; benchmark against cloud AI API peers | High |
| Confirmed burn rate | Cannot confirm capital adequacy or verify runway; estimates range 3× (low to high) | Request historical monthly P&L from EvolutionaryScale entity records via CZI Biohub | High |
| CZI Biohub transaction terms | Cannot assess investor return profile; cannot determine if $142M Series A had exit value | Request deal term sheet and any investor distribution records; check SEC for Form 8-K analog if any entity became subject to reporting | Critical |
| SEC Form D filings | Regulatory compliance gap; all Reg D raises require Form D within 15 days | Search EDGAR under all possible legal entity names; request from CZI Biohub legal team | High |
| Customer count and enterprise pipeline | Cannot validate commercial traction or sales efficiency of Forge API | Request from CZI Biohub; check LinkedIn for any customer public references | High |
| Forge API operational status post-CZI | Cannot assess whether commercial product is being maintained or wound down under CZI | Direct test of forge.evolutionaryscale.ai API; request roadmap from CZI Biohub | High |
| Investor return from CZI transaction | Cannot determine whether Amazon, NVIDIA, or Lux Capital received return on $142M | Review any public CZI acquisition announcements, press releases, or SEC-equivalent disclosures; direct inquiry to investors | Critical |
All gaps confirmed by reviewing EDGAR, company website, CNBC, Biohub.org, Crunchbase, and NVIDIA announcements as of May 2026. The CZI Biohub transaction is the dominant gap: it absorbs all other financial diligence priorities into a single inquiry about deal terms and residual commercial continuity.
[CI033, CI034, CI035, CI036]4.6 Financial Verdict
EvolutionaryScale's financial narrative is one of exceptional early-stage capital formation followed by an abrupt strategic pivot that extinguished the standalone commercial path. The $142M Series A at $1.35B valuation—led by Amazon and NVIDIA—was structurally a strategic bet on ESM3 as a foundational protein AI layer, not a near-term commercial revenue play. The commercial product (Forge API) launched publicly in January 2025 with no disclosed revenue metrics, no disclosed customer count, and no pricing transparency. The CZI Biohub transaction in November 2025—within 14 months of the Series A close—confirms that the company did not achieve commercial breakout on a standalone basis before seeking the CZI umbrella. Revenue quality assessment: insufficient data to assess. The Forge API mechanism was sound (per-token inference on a proprietary foundation model), but the addressable market for paid protein AI API access without a broader drug-discovery workflow context is narrow, and open-weight alternatives (ESM2 from Meta) cap the willingness-to-pay ceiling for commodity embedding use cases. Capital intensity was very high relative to commercial traction. The SEC Form D filing gap is a noteworthy compliance signal. The principal diligence blockers are: (1) CZI Biohub transaction terms; (2) confirmed Forge API revenue or customer metrics; (3) actual burn rate pre-transaction; and (4) investor return profile from the CZI deal. No underwriting-grade financial conclusion on investment return can be drawn from public sources alone. [CI033][CI034][CI035][CI036][CI037][CI038]
4.7 Exhibits
05Product & Technology
5.1 ESM Product Portfolio and Model Family
EvolutionaryScale offers two distinct product lines built on the ESM (Evolutionary Scale Modeling) foundation: the ESM3 generative protein language model and the ESM-C (Cambrian) protein embedding family. Together they span eight model SKUs across open-weight and commercial-API tiers. ESM3 is offered in three weight classes: ESM3-small-2024-08 (1.4 billion parameters), ESM3-medium-2024-08 (7 billion parameters), and ESM3-large-2024-03 (98 billion parameters). ESM3-small is the only ESM3 variant with open weights, available under the Cambrian Non-Commercial License Agreement via HuggingFace (as esm3-sm-open-v1). ESM3-medium and ESM3-large are commercial and accessible exclusively through the Forge API. The 98B ESM3-large is the model used to design esmGFP — a de novo fluorescent protein exhibiting only 58% sequence identity to the nearest natural GFP, representing approximately 500 million years of equivalent evolutionary divergence and validated by peer-reviewed publication in Science (January 2025). ESM-C (Cambrian) is a separate embedding-focused model family with three sizes: ESMC-300M, ESMC-600M, and ESMC-6B. ESMC-300M and ESMC-600M carry open weights under the same Cambrian Non-Commercial License. ESMC-6B is accessible via the Forge API for academic users and via AWS SageMaker JumpStart for commercial deployments. ESM-C models use a Pre-LN transformer architecture with rotary positional embeddings and SwiGLU activations and were benchmarked by EvolutionaryScale as state-of-the-art sequence representation models at their respective scales. The Forge API (forge.evolutionaryscale.ai) is the primary commercial monetization vehicle. Opened to public beta in January 2025 concurrent with the Science publication, Forge provides synchronous and asynchronous REST API access to the ESM3 and ESMC model family with a Python SDK available via pip. A batch executor for high-throughput workloads is included in the SDK. Pricing for commercial Forge access is not publicly disclosed; academic access terms are available directly through the platform.[CE001, CE002, CE003, CE004, CE005, CE006]
| Model / Product | Parameter Scale | Availability / License | Primary Use | Diligence Gap |
|---|---|---|---|---|
| ESM3-small-2024-08 | 1.4 B | Open weights — Cambrian Non-Commercial License (HuggingFace esm3-sm-open-v1) | Research protein generation; local fine-tuning on non-commercial datasets | No commercial use permitted; limited benchmarks vs. larger ESM3 variants |
| ESM3-medium-2024-08 | 7 B | Forge API only — commercial pricing undisclosed | Mid-scale protein design; Forge API customers | Pricing not public; no public benchmark comparison vs. ESM3-small |
| ESM3-large-2024-03 | 98 B | Forge API only — flagship commercial model | High-complexity protein design; generative design at esmGFP scale | Inference cost not disclosed; SLA terms not public |
| ESMC-300M (esmc-300m-2024-12) | 300 M | Open weights — Cambrian Non-Commercial License (HuggingFace) | Protein sequence embeddings for ML pipelines; research | Non-commercial only; no third-party independent accuracy benchmark |
| ESMC-600M (esmc-600m-2024-12) | 600 M | Open weights — Cambrian Non-Commercial License (HuggingFace) | Enhanced protein embeddings; research and academic fine-tuning | Non-commercial only; 1,490 HuggingFace downloads suggests early adoption |
| ESMC-6B | 6 B | Forge API (academic) + AWS SageMaker JumpStart (commercial) | Enterprise-scale protein sequence embedding and similarity search | Commercial pricing undisclosed; SageMaker instance cost is user-borne |
| Forge API (forge.evolutionaryscale.ai) | Service (all ESM3/ESMC models) | Public beta since January 2025; commercial subscription model | Programmatic access to all models; sync and async inference; batch executor | Pricing structure, usage tiers, uptime SLA, and customer list not public |
Model sizes and parameter counts are sourced from official EvolutionaryScale blog posts, the GitHub ESM repository README, and the HuggingFace model cards. HuggingFace download counts reflect a 30-day snapshot as of the research date and may fluctuate. Pricing for Forge API and SageMaker commercial tiers is not publicly disclosed; rows marked "Undisclosed" reflect confirmed absence of public pricing at the time of research.
| Use Case | Target User | ESM Tool | Workflow Step | Validation Evidence |
|---|---|---|---|---|
| De novo protein design | Protein engineers, drug discovery researchers | ESM3-large (Forge API) | Specify partial sequence or structure constraint; generate full candidate proteins; iterate | esmGFP — peer-reviewed Science 2025; 341 citations |
| Protein sequence embedding for ML pipelines | Academic researchers, bioinformatics teams | ESMC-300M or ESMC-600M (open weights) | Embed protein sequences; feed embeddings into downstream classifiers or clustering | 6,320 ESMC-300M HuggingFace downloads; community citations in 129+ BioRxiv papers |
| Enterprise-scale commercial embedding | Enterprise bioinformatics, pharma R&D | ESMC-6B (AWS SageMaker JumpStart) | Deploy via CloudFormation stack (15-25 min); run batch protein similarity search | AWS SageMaker JumpStart listing; GitHub esm-sagemaker CloudFormation docs |
| Structure-conditioned protein variant generation | Computational biologists | ESM3-small (open weights, local GPU) | Provide partial structure tokens as conditioning; generate sequence variants | GitHub ESM README examples; ESM3 architecture multitrack tokenization |
| Function-guided protein design | Drug discovery, enzyme engineering | ESM3 (Forge API) | Specify function annotation keywords; jointly optimize sequence and structure outputs | ESM3 Science paper benchmark results; ESM3 blog (company-claimed) |
| Academic research model fine-tuning | Academic labs | ESMC-300M (open weights) | Fine-tune ESMC on proprietary protein datasets for domain-specific tasks | Cambrian Non-Commercial License permits fine-tuning for non-commercial research |
Use cases are derived from official EvolutionaryScale blog posts, the GitHub ESM README, the Science paper (Hayes et al., 2025), and community HuggingFace downloads. The esmGFP use case is fully validated; other use cases reflect documented capability claims.
Five-layer architecture of EvolutionaryScale's ESM protein AI platform from training data through deployment.
Layer boundaries are conceptual; exact service architecture of the Forge API and internal infrastructure are not publicly documented.
[CE001, CE002, CE003, CE009, CE019, CE022]5.2 Technical Architecture: Multitrack Transformer, Training Scale, and Tokenization
ESM3's defining architectural innovation is its multitrack transformer design: the model processes three parallel token sequences — amino acid sequence tokens, structure tokens (encoding 3D coordinates), and function annotation tokens (keyword-based GO term labels) — within a unified transformer framework. Each track uses discrete tokenization. Structure coordinates are encoded via a vector quantized variational autoencoder (VQVAE) into a finite codebook of structural tokens, enabling the model to natively read and generate three-dimensional protein structure without requiring continuous coordinate regression. Pre-training uses a masked language modeling (MLM) objective applied across all three tracks simultaneously, allowing the model to learn joint representations that span the sequence-structure-function space. The 98-billion-parameter ESM3-large model was trained using 1.07×10²⁴ floating-point operations on approximately 2.78 billion protein sequences (771 billion unique tokens), trained on the Andromeda HPC cluster using NVIDIA H100 GPUs and Quantum-2 InfiniBand networking. NVIDIA reports that ESM3-large uses approximately 25× more FLOPs and 60× more data than its predecessor ESM2. Reinforcement learning from human feedback (RLHF) was applied to ESM3-large to align outputs with human preferences for protein design tasks. ESM-C (Cambrian) employs a different architectural profile: a Pre-Layer Normalization (Pre-LN) transformer with rotary positional embeddings (RoPE) and SwiGLU feed-forward activations, pre-trained with masked language modeling. ESMC-300M (30 layers, hidden width 960) was trained on 1.26×10²² FLOPs; ESMC-600M (36 layers, width 1152) on 2.17×10²² FLOPs; and ESMC-6B (80 layers, width 2560) on 2.37×10²³ FLOPs. Training data spans three large sequence databases: UniRef (83 million clusters), MGnify (372 million), and JGI metagenomics (2 billion clusters), all clustered at 70% sequence identity. EvolutionaryScale's internal infrastructure capabilities are illustrated by the open-source DeepEP library, a custom CUDA/NCCL implementation of Mixture-of-Experts Expert Parallelism communication for H800 GPUs. With 1,253 GitHub stars, DeepEP signals active HPC engineering capability at the company, supporting large-scale distributed training and inference.[CE009, CE010, CE011, CE012, CE013, CE014]
| Component | Description / Specification | Key Metric | Primary Source |
|---|---|---|---|
| ESM3 multitrack transformer | Three parallel input/output tracks: amino-acid sequence tokens, VQVAE structure tokens, function keyword tokens; unified attention across tracks | 3 tracks; 1.4B / 7B / 98B params across 3 model sizes | ESM3 blog (official); Hayes et al. Science 2025 |
| VQVAE structure tokenizer | Vector quantized variational autoencoder encodes 3D protein backbone coordinates as discrete structural tokens from a finite codebook | Discrete codebook; enables native 3D structure generation without coordinate regression | ESM3 blog (official); ESM3 preprint (bioRxiv) |
| ESM3 pre-training and alignment | Masked language modeling (MLM) across all three tracks; RLHF fine-tuning on ESM3-large for alignment with protein design preferences | 1.07×10²⁴ FLOPs (98B model); 2.78B proteins; 771B unique tokens | ESM3 blog (official); NVIDIA blog; Science paper |
| ESM-C architecture | Pre-LN transformer with RoPE positional embeddings and SwiGLU activations; masked language modeling pre-training; three sizes: 300M / 600M / 6B | 300M: 30L×960W; 600M: 36L×1152W; 6B: 80L×2560W | ESM-C blog (official) |
| ESM-C training compute and data | Training data: UniRef (83M seq clusters), MGnify (372M), JGI metagenomics (2B clusters) at 70% identity; FLOPs per model: 300M=1.26e22, 600M=2.17e22, 6B=2.37e23 | 2B+ total sequence clusters (largest component from JGI metagenomics) | ESM-C blog (official) |
| Andromeda HPC training cluster | NVIDIA H100 Tensor Core GPU cluster with Quantum-2 InfiniBand networking; used to train ESM3-large | 25× more FLOPs and 60× more data vs. predecessor ESM2 | NVIDIA blog (partner-proof) |
| DeepEP MoE Expert Parallelism library | Open-source CUDA/NCCL implementation of Mixture-of-Experts Expert Parallelism communication; custom kernels for H800 GPUs | 1,253 GitHub stars; signals advanced internal HPC infrastructure | GitHub evolutionaryscale/DeepEP (developer-signal) |
Architecture parameters (layer counts, hidden widths) for ESM-C are from the official ESM-C blog. Training FLOPs for ESM3-large are from the official ESM3 blog and Science paper. ESM-C FLOPs are from the ESM-C blog. Infrastructure details (Andromeda cluster, H100 GPUs) are from the NVIDIA blog announcement and corroborated by the ESM3 Science paper.
Eight-step workflow showing how protein researchers use the ESM3 platform from hypothesis to validated candidate.
Flow is a simplification; feedback loops between experimental results and revised prompts are not shown. Wet-lab validation step is performed by the user, not by EvolutionaryScale.
[CE005, CE007, CE018, CE029]5.3 Deployment and Ecosystem: Forge API, AWS, NVIDIA, and Community
EvolutionaryScale has built a multi-tier distribution strategy combining open-weight community adoption with commercial API and cloud-marketplace access. The Forge API (forge.evolutionaryscale.ai), opened to public beta in January 2025, is the primary programmatic interface for ESM3 and ESMC commercial use. The official Python client (pip install evoscale-sdk) is hosted at github.com/evolutionaryscale/esm and provides both synchronous inference and asynchronous batch execution. As of May 2026, the Forge API portal is accessible and operational, though detailed pricing is not publicly listed. AWS SageMaker JumpStart provides commercial deployment of ESMC-6B via a CloudFormation stack that provisions a dedicated GPU instance in 15-25 minutes. This integration, documented in the esm-sagemaker GitHub repository, targets enterprise bioinformatics customers needing large-scale embedding workflows with predictable SLAs. Amazon was a co-investor in EvolutionaryScale's Series A and is a deployment partner. NVIDIA announced ESM3 integration into its BioNeMo NIM platform for GPU-optimized inference when ESM3 was first released in June 2024; EvolutionaryScale's ESM-C blog (December 2024) lists BioNeMo as a forthcoming distribution channel. The NVIDIA NGC catalog separately lists ESM3 as a resource. On GitHub, EvolutionaryScale maintains nine public repositories. Beyond the flagship esm repository, notable community-facing projects include DeepEP (1,253 stars), a NCCL fork, a Hugging Face transformers fork, and a Mamba implementation. The HuggingFace organization (huggingface.co/evolutionaryscale) hosts open-weight model cards, with the esm3-sm-open-v1 model page showing 3,105 downloads in the prior month and 291 likes; ESMC-300M shows 6,320 downloads and ESMC-600M shows 1,490 downloads. These metrics indicate meaningful adoption within the research community.[CE019, CE020, CE021, CE022, CE023, CE024]
Directed acyclic graph of EvolutionaryScale's critical platform, infrastructure, and distribution dependencies.
Dependency directions represent data, compute, and ownership flows; not data-volume magnitudes. The Forge API internal serving infrastructure is not publicly documented.
[CE010, CE016, CE019, CE022, CE025, CE035]5.4 Intellectual Property and Competitive Moat
EvolutionaryScale's IP moat rests on three pillars: (1) publication depth in high-impact venues, (2) proprietary training scale and infrastructure, and (3) the esmGFP demonstration of generative capability in a region of protein sequence space no natural evolution has explored. The flagship Science publication (Hayes et al., Science, January 2025, Vol 387, Issue 6736, pp. 850-858, DOI 10.1126/science.ads0018) has accumulated 341 citations and 68,494 downloads as of the citation metrics observed during this research; 318 of those citations arrived within the first 12 months of publication. The ESM3 preprint on bioRxiv (10.1101/2024.07.01.600583, July 2024) was cited by 129+ downstream papers within its first year. This publication velocity places ESM3 among the most-cited new protein ML methods. The esmGFP result represents the strongest public demonstration of ESM3's generative capability. The designed protein carries 96 mutations across its 229 amino acid positions (a Hamming distance of 58% from the nearest known natural GFP), placing it at evolutionary distances comparable to the separation of corals from jellyfish — two distinct phyla. EvolutionaryScale filed patents covering esmGFP and related protein design methods. The compute investment required to replicate ESM3-large (1.07×10²⁴ FLOPs, equivalent to more than twice the training budget of GPT-4 at the time of training) creates a meaningful cost barrier. Primary competitors are differentiated on axis rather than directly overlapping. AlphaFold3 (DeepMind, May 2024) excels at protein structure prediction including small-molecule and antibody complexes but is not generative in the design sense and restricts commercial use to academic research. Chai-1 (Chai Discovery, 2024) focuses on high-accuracy protein complex structure prediction. ESM2 (Meta AI, 2022), a 650M-parameter open-weight predecessor, provides sequence embeddings but lacks generative sequence-structure-function joint modeling. EvolutionaryScale's unique positioning is in generative protein design that simultaneously reasons over sequence, structure, and function.[CE027, CE028, CE029, CE030, CE031, CE032]
| Trust / Safety Dimension | Current Public Status | Risk Level | Diligence Path |
|---|---|---|---|
| Dual-use / biosecurity risk (pathogen protein design) | Open weights (ESM3-small, ESMC-300M/600M) carry Cambrian Non-Commercial License restrictions; Forge API requires account acceptance of ToS; no public biosecurity screening policy | High (industry-wide concern; no published biosecurity audit) | Request biosecurity policy document and any third-party biosafety review from EvolutionaryScale / CZI Biohub |
| Open-weight non-commercial license compliance | Cambrian Non-Commercial License Agreement prohibits commercial use of ESM3-small and ESMC-300M/600M; commercial customers must use Forge API or SageMaker | Medium (license enforcement requires monitoring; grey-area commercial use may go undetected) | Review Cambrian Non-Commercial License Agreement; assess IP protection against unauthorized commercial fine-tuning |
| Forge API data privacy and retention | No public data retention, deletion, or confidentiality policy for sequences submitted to Forge API | Medium (material concern for pharma customers with proprietary sequence data) | Request Forge API Terms of Service, Privacy Policy, and Data Processing Agreement from EvolutionaryScale / CZI Biohub |
| Performance robustness under heterogeneous structural inputs | Independent BioRxiv study (Dec 2024) found ESM3 binding predictions deteriorate when distinct per-variant relaxed structures are used; see SE007 | Medium (adversarial finding; scope is specific to heterogeneous structure inputs) | Review Gissing & Smith BioRxiv Dec 2024 preprint; test ESM3 binding prediction with varied structural input strategies |
| Organizational continuity risk (CZI Biohub transition) | EvolutionaryScale team joined CZI Biohub in November 2025; future product roadmap governed by CZI Biohub rather than independent startup | Medium (dependency on non-profit mission and funding continuity) | Monitor CZI Biohub announcements; confirm Forge API SLA commitments post-transition |
Trust and compliance status is derived from public sources only. Biosecurity policies, Forge API data retention policies, and any independent security audits are not publicly available. The table reflects publicly documented controls and known gaps as of May 2026.
Capability and maturity comparison across ESM3 (generative), ESMC (embedding), Forge API, and open-weight tiers on six dimensions.
Capability ratings are evidence-based assessments from public sources. Internal performance benchmarks and Forge API SLA details have not been disclosed.
[CE001, CE002, CE010, CE024, CE033, CE039]5.5 Product Roadmap, CZI Biohub Transition, and Responsible Development
EvolutionaryScale raised a $142 million Series A in September 2024 led by Lux Capital, with participation from Amazon and NVIDIA, following an earlier seed investment from NVIDIA. In November 2025, the company's team joined CZI Biohub as part of a major "Frontier AI & Biology" initiative announced by the Chan Zuckerberg Initiative. Under this transition, co-founder and chief scientist Alex Rives became head of science at Biohub, and the EvolutionaryScale research team was integrated into Biohub's combined team of biological scientists, AI engineers, and technologists. CZI Biohub has announced a compute expansion to 10,000 GPUs by 2028 to support this initiative. As of the May 2026 report date, the Forge API and open-weight model distributions remain operational. EvolutionaryScale's public benefit company (PBC) charter and the Cambrian Non-Commercial License on open weights encode a commitment to research access while reserving commercial capabilities for the Forge API revenue model. Key trust and safety dimensions require diligence attention. The dual-use risk of generative protein design — including potential misuse for pathogen engineering — is an industry-wide concern. EvolutionaryScale addresses this through the non-commercial license restriction on open weights and access controls on the Forge API, but has not publicly documented a biosecurity screening policy or independent biosafety audit. An independent BioRxiv preprint published in December 2024 found that ESM3's binding prediction performance deteriorates when distinct, per-variant relaxed protein structures are used as inputs, compared to using a single consistent structure as the backbone — a "More Structure, Less Accuracy" paradox that diligence teams should investigate for deployment scenarios involving heterogeneous structural inputs. Data retention policies for sequences submitted to the Forge API have not been publicly disclosed, which may be a compliance concern for enterprise customers in regulated industries. No SEC Form D filings were found for EvolutionaryScale, consistent with its status as a privately held company.[CE034, CE035, CE036, CE037, CE038, CE039]
| Milestone | Date / Timing | Status | Evidence Source |
|---|---|---|---|
| ESM2 released (Meta AI, 650M–3B open weights) | 2022 | Complete — open source, widely adopted by research community | Meta AI blog; HuggingFace (predecessor, not EvolutionaryScale) |
| EvolutionaryScale founded; ESM3 pre-release development begins | 2023 | Complete — company founded by Meta AI FAIR alumni | NVIDIA seed investment announcement; Crunchbase |
| ESM3-small open weights released; ESM3 Forge closed beta | June 2024 | Complete — Forge closed beta opened; ESM3-small on HuggingFace | ESM3 official blog (SE001); NVIDIA blog (SE017) |
| ESM3 preprint submitted to bioRxiv (10.1101/2024.07.01.600583) | July 2024 | Complete — 129+ citing papers within first year | bioRxiv preprint (SE006); BioRxiv search (SE008) |
| Series A fundraise ($142M) — Lux Capital, Amazon, NVIDIA | September 2024 | Complete | Axios (SE025); Crunchbase (SE020) |
| ESM-C (Cambrian) models released — 300M/600M open weights + 6B Forge | December 2024 | Complete — open weights on HuggingFace; ESMC-6B on Forge | ESM-C blog (SE002); HuggingFace model cards (SE014, SE015) |
| ESM3 published in Science (Hayes et al., Vol 387, pp. 850-858) | January 16, 2025 | Complete — 341 citations; 68,494 downloads | Science DOI 10.1126/science.ads0018 (SE005); Semantic Scholar (SE026) |
| Forge API public beta opened | January 2025 | Complete — concurrent with Science publication | ESM3 blog (SE001); GitHub ESM README (SE009) |
| EvolutionaryScale team joins CZI Biohub (Frontier AI & Biology initiative) | November 2025 | Complete — Alex Rives appointed head of science at Biohub | CZI Biohub blog (SE023) |
| NVIDIA BioNeMo NIM integration (ESM-C) | Target: 2025/2026 | In progress — listed as 'available soon' in ESM-C blog (December 2024) | ESM-C blog (SE002); NVIDIA blog (SE017); NVIDIA NGC catalog (SE018) |
| CZI Biohub 10,000-GPU compute expansion | Target: by 2028 | Announced — Biohub Frontier AI initiative | CZI Biohub blog (SE023) |
Milestone dates are sourced from official EvolutionaryScale blog posts, bioRxiv submission metadata, the Science paper publication date, and news articles reporting the Series A. Future milestones (BioNeMo NIM, 10,000 GPUs) are drawn from NVIDIA and CZI Biohub announcements and represent planned targets, not confirmed deliverables.
5.6 Exhibits
06Customers
6.1 Customer Base Segmentation
EvolutionaryScale's customer base is best understood as four access tiers, each with a different buyer profile, access mechanism, and evidence depth. The largest and best-evidenced tier is academic and independent researchers, who access open-weight ESM3 (1.4B, non-commercial license) and ESM-C (300M and 600M, open weights) directly via GitHub and HuggingFace. These users are predominantly computational biologists, structural biologists, and bioinformaticians at universities, research institutes, and government labs. Their use cases include protein sequence representation, structure prediction fine-tuning, functional annotation, antibody design, and downstream model development. No revenue is generated from open-weight users; they represent a top-of-funnel signal for commercial conversion. The second tier comprises commercial cloud platform users reaching ESM models through Amazon Web Services SageMaker Marketplace (ESM-C models available for commercial deployment) and NVIDIA BioNeMo (listed as upcoming). These buyers are typically bioinformatics and computational biology teams at pharmaceutical and biotech companies who prefer cloud-native, infrastructure- managed model deployment over direct API subscriptions. AWS and NVIDIA are channel partners, not end customers; the actual enterprise buyers are their downstream clients. Subscriber counts and deployment metrics are not publicly disclosed. The third tier is Forge API beta users. As of January 2025, EvolutionaryScale opened a free limited-time public beta of the Forge API, providing access to ESM3 and ESM-C models at scale. The Forge API targets academic scientists and commercial builders who need inference beyond the 1.4B open model. Commercial pricing for Forge post-beta has not been publicly announced. API enrollment requires an access token; no user count has been disclosed. The fourth tier — large pharma R&D buyers paying for enterprise platform access — is the highest-value segment but has the weakest publicly available evidence. Adaptyv Bio (a protein engineering company based in Lausanne, Switzerland) has been confirmed as a named ESM ecosystem partner. No Pfizer, Eli Lilly, Novartis, Roche, or other top-20 global pharma deal has been publicly announced, creating a material commercial proof gap relative to peers Generate Biomedicines and Isomorphic Labs.[CU023, CU031, CU012, CU008, CU010, CU011]
| Segment | Buyer / User / Payer | Access Channel | Use Case | Scale / Reach (Est.) | Revenue / Strategic Value | Key Evidence Gap |
|---|---|---|---|---|---|---|
| Academic & independent researchers | Computational/structural biologists, bioinformaticians at universities and research institutes | GitHub (open weights), HuggingFace, PyPI (esm package) | Protein representation, structure prediction fine-tuning, functional annotation, antibody design | 3.1k+ HF downloads (ESM3); 7.8k+ HF downloads (ESM-C); 129+ bioRxiv preprints | Zero direct revenue; top-of-funnel signal; academic citation credibility | No conversion rate from academic to paid user disclosed |
| Cloud platform enterprise users | Biotech / pharma IT and computational biology teams; AWS and NVIDIA customers | AWS SageMaker Marketplace (ESM-C commercial license); NVIDIA BioNeMo (upcoming) | GPU-managed protein embedding, molecular design, virtual screening | Undisclosed; mediated through AWS/NVIDIA customer bases | High strategic (channel partner alignment with $142M investor); commercial metrics opaque | Subscriber count, revenue share, and SageMaker usage volume not public |
| Forge API beta users | Academic and commercial scientists accessing larger ESM3 and ESM-C 6B models | Forge API at forge.evolutionaryscale.ai (token-gated) | Large-scale protein generation, representation at scale beyond open-weight model limits | Undisclosed; free beta since January 2025 | Potential conversion to paid; commercial pricing not yet announced | Beta user count, active usage, and paid conversion plan not disclosed |
| Biotech / protein engineering companies | Protein engineering startups and CROs (e.g., Adaptyv Bio) | Forge API, SageMaker, or open-weight integration | Protein binder design, antibody optimization, functional engineering | One named partner (Adaptyv Bio); pipeline depth undisclosed via esm-partner repo | Early-stage; potential future revenue; validates product-market fit signal | No disclosed deal terms, pipeline size, or conversion from pilot to production |
| Large pharma R&D (gap segment) | Pharma CSOs, VP Drug Discovery, BD executives at top-20 global pharma | Enterprise Forge API or direct SageMaker subscription (hypothesized) | AI-assisted drug discovery, target identification, generative lead optimization | Zero publicly confirmed as of May 2026; no announced deal | Highest potential value; completely unverified commercially | No named deal analogous to Generate Biomedicines ($1.9B Amgen) or Isomorphic Labs (Lilly/Novartis) |
Scale/reach estimates for academic and cloud tiers are derived from HuggingFace download counts and bioRxiv search volume; actual unique-user counts differ and are unknown. Revenue and strategic value assessments are inferred, not disclosed. The large pharma segment is a target market, not a confirmed customer tier.
[CU001, CU007, CU011, CU012, CU017]EvolutionaryScale's customer journey from open-weight academic discovery through Forge API trial to commercial cloud deployment and potential enterprise pharma engagement.
Journey stages are inferred from documented access mechanisms and competitive market norms. Stage transition rates (conversion from open-weight to API to enterprise) are entirely unknown. The pharma enterprise stage is a hypothesized destination, not a confirmed outcome.
[CU001, CU007, CU012, CU017, CU034]6.2 Adoption Trajectory and Open-Access Usage
The clearest and most objectively measurable adoption signals come from open-access channels. On HuggingFace, the biohub/esm3-sm-open-v1 model (the 1.4B open-weight version of ESM3) had approximately 3,110 downloads with 291 likes as of the research date; the biohub/esmc-300m-2024-12 model had approximately 6,320 downloads and 30 likes; and biohub/esmc-600m-2024-12 had approximately 1,490 downloads and 32 likes. Total ESM-C family downloads across the two open models sum to approximately 7,810 as of May 2026. The ESM-C models were updated as recently as two days before the research cache date, indicating active maintenance. These download counts likely under-represent actual usage because many academic users clone the GitHub repository or access model weights via the official esm Python package rather than through the HuggingFace hub directly. On GitHub, the evolutionaryscale/esm repository is the primary open-source distribution channel. The organization maintains nine or more repositories including derivative technical infrastructure (DeepEP with 1,253 stars, NCCL fork with 1,270 stars), model weights, and the esm-partner repository explicitly labeled for partner collaborations. Active commit history through March–May 2026 demonstrates sustained development activity. Academic citation evidence is robust: a Semantic Scholar API search returned 32 papers building on ESM3 as of May 2026, and a bioRxiv search for "evolutionaryscale ESM3" returned 129 preprint results. Named downstream applications include MegSite (nucleic acid binding residue prediction), ProteinReasoner (multi-modal protein language model with chain-of-thought reasoning), iNClassSec-ESM (non-classical secreted protein discovery), and affinity peptide design for chromatographic purification — spanning academic, clinical, and industrial applications. The ESM3 Science paper (published January 16, 2025, DOI 10.1126/science.ads0018) provides authoritative academic reception and serves as a credibility anchor for commercial conversations.[CU001, CU002, CU003, CU004, CU005, CU006]
| Metric | Value | Date / Period | Source | Confidence | Implication |
|---|---|---|---|---|---|
| ESM3-open (1.4B) HuggingFace downloads | ~3,110 | As of May 2026 | HuggingFace org page (biohub/esm3-sm-open-v1) | High — direct read from HF page | Baseline demand signal for open-weight version; understates total usage (GitHub + pip) |
| ESM-C 300M HuggingFace downloads | ~6,320 | As of May 2026 | HuggingFace org page (biohub/esmc-300m-2024-12) | High — direct read from HF page | Most popular ESM-C model; broader adoption than ESM3-open, likely due to lower compute requirements |
| ESM-C 600M HuggingFace downloads | ~1,490 | As of May 2026 | HuggingFace org page (biohub/esmc-600m-2024-12) | High — direct read from HF page | Higher capability tier; incremental users willing to pay compute premium |
| Total ESM-C family HF downloads (300M + 600M) | ~7,810 | As of May 2026 | HuggingFace org page aggregation | High — computed from two confirmed values | ESM-C family exceeds ESM3-open by 2.5x, suggesting representation use cases have broader demand than generation |
| Downstream papers citing ESM3 (Semantic Scholar) | 32 papers | As of May 2026 | Semantic Scholar API search (query: ESM3 EvolutionaryScale protein language model) | High — API result with known query | Growing downstream research ecosystem; validates academic product-market fit |
| bioRxiv preprints mentioning ESM3 + EvolutionaryScale | 129 results | As of May 2026 | bioRxiv search for 'evolutionaryscale ESM3' | High — direct search result count | 4x more preprints than Semantic Scholar-indexed papers; indicates large unreported usage pipeline |
| EvolutionaryScale GitHub DeepEP repo stars | 1,253 | As of May 2026 | GitHub org page (evolutionaryscale/DeepEP) | High — direct read from GitHub | Signals active developer engagement beyond model users; developer community building |
| NCCL fork stars (evolutionaryscale/nccl) | 1,270 | As of May 2026 | GitHub org page (evolutionaryscale/nccl) | High — direct read from GitHub | Indicates GPU infrastructure-level engineering credibility; appeals to enterprise AI infra buyers |
| Forge API public beta launch | Launched January 2025 | January 2025 | Company blog (evolutionaryscale.ai, January 2025 post) | High — company official announcement | Commercial intent confirmed; exact beta user count not disclosed |
| Named downstream academic applications (Semantic Scholar, select) | MegSite, ProteinReasoner, iNClassSec-ESM, affinity peptide design (4+ named) | 2025–2026 | Semantic Scholar API result (ESM3 EvolutionaryScale) | High — individual paper citations confirmed | Demonstrates multi-domain downstream use in clinical, basic science, and industrial contexts |
HuggingFace download counts represent unique model downloads from the HuggingFace hub only; actual usage via pip install, GitHub clone, or SageMaker deployment is excluded. GitHub star counts are developer interest proxies, not active user counts. Semantic Scholar returns published papers; preprint count from bioRxiv is ~4x higher. All metrics reflect open-access usage; commercial deployment metrics are entirely opaque.
[CU001, CU002, CU003, CU004, CU005, CU006]Estimated top-down funnel from total addressable academic user base through open-weight downloads, API beta enrollment, commercial SageMaker subscriptions, and enterprise pharma partnerships.
Funnel values above 'Forge API beta enrollees' are approximate: combined HuggingFace download total of ~10.9k plus an estimate of GitHub-only users. Values for Forge API beta enrollees (~200) and SageMaker commercial subscribers (~10) are rough low-end estimates only; EvolutionaryScale has not disclosed these counts. The researcher community addressable total is an industry estimate. Numeric estimates for the two undisclosed tiers are placeholders and carry very high uncertainty.
[CU001, CU004, CU005, CU006, CU011, CU017]6.3 Named Deployments and Integration Partners
The named commercial deployment and integration evidence for EvolutionaryScale is anchored by three confirmed channels and one named partner. First, AWS SageMaker Marketplace lists ESM-C models for commercial deployment under the Cambrian Inference Clickthrough License Agreement. The GitHub README provides explicit deployment instructions: admin-level AWS access, subscription via the SageMaker Marketplace, and CloudFormation-based launch taking 15–25 minutes. GPU costs are billed directly to the subscriber's AWS account. This represents a verifiable commercial deployment path, though subscriber counts are undisclosed. Second, NVIDIA BioNeMo listed ESM-C as an upcoming integration as of December 2024. NVIDIA BioNeMo targets drug discovery, molecular design, virtual screening, and protein binder design use cases — exactly matching ESM-C's intended commercial applications. NVIDIA is also a strategic investor in EvolutionaryScale (Series A participant), creating a structural incentive for deep integration. The NVIDIA investor relationship is confirmed by both the Series A announcement (BusinessWire) and a dedicated NVIDIA news release about the seed investment. Third, Adaptyv Bio — a protein engineering company at Biopole Life Science Campus, Lausanne, Switzerland — has been confirmed as a named ESM ecosystem partner. Adaptyv Bio's focus on protein design aligns directly with ESM model capabilities. The partnership reflects the small-and-growing-biotech customer segment that can utilize open-weight or API-based access without the procurement overhead of large pharma. Fourth, the Forge API public beta (launched January 2025) constitutes a customer proof of the commercial platform, though the scale of enrolled users and conversion to paid status is not disclosed. The EvolutionaryScale GitHub ESM partner repository (evolutionaryscale/esm-partner, labeled "Repository for partner collaborations") implies a formal partnership pipeline beyond Adaptyv Bio, but no other partners are named publicly. Importantly, prior ESM family generations (ESM1b, ESM2) had documented corporate users: BioNTech and InstaDeep fine-tuned ESM models on COVID spike proteins to create a variant early-warning system flagging all 16 WHO variants of concern; Hie et al. used ESM1v/ESM1b to evolve antibodies; Shanker et al. used ESM-IF1 for antibody evolution against SARS-CoV-2. These legacy use cases by corporate and academic users validate the ESM family's practical utility, but they do not constitute current commercial customers of EvolutionaryScale's paid products.[CU007, CU008, CU009, CU010, CU011, CU013]
| Customer / Partner | Segment | Deployment / Integration | Status (Production vs. Pilot) | Outcome / Evidence Quality | Key Limitation |
|---|---|---|---|---|---|
| AWS SageMaker Marketplace (ESM-C) | Cloud platform channel — enterprise biotech/pharma AWS customers | ESM-C 300M, 600M, 6B available for subscription; CloudFormation deployment; GPU billed to subscriber | Production — live Marketplace listing; deployment documented in GitHub README | High — GitHub README documents specific Marketplace URLs, deployment steps, and SDK integration | Subscriber count and revenue generated are not public; AWS does not disclose per-ISV usage |
| NVIDIA BioNeMo Platform | Cloud platform channel — enterprise drug discovery teams using NVIDIA hardware | ESM-C listed as upcoming integration; BioNeMo targets molecular design, virtual screening, protein binder design | Upcoming / planned — announced December 2024 in ESM Cambrian blog; not yet live as of cache date | Medium — confirmed in company blog and NVIDIA BioNeMo platform page; integration status unverified post-announcement | No confirmed live integration or user count; 'soon' language in December 2024 blog indicates planned not confirmed |
| Adaptyv Bio | Biotech protein engineering startup (Lausanne, Switzerland) | ESM model integration for protein engineering workflows | Production / partnership — named as ESM ecosystem partner | Low-medium — named partner confirmed; specific use case, model version, and business terms not disclosed | Website content minimal; no case study or quantified outcome published by either party |
| Forge API Beta Enrollees | Academic and commercial scientists (mixed segment) | Token-gated API access to ESM3 and ESM-C 6B at scale; same SDK as SageMaker | Production-grade API (beta) — launched January 2025; free limited-time access | Medium — Forge API is operational per GitHub SDK and company blog; enrollment volume undisclosed | Free beta status; no revenue; post-beta paid pricing not announced; conversion plan opaque |
| BioNTech / InstaDeep (legacy ESM2 user) | Large biotech / AI company (ESM predecessor generation) | Fine-tuned ESM language model on COVID spike protein sequences for variant early-warning system | Production — flagged all 16 WHO variants of concern before official designation | High historical quality — documented in ESM3 blog, peer-reviewed context; real-world outcome confirmed | Legacy use of ESM2 (free, predecessor model), not a current paying EvolutionaryScale customer |
Coverage is partial: named partners and marketplace listings only. Undisclosed Forge API users, any private enterprise pilots, and any early-stage partner discussions in the esm-partner GitHub repository are excluded. The BioNTech/InstaDeep row documents prior ESM family usage, not a current commercial relationship with EvolutionaryScale. All revenue metrics are null or undisclosed.
[CU007, CU008, CU010, CU011, CU025, CU033]Evidence quality, deployment status, outcome specificity, and retention signal across EvolutionaryScale's named and inferred customer deployments.
Matrix assessments are qualitative judgments based on the type and quantity of available evidence for each deployment. Production status for AWS SageMaker and Adaptyv Bio is inferred from documented access mechanisms; EvolutionaryScale has not issued press releases confirming active commercial deployments. BioNTech/InstaDeep row documents historical ESM2 usage, not a current EvolutionaryScale commercial relationship.
[CU007, CU010, CU011, CU025, CU033]6.4 Retention, Durability, and Satisfaction Evidence
EvolutionaryScale has disclosed no NRR, GRR, customer churn rate, contract renewal statistics, or customer satisfaction scores as of May 2026. The absence of these metrics is expected for a company at this commercialization stage: the Forge API beta launched only in January 2025, AWS SageMaker listings are relatively recent, and no enterprise software deal with disclosed terms has been announced. The primary observable retention signals are indirect: sustained HuggingFace download growth (ESM-C updated within days of the research date), active GitHub commits through April–May 2026, and the ongoing accumulation of downstream academic papers (37 months of building on ESM models since ESM2's release). For the Forge API channel, the public beta model offers free access as an explicit customer development tool. The company's January 2025 blog post describes a "public beta, allowing scientists in academia and industry a free limited time preview" — which implies post-beta paid conversion as an intended but unverified retention mechanism. The ESM GitHub repository SDK integrates seamlessly across local, Forge, and SageMaker deployment modes (the same API code works regardless of endpoint), creating a low-switching- cost and high-stickiness profile that is architecturally favorable for retention but unverified at the commercial level. For the AWS SageMaker channel, retention is mediated by AWS cloud infrastructure lock-in. Once a customer deploys ESM-C via CloudFormation inside their AWS environment, migration to a competing protein LM requires deliberate re-integration effort, providing durable channel stickiness. The ESM2 predecessor models — available freely under a non-commercial license — represent an important floor for willingness-to-pay analysis. A customer who can satisfy their protein representation tasks with the free ESM2 (up to 15B parameters) has limited incentive to pay for ESM-C commercial access unless performance advantages justify the premium. The task of demonstrating quantifiable performance lift for specific pharmaceutical applications is a critical unresolved retention evidence gap.[CU022, CU027, CU028, CU030, CU032, CU035]
| Metric / Signal | Value or Status | Segment | Confidence | Diligence Ask |
|---|---|---|---|---|
| Net Revenue Retention (NRR) | Not disclosed | All commercial segments | N/A — metric does not exist publicly | Request NRR disclosure in management due diligence; available only post-commercial launch at scale |
| Gross Revenue Retention (GRR) | Not disclosed | All commercial segments | N/A — metric does not exist publicly | Request GRR at any enterprise customer; currently not applicable before paid tier is active at scale |
| HuggingFace model maintenance freshness | ESM-C updated 2 days before research date (May 2026); ESM3 updated January 29, 2025 | Open-weight academic users | High — direct read from HuggingFace timestamps | Monitor HuggingFace update frequency as a proxy for model freshness commitment |
| GitHub commit activity (evolutionaryscale org) | Active commits across esm, DeepEP, nccl, transformers repos through April–May 2026 | Developer community users | High — visible from org page activity | Track issue resolution rate and release cadence to assess developer support quality |
| Forge API availability / uptime | API available per GitHub SDK documentation; no SLA or uptime data published | Forge API beta users | Medium — API endpoint referenced in code but no status page or uptime metrics | Request SLA terms and historical uptime for enterprise API diligence; check forge.evolutionaryscale.ai status page |
| Academic downstream paper accumulation rate | 32 Semantic Scholar papers (≈ 13 months post-ESM3 release); 129 bioRxiv preprints | Academic users | High — from API search | Track quarterly paper count as a leading indicator of commercial pipeline conversion |
| Reported customer churn events | Zero publicly documented churn or non-renewal events as of May 2026 | All segments | Low — absence of evidence, not confirmed absence of churn; company is pre-scale | Not meaningful until paid commercial relationships are disclosed |
| Customer testimonials / G2/Gartner Peer Insights reviews | None found as of May 2026 | All commercial segments | High confidence in absence — systematic search returned no reviews | Search G2, Gartner Peer Insights, and Capterra periodically; first reviews expected when enterprise tier launches |
All NRR and GRR cells are null because EvolutionaryScale has no disclosed commercial revenue from its paid product tier as of May 2026. Retention proxies rely entirely on open-access metrics (HuggingFace downloads, GitHub activity, paper counts). The company is in an API beta phase; formal retention metrics are not yet applicable at commercial scale.
[CU022, CU027, CU028]Comparative bar representation of known open-access adoption metrics across HuggingFace and academic literature channels as of May 2026.
All values are from direct platform reads (HuggingFace page, GitHub org page, API search results) as of May 2026. HuggingFace downloads and GitHub stars are heterogeneous metrics (downloads reflect model weight retrievals, stars reflect developer interest). bioRxiv and Semantic Scholar values are search-result counts and may include indirect mentions.
[CU001, CU002, CU003, CU005, CU006]6.5 Expansion Drivers and Concentration Risk
EvolutionaryScale's expansion trajectory is shaped by two competing dynamics. The first is favorable: the AWS and NVIDIA strategic investments create preferential channel placement, marketing co-promotion, and potential preferential access to both companies' enterprise customer networks. NVIDIA BioNeMo's 2x faster training and 6x faster inference claims, combined with ESM model integration, position EvolutionaryScale models inside a high- adoption GPU infrastructure platform. AWS's inclusion of ESM-C in SageMaker JumpStart creates discoverability among the thousands of life sciences companies deploying workloads on AWS. The "free academic tier → Forge API beta → enterprise SageMaker contract" funnel is architecturally sound. The second dynamic is adverse: EvolutionaryScale has no disclosed pharma anchor customer, no land-and-expand case study, and no publicly announced pricing that would enable market-standard comparisons. Generate Biomedicines disclosed a $1.9B Amgen collaboration; Isomorphic Labs announced deals with Eli Lilly and Novartis totaling over $3 billion in potential milestone value. EvolutionaryScale's frontier protein language model has arguably superior academic credentials (Science publication, 98B-param ESM3) but demonstrably inferior commercial proof relative to these direct competitors in the protein AI space. Customer concentration risk is currently unmeasurable — with no named enterprise customer, there is technically zero customer concentration, but this masks the greater risk: no enterprise revenue whatsoever from a public-benefit company that raised $142M over 27 months. Structural risks to the expansion thesis include: (1) open-weight ESM2 substitution — many downstream users achieve adequate results with the free 15B-parameter ESM2 rather than paying for ESM-C or ESM3 commercial access; (2) biosecurity constraints — responsible development frameworks and dual-use risk assessments (documented by NTI and safe.ai) create legitimate reasons to gate access to frontier protein design models, limiting the addressable commercial user base; and (3) academic open-source competition — AlphaFold3, ESMFold, and RoseTTAFold are all freely available for structure prediction, compressing the addressable market to generation and multimodal reasoning tasks where ESM3 has genuine differentiation.[CU014, CU017, CU018, CU019, CU024, CU026]
| Expansion Driver / Risk Factor | Type | Impact | Likelihood / Status | Diligence Path |
|---|---|---|---|---|
| AWS SageMaker + NVIDIA BioNeMo channel placement | Expansion driver | High — access to thousands of enterprise life sciences customers via cloud platforms | Confirmed (SageMaker live; BioNeMo announced) | Monitor SageMaker listing rankings and BioNeMo launch timeline; request channel partner revenue split terms |
| No named pharma anchor customer (commercial proof gap) | Concentration risk / adverse | High — absence of enterprise customer proof limits Series B valuation and future fundraising | Confirmed gap as of May 2026 | Monitor for pharma deal announcement; request management update on enterprise sales pipeline stage and count |
| Open-weight ESM2 free substitution risk | Adverse headwind | Medium — ESM2 (up to 15B params, free) satisfies representation tasks for many downstream users | Confirmed — ESM2 available; degree of substitution unknown | Quantify what fraction of API beta users genuinely require ESM3 or ESM-C 6B performance vs. ESM2 |
| Biosecurity / dual-use constraints on frontier protein design models | Adverse headwind | Low-medium — responsible development framework may gate access to high-risk use cases, limiting addressable commercial market | Active — NTI and safe.ai document ongoing biosecurity concerns about protein design AI | Review EvolutionaryScale responsible development framework; assess whether access controls would materially restrict pharma customer use cases |
| Forge API commercial pricing launch (future) | Expansion driver (pending) | High — paid Forge API would create first direct revenue metric and proof of willingness-to-pay | Pending — free beta only as of May 2026; pricing model not announced | Track Forge pricing page; ask management for pricing tier structure and expected launch date |
| Land-and-expand via academic-to-enterprise funnel | Expansion driver (structural) | Medium — documented ESM2 corporate adoption (BioNTech/InstaDeep) suggests enterprise conversion is possible | Structural — funnel architecture confirmed; conversion rate unknown | Request Forge API conversion rate data from management; compare academic vs. commercial API usage split |
Expansion driver assessments are forward-looking and based on structural inference from channel relationships, not disclosed revenue data. Adverse headwinds (ESM2 substitution, biosecurity constraints) are substantiated by confirmed free alternatives and independent biosecurity organization documentation, respectively. All probability assessments are qualitative, not quantitative.
[CU017, CU018, CU024, CU035, CU036]6.6 Exhibits
07Risks
7.1 Biosecurity and Dual-Use Risk
Biosecurity risk is the most existential dimension of the EvolutionaryScale thesis. ESM3 can generate functional proteins at sequence distances that "represent 500 million years of evolution" from any known natural protein, as demonstrated by esmGFP. The same generative capability that enables drug discovery could, in principle, be directed at enhancing pathogen virulence, generating novel toxins, or engineering biological agents outside the known sequence space monitored by surveillance systems. A 2023 MIT study published on arXiv (Sandbrink & Shulman, 2306.03809) showed that large language models could, in a one-hour session, identify potential pandemic pathogens, synthesis routes, and CRO partners for non-scientists with no laboratory training. While that study focused on general-purpose LLMs rather than protein-specific models, the concern is directly analogous: protein language models lower the barrier to designing functional biological agents. US Executive Order 14110 (October 30, 2023) explicitly singles out biotechnology as a national-security AI risk domain, mandating evaluations of AI systems that might lower barriers to creating biological, chemical, nuclear, or radiological weapons with mass-casualty potential. The NIST AI Risk Management Framework (AI RMF 1.0, released January 2023; updated Generative AI Profile July 2024) provides voluntary guidance for identifying and managing these risks. The EU AI Act (Regulation 2024/1689, OJ 12 July 2024) includes dual-use bio-related AI in its high-risk scope under Annex III and related provisions. EvolutionaryScale has published a Responsible Development Framework with four core tenets: communicate benefits and risks, rigorously evaluate models before deployment, adopt guardrails, and engage government and civil society. ESM Cambrian's launch blog states that "ESM C was reviewed by a committee of scientific experts who concluded that the benefits of releasing the models greatly outweigh any potential risks." However, the canonical Responsible Development Framework URL (/blog/responsible-development) returned a 404 error at access date (2026-05-18), indicating the framework document may not be publicly accessible, which itself is a transparency risk. No independent third-party verification of EvolutionaryScale's model safety evaluations has been publicly disclosed. The Biological Weapons Convention (BWC, 1972, 189 parties as of May 2025) prohibits development and stockpiling of biological weapons but has no formal verification regime and no mechanism specifically addressing AI-designed proteins. The Center for AI Safety's 2023 statement — signed by Hinton, Bengio, and others — identifies pandemic- class bio risk from AI as one of the top extinction-level concerns. The Johns Hopkins Center for Health Security focuses explicitly on AI-biosecurity intersection as a core research area. Industry self-regulation via responsible AI bio frameworks (Anthropic's RSP, OpenAI's safety commitments) is nascent and not binding on third parties like EvolutionaryScale.[CR001, CR002, CR003, CR004, CR005, CR006]
| Risk | Category | Likelihood (1-5) | Impact (1-5) | Residual Score | Current Mitigation | Residual Exposure / Gaps |
|---|---|---|---|---|---|---|
| Regulatory imposition of mandatory biosecurity evaluations or export controls on protein LLMs | Biosecurity/Regulatory | 3 | 5 | 15 | Responsible Development Framework; government engagement stated in ESM3 blog | No binding third-party evaluation; /blog/responsible-development URL inaccessible at access date |
| Deliberate misuse of ESM3 API to design novel pathogen proteins or toxins | Biosecurity/Dual-Use | 2 | 5 | 10 | Access controls on Forge API; academic use restrictions; model output monitoring stated | No independent biosecurity audit of API guardrails disclosed publicly |
| Competitive commoditisation: free tools (AlphaFold3 DB, Chai-1, OpenFold) erode Forge API pricing | Competitive | 4 | 4 | 16 | ESM3 98B generative multimodality differentiates from structure-prediction tools; drug-discovery fine-tunes | Chai-1 already matches/beats ESM3 on key benchmarks; erosion accelerating |
| Meta retains residual IP rights over ESM2 ancestor weights used to initialise ESM3 | Legal/IP | 2 | 4 | 8 | Patents filed on ESM3 architecture; PBC corporate structure | No public disclosure of Meta IP agreement; ESM2 model card terms ambiguous on derivatives |
| Capital runway exhaustion before Forge API achieves commercial revenue scale | Financial | 3 | 4 | 12 | $145 M raised; Amazon/Nvidia as investors provide compute access optionality | No public revenue; valuation at 10×+ forward revenue implies high bar; no follow-on round disclosed |
| Down-round risk if Forge commercial adoption lags and Series A mark-to-market compresses | Financial | 3 | 3 | 9 | Pharmaceutical partnerships for drug discovery as revenue engine; AWS distribution | No disclosed enterprise contracts or ARR milestones; market-rate GPU costs remain high |
| Key-person departure: loss of Rives, Sercu, or Lin halts model development | Talent/Execution | 2 | 4 | 8 | Equity incentives implied; four-founder team provides some redundancy | No disclosed succession plan; no independent board oversight of founder roles |
| Single-employer cultural concentration (all founders ex-Meta FAIR) reduces strategic diversity | Talent/Culture | 3 | 2 | 6 | Company expanded beyond founding team (est. 50–80 employees) | No external scientific advisory board publicly disclosed; potential for paradigm blind spots |
| Investor/competitor conflict: Amazon and Nvidia steer enterprise clients to competing platforms | Partner/Dependency | 2 | 4 | 8 | Contractual distribution agreements provide channel incentives | No MFN or exclusivity disclosures; BioNeMo includes third-party competing tools |
| EU AI Act high-risk classification triggering conformity assessment and access restrictions for ESM3 API | Regulatory | 2 | 3 | 6 | Responsible Development Framework aligned with RSP-type self-regulation | EU AI Act full provisions apply Aug 2026; company EU presence and compliance posture not disclosed |
| Model hallucination: ESM3-generated sequences fail wet-lab validation at rate reducing customer ROI | Technical | 3 | 3 | 9 | Alignment training (RLHF-analogous feedback) improves generation quality per ESM3 paper | No published wet-lab validation failure rates; latency between API call and lab result obscures true failure rate |
| Data scarcity for novel protein families limits generalization of ESM3 to unexplored sequence space | Technical | 3 | 3 | 9 | Synthetic data augmentation used in ESM3 training (predicted structures/functions) | Synthetic data quality depends on AlphaFold predictions; circular dependency risk if predictions are wrong |
Likelihood and Impact on 1–5 scale; Residual Score = Likelihood × Impact. Mitigations sourced from public EvolutionaryScale disclosures and industry standard practices. Residual exposure column reflects unresolved public-evidence gaps.
[CR001, CR005, CR013, CR016, CR019, CR023]Risk items plotted by likelihood (x-axis, 1–5) and impact (y-axis, 1–5); higher-right quadrant = highest priority.
Likelihood and impact ratings are qualitative estimates based on public information; no quantitative probability model has been applied.
[CR001, CR005, CR013, CR019, CR032, CR038]7.2 Technical Risks
ESM3's performance depends on the quality of its training distribution. The model is trained across 2.78 billion protein sequences but the natural diversity of functional proteins in novel families — e.g., non-ribosomally synthesised peptides, novel enzyme scaffolds, or fully de novo folds — may lie far outside this distribution. Protein language models can "hallucinate" high-confidence sequences that do not fold or function as predicted; wet-lab validation is required before any ESM3-generated sequence can be used therapeutically or industrially. This creates a latency risk: customers pay for Forge API calls but must still run expensive lab validation before generating commercial value, weakening the economic argument for premium pricing relative to open alternatives. Benchmark saturation is a near-term technical risk. ESM3-98B was the state-of-the-art on CASP15 monomer prediction at launch, but Chai-1 (Apache 2.0, free) already reports Cα LDDT of 0.849 vs ESM3-98B's 0.801 and a 77% vs 76% PoseBusters success rate, approaching parity with commercial models in structure prediction. AlphaFold 3 and its open database (200 M+ structures including protein complexes as of March 2026) continuously expand free coverage. Baker Lab's RFdiffusion (Nature 2023) is freely available for binder design. These open tools reduce the marginal value of Forge's API-gated access. ESM3's training is initialised from Meta's ESM2 weights. Meta retains IP on the ESM2 model family under the terms of its own model card and the GitHub repository for facebookresearch/esm. ESM3 represents a substantial architectural and training advance beyond ESM2, but any residual IP claim from Meta on the ancestor weights could constrain EvolutionaryScale's ability to commercialise the 98B model or sublicense weights. The company has filed patents on aspects of its work (disclosed in the ESM3 biorxiv preprint competing interest statement), but the full patent portfolio and its relationship to Meta's prior art remain undisclosed. Compute dependency on NVIDIA (H100/H200 clusters) and Amazon (AWS) is a double-edged risk: both entities are investors and channel partners but could theoretically restrict access or deprioritise workloads. GPU supply constraints could delay model training or API scaling, particularly given that ESM3 was trained at 1×10²⁴ FLOPs — one of the largest compute investments for any biological model at launch.[CR011, CR012, CR013, CR014, CR015, CR016]
| Risk | Mechanism | Evidence | Severity | Mitigation |
|---|---|---|---|---|
| Protein hallucination / non-functional generations | Language model generates sequences with plausible statistics but incorrect fold or no biological activity | Chai-1 technical report shows ESM3-98B at 0.801 Cα LDDT vs Chai-1 0.849; AlphaFold2 achieved accuracy for ~2/3 proteins only at CASP14 | High for drug-discovery customers expecting reliable hits | Alignment training analogous to RLHF per ESM3 paper; lab validation required |
| Benchmark saturation and competitor catch-up | Free open tools rapidly close performance gaps on structure prediction; generative tasks less well-benchmarked | AlphaFold3 DB 200M+ structures free; Chai-1 Apache 2.0 at SOTA; RFdiffusion free from Baker Lab | Medium — limits premium pricing power | ESM3's multimodal joint generation (sequence+structure+function) differentiates |
| ESM2 IP provenance — Meta ancestor weights | ESM3 initialised from Meta's ESM2; Meta model card terms on derivatives not fully public; potential for claim on commercial weights | facebookresearch/esm repo text: 'contains pre-trained weights' under Meta terms; no explicit open-license for ESM2 weights | Medium — latent licensing risk | Patents filed; counsel review required; PBC structure provides some protection |
| Compute cost and GPU supply concentration | Training at 1×10²⁴ FLOPs + ongoing inference; HPC cluster needed for ESM3-98B; NVIDIA H100/H200 scarce | ESM3 blog: 'trained on one of the highest throughput GPU clusters in the world today' | Medium operational | Amazon and Nvidia as investors provide access optionality; multi-cloud risk remains |
| Training data gaps for rare protein classes | Novel organisms, synthetic biology substrates, or non-natural amino acids may lie outside the 2.78B-sequence training distribution | ESM3 paper: augmented with synthetic data to cover gaps; ESM Cambrian scaling law plateau indicates ceiling | Medium — limits utility for frontier drug discovery | Synthetic augmentation; ongoing model updates via ESM Cambrian family |
Severity ratings qualitative; evidence citations refer to public model cards and technical reports.
[CR011, CR012, CR013, CR014, CR015, CR016]Directed acyclic graph showing how biosecurity, technical, and financial risks flow to commercial and investor impacts.
[CR005, CR013, CR019, CR032, CR042]7.3 Competitive Commoditisation Risk
The protein AI tooling landscape is commoditising rapidly. Google DeepMind's AlphaFold 3 database provides over 200 million predicted protein-complex structures freely via EMBL-EBI partnership (updated March 2026 to include protein complexes). Meta's ESM2 is MIT-licensed via facebookresearch/esm; OpenFold is Apache 2.0. Chai-1 is Apache 2.0 with free commercial use. Baker Lab's RFdiffusion and ProteinMPNN are freely available from IPD/UW. These free-to-use models serve structure prediction and binder design workflows that are core use cases for Forge's API. EvolutionaryScale's defensibility relies on: (1) ESM3's generative multimodal capability going beyond structure prediction to sequence/structure/function joint generation; (2) the 98B-parameter flagship model being API-gated and commercially licensed; (3) domain-specific fine-tunes for drug discovery that require proprietary data. However, Chai-1's technical report claims state-of-the-art multimer prediction without MSA, directly competing with ESM3's key differentiator. If academic and venture-backed competitors (Profluent, Generate Biomedicines, AbSci, Isomorphic Labs) release competitive generative models under permissive licenses, Forge's pricing power will compress. The competitive risk is compounded by investor/partner overlap: Amazon (AWS) distributes EvolutionaryScale models via SageMaker JumpStart but also invests in and provides compute to competing bio-AI companies; Nvidia distributes via BioNeMo but BioNeMo is itself a competing model-distribution platform. Any conflict between partner and investor interests could result in preferential treatment of alternatives on these platforms.[CR019, CR020, CR021, CR022, CR023, CR024]
| Competitor Tool | Licence / Cost | Primary Capability | Threat to Forge API | Gap vs ESM3 |
|---|---|---|---|---|
| AlphaFold 3 DB (DeepMind/EMBL-EBI) | Free; CC BY 4.0 | Structure prediction for proteins, complexes, small molecules; 200M+ entries DB | High for structure-prediction use cases | Not a generative model; does not generate sequences from prompts |
| Chai-1 (Chai Discovery) | Apache 2.0; free commercial use | Multimer structure prediction; 77% PoseBusters; 0.849 LDDT monomer; no MSA needed | High — already beats ESM3-98B on CASP15 monomer | Not yet a generative protein design model; sequence generation limited |
| OpenFold (AQ Laboratory) | Apache 2.0 trainable | AlphaFold2-equivalent structure prediction; trainable on proprietary data | Medium — training requires compute | Structure-only; no sequence/function generation; no 98B-scale model |
| RFdiffusion (Baker Lab/IPD) | Permissive; free for research | De novo protein backbone generation; binder design; motif scaffolding | Medium for binder design use cases | No sequence/function joint reasoning; less multimodal than ESM3 |
| Meta ESM2 (MIT license via facebookresearch) | MIT — free commercial use | Sequence embedding; structure prediction (ESMFold) | Medium for embedding and structure tasks | Not generative in the ESM3 sense; superseded by ESM3 architecture |
| Profluent Bio, AbSci, Generate Biomedicines | Proprietary / partnership | AI-driven antibody/protein design with integrated wet-lab | Medium for enterprise drug-discovery customers | Vertically integrated competitors; charge for discovery services not API access |
Competitor data sourced from public GitHub repos, model cards, and technical blog posts at access date.
[CR019, CR020, CR021, CR022, CR023]7.4 Regulatory Landscape
The regulatory environment for AI-designed proteins is nascent and fragmented. No jurisdiction has issued a specific rule governing commercial deployment of generative protein language models. The US, EU, and UK are each developing frameworks that could apply to EvolutionaryScale's products under different risk classifications. In the US, EO 14110 (October 30, 2023) requires developers of dual-use foundation models above defined thresholds to report safety evaluations to the government, with specific attention to "biosecurity, cybersecurity, and critical infrastructure" risks. The NIST AI RMF (January 2023) and its Generative AI Profile (July 2024) provide voluntary guidance. The FDA regulates AI/ML-enabled medical devices via its Software as a Medical Device (SaMD) framework and 2024 AI/ML action plan, but this applies only to diagnostic or treatment-decision AI, not pure discovery-stage protein design tools. The BIS (Bureau of Industry and Security) has begun examining export-control frameworks for AI models that could be used in biological weapons contexts, though no specific rule governing protein language models has been finalised. In the EU, the AI Act (Regulation 2024/1689, effective August 2024; most provisions apply from August 2026) classifies AI systems with potential for dual-use biological harm in its high-risk or prohibited categories depending on application. Dual-use API access to ESM3 could require conformity assessments, transparency obligations, and human oversight measures once the Act's provisions are fully enforced. UK biosafety review is ongoing under the Biosecurity Strategy and AI Safety Institute framework, which evaluates biological risks from AI in collaboration with the US and international partners. The Biological Weapons Convention (signed and ratified by 189 states) prohibits development and production of biological weapons but contains no AI-specific language and lacks a formal verification mechanism. The regulatory tail risk is asymmetric: new binding rules (mandatory safety evaluations, export controls, access restrictions) could impose compliance costs, limit international distribution, or require model redactions — any of which would impair Forge's commercial model without precedent for the duration or severity of such restrictions.[CR025, CR026, CR027, CR028, CR029, CR030]
| Instrument | Jurisdiction | Status | Applicability to ESM3/Forge | Likely Timeline | Residual Risk |
|---|---|---|---|---|---|
| EO 14110 — Safe, Secure, and Trustworthy AI (§4.4 biotechnology) | US Federal | In force (Oct 30, 2023); future implementation rules TBD | Requires frontier model developers to report dual-use biological evaluations to NIST/OSTP above compute thresholds | Ongoing; reporting requirements depend on rulemaking | Medium: ESM3 at 1×10²⁴ FLOPs may cross thresholds; no confirmed reporting to date |
| NIST AI RMF 1.0 + Generative AI Profile (NIST-AI-600-1) | US (voluntary) | Published Jan 2023; GenAI profile Jul 2024 | Voluntary framework; increasingly referenced in procurement and regulatory contexts | Voluntary; de facto standard | Low-medium: non-compliance creates reputational and procurement risk only |
| EU AI Act (Regulation 2024/1689) | EU | Published Jul 12, 2024; Aug 2026 full enforcement | General-purpose AI model with >10²⁵ FLOPs training may trigger systemic risk obligations; dual-use bio applications could be high-risk | Aug 2026 for most provisions | High: conformity assessment, transparency obligations, and third-party audits could restrict EU market access |
| FDA AI/ML-Enabled Medical Devices framework (SaMD) | US FDA | Evolving; 2024 AI/ML action plan | Applies to diagnostic/treatment AI, not discovery-stage protein design tools; future expansion possible | Gradual; no specific protein LLM rule | Low currently; could expand if ESM3 used in clinical decision support |
| BIS Export Administration Regulations (EAR) — potential AI bio controls | US (Commerce/BIS) | ANPRM under development (2024); no final rule for protein LLMs | Could restrict export of ESM3 weights or API access to adversarial nations | 2025–2027 likely for final rule if enacted | Medium: would restrict international commercial revenue and academic distribution |
| Biological Weapons Convention (BWC) | International (189 parties) | In force since 1975; no verification regime | Prohibits development of bio weapons; does not address AI-designed proteins specifically; company and customers must comply | Ongoing; no AI-specific amendment near-term | Low-medium: compliance obligation on customers; no direct regulatory burden on EvolutionaryScale beyond terms of service |
Status and timeline information based on publicly available regulatory documents and fetch-date knowledge. EU AI Act enforcement dates may change via implementing acts. BIS ANPRM status subject to change.
[CR025, CR026, CR027, CR028, CR029, CR030]Chronological view of regulatory milestones affecting EvolutionaryScale from 2022 to 2027 (projected).
Dates for 2026+ regulatory milestones are estimates based on typical US/EU rulemaking timelines; actual dates may vary.
[CR025, CR026, CR027, CR028, CR029]7.5 Financial and Operational Risk
EvolutionaryScale has raised approximately $145 M in total (seed plus $142 M Series A, September 2024) at a $1.35 B post-money valuation. Operating a frontier protein language model at the 98B-parameter scale requires significant recurring compute. Training ESM3 consumed 1×10²⁴ FLOPs on a high-throughput GPU cluster; ongoing inference serving and model iteration at commercial scale carry recurring hardware costs. There is no public revenue disclosure; the company is pre-revenue or very-early-revenue at known access date. At typical AI infrastructure spending rates for a 50–80 person company with frontier GPU clusters, $145 M implies a runway measured in 2–4 years unless the Forge API converts to meaningful commercial revenue quickly. The valuation implies a multiple exceeding 10× any forward revenue projection achievable with current Forge adoption, creating down-round risk if the commercial ramp is slower than investor expectations. The investor/partner concentration in Amazon and Nvidia creates dual conflicts: both are the company's primary cloud-compute providers, primary distribution channels (SageMaker JumpStart, BioNeMo), and Series A investors. In a scenario where EvolutionaryScale needs to renegotiate cloud contracts or seek competitive bids, the investor relationship limits negotiating leverage. Conversely, if Amazon or Nvidia develop competing capabilities (which they have: BioNeMo already includes ESM models but also competing tools), they may have incentive to steer customers away from Forge API. The company has no disclosed IP monetisation strategy beyond the Forge API subscription. If the open ESM3 1.4B model (free for academic use) cannibalises commercial Forge adoption by serving most academic demand, and if pharmaceutical customers prefer building internal capabilities rather than paying API fees, the commercial model could underperform at scale.[CR032, CR033, CR034, CR035, CR036, CR037]
| Risk | Driver | Likelihood | Impact | Mitigation | Diligence Ask |
|---|---|---|---|---|---|
| Capital exhaustion before Forge API revenue scale | GPU-cluster burn rate at $145M total raised; no public revenue | Medium | Critical | Amazon/Nvidia compute access at investor terms; ESM Cambrian academic open models reduce inference burden | Disclose monthly burn rate, Forge ARR, and runway guidance |
| Down-round risk at $1.35B valuation if commercial ramp lags | Valuation implies 10× revenue projection at pre-revenue stage; market comparables compressed in 2024-2026 | Medium | High | Series A was oversubscribed (Amazon + Nvidia lead); strong investor syndicate | Obtain cap-table, option pool dilution, and any Series B mandate or down-round protection clauses |
| Investor/partner conflict: Amazon and Nvidia as investors, compute providers, and distribution platforms | Structural: both entities operate competing BioNeMo and SageMaker platforms | Medium | High | Contractual ring-fencing of investor and commercial arms assumed but not confirmed | Obtain copy of investor side-letter and any non-compete or channel-exclusivity terms |
| Revenue concentration: Forge API as sole commercial product | No diversification into drug-discovery partnership revenue, milestone payments, or data licensing publicly disclosed | Medium | Medium | Forge API on AWS SageMaker and NVIDIA BioNeMo broadens distribution; pharmaceutical partnerships implied | Obtain revenue breakdown: API fees vs partnership vs milestone vs licensing |
| GPU cost inflation: NVIDIA H100/H200 pricing power as sole-source supplier for frontier training | NVIDIA market dominance in AI accelerators; no near-term AMD/Intel alternative at equivalent performance | Medium | Medium | Amazon and Nvidia as investors may provide preferential pricing; cloud spot pricing optionality | Confirm compute cost per 1K API calls and model training run cost; assess margins at scale |
Likelihood and Impact are qualitative ratings. Financial risk assessment is based on public information only; no revenue, burn-rate, or margin data was disclosed by the company.
[CR032, CR033, CR034, CR035, CR036, CR037]7.6 Talent, Key-Person, and Culture Risk
All four named founders — Alexander Rives (CEO), Tom Sercu (President), Zeming Lin (CTO), and Salvatore Candido — are alumni of Meta AI's FAIR protein research team. This single-employer provenance creates a cultural concentration risk: the team shares a common research paradigm, network, and career history. Cultural monoculture can accelerate aligned decision-making but reduces cognitive diversity in assessing strategic pivots or regulatory threats that FAIR's academic culture may not have prepared them for. Key-person risk is acute. Alexander Rives is the original ESM model creator and primary scientific visionary. Tom Sercu and Zeming Lin are the primary technical architects of ESM3 and ESM Cambrian, as listed in the biorxiv preprint author list. Departure of any founder would likely impair model development velocity and investor confidence. No succession planning or CEO independence has been publicly disclosed. The company is in the Bay Area AI talent market, which is among the most competitive globally. Retaining senior ML researchers against offers from well-capitalised hyperscalers (Google DeepMind, Meta, Microsoft) or pharma AI arms (Isomorphic, Xaira) at $1.35 B valuation and without public liquidity is a structural challenge. Amazon and Nvidia's investor status could reduce acqui-hire risk from those specific parties, but does not reduce the risk of talent migration to other hyperscalers.[CR038, CR039, CR040, CR041]
7.7 Legal and IP Risk
EvolutionaryScale's ESM3 was trained using ESM2 weights as a starting point. Meta's facebookresearch/esm GitHub repository (which hosts ESM2) does not carry a standard open-source licence for the model weights themselves; the ESM2 model weights are described under Meta's own model card terms. The relationship between ESM3's commercial weights and the ESM2 ancestor weights is not fully documented in public sources, creating a latent IP provenance risk if Meta were to assert rights over derivative works. The biorxiv preprint competing interest statement notes that "patents have been filed related to aspects of this work." The nature, claims, and status of these patents are not publicly disclosed. The discrete-token approach to protein modelling used in ESM3 (tokenising 3D structure and function into discrete alphabets) has potential prior art from academic groups, including the Baker Lab, Meta FAIR, and Oxford-based researchers. Any infringement assertion — or patent interference proceedings — could slow commercialisation or require expensive licensing arrangements. EvolutionaryScale is incorporated as a Public Benefit Corporation (PBC), which provides some governance flexibility but also creates obligations around public benefit mission that could constrain purely commercial decisions, particularly around open-model access vs. commercial gating.[CR042, CR043, CR044, CR045]
7.8 Exhibits
08Valuation
8.1 Investment Thesis and Anti-Thesis
The investment thesis for EvolutionaryScale rests on four mutually reinforcing pillars. First, founder domain authority: Alexander Rives, Tom Sercu, Zeming Lin, and Salvatore Candido are the literal creators of the ESM protein language model family at Meta AI FAIR, representing institutional knowledge and publication track record that no competing team can replicate from scratch. Second, peer-reviewed scientific validation: ESM3 was published in Science Magazine on January 16, 2025, the premier peer-reviewed journal, documenting the generation of a novel fluorescent protein equivalent to simulating 500 million years of evolution—an extraordinary scientific claim publicly verified through editorial review and now indexed with over 1×10^24 FLOPs of training compute. Third, structural cloud distribution moat: Amazon (AWS) and NVIDIA are not merely financial investors; they are distribution channel partners embedding Forge into AWS SageMaker JumpStart and NVIDIA BioNeMo, giving EvolutionaryScale direct access to the cloud infrastructure used by virtually every global pharma and biotech R&D organization. Fourth, unique multimodal generative capability: ESM3 simultaneously reasons over protein sequence, structure, and function—a capability no peer protein AI startup (Profluent, Cradle.bio, Absci) has matched in a single foundation model. The anti-thesis is equally evidence-grounded. First, zero revenue disclosed: no ARR, customer count, or gross margin has been publicly confirmed for Forge as of May 2026; the $1.35B Series A valuation is entirely forward-looking, making it one of the richest pre-revenue entries in the AI biotech sector. Second, open-source substitute threat: ESM2 (the predecessor model) is freely available open-source; AlphaFold 3 (Google DeepMind) provides free non-commercial protein structure and interaction prediction to over 3 million researchers globally; both directly substitute for the core of EvolutionaryScale's commercial offering. Third, key-person concentration: all four co-founders come from a single prior employer (Meta AI FAIR); their simultaneous departure risk is a correlated concentration risk without analogy at comparable startups. Fourth, dual-use and biosecurity regulatory overhang: ESM3's generative protein design capabilities carry biosecurity risks acknowledged in the responsible development framework; undisclosed customer screening protocols leave regulatory exposure uncertain. Fifth, VC valuation multiple compression risk: KPMG's 2024 Venture Pulse report explicitly warned that investors are becoming "more discerning as to who the winners may be in the AI space" and will favor companies with credible commercial models, a standard EvolutionaryScale has not yet met in public evidence.[CV001, CV002, CV006, CV007, CV009, CV022]
| Perspective | Argument | What Would Change the View |
|---|---|---|
| Thesis | Founders (Rives, Sercu, Lin, Candido) created the ESM protein language model family at Meta AI FAIR—institutional knowledge no competing team can replicate | Co-founder departure or formation of a competing lab with access to similar training data |
| Thesis | ESM3 published in Science Magazine (Jan 2025): first multimodal protein generative model with peer-reviewed validation of 500M-year evolution simulation | Scientific peer challenges or reproducibility failure on key ESM3 claims |
| Thesis | Amazon (AWS) + NVIDIA co-investment: Forge deployed on SageMaker JumpStart and BioNeMo gives direct access to virtually all global pharma R&D cloud infrastructure | Amazon or NVIDIA terminate the distribution partnership or shift to a competing protein AI platform |
| Thesis | ESM3 multimodal reasoning (sequence + structure + function simultaneously) is unique among protein AI peers and enables prompt-guided protein design at scale | A peer demonstrates equivalent multimodal capability open-source and widely adopted before Forge secures multi-year contracts |
| Anti-thesis | $1.35B Series A with zero disclosed revenue implies ~9.5x post-money-to-raised ratio—one of the richest pre-revenue AI biotech entries; no confirmed ARR or customer count | Disclosed Forge ARR >$10M with gross margin >60% and multi-pharma customer count |
| Anti-thesis | ESM2 (predecessor) is open-source; AlphaFold 3 (Google DeepMind) provides free non-commercial protein structure prediction to 3M+ researchers—direct substitutes for lower-tier Forge use cases | ESM3's unique generative capabilities (not replicated by open-source) command sustained pricing above commodity API levels |
| Anti-thesis | All four co-founders joined from one prior employer (Meta AI FAIR)—correlated key-person departure risk creates single-point-of-failure at leadership level | Founders sign long-term employment contracts and hire a second-tier of independent scientific leadership |
Arguments in both thesis and anti-thesis are evidence-backed. Rows ordered by relative impact on valuation. All anti-thesis rows reflect observable public evidence as of May 2026; no speculative claims included.
[CV006, CV007, CV009, CV022, CV023, CV029]Decision chain from founder pedigree, platform proof, risk factors, and valuation anchors to the Research-More recommendation and required catalysts.
[CV006, CV009, CV030, CV033]8.2 Recommendation, Confidence, and Risk Rating
The recommendation is Research-More / Track with Interest. This is not a buy recommendation for three evidence-based reasons. First, valuation confidence is MEDIUM: the $1.35B post-money Series A anchor is confirmed via Crunchbase and Bloomberg but no intrinsic value model can be constructed without Forge revenue, ARR, gross margin, or customer count data—all currently undisclosed. Second, the entry multiple is aggressive: at $1.35B pre-revenue, EvolutionaryScale's implied price-to-raised multiple (~9.5x post-money-to-raised) substantially exceeds sector norms for pre-revenue biotechs and is difficult to justify without visibility into commercial traction. Third, bear case risk is asymmetric: the open-source ESM2 and free AlphaFold 3 represent real substitutes that could compress API pricing and destroy the revenue case before it materializes. The overall confidence rating is MEDIUM. Evidence is strong on the product (Science publication, model architecture), the team (Meta AI pedigree, GitHub activity, HuggingFace presence), and the funding (confirmed, multi-source). Evidence is weak on the commercial side (no revenue, no ARR, no customer disclosure, no partnership financial terms). The risk rating is HIGH, reflecting: pre-revenue entry at premium valuation; key-person concentration across all four co-founders from a single prior employer; open-source and free-tier competitive substitution; dual-use biosecurity regulatory uncertainty; and structural dependence on Amazon and NVIDIA for commercial distribution. The valuation stance is infrastructure/platform AI premium with no clinical proof uplift. Unlike clinical-stage AI drug discovery companies (Insilico Medicine, Recursion), EvolutionaryScale's value is entirely in the platform and foundation model layer— analogous to a foundation model API company (Anthropic, Cohere) applied to vertical biology, but without the general-purpose scale and with substantially more concentrated market exposure. A buy recommendation requires: disclosed Forge ARR of at least $10M+, active multi-pharma customer base, and confirmed gross margin above 60%.[CV001, CV002, CV005, CV011, CV013, CV030]
| Recommendation | Confidence | Risk Rating | Valuation Stance | Decision Implication |
|---|---|---|---|---|
| Research-More / Track with Interest | Medium (no Forge revenue data; open-source substitute risk; preference stack unknown) | High (pre-revenue at $1.35B; key-person concentration; open-source ESM2 + AlphaFold 3 substitution; dual-use regulatory overhang) | Infrastructure/platform AI premium; base case $1.5–2.5B; bull case $3–5B contingent on Forge ARR; bear case $400–800M on commoditization | Track Forge ARR and customer count; require >$10M ARR and multi-pharma customers before upgrading to buy; monitor Amazon/NVIDIA partnership financials |
Recommendation is price-sensitive and evidence-sensitive. Confidence and risk ratings reflect absence of disclosed Forge revenue and open-source substitute risk as of May 2026. Valuation stance range assumes no confirmed financial data.
[CV001, CV002, CV030, CV033]IC-ready scoring across six dimensions: market proof, platform moat, commercial evidence, economics visibility, risk level, and evidence quality.
[CV005, CV006, CV009, CV030]8.3 Financing, Valuation Context, and Entry Discipline
EvolutionaryScale closed its $142M Series A on September 26, 2024, with Amazon (AWS) and NVIDIA co-leading. Lux Capital, Nat Friedman, and Daniel Gross (the AI Grant organization) participated. Total capital raised is approximately $145M including seed funding. The post-money valuation is approximately $1.35B—confirmed via Crunchbase, Bloomberg (paywall), and PitchBook (paywall). No Form D securities filing was identified in SEC EDGAR's full-text search database for "EvolutionaryScale", which is consistent with the company's private status and possible use of Regulation D without public disclosure. The $1.35B valuation at pre-revenue stage is rich by historical biotech venture standards, but broadly consistent with the 2024 AI valuation environment in which five US companies each raised $4B+ rounds in Q4 2024 alone (KPMG Venture Pulse). The strategic nature of Amazon and NVIDIA's investments substantially alters the risk-adjusted thesis: both are distribution channel partners whose investments create a self-reinforcing commercial incentive to route pharma API traffic through Forge. AWS SageMaker JumpStart and NVIDIA BioNeMo together reach virtually every major global pharma R&D organization, making the channel moat real and durable. Entry discipline requires confirmation of commercial traction before a buy recommendation. The preference stack, cap table structure, and dilution overhang from the Series A are unknown; without confirmed audited financials or pro-forma cap table disclosure, common equity value at any given enterprise value cannot be precisely computed. The Amazon and NVIDIA co-investment structurally limits the probability of a hostile acqui-hire or catastrophic down-round, since neither investor would benefit from a distressed sale that undermines their cloud platform strategy.[CV001, CV002, CV003, CV004, CV005, CV009]
| Scenario | Key Assumptions | Valuation Range (USD B) | Key Risk and Probability Signal |
|---|---|---|---|
| Bull ($3–5B) | Forge achieves $50–100M ARR by 2027; multi-pharma multi-year contracts established; AWS+NVIDIA channel generates scale distribution; ESM3 is adopted as the protein foundation-model standard across biopharma; gross margin >70% | $3.0–5.0B | Requires confirmed $25M+ ARR data point and 2+ disclosed pharma contracts; comparable to Generate Biomedicines (~$2.5B) with better distribution advantage; probability: 25–30% |
| Base ($1.5–2.5B) | Slow commercial ramp; $10–25M ARR by 2027; most revenue via AWS/NVIDIA channel fees; some enterprise pharma contracts but API pricing pressure from open-source; moderate team expansion; Series B at modest premium | $1.5–2.5B | Consistent with current Series A entry at ~$1.35B; modest step-up; probability: 45–50% |
| Bear ($400M–800M) | Open-source ESM2 + AlphaFold 3 commoditize API; no major pharma contract closes; key co-founder departure triggers talent exodus; Amazon or NVIDIA acqui-hire at distressed valuation; dual-use regulatory action restricts distribution | $0.4–0.8B | Triggered by: no Forge ARR data by end-2026; competitor open-source parity; co-founder departure announcement; probability: 20–25% |
All valuation ranges are scenario-derived estimates based on comparable company analysis (ABSI ~$800M, RXRX ~$1.555B, Generate Biomedicines ~$2.5B last reported), precedent transactions, and ARR multiple modeling. No confirmed Forge revenue was available for DCF input. Probabilities are subjective estimates.
[CV011, CV022, CV023, CV031, CV032, CV033]Low-to-high valuation range (USD billion) for bear, base, and bull cases based on scenario assumptions and comparable company benchmarks.
Ranges are model-derived using comparable company multiples, M&A precedents, and ARR scenario modeling. No confirmed Forge financials used. Ranges represent informed estimates, not precise DCF outputs.
[CV031, CV032, CV033]8.4 Comparable Valuation Set
The comparable set for EvolutionaryScale spans three categories: public AI drug discovery companies, private AI biotech peers, and recent financing transactions. No perfect single comparable exists, given EvolutionaryScale's unique position as a pre-revenue protein foundation-model API company with strategic cloud distribution from two of the world's largest technology companies. Among public comps, Absci (NASDAQ:ABSI) is the closest direct analog by business model—pure-play AI biologics design with no Phase 2 or clinical programs. ABSI's market cap of approximately $800M as of May 2026 provides a public floor valuation for an AI drug creation platform with disclosed revenue of $2.8M (FY2025) and a net loss of $115.2M. The implied revenue multiple is extreme (~285x FY2025 revenue) and reflects market pricing for platform optionality, not near-term fundamentals. Recursion Pharmaceuticals (NASDAQ:RXRX) has a market cap of ~$1.555B but generated only $6.47M Q1 2026 revenue, with an accumulated deficit of $2.1B; it trades as a clinical-stage AI platform with pipeline optionality. Schrödinger (NASDAQ:SDGR) has a market cap of ~$893M with disclosed software plus structure-based drug discovery revenue; its multiple is more conventional but its hybrid model differs. Among private comps, Generate Biomedicines has raised ~$700M total with a last reported valuation around $2.5B—the closest comparable by modality (protein generative AI). Xaira Therapeutics launched with $1B in Series A funding in April 2024—the largest AI drug discovery Series A ever at that time—at a ~$1B valuation. Isomorphic Labs (Alphabet-backed) has undisclosed standalone valuation but is active in multi-billion collaboration deals with Lilly and Novartis. Profluent raised $44M, Cradle.bio raised ~$73M; both serve protein engineering use cases at earlier stage and substantially lower valuation. For foundation-model platform comparables adjusted for vertical narrowness: Anthropic (~$60B), Mistral (~$6B), and Cohere (~$5B) provide upper-bound public market sentiment for pre-revenue AI foundation models. EvolutionaryScale's $1.35B is approximately 2-3% of Anthropic's valuation with comparable model-quality claims but dramatically narrower addressable market (biology only). The implied discount to horizontal foundation models is appropriate given TAM concentration, but still represents a premium to clinical-stage AI drug discovery public comps.[CV011, CV012, CV013, CV014, CV015, CV016]
| Comparable | Type | Key Metric and Valuation (USD) | Multiple or Benchmark | Relevance to EvolutionaryScale | Limitation |
|---|---|---|---|---|---|
| Absci (NASDAQ:ABSI) | Public | Market cap ~$800M (May 2026); FY2025 revenue $2.8M | ~285x FY2025 revenue | Closest public pure-play AI biologics design peer; no Phase 2 programs; loss-making; NASDAQ data available | Absci revenue is milestone-based partner fees, not SaaS ARR; lower-quality revenue than EvolutionaryScale's potential Forge subscriptions |
| Recursion (NASDAQ:RXRX) | Public | Market cap ~$1.555B (May 2026); Q1 2026 revenue $6.47M; accumulated deficit $2.1B | ~60x annualized revenue | Largest public pure-play AI drug discovery company; phenomics+AI platform; 3M+ compound phenotypic map | Clinical-stage pipeline provides premium vs. EvolutionaryScale; different business model (discovery+pipeline not pure platform API) |
| Schrodinger (NASDAQ:SDGR) | Public | Market cap ~$893M (May 2026); 52-week range $10.94–$27.63 | Public market disclosed | Physics-based simulation + software licensing; disclosed ARR; longer operating history as a public company | Not generative-AI protein language model; hybrid software+drug discovery model; higher ARR visibility but lower growth profile |
| Generate Biomedicines | Private | ~$700M total funding; last reported valuation ~$2.5B | ~3.6x raised-to-valuation | Closest generative-biology comparable: protein generative AI for therapeutics; Massachusetts-based; Flagship Pioneering backed | Earlier commercial stage; different model architecture; no public financials; last round not confirmed current |
| Xaira Therapeutics | Private | $1B Series A (Apr 2024) | ~$1B valuation at launch | Largest-ever AI drug discovery Series A contemporaneous with EvolutionaryScale's raise; structural precedent for $1B+ private AI biotech rounds | Xaira is focused on drug programs not API platforms; different exit path and revenue model |
| Profluent | Private | ~$44M raised | Est. $200–400M valuation | AI protein design for CRISPR and gene editing; OpenCRISPR open-source release; earlier stage | Smaller raise; narrower application (gene editing not broad protein API); limited comparable value |
| Cradle.bio | Private | ~$73M raised | Est. $250–500M valuation | Protein optimization SaaS for biopharma and industrial bio; Novonesis partnership confirmed; closer to Forge use case | Earlier stage; optimization not generation; European company (Amsterdam); different technology approach |
| Isomorphic Labs (Alphabet) | Private | Undisclosed; Lilly+Novartis deals $3B+ headline value | Deal-value benchmark; no standalone valuation | AlphaFold 3 originator; generative biology; Lilly and Novartis collaborations; Alphabet structural advantages | No standalone valuation; Alphabet-backed with fundamentally different cost of capital and competitive position |
Comparable set is partial and asymmetric: public comps (ABSI, RXRX, SDGR) have SEC-filed financial data; private comps rely on press-reported funding rounds and estimated valuations. No investment banking or independent fairness opinion data was accessible. All private valuations are estimates.
[CV011, CV012, CV013, CV014, CV015, CV016]| Company | Transaction | Amount (USD) | Approx. Valuation | Date | Key Investors |
|---|---|---|---|---|---|
| EvolutionaryScale | Series A | $142M | ~$1.35B post-money | Sep 2024 | Amazon (AWS), NVIDIA, Lux Capital, Nat Friedman, Daniel Gross |
| Xaira Therapeutics | Series A (launch) | $1,000M | ~$1B | Apr 2024 | ARCH Venture Partners, Foresite Capital, and others |
| Generate Biomedicines | Series C (last disclosed) | ~$273M (Series C); ~$700M total | ~$2.5B (last reported) | 2022–2023 | Flagship Pioneering, Fidelity, NVIDIA; others |
| Isomorphic Labs | Series B | ~$600M reported | ~$3B+ (deal-value benchmark) | 2024 | Alphabet (Google); undisclosed institutional co-investors |
| Insilico Medicine | HKEX IPO | ~$293M | ~$2.3B (prior Series E) | Late 2025 | Public markets (SEHK:3696) |
| Profluent | Series A | $44M | Est. $200–400M | 2023–2024 | Salesforce Ventures, Felicis, OpenAI Fund |
Transaction data is sourced from company announcements, Crunchbase, press reports, and market research reports. Valuations for private transactions are estimates based on reported post-money or implied deal terms; Isomorphic Labs valuation reflects Lilly+Novartis deal headline value, not a confirmed standalone equity valuation. All amounts in USD.
[CV001, CV002, CV017, CV018, CV020, CV021]Sensitivity of EvolutionaryScale's estimated enterprise value (USD billion) to individual upside and downside drivers relative to a base case midpoint of ~$2.0B.
All values are estimated sensitivity deltas relative to a base-case midpoint of ~$2.0B. No confirmed Forge financial data was available. Ranges reflect comparable company multiples, ARR growth scenarios, and M&A precedent analysis.
[CV031, CV032, CV033]8.5 Exit Readiness and Final Diligence Asks
EvolutionaryScale's most likely near-term exit paths are: (1) strategic acquisition by Amazon (acqui-hire or full buyout to embed Forge into AWS AI Services) or NVIDIA (to deepen BioNeMo's differentiable protein design capability); (2) a pharma acquisition by a major drug company seeking a protein AI platform (AstraZeneca, Pfizer, Genentech, Novartis) once ARR demonstrates commercial product-market fit; (3) a Series B or Series C at $2B+ if Forge hits $25M+ ARR with multi-pharma customers; or (4) an IPO after reaching $50M+ ARR with >60% gross margin—likely not before 2028 at the earliest. The Amazon structural investor relationship substantially reduces bear case probability: an AWS-backed company with Forge deployed on SageMaker JumpStart would need an actively hostile Amazon decision to face a catastrophic down-round. However, the same Amazon relationship creates exit path concentration: if Amazon is the likely acquirer, secondary investors must accept M&A pricing discipline that may not maximize valuation for non-strategic investors. Five diligence asks are critical before a buy recommendation can be issued: (1) Forge ARR and customer count as of Q2 2026; (2) gross margin on Forge API revenue; (3) revenue share and exclusivity terms in the Amazon and NVIDIA partnerships; (4) ESM2/ESM3 IP transfer agreement with Meta Platforms (if any) confirming clear IP chain of title; (5) biosecurity and dual-use customer screening protocols. Until these five items are confirmed, valuation confidence remains MEDIUM and the recommendation remains Research-More / Track.[CV005, CV009, CV034, CV037, CV038]
| Trigger | Threshold and Event | Transmission to Thesis | Action Implication |
|---|---|---|---|
| No Forge ARR disclosed by year-end 2026 | EvolutionaryScale has not disclosed any ARR, enterprise customer count, or pricing data by Q4 2026—three years post-founding | Confirms the commercial thesis is entirely speculative; eliminates revenue multiple basis for any valuation above raised capital; signals possible acqui-hire risk | Downgrade to avoid; seek direct company IR meeting for ARR confirmation before any additional capital commitment |
| Co-founder departure (any of Rives, Sercu, Lin, Candido) | Public announcement or confirmed LinkedIn departure of any of the four co-founders from an active role at EvolutionaryScale | Destroys the founder-domain-authority pillar; raises immediate questions about IP continuity, team morale, and Amazon/NVIDIA partner confidence | Immediate reassessment; reduce position; require explanation of IP assignment and non-compete status before any thesis upgrade |
| Open-source protein generative model parity | Any open-source protein language model released with comparable ESM3 multimodal generative capability and broad community adoption (>10k GitHub stars within 6 months) | Eliminates Forge API's technical differentiation; commoditizes the $1.35B valuation anchor; shifts pricing power to infrastructure (AWS/NVIDIA) away from EvolutionaryScale | Reassess valuation toward bear case; evaluate whether AWS/NVIDIA distribution advantage alone sustains $800M+ valuation |
| Amazon acqui-hire or hostile pricing change | Amazon makes an offer to acquire EvolutionaryScale at a sub-$1B valuation, or AWS changes Forge pricing terms to capture economics directly | Reveals that Amazon views EvolutionaryScale as an infrastructure component rather than an independent platform, driving a valuation reset | Evaluate acqui-hire premium vs. long-term independence path; model fair value as AWS feature vs. standalone platform |
Kill triggers are binary or threshold events; monitoring requires: regular EvolutionaryScale blog/press release checks; LinkedIn tracking for co-founder activity; GitHub protein model repository monitoring; AWS Partner Network announcements.
[CV005, CV022, CV023, CV029, CV033, CV037]| Topic | Missing Evidence | Why It Matters | Owner and Diligence Path |
|---|---|---|---|
| Forge ARR and customer count | Annual recurring revenue, enterprise customer count, and customer names (at least category-level) for the Forge API platform as of Q2 2026 | No valuation model above Series A entry is defensible without revenue confirmation; ARR is the primary commercial thesis validation metric | Company IR; AWS Marketplace listing data; LinkedIn job postings referencing customer-facing roles; Series B fundraise data room |
| Forge gross margin | Cost of revenue for Forge API (GPU compute cost per API call, infrastructure cost, personnel allocated); gross margin percentage | Gross margin determines whether Forge can scale to a high-value SaaS business (>70% GM) or is structurally a low-margin compute resale business | Company IR; financial data room; AWS compute cost benchmarks as proxy |
| Amazon and NVIDIA partnership financial terms | Revenue share percentage, exclusivity provisions, minimum commitment volumes, and term length for the AWS SageMaker JumpStart and NVIDIA BioNeMo distribution agreements | If Amazon or NVIDIA take >40% of Forge gross revenue as channel fee, EvolutionaryScale's net economics may not support independent platform value; exclusivity terms determine ability to self-serve | Series B data room; M&A data room request; NVIDIA and AWS partner program filings |
| ESM IP chain of title from Meta | Any IP transfer, license, or assignment agreement between Meta Platforms and EvolutionaryScale founders covering ESM model architecture, training code, or data pipeline | Without confirmed IP clean chain, acquirers or pharma partners face IP litigation risk; Amazon or pharma due diligence will require clean title | Company legal disclosure; patent search (USPTO); co-founder employment agreement review |
| Biosecurity and dual-use screening protocols | Customer screening process, access controls for high-risk requests (pathogen-related proteins), and compliance with NIH Dual Use Research of Concern (DURC) policies | Biosecurity regulatory action could restrict Forge distribution to non-US markets or force API feature removal; governance transparency is a pre-condition for pharma partnership | EvolutionaryScale responsible development blog; DURC policy review; US DoD/BARDA contractor relationship check |
| Cap table and preference stack | Series A liquidation preference multiple, anti-dilution provisions, total preferred share outstanding, and estimated diluted share count post-Series A | Common equity value below the $1.35B headline depends critically on the preference stack; 2x liquidation preference or participating preferred can significantly reduce common equity value at moderate exit prices | Series B data room; VC legal counsel review; Delaware secretary of state certificate of incorporation |
All six diligence asks are blockers for a buy recommendation. Forge ARR is the highest-priority item; without it, no valuation confidence above the current Series A anchor is possible. IP and biosecurity items are pre-conditions for institutional pharma partnerships and any M&A transaction.
[CV005, CV009, CV034, CV037]8.6 Exhibits
Disclaimer
EvolutionaryScale ceased to operate as an independent for-profit entity on November 6, 2025, when its team was absorbed into CZ Biohub under the Chan Zuckerberg Initiative. This report is therefore primarily a historical / forensic diligence on a defunct standalone investment thesis; the present-day investable surface is the CZI / CZ Biohub network, which is non-profit and not on offer to outside investors. All financial figures (valuation, raise, headcount, downloads) are sourced from third-party reports as no SEC filings or audited disclosures exist; acquisition terms are undisclosed. The recommendation reflects unavailability of a forward-looking standalone equity instrument, not a judgment on the underlying science.
Evidence index
| ID | Statement | Confidence | Sources |
|---|---|---|---|
| CO001 | EvolutionaryScale was incorporated in 2023 and became operationally active in approximately March 2024. | High | SO001, SO004 |
| CO002 | EvolutionaryScale was co-founded by Alexander (Alex) Rives, Tom Sercu, Zeming Lin, and Sanjay Rao, all formerly of Meta AI Research (FAIR). | High | SO001, SO017, SO013 |
| CO003 | EvolutionaryScale was headquartered in San Francisco, California, USA, prior to the November 2025 CZI acquisition. | Medium | SO001, SO004 |
| CO004 | Alex Rives served as CEO of EvolutionaryScale from founding until the November 2025 CZI acquisition, at which point he became Head of Science at CZI. | High | SO001, SO017, SO014 |
| CO005 | Tom Sercu served as co-founder and VP of Engineering at EvolutionaryScale, leading infrastructure and large-scale model training. | High | SO017, SO010, SO009 |
| CO006 | EvolutionaryScale stated its mission as using generative AI to model the language of proteins and unlock programmable biology for human benefit. | High | SO001, SO002 |
| CO007 | EvolutionaryScale primary flagship product was ESM3, a generative multimodal protein language model, released June 25, 2024. | High | SO001, SO002, SO017 |
| CO008 | ESM3 was publicly released on June 25, 2024, with both an open-weights variant (esm3-sm-open-v1) for academic use and a commercial Forge API offering. | High | SO002, SO017, SO005 |
| CO009 | ESM3 is available in multiple model sizes; the largest publicly released variant has 98 billion parameters. | High | SO002, SO009 |
| CO010 | ESM3 was trained on 2.78 billion protein sequences totaling 771 billion tokens using approximately 1x10^24 FLOPs on a cluster of NVIDIA H100 GPUs. | Medium | SO002, SO009, SO010 |
| CO011 | EvolutionaryScale released ESM Cambrian (ESM-C) on December 4, 2024. | High | SO003, SO026 |
| CO012 | ESM Cambrian is available in three model sizes: 300M, 600M, and 6B parameters, optimized for efficient protein language modeling inference. | High | SO003, SO026 |
| CO013 | A peer-reviewed paper on ESM3 titled Simulating 500 million years of evolution with a language model was published in Science on January 16, 2025, with DOI 10.1126/science.ads0018. | High | SO009, SO010, SO002 |
| CO014 | ESM3 encodes and generates proteins by treating sequences, structures, and functional annotations as a multimodal language, sampling from the space of 500 million years of protein evolution. | Medium | SO002, SO009 |
| CO015 | EvolutionaryScale raised a seed round announced on June 25, 2024, with participation from Lux Capital, Nat Friedman, Daniel Gross, NVIDIA, and Amazon; the seed amount was not publicly disclosed. | Medium | SO017, SO004 |
| CO016 | EvolutionaryScale closed a $142M Series A round on September 26, 2024, co-led by Amazon and NVIDIA. | High | SO015, SO017, SO004 |
| CO017 | The Series A round was closed at an implied post-money valuation of approximately $1.35 billion. | Medium | SO004, SO015 |
| CO018 | Additional participants in the Series A included Lux Capital, Nat Friedman, and Daniel Gross, who had also participated in the seed round. | Medium | SO017, SO004 |
| CO019 | As of May 2026, no SEC Form D filings were found under any variant of EvolutionaryScale in EDGAR for the 2024 to 2026 period. | High | SO011, SO012 |
| CO020 | EvolutionaryScale had 11 to 50 employees according to its LinkedIn company page, consistent with a seed/Series A-stage AI research startup. | Low | SO013 |
| CO021 | On November 6, 2025, CZ Biohub announced that the EvolutionaryScale team would join the CZ Biohub Network as part of the Frontier AI for Biology initiative led by the Chan Zuckerberg Initiative. | Medium | SO014, SO018 |
| CO022 | Following the November 2025 CZI acquisition, Alex Rives became Head of Science at the Chan Zuckerberg Initiative (CZI), and other co-founders joined CZ Biohub in senior research roles. | Medium | SO014, SO013 |
| CO023 | CZI and CZ Biohub framed the EvolutionaryScale acquisition as advancing open biological science and making frontier AI biology tools broadly accessible to researchers. | Medium | SO014, SO018 |
| CO024 | The ESM GitHub repository originally at github.com/evolutionaryscale/esm was transferred to the biohub organization following the CZI acquisition, signaling IP transfer. | Medium | SO007, SO006 |
| CO025 | ESM3 open-weights variant (esm3-sm-open-v1) accumulated over 3,100 downloads on HuggingFace; ESM Cambrian models accumulated over 6,300 downloads collectively, as of May 2026. | Medium | SO026, SO005 |
| CO026 | ESM3 was integrated into the NVIDIA BioNeMo platform and made available as an NVIDIA NIM microservice for enterprise deployment on H100 infrastructure. | Medium | SO017, SO019, SO022 |
| CO027 | No Wikipedia article exists for EvolutionaryScale; the URL en.wikipedia.org/wiki/EvolutionaryScale returns a 404 not-found page as of May 2026. | Medium | SO021, SO018 |
| CO028 | EvolutionaryScale operated a commercial API platform at forge.evolutionaryscale.ai providing developer access to ESM3 and ESM-C models; the platform is JavaScript-rendered and its operational status post-acquisition is unknown. | Medium | SO024, SO002, SO003 |
| CO029 | EvolutionaryScale never publicly disclosed commercial revenue, ARR, or customer count as a standalone entity. | Medium | SO001, SO004, SO024 |
| CO030 | All four co-founders (Rives, Sercu, Lin, Rao) were formerly at Meta AI (FAIR), creating a single-employer provenance risk with homogeneous cultural and technical assumptions and no evidence of diverse executive expertise outside AI research. | High | SO002, SO017, SO013 |
| CO031 | The ESM3 BioRxiv preprint (doi: 10.1101/2024.07.01.600583) was published in July 2024 ahead of the Science journal paper, with Rives, Sercu, Candido, Lin, and others as authors. | Medium | SO010, SO007 |
| CO032 | EvolutionaryScale technological moat rested on large-scale protein language model pre-training, proprietary training infrastructure (Andromeda H100 cluster), and a multi-year research lead through the ESM model family lineage from Meta FAIR. | Medium | SO002, SO009, SO008 |
| CO033 | The DeepEP repository demonstrates EvolutionaryScale infrastructure capability in mixture-of-experts inference and expert-parallel communication, relevant to deploying large protein language models at scale. | Medium | SO008, SO006 |
| CO034 | Following the CZI acquisition, the ESM model family is expected to remain accessible as open-source research tools through the CZ Biohub network, continuing the open-weights distribution strategy. | Medium | SO014, SO007 |
| CO035 | EvolutionaryScale was classified as an early-stage private company at seed through Series A stage, with no commercial product revenue disclosed prior to the CZI acquisition in November 2025. | High | SO001, SO004, SO015 |
| CO036 | NVIDIA participated in EvolutionaryScale seed round announced alongside the ESM3 launch on June 25, 2024, and later co-led the Series A in September 2024. | Medium | SO017, SO023 |
| CO037 | The Forge API platform (forge.evolutionaryscale.ai) was the commercial interface for EvolutionaryScale protein design models, providing programmatic access for biotechnology and pharmaceutical customers. | Medium | SO024, SO002 |
| CO038 | The Bloomberg article reporting on the $142M Series A is behind a paywall, preventing public verification of full financing terms, investor rights, and any secondary components of the deal. | Medium | SO015, SO027 |
| CM001 | EvolutionaryScale's core addressable market is the protein language model (PLM) API and platform market—cloud-hosted AI models enabling protein engineers to generate, predict, and optimize protein sequences and structures without exhaustive wet-lab directed evolution. | High | SM011, SM012 |
| CM002 | ESM3, published in Science on January 16, 2025 (DOI: 10.1126/science.ads0018), is the first generative protein language model to simultaneously reason over sequence, structure, and function in a single unified architecture—trained on 2.78 billion protein sequences with 98 billion parameters using approximately 1×10^24 FLOPs. | High | SM013, SM015 |
| CM003 | Status-quo substitutes for protein LM platforms include AlphaFold2/3 (free structure prediction database, 200M+ structures), Rosetta/PyRosetta (open-source protein design), directed evolution in wet lab (weeks per cycle, throughput-limited), and traditional molecular dynamics tools (GROMACS, Schrödinger Maestro), none of which provide generative multi-modal reasoning over sequence, structure, and function jointly. | Medium | SM007, SM017 |
| CM004 | The adjacent AI drug discovery platform market (Grand View Research) is estimated at $2.35B in 2025 growing to $13.77B by 2033 at 24.8% CAGR; EvolutionaryScale's Forge API serves as infrastructure enabling this broader market by providing protein characterization and engineering capabilities. | Medium | SM005 |
| CM005 | Industrial biotechnology—enzyme engineering for green chemistry, agriculture, biomaterials, and food science—is a secondary adjacency for EvolutionaryScale with shorter product development cycles and lower regulatory burden than pharmaceutical applications. | Medium | SM011, SM025 |
| CM006 | The outer boundary drug discovery market (all modalities) is estimated at $71.89B in 2025 growing to $158.74B by 2034 at 9.2% CAGR (Precedence Research); protein engineering API tools constitute a specialized AI sub-segment within this broader market well beyond EvolutionaryScale's direct footprint. | Medium | SM006 |
| CM007 | MarketsandMarkets estimates the protein engineering market at $2.2B (2019) growing to $3.9B by 2024 at 12.4% CAGR; Allied Market Research estimates $2.2B (2022) to $7.7B by 2032 at 13.2% CAGR; Grand View Research estimates $2.60B (2023) to $7.62B by 2030 at 16.24% CAGR—all directionally consistent at 12–16% annual growth over a decade. | Medium | SM001, SM003, SM004 |
| CM008 | Precedence Research takes the broadest scope, estimating the protein engineering market at $5.09B in 2025 growing to $23.59B by 2035 at 16.57% CAGR, incorporating industrial enzymes, biopharmaceuticals, and all research tools rather than just software and services. | Medium | SM002 |
| CM009 | The protein engineering market has a 10× analyst estimate dispersion ($2.2B to $23.59B for 2019–2025 entry years), attributable to scope inconsistency: narrow estimates focus on software/services while broad estimates incorporate industrial enzymes, biopharmaceuticals, and manufacturing applications. | Medium | SM001, SM002, SM003, SM004 |
| CM010 | The FDA received over 500 AI/ML-enabled drug development submissions between 2016 and 2023, issued draft AI guidance in 2025, and established the CDER AI Council in 2024, signaling active and accelerating federal regulatory engagement with AI-native drug development tools including protein engineering applications. | High | SM010, SM016 |
| CM011 | EvolutionaryScale raised $142 million in a Series A round (Crunchbase), establishing investor-validated commercial potential for the protein LM API market; the Forge commercial API monetizes ESM3 access for biopharma and biotech customers beyond the MIT-licensed free tier. | Medium | SM022, SM024 |
| CM012 | The AI drug discovery market (24.8% CAGR, GVR) grows materially faster than the protein engineering tools market (12–17% CAGR), reflecting accelerated pharma AI investment post-AlphaFold; EvolutionaryScale's Forge API benefits from both trajectories as an enabling platform. | Medium | SM005, SM001, SM002 |
| CM013 | No independently published serviceable addressable market figure exists for protein language model APIs specifically within pharmaceutical R&D; all protein engineering market estimates encompass the full market including reagents and instruments, making PLM API SAM derivation assumption-dependent. | Medium | SM001, SM002, SM003, SM004 |
| CM014 | ESM3's commercial Forge API and ESM-C open weights were launched in June 2024 and September 2024 respectively, with ESM-C distributed on AWS SageMaker JumpStart and NVIDIA BioNeMo to reach enterprise pharma customers already embedded in those cloud ecosystems. | High | SM011, SM012, SM008, SM009 |
| CM015 | The primary commercial buyer for EvolutionaryScale's Forge API is the large or mid-tier pharmaceutical or biotech company with an active computational biology or protein engineering program, where economic buyer authority rests with a VP of Computational Biology, Director of Drug Discovery, or Chief Scientific Officer. | Medium | SM011, SM012 |
| CM016 | Academic and government research labs constitute a high-volume, zero-revenue user segment: ESM-C open weights under MIT license have been downloaded over 6,320 times from HuggingFace and the ESM package is available via PyPI, providing community mindshare that can feed eventual commercial pipeline. | Medium | SM018, SM025 |
| CM017 | NVIDIA BioNeMo and AWS SageMaker JumpStart serve as enterprise distribution channels for ESM-C, lowering commercial adoption friction for pharma customers with existing cloud infrastructure contracts on those platforms. | High | SM008, SM009, SM012 |
| CM018 | The technical champion for ESM3/ESM-C adoption is typically a computational biologist, structural biologist, or machine learning scientist within a pharma or biotech R&D organization who evaluates model capabilities and advocates for integration into existing protein engineering pipelines. | Medium | SM011, SM017 |
| CM019 | Industrial biotechnology companies—engineering enzymes for green chemistry, agriculture, and specialty materials—represent a growing buyer segment with different procurement patterns than pharma: shorter development cycles, lower regulatory burden, and higher tolerance for experimental API tools. | Medium | SM025, SM011 |
| CM020 | The ESM Python package on PyPI enables access to ESM3 open models and commercial Forge API, listing all model sizes (esm3-large-2024-03, ESM-C 300M/600M/6B) and API authentication, supporting both researcher self-service and enterprise paid Forge access from a single installation. | Medium | SM025, SM017 |
| CM021 | Biotech startups at Series A–B stage represent an emerging paid segment for Forge API: they have computational infrastructure but lack resources to train frontier protein LMs independently, making API access economically rational vs. self-hosting 98B-parameter ESM3. | Medium | SM022, SM011 |
| CM022 | DNA sequencing costs declined from approximately $10,000 per genome in 2011 to approximately $100 per genome by 2023 (NHGRI), enabling exponential growth in protein sequence databases and providing the training data foundation that enables frontier-scale protein language models like ESM3 to generalize across the protein universe. | High | SM023, SM020 |
| CM023 | Google DeepMind's AlphaFold protein structure database provides free access to over 200 million predicted protein structures; this open resource normalizes computational protein tools in pharma R&D and expands the addressable buyer base for ESM3/ESM-C by reducing scientific credibility risk. | High | SM007, SM013 |
| CM024 | NVIDIA BioNeMo delivers 2× faster biofoundation model training and 6× faster model inference versus unoptimized implementations, reducing the total cost of ownership for enterprise protein LM deployment and strengthening EvolutionaryScale's NVIDIA distribution partnership as a commercial channel. | High | SM008, SM009 |
| CM025 | ESM3 was trained on 2.78 billion protein sequences with 98 billion parameters using approximately 1×10^24 FLOPs of compute (Science, January 2025)—a scale achievable only because of the exponential growth in protein databases enabled by declining sequencing costs. | High | SM013, SM015 |
| CM026 | ESM-C's release under MIT license on HuggingFace with AWS SageMaker and NVIDIA BioNeMo distribution mirrors the open-weight strategy that drove commercial cloud API conversion in NLP (e.g., Hugging Face, Mistral AI), establishing EvolutionaryScale as the community standard for protein LMs. | Medium | SM012, SM018 |
| CM027 | ESM-C (300M, 600M, and 6B parameter variants) is available under MIT license on HuggingFace (6,320+ downloads for the 600M variant, 3,110+ for ESM3 open), enabling any organization with GPU access to self-host the model at zero marginal cost, creating a pricing ceiling and limiting paid Forge conversion for price-sensitive customers. | Medium | SM018, SM017 |
| CM028 | No protein engineered purely by a computational AI model has received regulatory approval without extensive in vitro and in vivo wet-lab validation; the experimental bottleneck remains a necessary post-computational step, structurally limiting the standalone commercial value of a protein LM API. | High | SM010, SM013 |
| CM029 | Enterprise pharma technology procurement cycles typically add 12–24 months to commercial deployment timelines relative to academic adoption due to IT security reviews, cloud data governance policies, SOC2/GxP compliance requirements, and multi-year vendor vetting processes. | Medium | SM008, SM009 |
| CM030 | Google DeepMind (AlphaFold3), NVIDIA BioNeMo, and AWS HealthOmics all have distribution, compute, and ecosystem advantages that could threaten EvolutionaryScale's commercial differentiation if frontier protein LM capabilities converge toward commodity—a material long-run competitive risk. | Medium | SM007, SM008, SM019 |
| CM031 | The bioRxiv preprint server indexed over 129 papers citing ESM3 or EvolutionaryScale as of the access date, indicating strong academic community engagement with the ESM protein LM family and validating the open-weight strategy for building ecosystem adoption. | Medium | SM014, SM015 |
| CM032 | No independent SAM figure for protein language model APIs within pharmaceutical R&D has been published; all accessible analyst estimates cover the full protein engineering tools market ($2.2B–$23.59B), making PLM API SAM derivation assumption-dependent and constituting a material diligence gap. | Medium | SM001, SM002, SM003, SM004 |
| CM033 | EvolutionaryScale has not publicly disclosed Forge API pricing, subscriber counts, or revenue figures; HuggingFace download metrics and GitHub stars are developer adoption proxies that do not directly translate to commercial revenue without knowledge of the paid conversion funnel. | Medium | SM018, SM025, SM022 |
| CM034 | The protein engineering market analyst consensus (MAM, Allied, GVR) converges on a 2024 base of $2.2B–$2.6B with 12–16% CAGR reaching $7–8B by 2030; Precedence's $5.09B base is an outlier explained by broader scope inclusion of industrial enzymes and biopharmaceutical manufacturing. | Medium | SM001, SM002, SM003, SM004 |
| CM035 | The ESM3 Science paper has accumulated over 40,000 citations to AlphaFold as context and 129+ follow-on bioRxiv preprints within one year of publication, demonstrating the scientific impact of the ESM model family and establishing ecosystem depth that sustains commercial positioning. | Medium | SM013, SM014 |
| CM036 | EvolutionaryScale's distribution strategy—open weights on HuggingFace (MIT license) + enterprise Forge API + AWS SageMaker + NVIDIA BioNeMo—creates a multi-channel commercial model spanning free community tier, cloud marketplace access, and direct enterprise contracts. | High | SM011, SM012, SM008, SM009 |
| CP001 | AbSci Corporation (NASDAQ: ABSI) filed a 10-K with the SEC for fiscal year ended December 31, 2025, confirming it is a publicly traded generative AI drug company based in Vancouver, Washington. | High | SP026, SP004 |
| CP002 | DeepMind's AlphaFold Protein Structure Database, developed in partnership with EMBL-EBI, provides open access to over 200 million predicted protein structures under a CC-BY-4.0 license, used by over 3 million researchers in 190+ countries. | High | SP024, SP006 |
| CP003 | EvolutionaryScale's ESM3 is the first generative model to simultaneously reason over protein sequence, structure, and function in a single multimodal architecture, published in Science on January 16, 2025, trained with over 10^24 FLOPs and 98 billion parameters. | High | SP022, SP025 |
| CP004 | Generate Biomedicines has generated, built, and tested over 42,000 proteins through its continuously learning platform, with 140,000+ square feet of lab space across Boynton Yards and Andover locations. | Medium | SP002 |
| CP005 | Cradle.bio's homepage reports that teams using Cradle achieve 2–12x faster protein development timelines, with results compounding across successive rounds of wet-lab and AI iteration. | Medium | SP005 |
| CP006 | The RFdiffusion algorithm for de novo protein structure and function design was published in Nature in July 2023 by Baker Lab researchers, representing the Baker Lab / IPD's leading open-source generative design tool. | High | SP021, SP010 |
| CP007 | Meta's ESM2 and ESMFold protein language models are released under an MIT license, confirmed on both GitHub (github.com/facebookresearch/esm) and HuggingFace, permitting commercial use at zero cost. | High | SP017, SP018 |
| CP008 | Meta's ESM protein language models were created by Alexander Rives, Zeming Lin, Tom Sercu, and Salvatore Candido at Meta AI FAIR — the exact same four individuals who co-founded EvolutionaryScale in 2023. | High | SP017, SP020 |
| CP009 | Isomorphic Labs is an Alphabet subsidiary focused on AI-driven drug discovery, building on the Nobel Prize-winning AlphaFold system, with an interdisciplinary team of drug discovery experts and machine learning specialists. | Medium | SP008 |
| CP010 | Chai Discovery is developing Chai-2, which targets drug-like antibody design against challenging targets with atomic precision, building on its earlier open-released Chai-1 model. | Medium | SP009 |
| CP011 | Recursion Pharmaceuticals (NASDAQ: RXRX) has generated over 50 petabytes of biological and chemical data and operates BioHive-2, a biopharma supercomputer built in partnership with NVIDIA. | Medium | SP011 |
| CP012 | Schrödinger's computational platform is built on over 30 years of R&D and includes FEP+, WaterMap, and LiveDesign as core products used by leading pharmaceutical companies for molecular discovery and optimization. | Medium | SP013, SP014 |
| CP013 | Inceptive specializes in foundation models for RNA, mRNA, siRNA, ASO, and peptide therapeutics, operating from offices in Palo Alto, Berlin, and Zurich, and was founded in 2021. | Medium | SP015 |
| CP014 | Iambic Therapeutics uses its Enchant and NeuralPLexer AI technologies for drug design and has reported Phase 1b safety and tolerability data for IAM1363, a HER2-targeted inhibitor for brain-penetrant cancer treatment. | Medium | SP016 |
| CP015 | Xaira Therapeutics is building predictive and agentic AI models across the complete drug discovery and development process, including target identification, therapeutic design, and patient selection. | Medium | SP028 |
| CP016 | The OpenFold Consortium provides permissively licensed open-source protein folding tools including OpenFold, OpenFold-SoloSeq (no MSA required), and OpenFold-Multimer for protein complex modeling. | Medium | SP019 |
| CP017 | AbSci's AI Drug Creation Platform operates with 6-week wet-lab and AI iterative cycles for de novo biologic design, enabling multi-parametric lead optimization from concept through to clinical trial pipeline. | Medium | SP004 |
| CP018 | AbSci has designed ABS-201, an AI-generated antibody targeting prolactin receptors for androgenetic alopecia, which demonstrated hair follicle regeneration in vivo studies as a potential best-in-class therapeutic developed in 24 months. | Medium | SP004 |
| CP019 | Profluent Bio describes OpenCRISPR on its website as the world's first AI-designed gene editor, representing the company's flagship public demonstration of protein design AI capability. | Medium | SP001 |
| CP020 | Cradle.bio is SOC 2 compliant and operates on a software subscription model where customer IP is fully retained, customer experimental data is never used to train models for other customers, and no royalties are charged. | Medium | SP005 |
| CP021 | Novonesis (formerly Novozymes), one of the world's largest industrial biotech companies, has publicly stated a partnership with Cradle that embeds AI directly into how it innovates protein products to shorten development time. | Medium | SP005 |
| CP022 | Generate Biomedicines operates over 140,000 square feet of lab space at Boynton Yards and Andover locations, supporting a capital-intensive generate-build-measure-learn platform. | Medium | SP002 |
| CP023 | Generate Biomedicines' lead program GB-0895 is an AI-designed anti-TSLP antibody for asthma, co-optimized for both biological effect and reduced dosing frequency, with potential to shift treatment from monthly to twice-yearly administration. | Medium | SP003 |
| CP024 | Recursion Pharmaceuticals' clinical pipeline includes REC-4881 (Phase 2 MEK1/2 inhibitor for FAP with Orphan Drug and Fast Track designations) and REC-3565 (Phase 1 MALT1 inhibitor for B-cell lymphoma). | Medium | SP012 |
| CP025 | Adaptyv Bio is based at the Biopole Life Science Campus in Epalinges, Lausanne, Switzerland and positions itself as a cloud lab for protein designers. | Low | SP007 |
| CP026 | EvolutionaryScale offers Forge, its commercial API platform for ESM3 and ESMC access, described as entering public beta in January 2025 alongside the Science publication announcement. | Medium | SP025, SP023 |
| CP027 | Cradle.bio charges customers a software subscription fee, explicitly promises no royalties, and states that customer sequences and data are private, secure, and never used to train models for other customers. | Medium | SP005 |
| CP028 | Meta's ESM2 model family is available on both GitHub and HuggingFace under the MIT license with no usage restrictions—including commercial use—at zero cost, for models ranging from 8M to 650M parameters publicly on HuggingFace. | High | SP017, SP018 |
| CP029 | Schrödinger announced Q1 2026 financial results in May 2026, confirming its active status as a publicly traded drug discovery platform company (NASDAQ: SDGR). | Medium | SP013 |
| CP030 | Generate Biomedicines and AbSci both monetize through B2B pharma partnership and licensing models rather than offering public self-service APIs, distinguishing their commercial models from EvolutionaryScale's Forge API approach. | Medium | SP002, SP004 |
| CP031 | ESM2 was released by Meta AI FAIR under an MIT license, and the same researchers (Rives, Lin, Sercu, Candido) who created it at Meta are the co-founders of EvolutionaryScale, creating a structural commoditization baseline against their own commercial offering. | High | SP017, SP020, SP025 |
| CP032 | ESMFold, a protein structure prediction model based on ESM2 developed at Meta AI, predicts protein structure end-to-end up to 60x faster than prior state-of-the-art methods and is freely available. | Medium | SP020 |
| CP033 | The Meta ESM2 model family (up to 15B parameters) and ESMFold, both released under MIT license for any use including commercial, were built by EvolutionaryScale's own founders and set a free commoditization floor for basic protein language modeling. | High | SP017, SP018, SP020 |
| CP034 | Insilico Medicine (HKEX: 3696) has completed a Phase 2 clinical trial for ISM001-055 (TNIK inhibitor for IPF), making it the first AI drug discovery company to reach Phase 2 completion with a drug designed entirely using AI. | Medium | SP027 |
| CP035 | ESM3's defining differentiation is simultaneous joint reasoning over protein sequence, structure, and function in one multimodal model — a capability absent from ESM2 (sequence only) and standard AlphaFold variants (structure prediction only). | High | SP022, SP025 |
| CP036 | Generate Biomedicines has raised substantially more capital than EvolutionaryScale (estimated ~$700M+ vs $142M), enabling a capital-intensive wet-lab validation strategy that EvolutionaryScale's disclosed funding cannot currently replicate. | Medium | SP002 |
| CP037 | AlphaFold 3's commercial rights for drug discovery are exclusively licensed to Isomorphic Labs, while the AlphaFold model code and weights are available for academic non-commercial use, creating a two-tier access structure. | Medium | SP006, SP008 |
| CP038 | Pharma clients can simultaneously use free protein AI tools (AlphaFold DB CC-BY-4.0, ESM2 MIT, OpenFold open-source) and paid platforms (Forge, Cradle, Generate), enabling multi-homing that limits any single vendor's pricing power. | Medium | SP006, SP017, SP005, SP025 |
| CP039 | David Baker (Institute for Protein Design, University of Washington) was co-awarded the Nobel Prize in Chemistry in October 2024 for computational protein design, alongside Demis Hassabis and John Jumper (Google DeepMind) for AlphaFold. | High | SP006, SP010 |
| CP040 | The Institute for Protein Design distributes RFdiffusion and RoseTTAFold software royalty-free and has developed a COVID-19 vaccine using protein design technology that received approval in the UK and South Korea under WHO Emergency Use Listing. | Medium | SP010, SP021 |
| CI001 | EvolutionaryScale's commercial product was Forge, a protein language model inference API launched in public beta in January 2025, providing pay-per-token access to ESM3 and ESM Cambrian models. | High | SI001, SI003, SI004 |
| CI002 | ESM Cambrian (released January 2025) was made available exclusively as a commercial model through the Forge API, unlike the open-weight ESM2 from Meta AI Research which is freely available on HuggingFace. | High | SI003, SI023 |
| CI003 | No revenue figures, ARR, gross margin, customer count, or commercial traction metrics were publicly disclosed by EvolutionaryScale at any point during its operation as a standalone entity. | High | SI001, SI014, SI015 |
| CI004 | The Forge API pricing schedule required a user login to view at forge.evolutionaryscale.ai; no public price list was available to unauthenticated users as of the research date. | Medium | SI004 |
| CI005 | EvolutionaryScale's revenue model combined at least three streams: Forge API pay-per-use (per-token), enterprise annual API contracts, and partner distribution through NVIDIA BioNeMo and AWS SageMaker JumpStart. | Medium | SI004, SI016, SI017 |
| CI006 | ESM3 was integrated into NVIDIA's BioNeMo platform as a NVIDIA Inference Microservice (NIM), enabling cloud-hosted protein generation through NVIDIA's commercial distribution channel. | High | SI017, SI018, SI019 |
| CI007 | EvolutionaryScale's ESM models were listed on AWS SageMaker JumpStart for cloud-hosted access; Amazon was the lead co-investor in the September 2024 Series A, suggesting a strategic alignment between investment and cloud distribution. | Medium | SI009, SI017 |
| CI008 | EvolutionaryScale offered an academic free tier with capped token allowances as a freemium entry point to Forge API, intended to drive academic usage and downstream conversion to enterprise or paid API tiers. | Medium | SI004, SI001 |
| CI009 | The open-weight ESM2 model (developed by Meta AI Research and released on HuggingFace) serves as a zero-cost alternative for protein sequence embeddings, creating a structural competitive ceiling on EvolutionaryScale's Forge API pricing power for non-generative use cases. | Medium | SI023, SI003 |
| CI010 | The Forge API's post-CZI Biohub operational status and pricing under CZI management are unconfirmed in public sources as of May 2026; the company homepage states it is "joining forces with Biohub" without specifying Forge API continuity. | High | SI001, SI013 |
| CI011 | ESM3 was trained with over 10^24 FLOPs—described by EvolutionaryScale as "the most compute ever applied to training a biological model"—on what the company called "one of the highest throughput GPU clusters in the world today." | High | SI002, SI018 |
| CI012 | At over 10^24 FLOPs and H100 GPU pricing of approximately $2–5 per GPU-hour, the ESM3 training run is estimated to have cost $10–50 million, making it the dominant one-time capital expenditure in EvolutionaryScale's history. | Low | SI002, SI018 |
| CI013 | EvolutionaryScale's LinkedIn company profile shows the 11-50 employee size bracket, implying approximately 25-50 full-time employees at its peak operational scale. | Medium | SI025 |
| CI014 | At an estimated 25-50 FTE with a blended all-in cost of $200,000–$300,000 per employee annually (standard for San Francisco AI research teams), EvolutionaryScale's annual personnel burn is estimated at $5–15 million per year. | Low | SI025 |
| CI015 | Amazon's role as lead Series A investor may have included in-kind AWS cloud compute credits as part of the deal structure, which would reduce EvolutionaryScale's cash infrastructure spend and extend effective runway beyond naive burn-rate estimates. | Low | SI009, SI017 |
| CI016 | Gross margin for the Forge API inference business depends on whether EvolutionaryScale owned GPU cluster infrastructure (capital-intensive, higher long-run margin) or rented cloud compute (lower capex, COGS-heavy). No gross margin figures were disclosed. | Low | |
| CI017 | ESM3 was developed with 98 billion parameters, placing it firmly in the frontier model scale class for biological language models; inference costs per query at this parameter scale are substantially higher than smaller protein models. | High | SI002, SI018 |
| CI018 | EvolutionaryScale raised $142 million in a Series A round announced on September 26, 2024, led by Amazon and NVIDIA, with co-investment from Lux Capital, Nat Friedman, and Daniel Gross. | High | SI001, SI009, SI010, SI017 |
| CI019 | The Series A was reported at a post-money valuation of approximately $1.35 billion, placing EvolutionaryScale among the most highly valued pre-revenue protein AI companies at the time of the round. | High | SI009, SI011, SI020 |
| CI020 | NVIDIA joined EvolutionaryScale's seed round (announced June 25, 2024), making NVIDIA both a seed and Series A investor—an unusual dual-round commitment that underscores the strategic importance of ESM3 to NVIDIA's BioNeMo platform. | High | SI016, SI017 |
| CI021 | EvolutionaryScale's seed round (late 2023) was led by Nat Friedman and Daniel Gross, with participation from Lux Capital; the dollar amount raised in the seed was not publicly confirmed in any accessible source. | Medium | SI014, SI015 |
| CI022 | Total capital raised by EvolutionaryScale prior to the CZI Biohub transaction is estimated at approximately $145 million ($3M seed + $142M Series A). | Medium | SI014, SI009, SI016 |
| CI023 | On November 6, 2025, the EvolutionaryScale team joined CZI Biohub to advance the Frontier AI for Biology Initiative, as announced by Biohub.org and reported by CNBC. | High | SI013, SI012, SI001 |
| CI024 | Alex Rives, EvolutionaryScale co-founder and chief scientist, became Head of Science at CZI Biohub following the November 2025 transaction. | High | SI013, SI012 |
| CI025 | The CZI Biohub transaction was consummated approximately 14 months after the $142M Series A closed on September 26, 2024, providing only a narrow window for commercial revenue ramp before the standalone entity effectively ended. | High | SI009, SI013 |
| CI026 | No Form D filings for EvolutionaryScale appear in SEC EDGAR under any of the following search terms: "EvolutionaryScale," "Evolutionary Scale," "Evolutionary Scale Inc," or by key person "Alexander Rives" — across four separate EDGAR full-text and company browse searches. | High | SI005, SI006, SI007, SI008 |
| CI027 | Private companies raising capital under SEC Regulation D exemptions are legally required to file Form D with the SEC within 15 days of the first sale of securities. The absence of Form D in EDGAR for EvolutionaryScale's $142M Series A raise is a noteworthy regulatory compliance gap or indicator of filing under an undiscovered legal entity name. | High | SI005, SI006 |
| CI028 | The financial terms of the November 2025 CZI Biohub transaction were not disclosed in any public source reviewed, including Biohub.org, CNBC, EvolutionaryScale's homepage, or SEC EDGAR. | High | SI012, SI013, SI001 |
| CI029 | Xaira Therapeutics raised $1 billion at its founding in 2024 for full-stack AI drug discovery, representing the largest single-round raise in protein AI; EvolutionaryScale's $1.35B valuation on $142M capital compares to Xaira's larger initial capital base. | Medium | SI014, SI015 |
| CI030 | Profluent Bio raised approximately $44 million across its financing rounds for protein design AI with a narrower commercial scope than EvolutionaryScale, demonstrating that the protein AI market can support smaller, more focused capital deployments alongside frontier-scale foundation models. | Medium | SI014 |
| CI031 | Generate:Biomedicines raised over $700 million across Series A through C for full-stack AI protein therapeutics, targeting drug revenue rather than API monetization—a fundamentally different business model and capital structure from EvolutionaryScale's foundation-model API approach. | Medium | SI014, SI015 |
| CI032 | EvolutionaryScale's ~$3–6M capital raised per employee (based on ~$145M total and ~25–50 FTE) substantially exceeds Profluent's and Cradle's capital efficiency, reflecting the compute intensity of frontier biological foundation model training rather than scaled commercial deployment. | Low | SI014, SI025 |
| CI033 | Crunchbase incorrectly labels EvolutionaryScale's $142M Series A as a "seed investment round" in its AI-generated summary, illustrating the unreliability of AI-generated private-market data summaries; the actual round type is confirmed as Series A by CNBC, Axios, NVIDIA, and MIT Technology Review. | High | SI014, SI009 |
| CI034 | The investor return profile for EvolutionaryScale's $142M Series A participants (Amazon, NVIDIA, Lux Capital, Nat Friedman, Daniel Gross) is not determinable from public sources following the CZI Biohub transaction, as no deal terms or investor distribution amounts were disclosed. | Low | SI012, SI013 |
| CI035 | EvolutionaryScale as an independent commercial entity effectively ceased to operate following the November 2025 CZI Biohub transaction; the company homepage confirms the entity is "joining forces with Biohub" without a separate commercial continuation announcement. | High | SI001, SI013, SI012 |
| CI036 | All five planned financial information gaps—actual revenue, confirmed burn rate, CZI transaction terms, Form D filings, and enterprise customer count—remain unresolved in public sources as of May 2026 and require direct access to CZI Biohub documentation or historical EvolutionaryScale internal records. | High | SI005, SI001, SI014 |
| CI037 | The CZI Biohub is a non-profit initiative of the Chan Zuckerberg Initiative whose Frontier AI for Biology Initiative absorbs EvolutionaryScale's team and models under a philanthropic, non-commercial mandate—a fundamental change from the VC-backed commercial API business model. | High | SI013, SI012 |
| CI038 | The $142M Series A at $1.35B valuation for a pre-revenue, sub-50-employee foundation model company represents a significant premium ascribed entirely to the scientific moat and strategic optionality of ESM3/ESM Cambrian rather than demonstrated commercial revenue or customer traction. | Medium | SI009, SI019, SI002 |
| CE001 | EvolutionaryScale's product portfolio consists of two model families: ESM3 (multimodal generative protein LM in 1.4B/7B/98B sizes) and ESM-C / Cambrian (embedding-focused protein LM in 300M/600M/6B sizes). | High | SE001, SE002, SE003 |
| CE002 | ESM3-small-2024-08 has 1.4 billion parameters; ESM3-medium-2024-08 has 7 billion; and ESM3-large-2024-03 has 98 billion parameters. | High | SE001, SE009, SE017 |
| CE003 | ESMC-300M uses 30 transformer layers with hidden width 960; ESMC-600M uses 36 layers with width 1152; ESMC-6B uses 80 layers with width 2560. | Medium | SE002 |
| CE004 | Open weights for ESM3-small-2024-08 and ESMC-300M/ESMC-600M are available on HuggingFace under the Cambrian Non-Commercial License Agreement, which prohibits commercial use. | High | SE001, SE014, SE009 |
| CE005 | EvolutionaryScale's open weights for ESM3-small were first released in June 2024 concurrent with the ESM3 launch; ESMC-300M and ESMC-600M open weights were released in December 2024. | High | SE001, SE002 |
| CE006 | ESM3's flagship proof of concept is esmGFP, a novel functional fluorescent protein designed with only 58% sequence identity to the nearest known natural GFP — approximately equivalent to 500 million years of evolutionary distance. | High | SE001, SE005, SE006 |
| CE007 | The Forge API (forge.evolutionaryscale.ai) provides programmatic access to ESM3 and ESMC models through a Python SDK (pip install evoscale-sdk) with synchronous and asynchronous inference and a batch executor; the API was opened to public beta in January 2025. | High | SE001, SE004, SE009 |
| CE008 | Amazon Web Services and NVIDIA are EvolutionaryScale's primary commercial deployment partners: ESMC-6B is deployed on AWS SageMaker JumpStart, and NVIDIA is integrating ESM-C into BioNeMo NIM. | High | SE017, SE018, SE019 |
| CE009 | ESM3 uses a multitrack transformer architecture with three separate discrete token tracks: amino acid sequence tokens, VQVAE-encoded structure tokens (representing 3D backbone coordinates), and function keyword tokens (GO annotations). | High | SE001, SE005, SE006 |
| CE010 | ESM3-large (98B parameters) was trained on 2.78 billion proteins, 771 billion unique tokens, using 1.07×10²⁴ floating-point operations on the Andromeda cluster. | High | SE001, SE005, SE017 |
| CE011 | ESM3 employs a vector quantized variational autoencoder (VQVAE) to encode 3D protein backbone coordinates as discrete structural tokens, enabling the transformer to natively generate structure as tokens rather than as continuous coordinates. | High | SE001, SE006 |
| CE012 | ESM3 is pre-trained using a masked language modeling (MLM) objective applied jointly across all three tracks (sequence, structure, function), enabling the model to infer any track from the others. | High | SE001, SE005 |
| CE013 | Reinforcement learning from human feedback (RLHF) was applied to ESM3-large to align outputs with human preferences for protein design tasks. | Medium | SE001 |
| CE014 | ESM-C uses a Pre-Layer Normalization transformer architecture with rotary positional embeddings (RoPE), SwiGLU feed-forward activations, and masked language modeling pre-training. | Medium | SE002 |
| CE015 | ESMC training compute: ESMC-300M was trained on 1.26×10²² FLOPs; ESMC-600M on 2.17×10²² FLOPs; ESMC-6B on 2.37×10²³ FLOPs. | Medium | SE002 |
| CE016 | ESM-C was trained on three protein sequence databases: UniRef (83 million sequence clusters), MGnify (372 million), and JGI metagenomics (2 billion clusters), all clustered at 70% sequence identity. | Medium | SE002 |
| CE017 | EvolutionaryScale's DeepEP library is an open-source CUDA/NCCL implementation of Mixture-of-Experts Expert Parallelism communication optimized for H800 GPUs, with 1,253 GitHub stars as of the research date. | Medium | SE011, SE010 |
| CE018 | NVIDIA reports that ESM3-large uses approximately 25× more FLOPs and 60× more training data than its predecessor, ESM2 (Meta AI), and was trained on NVIDIA H100 GPUs via the Andromeda HPC cluster. | Medium | SE017 |
| CE019 | The GitHub ESM repository (github.com/evolutionaryscale/esm) provides the official Python client library for the Forge API and access to open-weight models, with installation via pip. | Medium | SE009, SE012 |
| CE020 | The HuggingFace model card for esm3-sm-open-v1 showed 3,105 downloads in the prior 30 days and 291 likes as of the research access date. | Medium | SE013, SE014 |
| CE021 | The ESMC-300M model card showed 6,320 downloads on HuggingFace; the ESMC-600M model card showed 1,490 downloads, as of the research access date. | Medium | SE013, SE015 |
| CE022 | ESMC-6B is available via the Forge API for academic users and via AWS SageMaker JumpStart for commercial deployments; the SageMaker deployment uses a CloudFormation stack documented in the esm-sagemaker GitHub repository, with setup time of 15-25 minutes. | Medium | SE002, SE009, SE019 |
| CE023 | EvolutionaryScale's GitHub organization (github.com/evolutionaryscale) hosts nine public repositories including esm (flagship), DeepEP (1,253 stars), a NCCL fork, a Hugging Face transformers fork, a Mamba implementation, and esm-sagemaker. | Medium | SE010, SE011, SE009 |
| CE024 | EvolutionaryScale has not filed any SEC Form D equity offering disclosures as of May 2026, confirming the company's status as a privately held entity that has not made registered securities offerings. | Medium | SE021 |
| CE025 | NVIDIA announced a partnership with EvolutionaryScale to integrate ESM3 into the BioNeMo NIM platform for GPU-optimized inference, and participated in EvolutionaryScale's seed investment. | High | SE016, SE017, SE018 |
| CE026 | Hacker News search results show ten or more community discussion threads covering EvolutionaryScale and ESM3, including "Show HN: ESM C" and multiple threads on the Science publication and initial ESM3 launch, indicating meaningful developer community engagement. | Medium | SE022 |
| CE027 | The ESM3 Science paper (Hayes et al., January 2025, Vol 387, Issue 6736, pp. 850-858, DOI 10.1126/science.ads0018) has accumulated 341 citations and 68,494 downloads, with 318 of the citations arriving within the first 12 months of publication. | High | SE005, SE026 |
| CE028 | The ESM3 preprint on bioRxiv (submitted July 2024, DOI 10.1101/2024.07.01.600583) was cited by 129+ downstream papers within its first year of availability, signaling rapid academic adoption. | Medium | SE006, SE008 |
| CE029 | esmGFP carries 96 mutations out of 229 total amino acid positions, achieving 58% sequence identity to the nearest known natural GFP and representing a protein in a region of sequence space separated from known fluorescent proteins by approximately 500 million years of evolutionary divergence. | High | SE001, SE005, SE006 |
| CE030 | ESM3 designed esmGFP by jointly optimizing across sequence, structure, and function tracks, using the multitrack generative capability to explore protein sequence space beyond the reach of natural evolution or previous directed-evolution methods. | High | SE001, SE005 |
| CE031 | The 58% sequence identity distance between esmGFP and the nearest natural GFP is comparable to the evolutionary separation between corals and jellyfish, which represent two distinct animal phyla. | Medium | SE001, SE006 |
| CE032 | EvolutionaryScale filed patents covering esmGFP and related protein design methods, as stated in the ESM3 bioRxiv preprint. | Medium | SE006 |
| CE033 | ESM3 competes in the protein AI landscape against AlphaFold3 (structure prediction, DeepMind, May 2024), Chai-1 (protein complex structure, Chai Discovery), and ESM2 (sequence LM, Meta AI); each competitor focuses on structure prediction rather than generative protein design. | Medium | SE027, SE028, SE017 |
| CE034 | EvolutionaryScale raised a $142 million Series A in September 2024 led by Lux Capital, with participation from Amazon and NVIDIA, following an earlier seed investment from NVIDIA. | Medium | SE024, SE025, SE020 |
| CE035 | In November 2025, EvolutionaryScale's team joined CZI Biohub as part of its Frontier AI & Biology initiative; co-founder and chief scientist Alex Rives was appointed head of science at Biohub. | Medium | SE023 |
| CE036 | Open-weight ESM3 and ESMC models are distributed under the Cambrian Non-Commercial License Agreement, which restricts use to non-commercial research; commercial customers must access models through the Forge API or AWS SageMaker. | High | SE001, SE014 |
| CE037 | EvolutionaryScale employs a dual-access commercial model: open-weight non-commercial access for research and community adoption, and Forge API / SageMaker commercial access with undisclosed pricing. | Medium | SE001, SE002, SE004, SE019 |
| CE038 | EvolutionaryScale is described as a public benefit company (PBC) in CZI Biohub's November 2025 announcement, consistent with its stated mission of advancing biology through responsible AI. | Medium | SE023 |
| CE039 | An independent BioRxiv preprint (December 2024) found that ESM3's binding prediction accuracy deteriorates when distinct per-variant relaxed protein structures are used as inputs, compared to a single consistent structural backbone — a finding the authors describe as the 'More Structure, Less Accuracy' paradox. | Medium | SE007 |
| CE040 | EvolutionaryScale has not publicly disclosed commercial pricing for the Forge API, customer names or contract counts, or revenue metrics as of May 2026. | High | SE003, SE004, SE021 |
| CU001 | The biohub/esm3-sm-open-v1 model on HuggingFace had approximately 3,110 downloads and 291 likes as of May 2026, reflecting academic uptake of the ESM3 open-weight model. | Medium | SU006 |
| CU002 | The biohub/esmc-300m-2024-12 model on HuggingFace had approximately 6,320 downloads and 30 likes as of May 2026. | Medium | SU006 |
| CU003 | The biohub/esmc-600m-2024-12 model on HuggingFace had approximately 1,490 downloads and 32 likes as of May 2026. | Medium | SU006 |
| CU004 | Total ESM-C family HuggingFace downloads across 300M and 600M open models sum to approximately 7,810 as of May 2026. | Medium | SU006 |
| CU005 | A Semantic Scholar API search for ESM3 and EvolutionaryScale returned 32 papers building on the ESM3 framework as of May 2026. | Medium | SU009 |
| CU006 | A bioRxiv search for 'evolutionaryscale ESM3' returned 129 preprint results as of May 2026, indicating broad academic interest in ESM3. | Medium | SU010 |
| CU007 | ESM-C models are available for commercial deployment on Amazon SageMaker under the Cambrian Inference Clickthrough License Agreement, enabling broad commercial use by enterprise customers. | High | SU003, SU004 |
| CU008 | ESM-C 6B is available for academic use via the Forge API and for commercial use via Amazon SageMaker, as stated in the ESM Cambrian launch blog post. | Medium | SU003 |
| CU009 | AWS SageMaker deployment of ESM-C requires admin-level AWS account access, subscription via the Marketplace, and uses CloudFormation to deploy a dedicated GPU endpoint in 15–25 minutes billed to the customer's AWS account. | Medium | SU004 |
| CU010 | NVIDIA BioNeMo was listed as an upcoming integration channel for ESM-C models as of December 2024; the live status of the integration could not be confirmed as of May 2026. | Medium | SU003, SU007 |
| CU011 | Adaptyv Bio, a protein engineering company based at Biopole Life Science Campus in Lausanne, Switzerland, has been confirmed as a named ESM ecosystem partner. | Medium | SU008 |
| CU012 | EvolutionaryScale opened the Forge API public beta in January 2025, offering scientists in academia and industry a free limited-time preview of ESM3 and ESM-C models. | High | SU001, SU002 |
| CU013 | The EvolutionaryScale GitHub organization includes an 'esm-partner' repository explicitly labeled 'Repository for partner collaborations,' indicating a formal partner pipeline. | Medium | SU005 |
| CU014 | EvolutionaryScale raised $142M Series A in September 2024 from Amazon (AWS) and NVIDIA as strategic investors, with Lux Capital, Nat Friedman, and Daniel Gross also participating. | High | SU011, SU012 |
| CU015 | NVIDIA participated in EvolutionaryScale's seed investment round, as confirmed by a dedicated NVIDIA Newsroom press release, establishing the NVIDIA–EvolutionaryScale relationship before the Series A. | Medium | SU016 |
| CU016 | Lux Capital co-led or participated in EvolutionaryScale's Series A round, as confirmed by a Lux Capital blog post announcing the investment. | Medium | SU017 |
| CU017 | No named pharmaceutical company (such as Pfizer, Eli Lilly, Novartis, or Roche) has been publicly disclosed as a paying enterprise customer of EvolutionaryScale as of May 2026. | Medium | SU014, SU018 |
| CU018 | Generate Biomedicines announced a multi-billion-dollar collaboration with Amgen, representing a commercial benchmark that EvolutionaryScale has not publicly matched as of May 2026. | Medium | SU019 |
| CU019 | Isomorphic Labs signed collaboration agreements with Eli Lilly and Novartis totaling over $3 billion in potential milestone value, creating a commercial proof standard EvolutionaryScale has not yet demonstrated. | Medium | SU021 |
| CU020 | ESM3 was published in Science Magazine on January 16, 2025 (DOI: 10.1126/science.ads0018), providing peer-reviewed academic validation that anchors downstream commercial trust. | High | SU015, SU002 |
| CU021 | MegSite (nucleic acid binding residue prediction), ProteinReasoner (multi-modal protein LM with chain-of-thought), and iNClassSec-ESM (non-classical secreted protein discovery) are among the named downstream academic applications built on the ESM3 framework. | Medium | SU009 |
| CU022 | EvolutionaryScale has disclosed no NRR, GRR, customer count, or customer satisfaction metrics as of May 2026; the company is in a pre-commercial-scale API beta phase. | Medium | SU014, SU018 |
| CU023 | The ESM3 open-weight model (1.4B parameters) is licensed for non-commercial use only; commercial access to all model scales (including 7B and 98B ESM3) requires Forge API tokens or AWS SageMaker subscriptions. | High | SU004, SU001 |
| CU024 | Biosecurity organizations including NTI and the Center for AI Safety have documented ongoing biosecurity concerns about dual-use capabilities of protein design AI, which may constrain the addressable commercial market for open-weight frontier protein language models. | Medium | SU028, SU029 |
| CU025 | BioNTech and InstaDeep fine-tuned an ESM language model (predecessor generation) on COVID spike proteins to create a variant early-warning system that flagged all 16 WHO variants of concern before official designation, demonstrating prior named corporate production use of the ESM family. | Medium | SU002 |
| CU026 | The global drug discovery market is valued at approximately $71.89 billion in 2025, growing at a CAGR of 9.20% through 2034, providing a large total addressable market for AI infrastructure vendors like EvolutionaryScale. | Medium | SU020 |
| CU027 | EvolutionaryScale's company website does not display customer logos, named case studies, testimonials, or enterprise customer success content as of May 2026. | Medium | SU001 |
| CU028 | ESM-C models on HuggingFace were updated as recently as two days before the research cache date (approximately May 2026), indicating active model maintenance and development velocity. | Medium | SU006 |
| CU029 | The GitHub repository evolutionaryscale/esm is the primary open-source distribution channel for ESM model weights, code, and API client libraries including the Forge and SageMaker SDKs. | Medium | SU004 |
| CU030 | The ESM3 open-weight model (1.4B parameters) was released on June 25, 2024 under a non-commercial license as stated in the ESM3 launch blog post. | High | SU002, SU004 |
| CU031 | The ESM Cambrian (ESM-C) model family was released on December 4, 2024 at three scales (300M, 600M, 6B), with open weights for the two smaller models and gated commercial access for the 6B model. | Medium | SU003 |
| CU032 | NVIDIA BioNeMo platform explicitly targets drug discovery, molecular design, virtual screening, and protein binder design use cases, which directly overlap with ESM3/ESM-C's primary commercial applications. | Medium | SU007 |
| CU033 | AWS SageMaker listing of ESM-C models creates a cloud-based commercial deployment channel for enterprise users who can subscribe and deploy without requiring direct Forge API account creation. | Medium | SU004, SU025 |
| CU034 | EvolutionaryScale's open-weight model strategy creates a top-of-funnel adoption mechanism where academic users build familiarity that can convert to commercial API usage, following a pattern analogous to open-source AI infrastructure companies. | Medium | SU001, SU004 |
| CU035 | ESM2, the predecessor to ESM3, with up to 15B parameters and freely available open weights, represents a no-cost substitution option for protein representation tasks that reduces commercial willingness-to-pay for ESM-C paid access. | Medium | SU004 |
| CU036 | Amazon AWS's strategic investment in EvolutionaryScale's Series A creates a structural incentive for preferential channel placement in AWS SageMaker JumpStart and AWS HealthOmics, giving EvolutionaryScale access to AWS's enterprise life sciences customer base. | Medium | SU011, SU004 |
| CU037 | Semantic Scholar papers building on ESM3 were published as recently as July–August 2025 (ProteinReasoner: July 25, 2025; MegSite: August 31, 2025), indicating ongoing downstream academic use at least 13 months after ESM3's release. | Medium | SU009 |
| CU038 | EvolutionaryScale is incorporated as a public benefit company (PBC), a legal structure that creates mission constraints that could limit commercialization of some high-profit but ethically questionable protein design applications. | Medium | SU001 |
| CR001 | ESM3 can generate proteins at 58% sequence identity to any known natural fluorescent protein, representing an equivalent of 500 million years of natural evolution, demonstrating the model's capability to design genuinely novel proteins far from existing sequence space. | High | SR014, SR028 |
| CR002 | EvolutionaryScale's canonical Responsible Development Framework URL (evolutionaryscale.ai/blog/responsible-development) returned a 404 error on 2026-05-18, indicating the public documentation page is not accessible at access date. | Medium | SR014 |
| CR003 | US Executive Order 14110 (October 30, 2023) explicitly mandates that developers of dual-use foundation models address security risks 'with respect to biotechnology, cybersecurity, critical infrastructure, and other national security dangers.' | Medium | SR001 |
| CR004 | A 2023 MIT study (arXiv 2306.03809) showed that general-purpose LLM chatbots could, in one hour, suggest four pandemic pathogen candidates, synthesis routes, DNA suppliers with lax screening, and troubleshooting protocols to non-scientists, indicating LLMs broadly lower biosecurity barriers. | Medium | SR004 |
| CR005 | No independent third-party biosecurity audit of EvolutionaryScale's Forge API guardrails has been publicly disclosed as of May 2026, making it impossible to verify the effectiveness of the company's self-regulatory biosecurity measures. | Medium | SR014, SR015 |
| CR006 | The Biological Weapons Convention (BWC), effective since 1975 with 189 states party as of May 2025, contains no AI-specific language and has no formal verification mechanism, leaving AI-designed protein risks unaddressed by existing international law. | Medium | SR006 |
| CR007 | The Center for AI Safety's 2023 statement (co-signed by Hinton and Bengio) identifies mitigating the risk of extinction from AI as a global priority on par with pandemics and nuclear war, with biological weapons specifically cited as a concern. | Medium | SR005 |
| CR008 | EvolutionaryScale's ESM Cambrian (Dec 2024) launch blog states that 'ESM C was reviewed by a committee of scientific experts who concluded that the benefits of releasing the models greatly outweigh any potential risks,' but the committee composition and evaluation criteria are not publicly disclosed. | Medium | SR015 |
| CR009 | The Asilomar Conference on Recombinant DNA (1975) established the precedent of voluntary self-regulatory frameworks for biotechnology, which EvolutionaryScale explicitly cites as inspiration for its Responsible Development Framework, but Asilomar's effectiveness in the longer term depended on subsequent binding FDA regulations. | Medium | SR014, SR027 |
| CR010 | The NTI biosecurity program identifies AI-biotech convergence as introducing 'risks of accidental misuse and deliberate exploitation, which could result in a biological catastrophe with grave consequences,' framing regulatory action as urgent. | Medium | SR007 |
| CR011 | Chai-1 (Apache 2.0, free for commercial use, released September 2024) achieves 0.849 Cα LDDT on CASP15 monomer prediction, outperforming ESM3-98B's 0.801, with 77% PoseBusters success vs AlphaFold3's 76%, making it the only freely available commercial-use model at or above ESM3 structure-prediction accuracy. | Medium | SR023 |
| CR012 | AlphaFold 3 database (Google DeepMind/EMBL-EBI) provides over 200 million protein structure predictions freely under CC BY 4.0, updated as of March 2026 to include protein complex structures, directly covering a major Forge API use case at no cost. | Medium | SR011 |
| CR013 | ESM3-98B's training consumed 1×10²⁴ FLOPs, described at launch as 'one of the highest throughput GPU clusters in the world,' creating a compute cost that is a recurring operational risk as the company scales inference and future model training. | High | SR014, SR028 |
| CR014 | Meta's facebookresearch/esm repository states it 'contains code and pre-trained weights for Transformer protein language models from the Meta Fundamental AI Research Protein Team (FAIR),' and ESM3's biorxiv preprint confirms ESM3 was developed by founders who were FAIR employees, raising IP provenance questions about the ESM2-to-ESM3 transition. | Medium | SR010, SR028 |
| CR015 | RFdiffusion (Baker Lab, Nature 2023) is freely available for protein backbone generation, binder design, symmetric oligomer design, and active-site scaffolding — core use cases also addressed by ESM3 — and is distributed with permissive licensing from the University of Washington's Institute for Protein Design. | Medium | SR009 |
| CR016 | OpenFold (Apache 2.0, AQ Laboratory) provides a trainable reproduction of AlphaFold2 that organisations can fine-tune on proprietary data, enabling pharmaceutical companies to build internal capabilities that reduce dependence on Forge API subscription. | Medium | SR012 |
| CR017 | ESM3 uses a discrete token representation of protein structure that tokenises 3D protein backbone into a sequence alphabet, an architectural innovation first published in the ESM3 Science paper (January 2025) and biorxiv preprint (July 2024), with patents filed. | High | SR028, SR016 |
| CR018 | AlphaFold 2's prediction accuracy at CASP14 was 'insufficient for a third of its predictions' per Wikipedia's AlphaFold article, indicating that even state-of-the-art protein AI models have non-trivial failure rates — a parallel risk for ESM3-generated sequences in drug-discovery applications. | Medium | SR022 |
| CR019 | Amazon (AWS) and Nvidia are simultaneously Series A investors in EvolutionaryScale and operators of competing bio-AI model distribution platforms (SageMaker JumpStart and BioNeMo respectively), creating a structural investor-competitor conflict. | High | SR017, SR013, SR026 |
| CR020 | NVIDIA BioNeMo is described as 'the development platform for AI-driven biology and drug discovery,' a direct competitor to EvolutionaryScale's Forge API, while NVIDIA is simultaneously an investor and hardware supplier to EvolutionaryScale. | Medium | SR013 |
| CR021 | ESM Cambrian (300M and 600M parameter models) are released as open-weight models for academic and commercial use, with ESM-C 6B available on Forge for academic use and on AWS SageMaker for commercial use, meaning EvolutionaryScale deliberately makes its representation models freely available to drive adoption of Forge API. | High | SR015, SR016 |
| CR022 | Meta's ESM2 model is available under the MIT license via the facebookresearch/esm repository, making it freely usable for commercial applications without royalty obligations — this creates a baseline that limits the premium a customer would pay for ESM3 API access for embedding-only use cases. | Medium | SR010 |
| CR023 | Chai-1's technical report demonstrates that multimer structure prediction without MSA at AlphaFold-Multimer quality level (69.8% DockQ acceptable vs 67.7%) is achievable under Apache 2.0 without API fees, representing a direct commercial threat to Forge API's structure-prediction use cases. | Medium | SR023 |
| CR024 | Meta's ESM Metagenomic Atlas blog (November 2022) confirms that ESMFold (based on ESM2) provides structure predictions up to 60x faster than the prior state-of-the-art, illustrating that Meta's FAIR team (EvolutionaryScale's founding employer) retains independent protein AI capabilities that could re-enter the competitive landscape. | Medium | SR024 |
| CR025 | The EU AI Act (Regulation 2024/1689), published 12 July 2024 and entering full enforcement August 2026, lays down harmonised rules for AI systems with EEA relevance, potentially subjecting general-purpose AI models with large training compute (>10²⁵ FLOPs) to systemic risk obligations including third-party audits. | High | SR020, SR021 |
| CR026 | The EU AI Act's full enforcement provisions take effect August 2026, meaning EvolutionaryScale will need to assess EU compliance — including conformity assessments, transparency obligations, and potentially human oversight for high-risk applications — within its current planning horizon. | High | SR020, SR021 |
| CR027 | The NIST AI Risk Management Framework Generative AI Profile (NIST-AI-600-1, published July 2024) provides voluntary guidance for organisations developing generative AI, increasingly referenced in government procurement requirements, creating de-facto compliance pressure for Forge API customers in the public sector. | Medium | SR002 |
| CR028 | FDA's AI/ML-enabled medical devices framework (SaMD) and 2024 action plan govern AI used in clinical diagnosis and treatment decision support but do not yet specifically regulate AI protein design tools used in pre-clinical drug discovery — a regulatory gap that may be filled if ESM3-based designs progress toward IND submissions. | Medium | SR003 |
| CR029 | No public BIS (Bureau of Industry and Security) final rule specifically governing export of protein language model weights or API access has been identified as of May 2026, but CSET has highlighted advancing US biotechnology governance as urgent for AI biosecurity, suggesting rulemaking activity is directionally likely. | Medium | SR008, SR001 |
| CR030 | The Biological Weapons Convention's absence of any AI-specific language or verification regime means that the primary international legal framework against bioweapons does not currently create binding compliance obligations for EvolutionaryScale specifically related to protein language model deployment. | Medium | SR006 |
| CR031 | Industry self-regulatory AI safety commitments (Anthropic's Responsible Scaling Policy, OpenAI safety commitments) set voluntary precedents for biosafety evaluation at capability thresholds, but EvolutionaryScale's Responsible Development Framework does not publicly specify comparable quantitative triggers or third-party verification requirements. | Medium | SR029, SR030, SR015 |
| CR032 | EvolutionaryScale has raised approximately $145 M total (seed plus $142 M Series A, September 2024) at a $1.35 B post-money valuation, with Amazon and Nvidia as lead investors; no subsequent funding round has been publicly disclosed as of May 2026. | High | SR017, SR019 |
| CR033 | EvolutionaryScale has disclosed no public revenue or ARR metrics; at $1.35 B valuation with $145 M raised and a pre-revenue or early-revenue profile, the implied revenue multiple significantly exceeds typical Series A SaaS multiples, creating down-round risk if commercial adoption is slower than investor expectations. | Medium | SR017 |
| CR034 | Amazon (AWS) is simultaneously a lead Series A investor, primary compute provider (AWS EC2 GPU instances), distribution channel (SageMaker JumpStart), and operator of a competing bio-AI discovery platform — a four-way conflict of interest with no public disclosure of ring-fencing arrangements. | Medium | SR017, SR026 |
| CR035 | Nvidia is simultaneously a lead Series A investor, primary GPU hardware supplier, BioNeMo platform operator (including ESM model distribution), and a developer of competing bio-AI capabilities — creating a comparable four-way structural conflict to Amazon's. | Medium | SR017, SR013 |
| CR036 | ESM Cambrian commercial use is available via AWS SageMaker, meaning Amazon earns transaction fees on Forge-equivalent commercial access to EvolutionaryScale's models — an arrangement that benefits Amazon's cloud revenue while potentially constraining EvolutionaryScale's direct-to-customer pricing power. | Medium | SR015, SR026 |
| CR037 | ESM3-98B training at 1×10²⁴ FLOPs represents one of the most computationally intensive biological model training runs recorded; the GPU compute costs for ongoing model development, API inference at commercial scale, and future ESM4 training represent a significant and growing operating expense with no public disclosure of unit economics. | Medium | SR014, SR013 |
| CR038 | All four named EvolutionaryScale founders (Alexander Rives, Tom Sercu, Zeming Lin, Salvatore Candido) are alumni of Meta AI's FAIR protein team, representing a single-employer concentration in the founding team with no disclosed external scientific advisory board. | High | SR028, SR014 |
| CR039 | The ESM3 biorxiv preprint author list names Thomas Hayes, Roshan Rao, Halil Akin, Nicholas Sofroniew, Deniz Oktay, Zeming Lin, Robert Verkuil, Tom Sercu, Salvatore Candido, and Alexander Rives among the core team — all identified as EvolutionaryScale, PBC employees — indicating technical concentration in the founding team. | Medium | SR028 |
| CR040 | Alexander Rives, CEO, is the original ESM model creator and lead author of the 2021 PNAS paper on ESM1v; Tom Sercu and Zeming Lin are the primary technical architects of ESM3 and ESM Cambrian respectively — departure of any of the three would represent a material scientific knowledge risk. | Medium | SR010, SR028 |
| CR041 | No succession plan, key-person insurance, or CEO independence structure has been publicly disclosed by EvolutionaryScale, leaving investors with no visible mitigation for key-person departure risk at the $1.35 B valuation level. | Low | |
| CR042 | The facebookresearch/esm GitHub repository states it 'contains code and pre-trained weights' under Meta's terms; ESM3 was developed by founders who worked at Meta FAIR and built ESM2, creating a plausible IP provenance chain where Meta could assert rights over ESM3 commercial weights as derivative works. | Medium | SR010, SR024 |
| CR043 | The ESM3 biorxiv preprint competing interest statement discloses 'patents have been filed related to aspects of this work' but does not specify patent numbers, claims, status, jurisdiction, or relationship to Meta's prior art — leaving investors and customers unable to assess the durability of EvolutionaryScale's IP position. | Medium | SR028 |
| CR044 | EvolutionaryScale is incorporated as a Public Benefit Corporation (PBC), which in Delaware law creates a board obligation to balance shareholder interests with a stated public benefit purpose — potentially constraining purely commercial decisions about model access pricing or API gating in ways that may conflict with investor return expectations. | Medium | SR014, SR028 |
| CR045 | No litigation, regulatory complaint, enforcement action, or customer dispute records involving EvolutionaryScale, PBC have been identified in publicly accessible sources as of May 2026, indicating a clean legal record at this early stage. | Medium | SR017, SR014 |
| CV001 | EvolutionaryScale raised $142M in its Series A on September 26, 2024. | High | SV001, SV002, SV029 |
| CV002 | EvolutionaryScale's September 2024 Series A valued the company at approximately $1.35B post-money. | High | SV001, SV028, SV029 |
| CV003 | Amazon (AWS) and NVIDIA co-led EvolutionaryScale's Series A; Lux Capital, Nat Friedman, and Daniel Gross participated. | High | SV001, SV002 |
| CV004 | EvolutionaryScale has raised approximately $145M in total funding including seed capital as of May 2026. | Medium | SV001, SV028 |
| CV005 | EvolutionaryScale has not publicly disclosed any revenue, ARR, Forge customer count, or gross margin as of May 2026. | High | SV002, SV028 |
| CV006 | ESM3 was published in Science Magazine on January 16, 2025, documenting the generation of a novel fluorescent protein equivalent to simulating 500 million years of evolution. | High | SV005, SV003 |
| CV007 | ESM3 was trained on 2.78 billion proteins and 771 billion tokens with over 1×10^24 FLOPs, described as the most compute ever applied to training a biological model. | High | SV003, SV005 |
| CV008 | The Forge commercial API platform provides fee-based access to ESM3 and ESM Cambrian models for pharmaceutical and biotech R&D users. | Medium | SV002, SV004 |
| CV009 | EvolutionaryScale distributes Forge via AWS SageMaker JumpStart and NVIDIA BioNeMo, providing direct access to global pharma R&D cloud infrastructure. | High | SV026, SV002 |
| CV010 | The ESM Cambrian model family has over 6,320 downloads on HuggingFace as of May 2026, indicating active developer community adoption. | Medium | SV033, SV004 |
| CV011 | Absci (NASDAQ:ABSI) had a market capitalization of approximately $800M as of May 2026. | High | SV006, SV009 |
| CV012 | Absci reported revenue of $2.8M for FY2025 (down from $4.5M in FY2024) and a net loss of $115.2M for FY2025. | High | SV009, SV006 |
| CV013 | Recursion Pharmaceuticals (NASDAQ:RXRX) had a market capitalization of approximately $1.555B as of May 2026. | High | SV007, SV010 |
| CV014 | Recursion reported Q1 2026 revenue of $6.47M, which fell short of analyst expectations, with cash of $665.2M providing runway into early 2028. | High | SV007, SV010 |
| CV015 | Recursion had an accumulated deficit of $2.1 billion as of December 31, 2025, reflecting the capital intensity of AI drug discovery platform development. | High | SV010, SV007 |
| CV016 | Schrodinger (NASDAQ:SDGR) had a market capitalization of approximately $893M as of May 2026, with a 52-week high of $27.63 and low of $10.94. | High | SV008, SV013 |
| CV017 | Generate Biomedicines has raised approximately $700M in total disclosed funding and operates a generative biology platform targeting protein therapeutics. | Medium | SV021, SV028 |
| CV018 | Profluent has raised approximately $44M in disclosed funding and introduced OpenCRISPR, described as the first AI-designed gene editor. | Medium | SV019, SV030 |
| CV019 | Cradle.bio has raised approximately $73M in total disclosed funding and serves top biopharma and industrial bio R&D teams for protein optimization. | Medium | SV020, SV030 |
| CV020 | Xaira Therapeutics launched in April 2024 with $1B in Series A funding, the largest-ever AI drug discovery Series A at the time. | Medium | SV025, SV032 |
| CV021 | Isomorphic Labs (Alphabet-backed) is developing AI drug discovery with Lilly and Novartis collaborations reportedly worth $3B+ in combined headline deal value. | Medium | SV023, SV029 |
| CV022 | AlphaFold 3 was released by Google DeepMind on May 8, 2024, predicting structure and interactions of all biomolecules; AlphaFold Server provides free access to 3M+ researchers across 190+ countries for non-commercial research. | High | SV027, SV032 |
| CV023 | The ESM2 predecessor protein language model is available as open-source software from EvolutionaryScale's GitHub, providing free sequence-embedding capability comparable to lower-capability ESM3 tiers. | High | SV003, SV033 |
| CV024 | The global AI in drug discovery market is estimated at $2.35B in 2025, projected to grow to $13.77B by 2033 at a CAGR of 24.8%. | Medium | SV032, SV016 |
| CV025 | The global protein engineering market is estimated at $5.09B in 2025, projected to grow to $23.59B by 2035 at a CAGR of 16.57%. | Medium | SV015, SV017 |
| CV026 | US VC deal value reached $74.6B across 2,859 deals in Q4 2024, the highest since Q2 2022, driven primarily by AI investment including five companies raising $4B+ rounds. | High | SV014, SV032 |
| CV027 | KPMG's Venture Pulse Q4 2024 warned that VC investors are becoming more discerning as to who the winners may be in the AI space and will favor companies with credible commercial models over AI-wrapper businesses. | Medium | SV014 |
| CV028 | No EvolutionaryScale Form D securities filing was identified in SEC EDGAR's full-text search database as of May 2026, consistent with private status and possible Regulation D without full disclosure. | Medium | SV031, SV011 |
| CV029 | All four co-founders of EvolutionaryScale (Rives, Sercu, Lin, Candido) joined from Meta AI FAIR, representing correlated key-person concentration risk. | High | SV002, SV028 |
| CV030 | EvolutionaryScale's $1.35B Series A at pre-revenue stage implies an approximately 9.5x post-money-to-raised multiple, substantially above historical norms for pre-revenue biotech Series A rounds. | Medium | SV001, SV028, SV029 |
| CV031 | The bull case for EvolutionaryScale is $3–5B, contingent on Forge achieving $50–100M ARR by 2027 through multi-pharma contracts and AWS/NVIDIA channel scale. | Low | SV001, SV032, SV028 |
| CV032 | The base case for EvolutionaryScale is $1.5–2.5B, reflecting a modest step-up from Series A entry if commercial ramp is slow ($10–25M ARR) and AWS/NVIDIA partnerships drive most revenue. | Medium | SV001, SV014, SV028 |
| CV033 | The bear case for EvolutionaryScale is $400M–800M, reflecting commoditization risk from open-source ESM2 and free AlphaFold 3, possible key-person departure, or acqui-hire by Amazon at a down-round price. | Medium | SV027, SV023, SV014 |
| CV034 | Amazon (AWS) and NVIDIA's co-investment creates a structural distribution advantage: both partners have a commercial incentive to route pharma API traffic through Forge via their respective cloud platforms. | Medium | SV026, SV002, SV009 |
| CV035 | Insilico Medicine completed an HKEX IPO raising approximately $293M in late 2025 (SEHK:3696), with a prior Series E valuation of ~$2.3B, providing a precedent for AI drug discovery company public market exits. | Medium | SV022, SV032 |
| CV036 | EvolutionaryScale's founders created the original ESM protein language model family at Meta AI FAIR, establishing unique domain authority and an institutional knowledge base not replicable at competing protein AI startups. | High | SV003, SV005, SV002 |
| CV037 | EvolutionaryScale's responsible development framework acknowledges dual-use biosecurity risks of ESM3; no public disclosure of specific customer screening protocols or DURC compliance procedures has been made. | Medium | SV002, SV003 |
| CV038 | Recursion's FY2025 10-K disclosed that three partners represented 95% of total partner program revenue, illustrating extreme customer concentration risk inherent in AI drug discovery platform businesses. | High | SV010, SV007 |
| CV039 | The drug discovery market is estimated at $71.89B in 2025, growing to $158.74B by 2034 at a 9.2% CAGR, representing the broader TAM for AI tools improving pharmaceutical R&D efficiency. | Medium | SV016, SV032 |
| CV040 | Recursion (RXRX) had a 52-week share price range of $2.80–$7.18 and Schrodinger (SDGR) a range of $10.94–$27.63 as of May 2026, evidencing high multiple compression volatility in public AI drug discovery comps. | High | SV007, SV008 |
| CV041 | Schrodinger (SDGR) most recently filed a 10-K on February 25, 2026 (for FY2025), confirming ongoing SEC reporting status and a current market cap of approximately $893M. | High | SV013, SV008 |
| CV042 | ESM3 was trained on biological data spanning diverse environments including the Amazon rainforest, hydrothermal vents, and soil microbiomes, representing one of the most comprehensive biological training datasets compiled by any private company. | Medium | SV003, SV005 |
| ID | Publisher | Title | Quote |
|---|---|---|---|
| SO001 | EvolutionaryScale | EvolutionaryScale Official Homepage | We are building the foundation for a new era of programmable biology — from foundational models for protein sequences to tools that let scientists design and understand proteins. |
| SO002 | EvolutionaryScale | ESM3 Release Blog Post | We are releasing ESM3, a generative multimodal model for protein design. ESM3 reasons over the sequence, structure, and function of proteins. |
| SO003 | EvolutionaryScale | ESM Cambrian Launch Blog Post | We are releasing ESM Cambrian, a new family of protein language models available in 300M, 600M, and 6B parameter sizes. |
| SO004 | Crunchbase | EvolutionaryScale on Crunchbase | EvolutionaryScale raised a total of $142M in funding over 2 rounds. Their latest funding was raised on Sep 26, 2024 from a Series A round. |
| SO005 | Hugging Face | EvolutionaryScale Organization on Hugging Face | EvolutionaryScale organization on Hugging Face; hosts ESM3 and ESM Cambrian model weights and model cards. |
| SO006 | EvolutionaryScale | EvolutionaryScale GitHub Organization | EvolutionaryScale on GitHub: 9 repositories including esm, DeepEP, nccl, and transformers forks. |
| SO007 | EvolutionaryScale / CZ Biohub | ESM Repository on GitHub | ESM: Evolutionary Scale Modeling — official model weights and inference code; repository redirected to biohub organization following acquisition. |
| SO008 | EvolutionaryScale | DeepEP Repository on GitHub | DeepEP: An efficient expert-parallel communication library optimized for mixture-of-experts models and inference. |
| SO009 | Science (AAAS) | Simulating 500 million years of evolution with a language model | We describe ESM3, a generative multimodal model that reasons over the sequences, structures, and functions of proteins. ESM3 was found to generate a new fluorescent protein distant from known sequences. |
| SO010 | bioRxiv (Cold Spring Harbor Laboratory) | ESM3: Simulating 500 million years of evolution with a language model (preprint) | ESM3: Simulating 500 million years of evolution — BioRxiv preprint doi: 10.1101/2024.07.01.600583; authors include Rives, Sercu, Candido, Lin. |
| SO011 | U.S. Securities and Exchange Commission | SEC EDGAR Full-Text Search: Form D filings for EvolutionaryScale 2024-2026 | 0 results found for EvolutionaryScale in Form D filings from January 2024 through May 2026. |
| SO012 | U.S. Securities and Exchange Commission | SEC EDGAR Company Search: EvolutionaryScale Form D | No companies found matching evolutionaryscale for Form D filings in SEC EDGAR. |
| SO013 | EvolutionaryScale on LinkedIn | EvolutionaryScale — Company size: 11-50 employees — Industry: Biotechnology Research — team has joined CZ Biohub. | |
| SO014 | CZ Biohub / Chan Zuckerberg Initiative | CZ Biohub: Frontier AI for Biology Initiative | We are thrilled to welcome the EvolutionaryScale team to the CZ Biohub Network. Alex Rives will serve as Head of Science at CZI, working to advance open biological science. |
| SO015 | Bloomberg | EvolutionaryScale Raises $142M from Amazon, NVIDIA | EvolutionaryScale Inc. raised $142 million from Amazon.com Inc. and Nvidia Corp. in a new financing round. Full article paywalled; detailed terms and investor rights not accessible for diligence. |
| SO016 | Reuters | EvolutionaryScale raises $142M for AI protein design | Reuters article confirmed as broken or inaccessible (401 JS-only response); content unavailable. |
| SO017 | NVIDIA | NVIDIA Blog: EvolutionaryScale ESM3 on BioNeMo and H100 NIM | EvolutionaryScale and NVIDIA partner to deploy ESM3 as a NIM microservice on H100 infrastructure. Tom Sercu, co-founder and VP of engineering, described the partnership. NVIDIA participated in both the seed and Series A rounds. |
| SO018 | Hacker News (Algolia API) | Hacker News Stories About EvolutionaryScale | Top HN stories include: ESM3 Simulating 500 million years of evolution (2024, ~500 points); EvolutionaryScale raises $142M Series A; EvolutionaryScale Acquired by CZI (Nov 2025, story 45838940). |
| SO019 | NVIDIA | NVIDIA NGC Catalog: ESM3 by EvolutionaryScale | ESM3 listed in NVIDIA NGC catalog under Clara team; page rendered as JS-only SPA; existence confirmed but detailed content not accessible. |
| SO020 | Semantic Scholar (Allen Institute for AI) | Semantic Scholar API: ESM3 / EvolutionaryScale paper search | Semantic Scholar API returns multiple papers related to protein language models and ESM3; provides citation count proxy and publication network for ESM family research. |
| SO021 | Wikimedia Foundation | Wikipedia: EvolutionaryScale (page not found) | Wikipedia page for EvolutionaryScale does not exist; URL returns HTTP 404 Not Found. No Wikipedia article has been created for the company as of May 2026. |
| SO022 | NVIDIA | NVIDIA Clara BioNeMo Platform | NVIDIA BioNeMo is a cloud platform for generative AI drug discovery; features ESM3 integration for protein sequence and structure generation. |
| SO023 | NVIDIA | NVIDIA News: NVIDIA Joins Seed Investment in EvolutionaryScale | NVIDIA News URL for seed investment announcement returns news archive page rather than the specific article; original press release content is inaccessible. |
| SO024 | EvolutionaryScale | EvolutionaryScale Forge API Platform | Forge API platform is a JavaScript-rendered SPA; no textual content accessible; existence confirmed but operational status post-acquisition is unknown. |
| SO025 | GlobeNewswire (expected: EvolutionaryScale) | GlobeNewswire: EvolutionaryScale Series A press release (expected) | URL returned content for a different company (Banzai International press release); EvolutionaryScale Series A press release was not accessible at this URL. |
| SO026 | Hugging Face / EvolutionaryScale | HuggingFace: ESM3-sm-open-v1 model card | ESM3-sm-open-v1 on HuggingFace: 3,110+ downloads; open model for non-commercial academic use; model card describes sequence, structure, and function inputs. |
| SO027 | Axios | Axios: EvolutionaryScale Series A funding protein AI | Axios article on EvolutionaryScale Series A was rate-limited during fetch; content not retrieved; URL confirms coverage of the funding round. |
| SM001 | MarketsandMarkets | Protein Engineering Market Size, Share & Trends Analysis Report — MarketsandMarkets | The protein engineering market size is projected to grow from USD 2.2 billion in 2019 to USD 3.9 billion by 2024, at a CAGR of 12.4%. |
| SM002 | Precedence Research | Protein Engineering Market Size, Growth & Forecast 2025–2035 — Precedence Research | The global protein engineering market size was estimated at USD 5.09 billion in 2025 and is expected to reach around USD 23.59 billion by 2035, growing at a CAGR of 16.57%. |
| SM003 | Allied Market Research | Protein Engineering Market by Type, Application, and Region — Allied Market Research | The global protein engineering market size was valued at $2.2 billion in 2022, and is projected to reach $7.7 billion by 2032, growing at a CAGR of 13.2% from 2023 to 2032. |
| SM004 | Grand View Research | Protein Engineering Market Size, Share & Trends Analysis — Grand View Research | The global protein engineering market size was valued at USD 2.60 billion in 2023 and is expected to grow at a CAGR of 16.24% from 2024 to 2030. |
| SM005 | Grand View Research | Artificial Intelligence In Drug Discovery Market — Grand View Research | The global artificial intelligence in drug discovery market was valued at USD 2.35 billion in 2025 and is expected to reach USD 13.77 billion by 2033 at a CAGR of 24.8%. |
| SM006 | Precedence Research | Drug Discovery Market Size, Share & Trends 2025–2034 — Precedence Research | The global drug discovery market size was estimated at USD 71.89 billion in 2025 and is expected to reach around USD 158.74 billion by 2034, growing at a CAGR of 9.2%. |
| SM007 | Google DeepMind | AlphaFold: Protein Structure Database — Google DeepMind | The AlphaFold Protein Structure Database provides open access to over 200 million protein structure predictions covering nearly all known proteins. |
| SM008 | NVIDIA | NVIDIA AI for Healthcare and Life Sciences — NVIDIA BioNeMo | NVIDIA BioNeMo is the development platform for AI-driven biology and drug discovery. 2x faster biofoundation model training. 6x faster model inference. |
| SM009 | Amazon Web Services | Amazon SageMaker JumpStart — AWS | |
| SM010 | U.S. Food and Drug Administration (FDA) | Artificial Intelligence in Drug Development — FDA | FDA has received over 500 AI/ML-enabled drug development submissions since 2016. The CDER AI Council was established in 2024. |
| SM011 | EvolutionaryScale | Simulating 500 million years of evolution with a language model — ESM3 Blog | ESM3 is a frontier multimodal generative model for biology. We are releasing ESM3 as open for academic and non-commercial use. For commercial access to ESM3, we are launching the Forge API. |
| SM012 | EvolutionaryScale | ESM Cambrian: Building the Frontier of Protein Language Models — Blog | ESM C is now available on AWS SageMaker JumpStart and NVIDIA BioNeMo. ESM C is released under the MIT license for any use, including commercial applications. |
| SM013 | Science (AAAS) | Simulating 500 million years of evolution with a language model — Science | ESM3 is a frontier multimodal generative model for biology that reasons over the sequence, structure, and function of proteins simultaneously, trained on sequences of 2.78 billion proteins. |
| SM014 | bioRxiv (Cold Spring Harbor Laboratory) | Search results: evolutionaryscale ESM3 — bioRxiv | |
| SM015 | bioRxiv (Cold Spring Harbor Laboratory) | Simulating 500 million years of evolution with a language model — bioRxiv preprint | We have developed ESM3, a frontier multimodal generative model for biology trained at the scale of evolution. |
| SM016 | IQVIA | IQVIA — Healthcare and Life Science Analytics | |
| SM017 | EvolutionaryScale (GitHub) | evolutionaryscale/esm — GitHub repository | |
| SM018 | Hugging Face | evolutionaryscale — Hugging Face organization page | esm3-sm-open-v1: 3,110 downloads. esmc-600m-2024-12: 6,320 downloads. |
| SM019 | Amazon Web Services | AWS for Health — Genomics and Life Sciences | |
| SM020 | Statista | Pharmaceutical industry research and development expenditure worldwide 2008–2024 | |
| SM021 | Fortune Business Insights | Artificial Intelligence In Drug Discovery Market Size & Forecast | |
| SM022 | Crunchbase | EvolutionaryScale — Crunchbase company profile | Total funding: $142M. Most recent funding: Series A. |
| SM023 | National Human Genome Research Institute (NHGRI) | DNA Sequencing Costs: Data — National Human Genome Research Institute | The cost per raw megabase of DNA sequence dropped dramatically from ~$10,000 in 2001 to less than $0.01 by 2023, reaching approximately $100 per genome. |
| SM024 | Lux Capital | EvolutionaryScale — Lux Capital portfolio | ESM3 is the first generative AI model for biology that simultaneously reasons over the sequence, structure, and function of proteins. |
| SM025 | PyPI | esm — Python Package Index | This repository contains flagship protein models for EvolutionaryScale, as well as access to the API. ESM3 is our flagship multimodal protein generative model. |
| SP001 | Profluent Bio | Profluent — AI-Designed Proteins | The world's first AI-designed gene editor, demonstrating authorship in action. |
| SP002 | Generate Biomedicines | Generate Biomedicines — Generative Biology | "42,000 proteins generated, built, and tested – and we're just getting started." |
| SP003 | Generate Biomedicines | The Generate Platform | "GB-0895 has the potential to shift treatment from monthly to just twice per year." |
| SP004 | AbSci Corporation | AbSci — Unlocking Novel Biology with AI | "De novo design of biologics; Multi-parametric lead optimization; Data to train, AI to create, and wet lab to validate with 6 week cycle times" |
| SP005 | Cradle | Cradle — AI Protein Engineering Platform | "Teams that use Cradle report 2-12x faster development timelines." |
| SP006 | Google DeepMind | AlphaFold — AI for Protein Structure | "Demis Hassabis and John Jumper are co-awarded the Nobel Prize in Chemistry for their work on AlphaFold, alongside David Baker for his work on computational protein design." |
| SP007 | Adaptyv Bio | Adaptyv Bio — Cloud Lab for Protein Designers | |
| SP008 | Isomorphic Labs | Isomorphic Labs — Reimagining Drug Discovery with AI | "Isomorphic Labs is here to advance human health by building on and beyond the Nobel-winning AlphaFold system." |
| SP009 | Chai Discovery | Chai Discovery — Drug-Like Antibody Design | "Drug-like antibody design against challenging targets with atomic precision" |
| SP010 | Institute for Protein Design (University of Washington) | Institute for Protein Design — We Create New Proteins | "We create new proteins that solve challenges in medicine, technology, and sustainability." |
| SP011 | Recursion Pharmaceuticals | Recursion — Pioneering AI Drug Discovery | "Over 50 petabytes spanning phenomics, transcriptomics, proteomics, ADME, and de-identified patient data." |
| SP012 | Recursion Pharmaceuticals | Recursion Drug Discovery Pipeline | |
| SP013 | Schrödinger | Schrödinger — Physics-Based Software Platform for Molecular Discovery | "Built upon more than 30 years of R&D, our industry-leading computational platform is transforming the way therapeutics and materials are discovered." |
| SP014 | Schrödinger | Schrödinger Computational Platform for Molecular Discovery & Design | |
| SP015 | Inceptive | Inceptive — Foundation Models of Life, for Life | "We build end-to-end foundation models that learn to design molecules directly from diverse observations of life. We specialize in sequence-based medicines like mRNA, siRNA, ASOs, and peptides." |
| SP016 | Iambic Therapeutics | Iambic Therapeutics — Better Technology for Better Medicines | "IAM1363 for HER2: Highly selective, brain-penetrant inhibitor for HER2-driven cancers that has shown anti-tumor activity, safety and tolerability in Phase 1b studies" |
| SP017 | Meta AI (Facebook Research) | GitHub: facebookresearch/esm — Evolutionary Scale Modeling (esm) | "This repository contains code and pre-trained weights for Transformer protein language models from the Meta Fundamental AI Research Protein Team (FAIR), including our state-of-the-art ESM-2 and ESMFold." |
| SP018 | Meta AI (AI at Meta) | facebook/esm2_t33_650M_UR50D · Hugging Face | License:mit |
| SP019 | OpenFold Consortium | OpenFold Consortium — Open Ecosystem for AI Biology | "Our goal is to develop an open ecosystem of accelerated AI for Biology tools in order to catalyze innovation, starting with state-of-the-art and permissively licensed protein structure prediction training and inference pipelines and models." |
| SP020 | Meta AI | ESM Metagenomic Atlas: The First View of the 'Dark Matter' of the Protein Universe | "We found that using a language model of protein sequences greatly accelerates the speed of structure prediction (up to 60x)." |
| SP021 | Nature | De novo design of protein structure and function with RFdiffusion | |
| SP022 | Science (AAAS) | Simulating 500 million years of evolution with a language model | |
| SP023 | EvolutionaryScale | EvolutionaryScale on HuggingFace — ESM3 and ESMC Model Families | |
| SP024 | EMBL-EBI / Google DeepMind | AlphaFold Protein Structure Database | "AlphaFold DB provides open access to over 200 million protein structure predictions to accelerate scientific research." |
| SP025 | EvolutionaryScale | ESM3 — A Frontier Language Model for Biology (EvolutionaryScale Blog) | "ESM3 represents a milestone model in the ESM family—the first created by our team at EvolutionaryScale, an order of magnitude larger than our previous model ESM2, and natively multimodal and generative." |
| SP026 | U.S. Securities and Exchange Commission (EDGAR) | Absci Corp (ABSI) 10-K Filing — FY2025 (Period: 2025-12-31; Filed: 2026-03-24) | |
| SP027 | Insilico Medicine | Insilico Medicine — Generative AI Software for Drug Discovery | |
| SP028 | Xaira Therapeutics | Xaira Therapeutics — AI Drug Discovery | "We are building predictive and agentic AI models across the complete spectrum of the drug discovery and development process." |
| SP029 | Wikipedia | AlphaFold — Wikipedia | |
| SI001 | EvolutionaryScale | EvolutionaryScale homepage — joining forces with Biohub | |
| SI002 | EvolutionaryScale | ESM3: A new paradigm for protein language models — ESM3 release blog | |
| SI003 | EvolutionaryScale | ESM Cambrian blog — commercial model release January 2025 | |
| SI004 | EvolutionaryScale | Forge API product page — commercial protein AI API | |
| SI005 | U.S. Securities and Exchange Commission | SEC EDGAR full-text search — Form D filings for 'EvolutionaryScale' (0 results) | |
| SI006 | U.S. Securities and Exchange Commission | SEC EDGAR full-text search — Form D filings for 'Evolutionary Scale' (0 results) | |
| SI007 | U.S. Securities and Exchange Commission | SEC EDGAR company browse — Form D filings for 'evolutionary scale' (0 results) | |
| SI008 | U.S. Securities and Exchange Commission | SEC EDGAR company browse — Form D filings for 'evolutionaryscale' (0 results) | |
| SI009 | CNBC | EvolutionaryScale raises $142 million from Amazon, Nvidia for protein AI | |
| SI010 | Axios | EvolutionaryScale Series A funding — protein AI $142 million Amazon NVIDIA | |
| SI011 | MIT Technology Review | EvolutionaryScale raises $142 million for protein AI from Amazon, Nvidia | |
| SI012 | CNBC | Chan Zuckerberg Initiative Biohub joins with EvolutionaryScale team | |
| SI013 | CZI Biohub | Frontier AI for Biology Initiative — EvolutionaryScale team joins Biohub | |
| SI014 | Crunchbase | EvolutionaryScale company profile — funding, investors, products | |
| SI015 | PitchBook | EvolutionaryScale company profile — funding history | |
| SI016 | NVIDIA | NVIDIA joins seed investment in EvolutionaryScale | |
| SI017 | NVIDIA | NVIDIA partners with EvolutionaryScale — ESM3 on BioNeMo | |
| SI018 | NVIDIA | EvolutionaryScale debuts with ESM3 generative AI model on BioNeMo and H100 | |
| SI019 | NVIDIA NGC Catalog | ESM3 model on NVIDIA NGC Catalog — Clara / BioNeMo resource | |
| SI020 | Lux Capital | Lux Capital — EvolutionaryScale Series A portfolio announcement | |
| SI021 | GitHub | EvolutionaryScale GitHub organization — repos and activity | |
| SI022 | GitHub | evolutionaryscale/esm — ESM model repository | |
| SI023 | Hugging Face | EvolutionaryScale/esm3-sm-open-v1 — open-weight ESM3 on HuggingFace | |
| SI024 | Bloomberg | EvolutionaryScale raises $142 million from Amazon, Nvidia (Bloomberg; access blocked) | |
| SI025 | Wikipedia | EvolutionaryScale — Wikipedia article | |
| SI026 | Hacker News (Algolia API) | Hacker News search — EvolutionaryScale funding Series A discussions | |
| SE001 | EvolutionaryScale | Simulating 500 million years of evolution with a language model | ESM3 is a generative model that reasons over the sequence, structure and function of proteins simultaneously. We trained ESM3 on an enormous scale: 771B tokens and 1.07×10^24 FLOPs. |
| SE002 | EvolutionaryScale | ESM Cambrian: New foundational protein language models | ESM Cambrian (ESMC) introduces a new family of protein language models with 300M, 600M, and 6B parameter sizes. |
| SE003 | EvolutionaryScale | EvolutionaryScale — Homepage | |
| SE004 | EvolutionaryScale | Forge — EvolutionaryScale API Platform | |
| SE005 | American Association for the Advancement of Science (AAAS) | Simulating 500 million years of evolution with a language model (Science, Vol 387) | Hayes et al. Science Vol 387 Issue 6736 pp. 850-858 (January 16, 2025); DOI 10.1126/science.ads0018 |
| SE006 | bioRxiv / Cold Spring Harbor Laboratory | Simulating 500M years of evolution with a language model (ESM3 preprint) | esmGFP is 58% sequence identical to the nearest natural GFP and has 96 mutations out of 229 total amino acid positions. |
| SE007 | bioRxiv / Cold Spring Harbor Laboratory | More Structure, Less Accuracy: ESM3's Binding Prediction Paradox | When distinct relaxed mutant structures are used per variant (rather than a single consistent backbone), ESM3's binding prediction performance deteriorates — a counter-intuitive result suggesting that more structural information can reduce accuracy. |
| SE008 | bioRxiv / Cold Spring Harbor Laboratory | bioRxiv search — EvolutionaryScale ESM3 citing papers | |
| SE009 | EvolutionaryScale | GitHub — evolutionaryscale/esm (official Python client and model weights) | |
| SE010 | EvolutionaryScale | GitHub — evolutionaryscale organization page | |
| SE011 | EvolutionaryScale | GitHub — evolutionaryscale/DeepEP (MoE Expert Parallelism library) | |
| SE012 | GitHub / EvolutionaryScale | GitHub API — evolutionaryscale/esm repository metadata | |
| SE013 | HuggingFace / EvolutionaryScale | HuggingFace — EvolutionaryScale organization page | |
| SE014 | HuggingFace / EvolutionaryScale | HuggingFace model card — esm3-sm-open-v1 | 3,105 downloads in the last month; 291 likes; license: Cambrian Non-Commercial License Agreement |
| SE015 | HuggingFace / EvolutionaryScale | HuggingFace model card — esmc-600m-2024-12 | |
| SE016 | NVIDIA | NVIDIA Clara BioNeMo — Platform for generative AI in drug discovery | |
| SE017 | NVIDIA | EvolutionaryScale Debuts With ESM3 Generative AI Model for Protein Design | ESM3 was trained using the Andromeda cluster, which uses NVIDIA H100 GPUs and NVIDIA Quantum-2 InfiniBand networking. ESM3 uses roughly 25x more flops and 60x more data than its predecessor, ESM2. |
| SE018 | NVIDIA | NVIDIA NGC Catalog — ESM3 (Clara resource) | |
| SE019 | Amazon Web Services | Amazon SageMaker JumpStart — Foundation models and ML solutions | |
| SE020 | Crunchbase | EvolutionaryScale — Crunchbase organization profile | |
| SE021 | U.S. Securities and Exchange Commission (SEC) | SEC EDGAR EFTS — Form D search for EvolutionaryScale | |
| SE022 | Hacker News (Algolia) | Hacker News search — EvolutionaryScale community discussion | |
| SE023 | CZI Biohub | Biohub launches initiative combining frontier AI & frontier biology | The team at EvolutionaryScale, a frontier AI research lab and public benefit company that has created groundbreaking, large-scale AI systems for the life sciences, will join Biohub to help advance this initiative. Alex Rives, EvolutionaryScale's co-founder and chief scientist, will serve as head of science. |
| SE024 | Lux Capital | EvolutionaryScale Series A — Lux Capital investment update | |
| SE025 | Axios | EvolutionaryScale raises $142 million Series A — Axios Pro Health Tech | |
| SE026 | Semantic Scholar / Allen Institute for AI | Semantic Scholar paper search — ESM3 EvolutionaryScale citations | |
| SE027 | DeepMind / Google | AlphaFold — DeepMind protein structure prediction | |
| SE028 | Chai Discovery | Chai Discovery — protein complex structure prediction | |
| SE029 | Semantic Scholar / Allen Institute for AI | Semantic Scholar paper search — ESM3 protein language model evaluation benchmark limitation | |
| SU001 | EvolutionaryScale | EvolutionaryScale — Company Homepage | ESM3 is a family of models in three sizes: small, medium, and large, available through our API and our partner's platforms. |
| SU002 | EvolutionaryScale | ESM3: A Frontier Language Model for Biology (Blog) | We're opening our API for biological intelligence, now in public beta, allowing scientists in academia and industry a free limited time preview of the capabilities of some of our models through Forge. |
| SU003 | EvolutionaryScale | ESM Cambrian: Revealing the Mysteries of Proteins with Unsupervised Learning (Blog) | ESM C 6B is available on Forge for academic use, and AWS Sagemaker for commercial use. ESM C will also be available on NVIDIA BioNemo soon. |
| SU004 | EvolutionaryScale (GitHub) | evolutionaryscale/esm — GitHub Repository (README) | ESM C models are also available on Amazon SageMaker under the Cambrian Inference Clickthrough License Agreement. Under this license agreement, models are available for broad use for commercial entities. |
| SU005 | EvolutionaryScale (GitHub) | evolutionaryscale — GitHub Organization Page | esm-partner — Repository for partner collaborations |
| SU006 | HuggingFace | EvolutionaryScale — HuggingFace Organization Page | biohub/esm3-sm-open-v1 Updated Jan 29, 2025 • 3.11k • 291 ... biohub/esmc-300m-2024-12 Updated 2 days ago • 6.32k • 30 ... biohub/esmc-600m-2024-12 Updated 2 days ago • 1.49k • 32 |
| SU007 | NVIDIA | NVIDIA BioNeMo — AI Platforms for Healthcare and Life Sciences | BioNeMo — NVIDIA BioNeMo™ is the development platform for AI-driven biology and drug discovery. Use Cases: biofoundation model building, molecular design, virtual screening, protein structure prediction, protein binder design |
| SU008 | Adaptyv Bio | Adaptyv Bio — Company Website | |
| SU009 | Semantic Scholar (Allen Institute for AI) | Semantic Scholar API — ESM3 EvolutionaryScale Downstream Paper Search | total: 32 |
| SU010 | bioRxiv (Cold Spring Harbor Laboratory) | bioRxiv Search — evolutionaryscale ESM3 Preprint Results | 129 Results for term 'evolutionaryscale ESM3' |
| SU011 | BusinessWire (Berkshire Hathaway) | EvolutionaryScale Raises $142M Series A to Advance Protein Language Models | |
| SU012 | CNBC | EvolutionaryScale Raises $142 Million, Protein AI Amazon NVIDIA | |
| SU013 | GlobeNewsWire | EvolutionaryScale Raises $142M Series A | |
| SU014 | Crunchbase | EvolutionaryScale — Crunchbase Organization Profile | EvolutionaryScale secured $142 million in a seed investment round. The funding was backed by Amazon and Nvidia and is intended for the development of protein-generating AI. |
| SU015 | Science (AAAS) | Simulating 500 million years of evolution with a language model | |
| SU016 | NVIDIA Newsroom | NVIDIA Joins Seed Investment in EvolutionaryScale | |
| SU017 | Lux Capital | EvolutionaryScale Series A — Lux Capital Blog | |
| SU018 | Wikipedia | EvolutionaryScale — Wikipedia | |
| SU019 | Wikipedia | Generate Biomedicines — Wikipedia | |
| SU020 | Precedence Research | Drug Discovery Market Size, Share & Trends 2025–2034 | The global drug discovery market size is valued at USD 71.89 billion in 2025 and is predicted to increase from USD 78.51 billion in 2026 to approximately USD 158.74 billion by 2034, expanding at a CAGR of 9.20% from 2025 to 2034. |
| SU021 | Isomorphic Labs | Isomorphic Labs — Company Homepage | |
| SU022 | Generate Biomedicines | Generate Biomedicines — Company Homepage | 42,000 proteins generated, built, and tested |
| SU023 | Axios | EvolutionaryScale Series A Funding — Axios | |
| SU024 | TechCrunch | EvolutionaryScale — TechCrunch Tag Page | |
| SU025 | Amazon Web Services | EvolutionaryScale ESM-C — AWS Marketplace Product Listing | |
| SU026 | Amazon Web Services | Revolutionizing Drug Discovery with AI: A Spotlight on EvolutionaryScale (AWS Industries Blog) | |
| SU027 | NVIDIA Developer | EvolutionaryScale ESM3 on NVIDIA (Developer Blog) | |
| SU028 | Nuclear Threat Initiative (NTI) | NTI Biosecurity — Program Overview | |
| SU029 | Center for AI Safety | Statement on AI Risk — safe.ai | |
| SR001 | U.S. Government Publishing Office / Federal Register | Executive Order 14110: Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence | addressing AI systems' most pressing security risks — including with respect to biotechnology, cybersecurity, critical infrastructure, and other national security dangers |
| SR002 | National Institute of Standards and Technology (NIST) | AI Risk Management Framework (AI RMF) — ITL AI Program | NIST released NIST-AI-600-1, Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile |
| SR003 | U.S. Food and Drug Administration | Artificial Intelligence-Enabled Medical Devices | |
| SR004 | arXiv / MIT (Sandbrink, Shulman) | Can large language models democratize access to dual-use biotechnology? | In one hour, the chatbots suggested four potential pandemic pathogens, explained how they can be generated from synthetic DNA using reverse genetics, supplied the names of DNA synthesis companies unlikely to screen orders, identified detailed protocols |
| SR005 | Center for AI Safety (CAIS) | Statement on AI Risk | Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war. |
| SR006 | Wikipedia | Biological Weapons Convention | As of May 2025, 189 states have become party to the treaty. The convention's effectiveness has been limited due to insufficient institutional support and the absence of any formal verification regime to monitor compliance. |
| SR007 | Nuclear Threat Initiative (NTI) | Biosecurity — NTI | These technologies also introduce risks of accidental misuse and deliberate exploitation, which could result in a biological catastrophe with grave consequences. |
| SR008 | Center for Security and Emerging Technology (CSET, Georgetown) | Biosecurity and Innovation in the Age of AI: Safeguarding the Future of U.S. Biotechnology | |
| SR009 | Nature (Baker Lab / University of Washington) | De novo design of protein structure and function with RFdiffusion | RFdiffusion enables the design of diverse functional proteins from simple molecular specifications |
| SR010 | Meta AI (facebookresearch) | facebookresearch/esm: Evolutionary Scale Modeling — Pretrained language models for proteins | This repository contains code and pre-trained weights for Transformer protein language models from the Meta Fundamental AI Research Protein Team (FAIR) |
| SR011 | EMBL-EBI / Google DeepMind | AlphaFold Protein Structure Database | AlphaFold DB provides open access to over 200 million protein structure predictions to accelerate scientific research |
| SR012 | AQ Laboratory (Columbia/UCSF) | aqlaboratory/openfold: Trainable, memory-efficient PyTorch reproduction of AlphaFold 2 | |
| SR013 | NVIDIA | NVIDIA BioNeMo — AI Development Platform for Biology and Drug Discovery | NVIDIA BioNeMo is the development platform for AI-driven biology and drug discovery |
| SR014 | EvolutionaryScale | ESM3: A New Era for Protein Design (Launch Blog — Responsible Development) | We have created a Responsible Development Framework to guide our work towards our mission with transparency and clarity |
| SR015 | EvolutionaryScale | ESM Cambrian: Representation learning for protein language models | ESM C was reviewed by a committee of scientific experts who concluded that the benefits of releasing the models greatly outweigh any potential risks |
| SR016 | EvolutionaryScale | evolutionaryscale/esm — ESM protein models and Forge API access | |
| SR017 | Crunchbase | EvolutionaryScale — Crunchbase Company Profile | |
| SR018 | HuggingFace | EvolutionaryScale — HuggingFace Organization Page | |
| SR019 | Bloomberg | EvolutionaryScale Raises $142 Million for Protein AI | |
| SR020 | EUR-Lex (European Union) | Regulation (EU) 2024/1689 — The Artificial Intelligence Act | laying down harmonised rules on artificial intelligence ... to promote the uptake of human centric and trustworthy artificial intelligence |
| SR021 | artificialintelligenceact.eu | The Act Texts — EU Artificial Intelligence Act | |
| SR022 | Wikipedia | AlphaFold | AlphaFold 2's results at CASP14 were described as 'astounding' and 'transformational'. As of November 2025, the paper had been cited nearly 43,000 times. |
| SR023 | Chai Discovery | Introducing Chai-1: A Multi-Modal Foundation Model for Molecular Structure Prediction | We tested Chai-1 across a large number of benchmarks, and found that the model achieves a 77% success rate on the PoseBusters benchmark (vs. 76% by AlphaFold3), as well as an Cα LDDT of 0.849 on the CASP15 protein monomer structure prediction set (vs. 0.801 by ESM3-98B) |
| SR024 | Meta AI | ESM Metagenomic Atlas: The first view of the 'dark matter' of the protein universe | |
| SR025 | Johns Hopkins Center for Health Security | Center for Health Security — Mission and Research Focus | We advance policies and practice addressing diverse challenges, including... the potential for biological accidents or intentional threats |
| SR026 | Amazon Web Services | Amazon SageMaker JumpStart | |
| SR027 | Wikipedia | Asilomar Conference on Recombinant DNA | A group of about 140 professionals participated in the conference to draw up voluntary guidelines to ensure the safety of recombinant DNA technology |
| SR028 | bioRxiv / EvolutionaryScale | Simulating 500 million years of evolution with a language model (Preprint) | Authors are employees of EvolutionaryScale, PBC. Patents have been filed related to aspects of this work. |
| SR029 | OpenAI | Safety and Responsibility — OpenAI | |
| SR030 | Anthropic | Responsible Scaling Policy | |
| SR031 | Wikipedia | AI Safety | |
| SV001 | Crunchbase | EvolutionaryScale — Funding, Investors, and Overview | EvolutionaryScale secured $142 million in a seed investment round. The funding was backed by Amazon and Nvidia and is intended for the development of protein-generating AI. |
| SV002 | EvolutionaryScale | EvolutionaryScale — Company Homepage | |
| SV003 | EvolutionaryScale | ESM3: A frontier language model for biology — ESM3 Release Blog | trained with over 1x10^24 FLOPS and 98B parameters |
| SV004 | EvolutionaryScale | ESM Cambrian — EvolutionaryScale Blog | |
| SV005 | Science (AAAS) | Simulating 500 million years of evolution with a language model | Simulating 500 million years of evolution with a language model |
| SV006 | Yahoo Finance | Absci Corporation (ABSI) Stock Price, News, Quote & History | Market Cap (intraday) 799.793M |
| SV007 | Yahoo Finance | Recursion Pharmaceuticals (RXRX) Stock Price, News, Quote & History | Market Cap (intraday) 1.555B |
| SV008 | Yahoo Finance | Schrodinger, Inc. (SDGR) Stock Price, News, Quote & History | Market Cap (intraday) 892.913M |
| SV009 | Absci Corporation (SEC Filing) | Absci Corporation Annual Report on Form 10-K for FY2025 (absi-20251231) | Revenue was $2.8 million for the year ended December 31, 2025 compared to $4.5 million for the year ended December 31, 2024. |
| SV010 | Recursion Pharmaceuticals (SEC Filing) | Recursion Pharmaceuticals Annual Report on Form 10-K for FY2025 (rxrx-20251231) | We had an accumulated deficit of $2.1 billion as of December 31, 2025. |
| SV011 | SEC EDGAR | EDGAR Filing Search — Absci Corp 10-K filings | |
| SV012 | SEC EDGAR | EDGAR Filing Search — Recursion Pharmaceuticals 10-K filings | |
| SV013 | SEC EDGAR | EDGAR Filing Search — Schrodinger Inc 10-K filings | |
| SV014 | KPMG Private Enterprise | Venture Pulse Q4 2024 — US Venture Capital Trends | people are now starting to become more discerning as to who the winners may be in the AI space — the companies with credible business models, creating highly disruptive solutions, as opposed to others who have put AI wrappers on existing solutions |
| SV015 | Precedence Research | Protein Engineering Market Size to Hit USD 23.59 Billion By 2035 | The global protein engineering market size was estimated at USD 5.09 billion in 2025 and is predicted to increase from USD 5.95 billion in 2026 to approximately USD 23.59 billion by 2035, expanding at a CAGR of 16.57% from 2026 to 2035. |
| SV016 | Precedence Research | Drug Discovery Market Size, Share, and Trends 2025–2034 | The global drug discovery market size is valued at USD 71.89 billion in 2025 and is predicted to increase from USD 78.51 billion in 2026 to approximately USD 158.74 billion by 2034. |
| SV017 | MarketsandMarkets | Protein Engineering Market — Global Forecast | |
| SV018 | Absci | Absci — AI Biologics Drug Creation Platform | |
| SV019 | Profluent | Profluent — AI Protein Design | |
| SV020 | Cradle | Cradle — AI Protein Engineering Platform | |
| SV021 | Generate Biomedicines | Generate Biomedicines — Generative Biology Platform | |
| SV022 | Insilico Medicine | Insilico Medicine — Generative AI Drug Discovery | |
| SV023 | Isomorphic Labs | Isomorphic Labs — Reimagining Drug Discovery with AI | |
| SV024 | Recursion | Recursion — Pioneering AI Drug Discovery | |
| SV025 | Xaira Therapeutics | Xaira Therapeutics — Company Homepage | |
| SV026 | NVIDIA | NVIDIA Clara BioNeMo — AI Drug Discovery and Protein Language Models | |
| SV027 | Google DeepMind | AlphaFold — Predicting Protein Structure and Interactions | AlphaFold 3 and AlphaFold Server are launched — Google DeepMind and Isomorphic Labs introduce AlphaFold 3, which predicts the structure and interactions of all of life's molecules. |
| SV028 | PitchBook | EvolutionaryScale — PitchBook Funding Profile | |
| SV029 | Bloomberg | EvolutionaryScale Raises $142 Million From Amazon, Nvidia | |
| SV030 | Hacker News (Algolia Search API) | Hacker News — EvolutionaryScale funding Series A discussions | |
| SV031 | SEC EDGAR (Full-Text Search) | EDGAR Full-Text Search — EvolutionaryScale Form D | hits total value 0 |
| SV032 | Grand View Research | Artificial Intelligence In Drug Discovery Market Report, 2033 | The global artificial intelligence in drug discovery market size was estimated at USD 2.35 billion in 2025 and is projected to reach USD 13.77 billion by 2033, growing at a CAGR of 24.8% from 2026 to 2033. |
| SV033 | Hugging Face | EvolutionaryScale Organization — Hugging Face Model Hub |