EvolutionaryScale
世界级蛋白质 LM 科学撞上非商业化退出——Nov 2025 被 CZI 吸收后,Series A 投资人的回报未披露,也没有独立股权故事可讲。
EvolutionaryScale 做出了前沿质量的蛋白语言模型(经 Science 验证的 ESM3),但 2025 年 11 月被 CZI 吸收——距 $142M Series A 仅约 14 个月——终结了独立投资逻辑,商业投资人的回报也未有公开交代。
封面要素
公司概况
EvolutionaryScale 是一家总部位于 San Francisco 的 AI 生物公司,开发并发布了 ESM3:一个 98B 参数的生成式多模态蛋白质语言模型,训练数据包括 2.78 billion 条序列(在 NVIDIA H100 上消耗 1×10^24 FLOPs);随后又推出偏表征的 ESM Cambrian(ESM-C)。四位联合创始人 Alex Rives(CEO)、Tom Sercu(工程副总裁)、Zeming Lin、Sanjay Rao 都来自 Meta AI Research(FAIR),ESM 谱系也起源于他们在 FAIR 的工作。公司完成一轮种子轮(June 2024)和由 Amazon、NVIDIA 共同领投的 $142M Series A(September 2024),Series A 隐含投后估值约 ~$1.35B,Lux Capital、Nat Friedman、Daniel Gross 也参投。ESM3 通过 Science 论文(January 2025)获得同行评审验证,并接入 AWS SageMaker JumpStart 与 NVIDIA BioNeMo。November 6, 2025,Series A 交割不到 14 个月后,整个团队加入 Chan Zuckerberg Initiative 的 Frontier AI for Biology initiative 旗下 CZ Biohub——EvolutionaryScale 作为独立营利实体就此结束。公司独立期间从未披露商业收入。
- 成立时间
- 2023-01-01
- 创始人
- Alexander (Alex) Rives, Tom Sercu, Zeming Lin, Sanjay Rao
- 创立地点
- San Francisco, California, USA
- 总部
- San Francisco, California, USA (pre-acquisition); post-acquisition operations consolidated into CZ Biohub Network sites.
- 产品
- EvolutionaryScale 的产品线围绕 ESM(Evolutionary Scale Modeling)蛋白质语言模型家族展开——ESM3(多模态生成式,最高 98B 参数)以及 ESM Cambrian / ESM-C(偏表征,300M / 600M / 6B 版本)。分发路径包括:(a)面向学术 / 非商业用途的 HuggingFace 开放权重渠道(截至 May 2026,esm3-sm-open-v1 与 ESM-C 版本合计约 9,400+ 次下载),(b)位于 forge.evolutionaryscale.ai 的商业 Forge API,配套面向开发者的 SDK 和文档,(c)面向企业 H100 部署的 NVIDIA BioNeMo NIM 微服务集成,以及(d)AWS Marketplace SageMaker JumpStart 上架。同行评审的 Science 论文(Jan 16, 2025;DOI 10.1126/science.ads0018)验证了 ESM3 设计全新功能性荧光蛋白的能力,给公司补上了独立科学可信度。
- 客户
- 收购前,目标客户包括生物技术和制药 R&D 团队(靶点识别、蛋白质工程、抗体设计)、合成生物学和工业酶公司、学术研究人员(免费的开放权重层),以及通过 AWS Marketplace 和 NVIDIA BioNeMo 触达模型的基础设施客户。公司从未公开披露具名制药客户或付费客户数。
- 商业模式
- 商业模式是通过 Forge 平台收取 SaaS / 按量 API 费用,并与 AWS(SageMaker JumpStart 上架)和 NVIDIA(BioNeMo NIM 微服务)做合作伙伴收入分成。免费的开放权重 ESM3 和 ESM-C 版本在 HuggingFace 以非商业研究许可证发布,用作开发者社区 / 线索生成策略。收购后,该实体成为非营利研究网络(CZI / CZ Biohub)的一部分,Forge API 未来的商业状态尚未得到公开确认。
- 阶段
- acquired
- 融资情况
- Chan Zuckerberg Initiative / CZ Biohub 已于 November 6, 2025 收购 EvolutionaryScale;团队并入 Frontier AI for Biology initiative 旗下的 CZ Biohub Network。收购条款(现金、股票、IP 转让、员工留任方案、Series A 优先股处理)未公开披露。收购前已披露累计融资为 $142M(Series A,Sep 26 2024),另有未披露金额的种子轮(Jun 25 2024 宣布);Series A 隐含投后估值约 ~$1.35B。
执行摘要
主要优势
- ESM3 是已公开发布的最大蛋白语言模型(98B 参数、2.78B 训练序列、1×10^24 FLOPs),并在 Science(2025 年 1 月)通过同行验证,证明能设计真正新颖的功能蛋白;这种可信度护城河,闭源对手很难匹配。
- 创始团队 Rives、Sercu、Lin、Rao 曾在 Meta AI Research 打造原始 ESM 系列,被普遍视为全球最强的蛋白 LM 研究者之一;CZI 整队招入,本身就是人才质量的证据。
- Amazon(AWS)和 NVIDIA 两个战略共同投资人同时提供资本与分发(AWS SageMaker JumpStart 上架、NVIDIA BioNeMo NIM 微服务),让 ESM3 触达的企业客户远超一家 Series A 初创公司单独能做到的范围。
- 开发者信号强:ESM3 与 ESM-C model cards 在 HuggingFace 合计约 9,400+ 次下载,GitHub 活跃(esm + DeepEP + 基础设施 fork),下游学术引用图也在增长(Semantic Scholar 显示 32+ 篇论文基于 ESM3)。
主要风险
- 2025 年 11 月 6 日被 CZI / CZ Biohub 吸收(距 Series A 交割不足 14 个月),独立商业实体和公开股权退出逻辑随之终止; 收购条款、Series A 优先股处理、以及商业 Forge API 是否延续,均未披露。
- 创始团队来源单一(四人均出自 Meta FAIR),抬高文化 / 方法同质化和关键人集中风险;收购后 Rives 留在 CZI 而非后续商业实体,也关掉了重新拆出团队的选项。
- 开源商品化威胁很尖锐:前代 ESM2 采用 MIT 许可并可免费使用,AlphaFold 3 发布非商业用途权重,OpenFold 和 Chai-1 开源,Meta 还保留底层 ESM IP;这些因素共同削弱客户为闭源 Forge API 付费的意愿。
- 公司未披露任何商业收入,EDGAR 中也没有任何 “EvolutionaryScale” 变体的 SEC Form D 文件(对 $142M 融资并不常见), Series A 报道又被 Bloomberg 付费墙挡住,资本结构、现金跑道和客户经济性都无法核验。
- 双用途 / 生物安全监管阻力正在上升(美国 2023 年 10 月行政令 §4.4 蛋白设计观察清单、BIS 关于 AI 生物的预告通知、EU AI Act 双用途条款);非营利继承主体面临的合规义务可能不同,但不一定更轻。
未决问题
- 2025 年 11 月 CZI / CZ Biohub 交易条款未公开:现金、股票、IP 转让、留任包,尤其是 Amazon、NVIDIA、Lux Capital、Nat Friedman、Daniel Gross 持有的 Series A 优先股如何处理。
- Forge API 运营状态、客户数量、定价层级、以及收购后的任何延续承诺;商业 API 是继续可用,还是在 CZ Biohub 非营利架构下被废弃。
- 公司生命周期任何阶段的商业收入和 ARR 从未公开;没有公开 S-1 / Form D / 10-K 文件,也就无法核验经审计收入。
- 种子轮准确金额未公开(2024 年 6 月 25 日与 ESM3 发布同步宣布,投资人包括 NVIDIA、Amazon、Lux、Friedman 和 Gross)。
- Meta 与 EvolutionaryScale 围绕 ESM2 系列的 IP 转让协议,以及随后转给 CZI / CZ Biohub 的安排,均未公开描述; ESM 商标和底层模型权重的法律所有权不清楚。
- Bloomberg 2024 年 9 月 26 日的 Series A 文章有付费墙,挡住了对投资人权利、董事会构成、按比例认购协议、以及该轮任何老股成分的独立核验。
目录
01公司概况
1.1 身份、总部与商业模式
EvolutionaryScale, Inc. 是一家早期 AI 生物公司,总部位于 San Francisco, California,2023 年注册,约从 March 2024 起运营。公司宣称的使命是用大型生成式模型解码蛋白质序列语言,把蛋白质视为承载数十亿年生物进化的文本,并据此设计具备可编程功能的新蛋白。其核心商业产品是 Forge API 平台(forge.evolutionaryscale.ai),面向开发者,以程序化方式开放 ESM3 和 ESM Cambrian(ESM-C)模型家族。预期收入模式是软件即服务 API 订阅和按量计费,服务需要加速蛋白质工程的生物技术、制药、合成生物学客户。公司也通过 Hugging Face 发布开放权重 ESM3 模型(esm3-sm-open-v1),供非商业学术用途,在商业产品之外建设开发者社区。截至 November 2025,整个团队被 Chan Zuckerberg Initiative 旗下 CZ Biohub 吸收后,EvolutionaryScale 不再作为独立实体运营,独立商业公司的路径也随之终止。公司独立期间从未公开披露收入、ARR 或付费客户数。 [CO001, CO003, CO006, CO007, CO028, CO029]
| 指标 | 数值 / 状态 | 日期 / 期间 | 置信度 | 来源 / 缺口 |
|---|---|---|---|---|
| 公司阶段 | 并入 CZ Biohub(非营利) | November 2025 | 高 | biohub.org 收购公告 |
| 总部(收购前) | 美国加州旧金山 | 2024-2025 | 中 | Crunchbase;未确认一手街道地址 |
| 成立 | 2023(注册);约 March 2024 开始运营 | 2023 / Mar 2024 | 高 | 官网;章节背景简报 |
| CEO / 创始人 | Alexander (Alex) Rives(CEO 至 Nov 2025) | 2024-Nov 2025 | 高 | evolutionaryscale.ai;NVIDIA 博客;LinkedIn |
| 累计确认融资 | $142M+(种子轮未披露 + $142M Series A) | Sept 2024 | 高 | Bloomberg(付费墙);NVIDIA 博客;Crunchbase |
| Series A 估值(隐含) | ~$1.35B 投后 | Sept 26, 2024 | 中 | 第三方估算;未获一手备案确认 |
| 员工数(收购前) | 11-50 | LinkedIn,截至 2024 | 低 | LinkedIn 公司页;未披露官方员工数 |
| 主要产品 | ESM3(98B 参数蛋白质语言模型) | 发布于 June 25, 2024 | 高 | 官方发布页:evolutionaryscale.ai/blog/esm3-release |
| ESM Cambrian(ESM-C) | 300M / 600M / 6B 参数模型 | 发布于 Dec 4, 2024 | 高 | 官方发布页:evolutionaryscale.ai/blog/esm-cambrian |
| Science 论文(ESM3) | 发表于 Science 期刊,Jan 16, 2025 | Jan 16, 2025 | 高 | DOI: 10.1126/science.ads0018;此前有 BioRxiv 预印本 |
| 收入 / ARR | 未公开披露;Forge API SaaS 模式 | 当前 | 低 | 无备案;forge.evolutionaryscale.ai 仅 JS 可见 |
| NVIDIA 合作 | BioNeMo NIM 集成;种子轮 + Series A 投资人 | 2024-2025 | 高 | NVIDIA 博客:blogs.nvidia.com;BioNeMo 页面:nvidia.com/bionemo |
| SEC Form D 备案 | EDGAR 未找到(2024-2026) | May 2026 | 高 | efts.sec.gov Form D 检索;SEC EDGAR |
| Wikipedia 页面 | 不存在(404) | May 2026 | 高 | en.wikipedia.org/wiki/EvolutionaryScale 返回 404 |
估值和员工数仅基于第三方报道;没有 SEC 备案或一手股权结构表披露。Series A 的种子轮金额未公开披露。收购后的公司结构和投资人回报未知。
[CO001, CO003, CO004, CO007, CO008, CO015]截至 2026 年 5 月,概括 EvolutionaryScale 资本、技术规模、模型采用和当前状态的核心绩效指标。
HuggingFace 下载量是 2026 年 5 月研究会话的快照,可能变化。估值是来自第三方来源的隐含估计。参数规模和训练数据规模来自公司官方出版物。
[CO013, CO016, CO017, CO019, CO021, CO025]1.2 创始人、领导层与关键人物风险
EvolutionaryScale 由四位曾在 Meta AI Research(FAIR)共事的研究员联合创立:Alexander (Alex) Rives、Tom Sercu、Zeming Lin、Sanjay Rao。Alex Rives 担任 CEO,是 ESM 蛋白质语言模型谱系的主要架构师;该谱系源于他的学术工作,并在公司拆分前延续到 Meta FAIR。Tom Sercu 担任联合创始人兼工程副总裁,带领基础设施和工程团队搭建 Andromeda H100 集群。Zeming Lin 与 Sanjay Rao 以技术联合创始人身份参与模型开发和研究。November 2025 收购后,Alex Rives 出任 Chan Zuckerberg Initiative Head of Science,其余创始团队进入 CZ Biohub 担任高级研究职务。公司整个创始团队都来自同一雇主(Meta FAIR),形成显著集中风险:团队在文化、方法论和技术假设上高度同质,且没有证据显示创始团队中存在独立董事、AI 研究圈之外的顾问,或具备商业化生物技术经验的高管。单一雇主来源也加重关键人物依赖;任何一位创始人离开,尤其是 CEO 和技术愿景核心 Rives 离开,都会对公司研究方向和投资人信心产生过大影响。 [CO002, CO004, CO005, CO022, CO030]
| 人物 | 角色(在 EvolutionaryScale) | 创业前背景 | 创始人-市场匹配 | 关键人物依赖 |
|---|---|---|---|---|
| Alexander (Alex) Rives | CEO,联合创始人 | Meta AI(FAIR)研究员;开创 ESM 蛋白质 LM 研究谱系 | 蛋白质语言建模深度专家;ESM 模型家族发明者 | 关键:公众代表、研究愿景负责人和 CEO;离开会实质性削弱公司 |
| Tom Sercu | 联合创始人,工程副总裁 | Meta AI(FAIR)工程师;ESM3 和 BioRxiv 预印本共同作者 | 负责大规模模型训练与推理的基础设施和工程领导 | 高:领导工程;直接负责 Andromeda 训练集群运营 |
| Zeming Lin | 联合创始人,研究 | Meta AI(FAIR)研究员;ESM3 BioRxiv 预印本共同作者 | ESM 模型开发的核心研究贡献者 | 中:研究贡献者;公众曝光低于 Rives 或 Sercu |
| Sanjay Rao | 联合创始人,研究 | Meta AI(FAIR);参与 ESM 研究项目 | 拥有 AI 研究背景的核心技术联合创始人 | 中:研究联合创始人;独立公开记录有限 |
四位联合创始人此前都在 Meta AI Research(FAIR)工作,形成单一雇主集中风险。Salvatore Candido 出现在 BioRxiv 预印本作者名单中,可能是第五位联合创始人或早期团队成员,但公开来源未能独立确认其在 EvolutionaryScale 的角色。收购后:Rives 成为 CZI 科学负责人;其他创始人以研究角色加入 CZ Biohub。
[CO002, CO004, CO005, CO022, CO030, CO031]1.3 融资历史、估值与资本结构
在被 CZI 吸收前,EvolutionaryScale 完成了两轮已知融资。第一轮是种子轮,与 ESM3 于 June 25, 2024 公开发布同时宣布。NVIDIA、Amazon 均参与种子轮,Lux Capital、Nat Friedman、Daniel Gross 也参投。种子轮准确金额没有公开披露,但 NVIDIA 单独宣布了参投。Series A 很快跟进:September 26, 2024,EvolutionaryScale 宣布 $142M Series A,由 Amazon 和 NVIDIA 共同领投,Lux Capital、Nat Friedman、Daniel Gross 继续参投,隐含投后估值约 $1.35B。该轮在 ESM3 发布仅三个月后交割,是 2024 年规模最大的 AI 生物融资之一。值得注意的是,EDGAR 在 2024 至 2026 期间没有找到任何 “EvolutionaryScale” 变体名下的 SEC Form D 文件。Bloomberg 曾报道 Series A,但文章在付费墙后,公开渠道无法核验完整交易条款、投资人权利和任何老股交易部分。已确认累计融资为 $142M 加未披露金额的种子轮。公开记录中没有商业收入、债务融资或老股交易。CZI 收购时公司的资本结构仍未公开。 [CO015, CO016, CO017, CO018, CO019, CO036]
| 利益相关方 | 角色 / 关系 | 轮次 / 阶段 | 经济 / 控制重要性 | 尽调问题 |
|---|---|---|---|---|
| Amazon(AWS) | Series A 领投方;云基础设施合作伙伴 | Series A 共同领投(Sept 2024) | 经济权益高;可能为优先股;战略算力供应安排概率高 | 确认投资金额、优先条款、董事席位和 AWS 算力额度安排 |
| NVIDIA | Series A 领投方;种子轮投资人;BioNeMo 集成伙伴 | 种子轮 + Series A 共同领投(2024) | 高经济权益;战略性:ESM3 集成进 H100 基础设施上的 BioNeMo/NIM | 确认投资金额、NIM 授权收入分成和 GPU 基础设施承诺条款 |
| Lux Capital | 早期 VC;参与种子轮和 Series A | 种子轮 + Series A(2024) | 早期投资人,可能自种子轮起持有显著所有权 | 确认持股比例、清算优先权和收购后处理方式 |
| Nat Friedman | 天使投资人;参与种子轮和 Series A | 种子轮 + Series A(2024) | 个人天使;相对机构投资人经济权益可能较小 | 确认参投金额和任何顾问角色 |
| Daniel Gross | 天使投资人;参与种子轮和 Series A | 种子轮 + Series A(2024) | 个人天使;具备 AI 算力投资人背景 | 确认参投金额及其与 AI 基础设施生态的关系 |
| Chan Zuckerberg Initiative(CZI)/ CZ Biohub 吸收方 | 收购方 / 继任组织(Nov 2025) | 收购 / 整合(Nov 2025) | 关键:吸收整个 EvolutionaryScale 团队并大概率接收 IP;控制 ESM 模型的未来 | 披露收购条款:股权买断、IP 转让、商业协议延续性和投资人回报细节 |
NVIDIA 博客和 Crunchbase 确认 Amazon 与 NVIDIA 为 Series A 共同领投方。各投资人的完整金额不公开。CZI 收购条款未公开披露。 可能还有未在可得来源中识别的其他 Series A 投资人。
[CO015, CO016, CO017, CO018, CO021, CO023]1.4 产品、技术与运营
EvolutionaryScale 的产品组合以 ESM(Evolutionary Scale Modeling)蛋白质语言模型家族为核心。ESM3 于 June 25, 2024 发布,是生成式多模态蛋白质模型,最高 98 billion 参数;发布时,它是公开发布的最大蛋白质语言模型。ESM3 用 2.78 billion 条蛋白质序列训练,总计 771 billion tokens,在公司内部称为 Andromeda 的 NVIDIA H100 GPU 集群上消耗约 1×10^24 浮点运算。公司称,ESM3 在一次生成过程中模拟 500 million 年蛋白质进化,从而设计具有用户指定结构和功能属性的新蛋白。Science 于 January 16, 2025 发表同行评审论文(DOI: 10.1126/science.ads0018),验证 ESM3 能设计真正全新的荧光蛋白,显著提升独立可信度。ESM Cambrian(ESM-C)于 December 4, 2024 发布,是下一代蛋白质语言模型,提供 300M、600M、6B 参数版本,针对高效推理优化。NVIDIA 将 ESM3 接入 BioNeMo 平台,并以 NVIDIA NIM 微服务形式开放,支持企业在 H100 基础设施上部署。公司维护一个 GitHub 组织,拥有 9+ 个仓库,包括 ESM 模型代码库、用于 mixture-of-experts 推理的 DeepEP 仓库,以及若干开源训练基础设施 fork。Forge API 平台承担商业开发者接口。收购前,LinkedIn 显示 EvolutionaryScale 员工数为 11 至 50 人。 [CO007, CO008, CO009, CO010, CO011, CO012]
EvolutionaryScale 蛋白语言模型如何从训练数据接到模型开发、API 平台和战略合作,最终走向 CZI 收购。
[CO009, CO010, CO025, CO026, CO028, CO032]1.5 关键里程碑与负面事件
EvolutionaryScale 的历史从 2023 年注册开始,经历快速的产品和融资阶段,到 November 2025 被 CZI 吸收为止;作为独立实体的总运营寿命不足三年。公司最重要的技术里程碑是 ESM3 Science 论文发表,首次用独立同行评审验证生成式蛋白质语言模型能够设计具备可编程属性的全新功能蛋白。September 2024,以约 $1.35B 估值完成 $142M Series A,距离产品发布仅三个月,反映出投资人对该技术的异常热情。但 November 2025 被 CZI/Biohub 收购,是最关键的负面治理事件:Series A 交割到并入非营利实体间隔不到 14 个月,商业投资人的退出路径高度不确定,因为 CZI 收购条款以及任何给股权持有人的补偿都没有公开披露。从营利性商业实体转为非营利研究项目,实质性引发几个问题:Series A 投资人如何处理、员工股权如何处理、Forge API 商业产品能否延续。其他负面事件还包括:没有任何 SEC Form D 文件,这对于一家融资 $142M 的公司并不寻常;ESM GitHub 仓库重定向到 biohub 组织,释放 IP 转移信号;公司没有 Wikipedia 页面,说明主流媒体记录有限。 [CO001, CO008, CO013, CO016, CO019, CO021]
| 日期 | 事件 | 类型 | 金额 / 估值 / 状态 | 参与方 | 含义 |
|---|---|---|---|---|---|
| 2023 | 公司注册;创始团队离开 Meta AI(FAIR) | 创立 | — | Alex Rives、Tom Sercu、Zeming Lin、Sanjay Rao 四位创始人 | 建立法律实体;创始团队围绕 ESM 蛋白质 LM 研究谱系聚合 |
| Mar 2024 | 公司运营里程碑;发布前开发阶段 | 创立 | — | 创始团队 | 研究和工程爬坡;开始在 Andromeda H100 集群上训练模型 |
| Jun 25, 2024 | ESM3 公开发布并宣布种子轮 | 产品 | 种子轮金额未披露 | Lux Capital、Nat Friedman、Daniel Gross、NVIDIA、Amazon 等投资人 | ESM3(98B 参数)随开放权重和 Forge API 公开发布;种子轮融资确认 |
| Jul 2024 | BioRxiv 预印本发布(doi: 10.1101/2024.07.01.600583) | 产品 | — | Rives、Sercu、Candido、Lin 等 | 同行评审前,ESM3 科学主张已可供社区审阅 |
| Sep 26, 2024 | 宣布 $142M Series A | 融资 | $142M,隐含估值约 ~$1.35B | Amazon(共同领投)、NVIDIA(共同领投)、Lux Capital、Nat Friedman、Daniel Gross | 2024 H2 最大 AI 生物学融资;机构认可蛋白质 LM 平台 |
| Dec 4, 2024 | 发布 ESM Cambrian(ESM-C) | 产品 | — | EvolutionaryScale 团队 | 扩展模型家族(300M/600M/6B 参数);为客户拓宽推理效率选择 |
| Jan 16, 2025 | ESM3 论文发表于 Science 期刊 | 产品 | — | Rives 等,Science DOI: 10.1126/science.ads0018 | 同行评审验证新型 GFP 设计;增强科学可信度和机构采用 |
| Nov 6, 2025 | 团队加入 CZ Biohub;EvolutionaryScale 并入 CZI | 负面 | 条款未披露 | CZI / CZ Biohub,完整 EvolutionaryScale 团队 | 公司不再作为独立实体存在;投资人退出不确定;ESM IP 转移至非营利组织 |
从商业投资人角度看,CZI 收购事件(Nov 2025)是负面事件。种子轮确切金额仍未披露。Series A 估值是第三方来源给出的隐含数字, 未获一手 SEC 备案确认。March 2024 运营启动日期来自章节背景简报;未找到一手公告。
[CO001, CO008, CO011, CO013, CO015, CO016]从 2023 年成立,到 ESM3 发布、Series A、ESM Cambrian、Science 论文发表,再到 2025 年 11 月被 CZI 收购的关键里程碑。
2024 年 3 月运营启动日期来自章节背景简报;未找到一手公告。Series A 约 $1.35B 估值为第三方估计。收购后的 ESM IP 延续性说法从 biohub.org 公告推断,未获正式 IP 转让协议确认。
[CO001, CO002, CO007, CO008, CO010, CO011]02市场分析
2.1 市场定义与边界
EvolutionaryScale 的可服务市场有两个相互连接的层次。第一层是蛋白质语言模型(PLM)API 与平台市场——云托管或本地部署的 AI 模型,让蛋白质工程师无需耗尽湿实验室定向进化,就能生成、预测、优化蛋白质序列和结构。该市场不同于通用云计算、传统生物信息学管线、基因组测序平台、医学影像 AI 或临床试验管理软件。第二层是更广义的蛋白质工程研究工具市场,涵盖重组蛋白研究和设计中使用的全部试剂、仪器和软件;PLM API 是该更大可服务市场中的高增长 AI 子类别。 蛋白质 LM 平台的现状替代品包括:(1)AlphaFold2/3(Google DeepMind),用于蛋白质结构预测——通过包含 200 million+ 个预测结构的公共数据库,免费供非商业用途使用;(2)Rosetta 和 PyRosetta(University of Washington),用于计算蛋白质设计——开源,但计算密集,并要求较高专业能力;(3)湿实验室定向进化(反复随机诱变和筛选)——成本高、速度慢(每周期数周)、吞吐受限;(4)传统分子动力学和同源建模软件(GROMACS、MODELLER、Schrödinger Maestro)——成熟的结构引导设计工具,但不具备生成能力。ESM3 和 ESM-C 的差异化在于,在一个统一生成式模型内同时推理蛋白质序列、结构和功能。 相邻市场包括 AI 药物发现平台市场(加速小分子和生物药设计的全部计算工具)。Grand View Research 估计该市场 2025 年为 $2.35B,到 2033 年增至 $13.77B,CAGR 为 24.8%。工业生物技术——面向绿色化学、农业和生物材料的酶工程——是另一个相邻市场,采购模式与药物应用不同,监管负担也更低。最外层边界是全球药物发现市场(Precedence Research 估计 2025 年 $71.89B),包含所有模式和服务,远超 EvolutionaryScale 的直接覆盖范围。[CM001, CM002, CM003, CM004, CM005, CM006]
| 市场细分 / 类别 | 纳入支出 | 排除支出 | 主要买方 / 付款方 | EvolutionaryScale 相关性 |
|---|---|---|---|---|
| 蛋白质语言模型 API(核心) | 通过 ESM3/ESM-C 使用云 API,生成蛋白质序列、做向量嵌入、预测结构,并进行多模态推理 | 传统湿实验定向进化、基因组测序仪器、没有蛋白质专用模型的一般云计算 | 制药 / 生物技术计算生物学副总裁、CSO、ML / 生物信息科学家 | 直接收入来源:Forge API、AWS SageMaker JumpStart 分发、NVIDIA BioNeMo NIM 微服务 |
| 蛋白质工程研究工具(主要相邻) | 支持重组蛋白研究、设计和优化的所有软件、AI 工具和服务 | 实验室仪器(PCR 机器、晶体学)、DNA 合成试剂、定向进化耗材 | 生物制药、生物技术和工业生物技术的研发负责人 | ESM3/ESM-C 是 AI 原生蛋白质设计工具,覆盖这一更宽类别;分析师估计规模为 $2.6–5.1B(2023–2025) |
| AI 药物发现平台(次级相邻) | AI 驱动的靶点识别、生成式分子设计、虚拟筛选、ADMET 预测 | CRO 湿实验服务、临床试验管理、测序平台、医学影像 AI | 制药 R&D 负责人 / 首席科学官、业务拓展 | ESM3 支持蛋白质靶点表征和抗体设计,使 EvolutionaryScale 成为 AI 药物发现工作流的赋能平台 |
| 工业生物技术(相邻非制药) | 面向绿色化学、生物燃料、农业、特种材料和食品科学的酶工程 | 诊断工具、医疗器械、医院 IT、保险平台 | 工业生物技术 R&D 团队、生物技术制造企业 CTO / 工程副总裁 | 正在增长的次级买方细分:ESM3/ESM-C 适用于酶优化工作流;监管负担低于制药 |
| 开源 / 自托管替代品(现状) | 自托管 ESM2(MIT)、Rosetta(开源)、AlphaFold(免费非商业)、PyRosetta | 无——该类别代表 EvolutionaryScale 争取付费转化时面对的免费替代方案 | 学术实验室、资源充足的创业公司计算团队、制药公司内部生物信息组 | MIT 许可证下的开源 ESM-C 是 Forge API 的直接替代品;自托管是主要成本竞争对手 |
市场边界来自 EvolutionaryScale 产品文档、ESM3 Science 论文、分析师市场报告(MarketsandMarkets、Precedence Research、 Grand View Research)和 FDA 监管指引。开源替代品这一行用于记录 EvolutionaryScale 免费产品与付费 Forge API 之间的直接竞争动态。 AI 药物发现平台相邻市场估计来自 Grand View Research($2.35B,2025)。
[CM001, CM002, CM003, CM004, CM005, CM006]2.2 市场规模与分析师估算
已发布的蛋白质工程市场估算随口径不同而差异很大。MarketsandMarkets 估计市场从 $2.2B(2019)增至 2024 年 $3.9B,CAGR 12.4%;这是较窄口径,聚焦蛋白质工程工具和服务。Precedence Research 口径更宽,估计 2025 年 $5.09B,到 2035 年增至 $23.59B,CAGR 16.57%,纳入工业酶和生物制药等更多蛋白质工程应用。Allied Market Research 估计 2022 年 $2.2B,到 2032 年增至 $7.7B,CAGR 13.2%。Grand View Research 估计 2023 年 $2.60B,到 2030 年增至 $7.62B,CAGR 16.24%。方向性共识很清楚——未来十多年保持中双位数百分比年增长——但 $2.2B 与 $5.09B 的基数差异来自范围不一致,不是证据相互矛盾。 相邻的 AI 药物发现市场(Grand View Research)估计 2025 年 $2.35B,到 2033 年增至 $13.77B,CAGR 24.8%,增速显著快于单独的蛋白质工程工具市场,反映 AlphaFold 之后大型制药公司 AI 投资的拉动。更广义的药物发现市场(全部模式)估计 2025 年 $71.89B,到 2034 年增至 $158.74B,CAGR 9.2%(Precedence Research),可作为外层 TAM 参照。EvolutionaryScale 的可服务市场,是蛋白质工程工具客户中的一个子群:(a)拥有计算生物学基础设施,(b)正在采用 AI 原生工具而非传统生物信息学,(c)计算规模足以支持付费 API 订阅,而不是自托管开放权重。 EvolutionaryScale 目前可获取市场包括:来自生物制药和生物技术客户的 Forge API 订阅收入;通过 AWS SageMaker JumpStart 和 NVIDIA BioNeMo 获得的商业模型访问收入;以及在受监管环境中部署 ESM3 与 ESM-C 的企业授权。公司完成 $142M Series A 融资(Crunchbase),说明投资人认可市场机会,但 PLM API 在制药 R&D 中的独立 SOM 数字尚未发布。这是规模测算缺口部分记录的实质性尽调缺口。HuggingFace 下载指标(ESM-C 6,320+、ESM3 3,110+)和 129+ 篇 bioRxiv 引文,可以作为用户采用的代理指标,但不能证明商业收入牵引力。[CM007, CM008, CM009, CM010, CM011, CM012]
| 发布方 | 年份 | 地域 | 市场规模(USD) | CAGR | 方法 / 范围 | 置信度 | 关键限制 |
|---|---|---|---|---|---|---|---|
| MarketsandMarkets(付费墙) | 2019–2024 | 全球 | $2.2B (2019) → $3.9B (2024) | 12.4% CAGR | 蛋白质工程工具和服务;包括软件、服务和部分设备;自下而上 | 中 | 范围更窄;广义工业酶未纳入;免费层未完全披露方法 |
| Precedence Research | 2025–2035 | 全球 | $5.09B (2025) → $23.59B (2035) | 16.57% CAGR | 广义蛋白质工程市场;纳入工业应用、生物制药和研究工具 | 中 | 范围更宽,抬高了相对 MAM 的基数;2035 预测有复利不确定性;与更窄估计口径不一致 |
| Allied Market Research 市场研究 | 2022–2032 | 全球 | $2.2B (2022) → $7.7B (2032) | 13.2% CAGR | 以研究为中心的蛋白质工程工具;与 MAM 中点轨迹一致 | 中 | 与 MAM 和 GVR 范围部分重叠;一手报告付费;置信度受二手报道限制 |
| Grand View Research – 蛋白质工程 | 2023–2030 | 全球 | $2.60B (2023) → $7.62B (2030) | 16.24% CAGR | 包括生物制药蛋白质工程和工业应用的蛋白质工程市场 | 中 | 来自 Wayback Machine 快照;当前页面需要订阅;范围与 Allied 一致但 CAGR 更高 |
| Grand View Research – AI 药物发现 | 2025–2033 | 全球(相邻) | $2.35B (2025) → $13.77B (2033) | 24.8% CAGR | AI 驱动的药物发现软件平台;由于制药 AI 投资加速,增速快于单独的蛋白质工程工具 | 中 | 相邻市场;EvolutionaryScale 是赋能工具,而非完整药物发现平台;与 TAM 部分重叠 |
| Precedence Research – 药物发现(外层边界) | 2025–2034 | 全球 | $71.89B (2025) → $158.74B (2034) | 9.2% CAGR | 所有药物发现服务和工具;最宽口径,涵盖 CRO 湿实验服务、仪器、软件和 AI 工具 | 中 | 明显过宽;仅作外层 TAM 边界;该市场大部分并非 PLM API 可服务范围 |
| EvolutionaryScale / Crunchbase(SOM 代理指标) | 2024–2026 | 全球 | $142M 融资(Series A 轮)作为投资人验证过的商业机会代理指标 | N/A | 以 Series A 轮作为商业潜力的融资代理指标;HuggingFace 下载量(ESM-C 超 6,320 次)作为开发者牵引力指标 | 低 | 未披露 SOM 数字;融资和下载指标只是代理指标,不是收入证据;Forge API 定价未披露 |
所有主要分析师报告的 TAM 均在付费墙后;数值来自可访问落地页、Wayback Machine 快照和二级摘要。最窄($2.2B,MAM 2019)与最宽($23.59B,Precedence 2035)估算相差 10 倍,主要源于口径差异和前瞻期不同,而不是真实市场分歧。EvolutionaryScale 的 SOM 代理指标是分析测算,并非官方披露;未找到独立 PLM API 子市场规模测算。蛋白工程增长率方向一致,为 12–17%;AI 药物发现特定赛道加速至 24.8%。
[CM007, CM008, CM009, CM011, CM012, CM013]EvolutionaryScale 市场的四层金字塔:TAM 外圈边界(全球所有药物发现服务)、可触达 TAM(蛋白工程工具市场)、SAM(蛋白工程中的 AI 原生蛋白设计工具和 API)以及 SOM(EvolutionaryScale 当前 Forge API 和分发渠道收入区),截至 2026 年。
TAM 外圈来自 Precedence Research($71.89B,2025);可触达 TAM 取四项分析师估计的中点(MAM、Allied、GVR、Precedence)。SAM 为分析估计,未有独立发布;由 AI 原生纯软件蛋白工程工具占比推导(排除仪器和试剂)。SOM 仅反映商业 API 和云分发;Forge API 定价和订阅数未披露。所有数字均为方向性估计。
[CM006, CM007, CM008, CM009, CM011, CM012]四个市场规模视角下的低 / 中 / 高估计,单位为 USD billion:蛋白工程工具市场(2024–2025 基准)、相邻的 AI 药物发现市场(2025)、蛋白工程 2030 年预测,以及更广义药物发现市场(2025)。所有数值均用 USD billion,便于统一比较。
蛋白工程 2024–2025:低值 = MAM 2024 估计($3.9B 基准;$2.2B 是 2019/2022 入口),中值 = MAM $3.9B,高值 = Precedence $5.09B。AI 药物发现 2025:低值 = 保守行业估计,中值 = GVR 基准($2.35B),高值 = 市场扩张情景。蛋白工程 2030 预测:低值 = 从 $3.9B 按 12% CAGR 线性外推,中值 = GVR $7.62B,高值 = Precedence 轨迹预测。药物发现外圈边界锚定 Precedence Research $71.89B 中点。所有数值单位均为 USD billion;排除单位不兼容的数据。
[CM007, CM008, CM009, CM011]2.3 买方与用户分层
EvolutionaryScale Forge API 的主要商业买方,是拥有活跃计算生物学或蛋白质工程项目的大型和中型制药 / 生物技术公司。经济买方通常是计算生物学副总裁、药物发现总监或首席科学官,采购权限通过 R&D 和 IT 预算下放。技术拥护者负责评估模型能力并推动采用,通常是嵌入发现团队的计算生物学家、结构生物学家或机器学习科学家。财务和采购负责人承担正式付款方角色,需要用减少湿实验室筛选周期或加快先导化合物识别周期来证明 ROI。 Forge 商业 API 面向需要大规模蛋白质序列生成和向量嵌入、且要求可用性和合规保障的企业客户。AWS SageMaker JumpStart 和 NVIDIA BioNeMo 是次级商业渠道,触达已嵌入这些云生态的制药客户。学术和政府研究实验室构成高体量但非收入的用户分层,他们从 HuggingFace 下载 MIT 许可下的 ESM-C 开放权重(6,320+ 次下载),并通过 PyPI `esm` 包安装——这种社区心智会反哺商业管线。 工业生物技术公司——为绿色化学、农业和特殊材料设计酶——代表一个实质上不同的买方分层,产品开发周期更短,监管负担更低。合同研究组织有双重角色:既可能购买蛋白质 LM API 来增强自身服务,也可能成为渠道伙伴,把带计算能力的蛋白质工程服务转售给制药客户。Series A–B 阶段的生物技术创业公司是新兴付费分层:它们具备计算基础设施,但缺乏内部资源训练前沿蛋白质 LM,因此购买 Forge API 访问在经济上合理。[CM015, CM016, CM017, CM018, CM019, CM020]
| 细分市场 | 经济买方 | 技术推动者 / 用户 | 付款方 | 工作流需求 | 预算负责人 | 主要采用触发因素 |
|---|---|---|---|---|---|---|
| 全球前 20 大药企(如 Pfizer、Roche、Novartis) | 计算生物学 VP / 首席科学官 | 计算生物学家 / ML 科学家 / 结构生物学家 | 研发预算委员会 / CFO | 用于生物制剂和 ADC 的蛋白序列优化;抗体工程;靶点表征;虚拟文库生成 | CSO / 研发 VP,企业合同需 CFO 批准 | 管线效率压力:减少定向进化轮次;把 AI 蛋白设计接入现有信息学栈 |
| 中型生物制药 / 生物技术公司(收入 $50M–$2B) | 研发 VP / 计算生物学总监 | 计算科学家 / 研究工程师 | 研发预算 / Series B–C 投资人资金 | 内部算力有限时,做先导优化和蛋白稳定性工程;无需训练基础设施即可使用前沿 LM | CTO 或研发 VP,超过门槛需董事会签字 | 融资节点:下一轮需要 in silico 验证数据;相较内部训练 LM 有成本优势 |
| 学术 / 政府研究实验室 | 首席研究员 / 院系负责人 | 博士后研究员 / 研究生 | NIH 经费、政府资助、机构 IT | 基础研究中的蛋白功能预测、进化分析、变体效应预测;无商业意图 | PI 拥有经费预算权限;无正式采购周期 | 开放权重可用(HuggingFace 上 ESM-C MIT 许可证);零直接成本接入现有 PyTorch 工作流 |
| 工业生物技术(酶工程、合成生物学) | 蛋白工程 VP / 首席技术官 | 酶工程师 / 蛋白科学家 / 计算化学家 | 研发预算 / 产品开发资金 | 面向比活性、稳定性、pH / 温度耐受性的酶工程;用于生物材料和绿色化学的 de novo 蛋白设计 | CTO / 产品开发 VP | 产品周期短于制药;通过减少湿实验筛选轮次证明 ROI;监管风险更低 |
| 合同研究组织(CRO) | 科学服务 VP / 业务拓展 | 计算生物学家 / 蛋白建模专家 | CRO 运营预算(工具成本并入服务成本) | 向制药和生物技术客户提供增强型计算蛋白工程服务;相较纯湿实验 CRO 做差异化 | 运营 / 财务,科学负责人参与 | 竞争差异化:提供纯湿实验 CRO 做不了的 AI 原生蛋白工程服务;转售 ESM3/ESM-C 创造的价值 |
买方细分来自 EvolutionaryScale 产品文档(Forge API、AWS/NVIDIA 上的 ESM-C)、NVIDIA BioNeMo 和 AWS SageMaker 分发公告、HuggingFace 开发者采用数据,以及类似 SaaS API 商业结构。学术 / 政府细分使用量最高,但免费开放权重层下不产生收入。大型药企和中型生物技术公司是主要商业目标。预算归属结构为原型画像;实际组织权限随公司规模和文化而变。
[CM015, CM016, CM017, CM018, CM019, CM020]矩阵把 EvolutionaryScale 买方细分市场映射到经济买方角色、技术拥护者,以及 Forge API 和 ESM-C 采购或采用决策的首要触发因素。
买方角色是基于 EvolutionaryScale Forge API 文档、ESM-C 分发公告(AWS SageMaker、NVIDIA BioNeMo)、HuggingFace 开发者社区信号,以及类似 SaaS API 商业模式推导的原型。实际组织头衔和审批阈值会不同。学术 / 政府行仅反映开放权重的非商业采用,在当前 MIT 许可证模式下不贡献付费收入。
[CM014, CM015, CM016, CM017, CM018, CM019]2.4 增长驱动因素与采用约束
五股结构性力量推动蛋白质 LM 市场增长。第一,蛋白质序列数据库指数级扩张:DNA 测序成本从 2011 年约 ~$10,000 / genome 降至 2023 年约 ~$100 / genome(National Human Genome Research Institute),推动数十亿条新蛋白质序列产生。ESM3 用 2.78 billion 条蛋白质序列和 98 billion 参数训练;这种规模只有在数据民主化之后才可实现。第二,AlphaFold2/3(Google DeepMind)生成了一个包含 200+ million 蛋白质结构的免费数据库,移除了历史上 $500,000+ 的晶体学成本壁垒,也训练行业接受计算蛋白质工具达到生产级。第三,NVIDIA BioNeMo 将生物基础模型训练提速 2×、模型推理提速 6×,降低企业 PLM 部署总拥有成本,压低 Forge API 替代方案的经济门槛,同时扩大 EvolutionaryScale 分发触达。第四,FDA 监管参与正在加速:2016 至 2023 年收到 500+ 份启用 AI/ML 的药物开发提交,2025 年发布草案指南,2024 年成立 CDER AI Council——这降低了制药客户评估 AI 原生发现工具时的监管不确定性。第五,ESM-C 以 MIT 许可开放权重发布,使 EvolutionaryScale 成为蛋白质 LM 的社区标准,类似 Hugging Face 开放权重策略推动 NLP 商业云 API 转化。 四个实质性约束限制采用速度。第一,开源商品化:ESM-C 权重以 MIT 许可免费开放;任何资源充足的实验室都可以自托管,压低付费转化率和非前沿市场的定价权。第二,湿实验室验证要求:没有任何完全由计算 AI 设计的蛋白质在缺少大量体外和体内确认的情况下进入监管批准流程——API 有价值,但不能替代实验瓶颈。第三,企业采购摩擦:制药 IT 安全审查、云数据治理政策、多年供应商评估周期,让商业部署时间比学术采用多出 12–24 个月。第四,Big Tech 竞争压力:Google DeepMind、NVIDIA BioNeMo、AWS HealthOmics 都有分发、算力和生态优势;如果前沿蛋白质 LM 能力走向商品化,它们会威胁 EvolutionaryScale。[CM022, CM023, CM024, CM025, CM026, CM027]
| 驱动 / 约束 | 方向 | 时间 | 对 EvolutionaryScale 的影响 | 尽调问题 |
|---|---|---|---|---|
| DNA 测序成本大众化(到 2023 年约 $100/基因组,低于 2011 年 $10,000) | 增长驱动 | 结构性、持续 | ESM3 得以用 2.78B 条蛋白序列训练;蛋白数据库持续扩张,支撑前沿模型训练的竞争护城河 | 确认 EvolutionaryScale 是否持续拿得到数据管线,并有算力预算在数据库扩张后用新序列重训 |
| AlphaFold2/3 免费蛋白结构数据库(200M+ 结构) | 增长驱动 | 当前、加速中 | 移除了历史上的晶体学成本门槛;让计算蛋白工具在药企研发中常态化;消除“这东西可信吗?”摩擦,扩大 ESM3/ESM-C 可触达买方 | 确认 ESM3 训练数据与 AlphaFold 结构数据库的重叠;评估 EvolutionaryScale 是否把结构输入纳入联合训练 |
| NVIDIA BioNeMo 训练提速 2× / 推理提速 6× | 增长驱动 | 当前、依赖硬件 | 降低企业部署 ESM 的总拥有成本;强化 AWS 和 NVIDIA 合作伙伴分发渠道 | 核实 EvolutionaryScale 的 ESM-C NIM 微服务是否已在 BioNeMo 测试并 GA;评估收入分成或推荐机制 |
| FDA 监管 AI 互动:2016–2023 年 500+ 份提交、2025 年指南草案、2024 年 CDER AI Council | 增长驱动 | 新兴,2025–2027 | 降低药企合作伙伴评估 AI 原生发现工具时的监管不确定性;释放 FDA 接受 IND 提交中使用 AI 的信号 | 确认是否有任何由 ESM3/Forge 支持的生物学成果被 IND 或监管文件引用;获取当前 FDA 指南适用性评估 |
| 生物制药研发支出增长(药企研发预算年增速约 5–6%) | 增长驱动 | 结构性、长期 | 目标买方细分的采购总预算扩张,即便市占率不变,也支撑 Forge API 收入增长 | 每年跟踪头部药企研发预算披露;评估 Forge 目标账户中计算生物学占研发预算的比例 |
| 开源商品化:ESM-C 在 MIT 许可证下免费;Meta ESM2 免费可用 | 采用约束 | 当前、持续 | 有 GPU 的价格敏感客户会自托管,挤掉付费 Forge 转化;限制小模型层级的定价权 | 量化 HuggingFace ESM-C 下载者随后转化为付费 Forge API 的比例;评估 6B 参数规模的自托管经济性 |
| 湿实验验证要求:没有 AI 设计蛋白在缺少实验确认的情况下进入监管审批 | 采用约束 | 结构性、长期 | 纯 API 价值主张只能覆盖计算工作流环节;无法取代 in vitro / in vivo 验证;限制平台型打法的单用户收入天花板 | 确认 EvolutionaryScale 是否计划把湿实验验证合作或数据集成服务纳入 Forge 企业产品 |
| 企业采购摩擦:药企云供应商审核需 12–24 个月 | 采用约束 | 当前、结构性 | 商业部署节奏慢于学术采用;影响从试点转化为企业合同的收入时间 | 获取参考客户从首次 API 试用到 Forge 企业合同的时间披露;评估 SOC2 和 GxP 合规认证 |
驱动和约束来自 FDA 监管指南页面、NVIDIA BioNeMo 技术文档、EvolutionaryScale ESM3 Science 论文、AlphaFold 公共数据库披露、National Human Genome Research Institute 测序成本数据,以及 HuggingFace/GitHub 开发者采用指标。专利悬崖和药企研发支出数字来自 IQVIA 和 Statista 二级来源。没有单一来源覆盖全部条目;该综合反映多类证据的交叉印证。
[CM022, CM023, CM024, CM025, CM026, CM027]蛋白 LM API 采用漏斗,从全球所有潜在企业用户,到活跃 EvolutionaryScale Forge 商业订阅者,展示截至 2026 年各转化阶段和估计量级。
漏斗数量为分析估计;没有权威发布的 PLM API 采用阶段调查。全球蛋白工程公司总数根据生物技术 / 制药行业数据库估计。有计算生物学预算的公司数,从全球 Top-1000 制药 / 生物技术公司中维持内部专职计算团队的比例推导。免费层 ESM 用户数由 HuggingFace 下载指标(6,320+ ESM-C 下载)和 GitHub star 数外推。商业 Forge 和企业合同数量为估计;EvolutionaryScale 未公开披露订阅者或合同数。所有数字均为方向性量级估计,需尽调验证。
[CM020, CM021, CM027, CM031]2.5 规模与采用尽调缺口
几个实质性证据缺口限制了市场分析精度。第一,针对制药 R&D 中蛋白质语言模型 API 的可服务市场,尚无独立发布的规模数字。所有蛋白质工程市场估算($2.2B–$23.59B)都涵盖完整市场,包括试剂、仪器和服务;纯 API / 软件子市场没有在任何可访问的分析师报告中被单独测算。推导 PLM API SAM,需要估算蛋白质工程支出中有多大比例可由计算工具捕获——这是高度依赖假设、缺少独立可验证基础的推断。 第二,EvolutionaryScale 的 Forge API 定价和实际付费订阅数没有公开披露。HuggingFace 下载量(ESM-C 6,320+、ESM3 3,110+)和 PyPI 安装数显示开发者牵引力,但如果不知道 Forge 转化漏斗和定价结构,就无法转化为付费商业收入。第三,biorxiv 预印本搜索返回 129 篇引用 ESM3/EvolutionaryScale 的论文,ESM3 Science 论文(DOI: 10.1126/science.ads0018)也被广泛引用;但学术引用影响力不能直接映射到制药商业市场渗透。第四,蛋白质工程市场分析师估算在相邻基准年之间跨度达 10×($2.2B 至 $23.59B);该范围主要由口径差异驱动(仅工具 vs. 仪器 + 试剂 + 服务),不是市场真实分歧,但投资人跨来源比较时若不调整口径,可能得出错误结论。 为后续章节保留的证据:(a)未识别到付费客户披露——这是 Chapter 4(Business Model)的尽调缺口;(b)Chapter 3(Competitors)应获取相对于 OpenAI API、Anthropic API 和 AWS Bedrock 的可比蛋白质 LM API 定价基准;(c)ESM3 商业 Forge API 是否用于任何 IND 提交或监管申报尚未确认——该缺口应在 Chapter 5(Technology)或 Chapter 7(Regulatory)处理。[CM032, CM033, CM034, CM035, CM036]
2.6 展示材料
03竞争格局
3.1 竞争宇宙与类别分层
EvolutionaryScale 的竞争位置在蛋白质语言建模、生成式生物学和 AI 赋能药物发现的交叉点。竞争宇宙可分为四类。 第一类也是最直接的一类,是 AI 蛋白质设计平台同行:专门为蛋白质工程和设计构建并商业化基础模型的公司。Profluent Bio(San Francisco,累计融资约 ~$44M)使用源自 ProGen 的模型,并推出 OpenCRISPR——其网站称之为全球首个 AI 设计的基因编辑器。Cradle.bio(Amsterdam,累计融资约 ~$73M)提供 SaaS 蛋白质工程平台,把自有湿实验室数据循环与客户数据结合,并声称可将开发周期提速 2–12x;客户中包括 Novonesis(原 Novozymes)。Generate Biomedicines(Somerville, MA,累计融资 ~$700M+)运营 Generate Platform——一个持续训练的生成式生物学闭环,已在 140,000+ sq ft 实验室空间中生成、构建并测试 42,000+ 种蛋白,并拥有活跃的大型制药合作。AbSci Corporation(NASDAQ: ABSI;Vancouver, WA)是唯一上市的直接同行,拥有面向 de novo 抗体设计的 AI Drug Creation Platform,采用 6 周迭代周期,并于 March 2026 向 SEC 提交 FY2025 10-K。Adaptyv Bio(Lausanne, Switzerland)定位为 Biopole Life Science Campus 的蛋白质设计师云实验室。 第二类包括基础模型和更广义的 bio-AI 同行。Isomorphic Labs(London,Alphabet 子公司)在 May 2024 Google DeepMind 联合公告之后,持有 AlphaFold 3 在药物发现领域的独家商业许可,并在 early 2024 与 Eli Lilly、Novartis 签署标志性交易。Chai Discovery(San Francisco)以开放模型形式发布 Chai-1,并在推进 Chai-2,用于具有原子级精度的药物样 de novo 抗体设计。Xaira Therapeutics(San Francisco,April 2024 启动时融资约 ~$1B)正在构建覆盖完整药物发现谱系的预测型和智能体 AI 模型。Iambic Therapeutics 使用 Enchant 和 NeuralPLexer AI 技术,并已有 IAM1363(HER2 inhibitor)Phase 1b 数据。Inceptive(Palo Alto/Berlin/Zurich,成立于 2021)专注于 RNA/mRNA/siRNA/ASO/peptide 基础模型。 第三类是 AI 药物发现整合商。Insilico Medicine(HKEX: 3696)在 AI-first 公司中临床证明最领先,已完成 ISM001-055(TNIK inhibitor for IPF)的 Phase 2 试验。Recursion Pharmaceuticals(NASDAQ: RXRX)收购 Exscientia,运营 50+ petabytes 表型组数据,并拥有 BioHive-2(与 NVIDIA 共建)。Schrödinger(NASDAQ: SDGR)是占主导地位的物理建模平台,拥有 30+ 年 R&D,以及 FEP+、WaterMap、LiveDesign 工具。 第四类覆盖学术和开源参与者。Institute for Protein Design(Baker Lab,UW)——其联合主任 David Baker 分享 2024 Nobel Prize in Chemistry——以免费开源工具形式分发 RFdiffusion 和 RoseTTAFold,并有一款免版税 COVID-19 疫苗在 UK 和 South Korea 获批。Google DeepMind 通过 AlphaFold Server(免费、非商业)和 EMBL-EBI AlphaFold DB(200M+ 结构,CC-BY-4.0)开放 AlphaFold 3。最具对抗性意义的开源威胁,是 Meta 的 ESM 模型家族(ESM2、ESMFold)——由 Alexander Rives、Zeming Lin、Tom Sercu、Salvatore Candido 创建,正是创办 EvolutionaryScale 的同一批人——并以 MIT 许可发布,为基础蛋白质语言建模建立了商品化底线。 [CP001, CP002, CP003, CP004, CP005, CP006]
| 竞争对手 | 类别 | 规模 / 融资 | 目标细分 | 核心产品 | 差异化 | 局限 |
|---|---|---|---|---|---|---|
| Profluent Bio | 直接 AI 蛋白设计 | 累计融资约 $44M(估计) | 生物技术 / 药企蛋白工程 | 基于 ProGen 的模型;OpenCRISPR | 首个 AI 设计基因编辑器;开放访问发表策略 | 算力 / 数据规模小于 ESM3;湿实验有限 |
| Cradle.bio | 直接 AI 蛋白设计 | 累计融资约 $73M(估计) | 生物制药;工业生物技术 | SaaS 蛋白工程平台;自有湿实验室 | 数据飞轮锁定;SOC2;Novonesis 合作;无版税模式 | 缺少大规模基础模型预训练;客户数据模型限制泛化 |
| Generate Biomedicines | 直接 AI 蛋白设计 | 累计融资约 $700M+(估计) | 生物制药疗法 | The Generate Platform;测试 42K+ 个蛋白;140K+ sq ft 实验室 | 资金最足;完整湿实验闭环;大型药企合作 | 无公开 API;仅合作伙伴访问;资本开支很重 |
| AbSci Corporation | 直接 AI 蛋白设计 | NASDAQ:ABSI(上市) | 生物制药抗体项目 | AI Drug Creation Platform;6 周周期;ABS-201 候选物 | 上市公司透明度;湿实验 + AI 迭代;ABSI SEC 文件 | 季度收入压力;尚无获批产品 |
| Adaptyv Bio | 直接 AI 蛋白设计 | 早期(未披露) | 学术机构;小型生物技术公司 | 面向蛋白设计者的云实验室 | 瑞士生命科学枢纽区位;Biopole 园区 | 公开信息很少;产品范围不清 |
| Isomorphic Labs | 基础模型 Bio-AI | Alphabet 资助(未披露) | 大型药企药物发现 | AlphaFold 3 商业许可;药物发现平台 | AF3 独家商业权利;Alphabet 资源;Lilly/Novartis 交易 | 非商业 AF3 仍可通过 Server 免费使用;范围限于药物发现 |
| Chai Discovery | 基础模型 Bio-AI | 私营(未披露) | 药物发现;抗体设计 | Chai-2 de novo 抗体设计 | 原子级精度抗体设计;Chai-1 已开放发布 | 早期;算力规模小于 EvolutionaryScale |
| Baker Lab / IPD (UW) | 学术 / 开源 | NSF/NIH/公共资金 | 全球学术界;衍生生物技术公司 | RFdiffusion;RoseTTAFold;蛋白疗法 | 2024 年诺奖(Baker);免版税工具;获批 COVID-19 疫苗衍生公司 | 非商业主体;分发与付费产品竞争的工具 |
| DeepMind AlphaFold | 学术 / 平台 | Alphabet(不受限) | 全球研究;药企 | AlphaFold 3;AFDB(200M+ 结构;CC-BY-4.0) | 研究用途免费;190+ 个国家 / 地区 3M+ 用户 | 通过 Server/DB 仅限非商业用途;商业用途由 Isomorphic 独家提供 |
| Meta FAIR / ESM2 | 开源基线 | Meta(不受限) | 所有生物学研究者 | ESM2(MIT 许可证);ESMFold | MIT 许可证覆盖包括商业在内的所有用途;创始人与 EvolutionaryScale 相同 | 2023 年后未维护 / 更新;无多模态推理;内部已被 ESM3 取代 |
Profluent(约 $44M)、Cradle(约 $73M)和 Generate Biomedicines(约 $700M+)的融资数字来自公开报道和分析师语境下的估计;并非所有公司都正式确认精确总额。AbSci 是上市公司(NASDAQ:ABSI),财务由 SEC 披露。Isomorphic Labs、Chai Discovery 和 Adaptyv Bio 未公开披露融资。“局限”单元格反映公开可观察约束,而不是内部评估。
[CP001, CP002, CP003, CP004, CP005, CP006]主要竞争者在两个有证据支持的轴上的序数定位:研究工具侧到临床 / 药物管线侧(x 轴),以及开放 / 免费访问到专有 / 商业(y 轴)。位置是有证据支持的序数评分(1–5),不是数值指标。
轴位置是有证据支持的序数评分(1=最偏研究 / 开放,5=最偏临床 / 专有)。数值不是指标测量,而是基于截至 2026 年 5 月关于产品类型、授权和管线状态的公开信息给出的相对排序。
[CP001, CP002, CP007, CP009, CP010, CP011]3.2 直接 AI 蛋白质设计同行:详细画像
在直接 AI 蛋白质设计平台同行中,每家公司相对于 EvolutionaryScale 都有不同的商业化模式和差异化策略。 Profluent Bio(San Francisco,2022)聚焦以蛋白质为中心的 AI,并对 OpenCRISPR 采取公共利益策略,以开放访问形式发布首个 AI 设计的基因编辑器。公司的核心 ProGen 模型架构面向蛋白质序列生成。Profluent 估计约 ~$44M 的融资基础显著小于 EvolutionaryScale 的 $142M,意味着算力和数据基础设施更有限。 Cradle.bio(Amsterdam,2021)通过 SaaS 平台差异化,把客户湿实验室数据与自有蛋白质工程模型结合。客户每次上传实验结果,模型都会改进(“models learn with you”),形成数据飞轮锁定效应。Cradle 声称客户可将蛋白质开发周期提速 2–12x,并明确采用无版税订阅模式,客户拥有所有生成 IP。公司在 Amsterdam 运营自有湿实验室,作为概念验证层。Cradle 与 Novonesis 的合作,代表全球最大工业生物技术公司之一把 AI 嵌入创新工作流。平台符合 SOC 2,并支持单点登录。 Generate Biomedicines(Somerville, MA,2018)是融资最重的直接同行,累计融资超过 $700M。Generate Platform 整合 AI 模型训练、高通量蛋白表达和迭代学习,覆盖 140,000+ sq ft 实验室空间。公司已测试 42,000+ 种蛋白,形成持续改进的反馈闭环。其领先项目 GB-0895 针对哮喘 TSLP,从一开始就共同优化更强生物效应和更低给药频率(具备一年两次潜力)。与大型生物制药公司的活跃合作,显示平台级制药验证。 AbSci Corporation(NASDAQ: ABSI,Vancouver, WA)是唯一上市的直接竞争对手。其 AI Drug Creation Platform 将湿实验室和 AI 整合进 6 周迭代周期,用于 de novo 生物药(抗体)设计和多参数先导优化。AbSci 的 ABS-201 是一款 AI 设计的抗体,靶向雄激素性脱发中的 prolactin receptors,已展示体内毛囊再生效果。FY2025 10-K 于 March 24, 2026 向 SEC 提交,提供私有同行没有的公开披露,但也让 AbSci 暴露在季度收入压力下。 Adaptyv Bio(Lausanne, Switzerland)定位为 Biopole Life Science Campus 的蛋白质设计师云实验室。与其他直接同行相比,公开信息显著有限,暗示公司仍处于早期或未充分公开阶段。 [CP018, CP019, CP020, CP021, CP022, CP023]
| 能力 | EvolutionaryScale (ESM3) | Profluent | Cradle.bio | Generate Biomedicines | AbSci | AlphaFold 3 | Meta ESM2 |
|---|---|---|---|---|---|---|---|
| 多模态(序列 + 结构 + 功能联合) | 是——同步推理 | 以序列为中心(部分) | 项目内优化 | 生成-构建-测量闭环 | 湿实验 + AI 迭代 | 结构 + 分子相互作用 | 否——仅序列 |
| de novo 生成式设计 | 是(ESM3 生成式) | 是(基于 ProGen) | 是(由实验数据引导) | 是(平台核心) | 是(de novo 抗体) | 有限(结构预测) | 有限(嵌入 / 预测) |
| 自助 API / 开发者访问 | 是(Forge,公开测试版) | Unknown | 是(SaaS 平台) | 否(仅合作伙伴) | 否(仅合作伙伴) | 是(AlphaFold Server,免费) | 是(HuggingFace,MIT,免费) |
| 湿实验验证闭环 | 否(未披露湿实验) | Unknown | 是(Amsterdam 湿实验室) | 是(140K+ sq ft) | 是(集成周期) | 否 | 否 |
| 商业许可(非研究) | 是(Forge 付费) | Unknown | 是(订阅) | 是(合作伙伴) | 是(合作伙伴 + 上市公司) | 是(通过 Isomorphic Labs) | 是(MIT,免费) |
| 最大模型规模(参数量) | 98B(已披露最大规模) | ~1B–10B(估计) | Unknown | Unknown | Unknown | 未知(大) | 15B(最大免费变体) |
未知单元格表示缺少公开披露;它们是证据缺口,不是否定。“有限”表示能力只部分存在。AbSci 的“上市公司”限定指 NASDAQ:ABSI 交易状态,不是公开 API。Meta ESM2 15B 是公开可用的最大免费蛋白 LM。AlphaFold 3 参数量未公开披露。
[CP003, CP004, CP005, CP019, CP020, CP027]3.3 定价、包装与分销比较
AI 蛋白质设计领域的定价模式,反映出根本不同的商业化哲学。EvolutionaryScale 提供 Forge,这是面向 ESM3 和 ESMC 访问的商业 API 平台,公开测试版于 early 2025 宣布。ESM3 small model(open)和 ESMC(300M/600M)可在 HuggingFace 免费用于非商业和研究用途;更大模型和生成式 API 访问需要 Forge 账户。 Cradle.bio 明确定位为软件订阅(“no royalties, just a software subscription fee”),客户保留所有生成 IP,实验数据也绝不会用于为其他客户训练模型。这种 SaaS 粘性不同于 EvolutionaryScale 的按量 API 模式。 Meta 的 ESM2 采用 MIT 许可,在 GitHub 和 HuggingFace 上免费开放,所有用途包括商业用途都零成本。ESMFold 结构预测同样开放。DeepMind 的 AlphaFold DB(200M+ 结构)采用 CC-BY-4.0,可用于任何用途;AlphaFold Server 对非商业研究免费。AlphaFold 3 在药物发现中的商业使用,通过 Isomorphic Labs 与制药公司的独家许可安排实现。 Generate Biomedicines 和 AbSci 都采用面向大型制药公司的 B2B 合作与授权模式,而不是自助式 API。两者都没有公开定价页或开发者 API。Baker Lab/IPD 工具(RFdiffusion、RoseTTAFold)和 OpenFold 完全免费,使用宽松许可。 关键尽调缺口:Forge 企业标价、阶梯量价和实际每次调用成本没有公开披露。Isomorphic Labs 与 Lilly、Novartis 的交易经济性有报道,但公开信息没有拆分。 [CP027, CP028, CP029, CP030, CP031, CP032]
| 供应商 | 访问模式 | 约略价格 / 层级 | 包含能力 | IP / 数据安排 | 关键未知 / 缺口 |
|---|---|---|---|---|---|
| EvolutionaryScale (Forge) | API(按用量计费) | 公开测试版;企业定价未披露 | ESM3 和 ESMC 推理;序列 / 结构 / 功能生成 | 用户保留生成蛋白的 IP | 企业标价和用量阶梯未公开 |
| Cradle.bio | SaaS 订阅 | 标价未披露 | 蛋白工程;多属性优化;数据管理 | 客户保留完整 IP;无版税;数据不共享 | 订阅确切成本未披露 |
| Meta ESM2 / ESMFold | 开源(MIT 许可证) | 免费(零成本) | 序列嵌入;结构预测(ESMFold) | MIT —— 包括商业在内的任何用途 | 无——可免费使用 |
| DeepMind AlphaFold(非商业) | 免费 Server + DB(研究) | 非商业免费(CC-BY-4.0) | 200M+ 结构;AF3 预测 | 研究使用 CC-BY-4.0;商业只能通过 Isomorphic | Isomorphic Labs 商业定价未公开 |
| Baker Lab / IPD 工具 | 开源(宽松许可证) | 免费 | RFdiffusion;RoseTTAFold;设计工具 | 免版税;开源 | 无——完全开放 |
| Generate Biomedicines | 战略合作 | 谈判定价(未披露) | 完整实验室 + AI 生成生物学平台 | 合作与许可安排 | 非自助;价格私密;交易条款私密 |
| AbSci (NASDAQ:ABSI) | 合作 + SEC 披露收入 | 按合作逐项谈判 | AI 抗体设计 + 湿实验验证周期 | 合作许可;SEC 文件披露财年收入 | 单笔交易实际经济性未列明 |
| Schrödinger(NASDAQ:SDGR 上市公司) | 企业软件许可 | ARR 约 $130–150M(2024 年分析师估计) | FEP+;WaterMap;LiveDesign;药物管线项目 | 软件许可;管线由独立实体推进 | 具体按席位或模块定价未公开 |
定价数据基于公开信息。EvolutionaryScale Forge 企业定价、Cradle 订阅成本和 Isomorphic Labs 交易经济性均未披露。Schrödinger ARR 是来自 2024 年覆盖报告的分析师估计,并非公司披露。AbSci 收入细节见 NASDAQ:ABSI 季度 / 年度文件。
[CP027, CP028, CP029, CP030, CP031, CP032]3.4 护城河耐久性、商品化风险与反向分析
EvolutionaryScale 最具结构性意义的竞争风险,是蛋白质语言模型的开源商品化。ESM2 和 ESMFold 由 Alexander Rives、Zeming Lin、Tom Sercu、Salvatore Candido 在 Meta AI FAIR 开发,在 GitHub 和 HuggingFace 上以 MIT 许可开放,可用于包括商业用途在内的任何目的。ESM2 版本覆盖 8M 至 15B 参数;ESMFold 预测蛋白质结构的速度最高可比此前 SOTA 快 60x。OpenFold 提供 AlphaFold 的独立开源复现,并采用宽松许可。任何生物技术或制药公司都可以部署这些免费模型作为基线,直接压缩 ESM3 在不需要多模态提示或前沿规模应用中的价值。 EvolutionaryScale 试图通过三种策略逃离这条底线:(1)前沿规模——ESM3 拥有 98B 参数和 10^24 FLOPs,是具备已验证涌现生成能力的最大蛋白质语言模型;(2)多模态差异化——ESM3 是首个同时推理蛋白质序列、结构和功能的模型,这一能力 ESM2 不具备;(3)R&D 速度——ESM Cambrian(ESMC)于 December 2024 发布,维持新模型发布节奏。 但反向信号也很明确。第一,Chai Discovery 的 Chai-2 推进原子级精度 de novo 抗体设计,Isomorphic Labs 持有 AlphaFold 3 商业独家权,说明多模态蛋白质设计空间正在变挤。第二,临床概念验证护城河掌握在 Insilico Medicine(AI 设计药物完成 Phase 2)和 Recursion(多项目临床管线)手中,而 EvolutionaryScale 没有披露内部管线。第三,Generate Biomedicines 的资本基础($700M+)远超 EvolutionaryScale($142M),能够支撑更资本密集、经实验室验证的策略。第四,制药客户可以并且确实会同时使用免费工具(AlphaFold、ESM2)和付费平台(Forge、Cradle、Generate),限制任何单一供应商的定价权。 [CP033, CP034, CP035, CP036, CP037, CP038]
| 护城河主张 | 威胁 / 反证 | 严重性 | 缓解 / 尽调问题 |
|---|---|---|---|
| 前沿规模——98B 参数、10^24 FLOPs | 开源模型能力逼近(ESM2 15B 免费;Chai-2 进展中);快速收敛 | 高 | 验证 ESM3 98B 在独立基准上是否相较较小免费模型给出显著更好的科学结果 |
| 多模态联合推理(序列 + 结构 + 功能) | AlphaFold3 处理分子 + 结构;Chai-2 瞄准抗体结构 + 功能;收敛在加速 | 中 | 评估有多少药企工作流真正需要同时进行三模态提示 |
| Forge API 商业分发 | 自助式 API 的 GTM 护城河低;Cradle 的数据飞轮 SaaS 粘性更强;未披露多年期企业合同 | 高 | 判断 Forge 是否有带切换成本的企业合同,还是按调用付费 |
| 创始人声誉 / ESM 研究传承 | 同一批创始人在 Meta 免费发布 ESM2(MIT),给自家付费产品留下了一个高质量免费替代品 | 高 | 向创始人确认模型 API 之外的商业化策略;评估研究信誉能否带动企业客户管线 |
| AWS SageMaker + NVIDIA BioNeMo 分发 | 两个渠道都非独家;同一市场上也有多个竞争对手(Recursion 等) | 中 | 确认 EvolutionaryScale 在 AWS / NVIDIA 渠道是否有合同独家性或优先级 |
| 未披露临床管线 | 已有临床证据的竞争对手(Insilico Phase 2;Recursion 多阶段管线)能拿到更高的药企交易金额 | 高 | 评估 EvolutionaryScale 是否计划做内部治疗项目,还是继续做纯工具公司 |
严重程度反映威胁一旦兑现会如何冲击 EvolutionaryScale 的竞争耐久性。所有威胁均基于公开可见证据;尽调团队不知道的内部战略缓释措施未纳入。
[CP033, CP034, CP035, CP036, CP037, CP038]七家主要竞争者在六项核心购买标准能力上的覆盖矩阵,显示各平台基于公开证据支持哪些能力。
“部分”表示公开来源显示只具备部分或有限能力。“未知”单元格表示没有公开证据;本表所有单元格均来自公开产品页面。临床管线状态截至 2026 年 5 月,来自官方来源。
[CP003, CP004, CP005, CP018, CP019, CP020]截至 2026 年 5 月,基于公开证据汇总 EvolutionaryScale 相对同业的关键竞争耐久性指标。
[CP001, CP002, CP003, CP007, CP010, CP033]3.5 展示材料
04财务情况
4.1 收入来源与定价模式
EvolutionaryScale 的收入模式围绕 Forge 构建。Forge 是商业 API 平台,向 ESM3、ESM Cambrian 及相关蛋白质语言模型提供推理访问。Forge 于 January 2025 进入公开测试版,对蛋白质序列生成和结构预测按 token 使用量收费。forge.evolutionaryscale.ai 界面没有公开列出准确价格表,需要登录;企业合同条款私下谈判。公开来源可识别三条收入流:(1)Forge API 按次付费,(2)企业年度 API 访问合同(按量),(3)通过战略伙伴分发——NVIDIA BioNeMo 平台和 AWS SageMaker JumpStart 提供云托管 ESM 模型访问,可能分别与 NVIDIA、Amazon 采取收入分成或推荐费模式。 ESM Cambrian(January 2025 发布)仅作为商业模型通过 Forge 发布,不同于 Meta AI Research 发布的开放权重 ESM2;后者仍可在 HuggingFace 免费使用。这种 ESM Cambrian 商业专属设计强化了 API 访问模式。Forge 产品页列出一个学术免费层,带有 token 额度上限。 EvolutionaryScale 作为独立实体运营期间,从未披露任何收入数字、ARR 或客户指标。November 2025 被 CZI Biohub 收购从根本上打断了商业路径;截至 May 2026,Forge API 在 CZI Biohub 管理下的运营状态和定价,公开来源尚未确认。 [CI001][CI002][CI003][CI004][CI005][CI006][CI007][CI008][CI009][CI010]
| 收入来源 | 机制 | 单位 / 定价 | 当前状态(2026 年 5 月) | 证据质量 | 尽调追问 |
|---|---|---|---|---|---|
| Forge API — 按量付费 | 对 ESM3 / ESM Cambrian 蛋白生成和结构预测按 token 或按调用收费 | 未公开列示;需登录 forge.evolutionaryscale.ai | CZI 交易后运营状态不明;公测版于 2025 年 1 月上线 | 低 — 定价未披露;商业发布已确认 | 向 CZI Biohub 索取当前 Forge API 定价和使用指标 |
| 企业 API 访问合同 | 按用量定价的 Forge API 年度访问许可;逐客户谈判 | 未披露;估计每年为中五位数至中六位数美元 | CZI 之后状态不明;未公开点名企业客户 | 低 — 合同结构由产品设计推断;没有已确认交易 | 识别已签约企业客户;尽调中取得合同模板 |
| NVIDIA BioNeMo 分发 | ESM3 托管在 NVIDIA BioNeMo NIM 平台;可能有收入分成或联合营销费用 | 未披露;NVIDIA 合作条款未公开 | 活跃 — 截至 2026 年 ESM3 仍列在 BioNeMo | 中 — NVIDIA 公告确认分发 | 向 NVIDIA 确认收入分成或实物支持条款 |
| AWS SageMaker JumpStart 分发 | ESM 模型上架 AWS SageMaker JumpStart;Amazon 是 A 轮领投方 | 未披露;可能是云额度等实物支持,而非现金收入 | 活跃 — 截至 2026 年 AWS 上架已确认 | 中 — 分发已确认;财务条款未知 | 判断 Amazon 关系带来现金收入,还是仅抵扣云额度 |
| 学术 / 免费层 | 面向学术和研究用户的封顶 token 配额 | 免费;作为付费层转化漏斗 | Forge 产品页有列示;CZI 之后状态不明 | 低 — 仅作为功能列示;没有转化指标 | 确认学术层转化率,以及能否带动企业客户管线 |
所有定价数据均未披露。状态判断基于产品页审阅和合作伙伴公告。CZI Biohub 交易(2025 年 11 月)之后,Forge API 在 CZI 旗下能否延续商业化尚未确认。
[CI001, CI002, CI005, CI006, CI007]| 产品 / 渠道 | 标价与实际成交价 | 折扣 / 未知项 | 来源 |
|---|---|---|---|
| Forge API(ESM3 / ESM Cambrian)— 公测版 | 标价:未发布;需登录查看;未确认按 token 费率 | 学术免费层和推测的企业折扣;未披露实际成交价 | forge.evolutionaryscale.ai(官方,需登录);ESM Cambrian 博客(官方) |
| 企业 API 合同 | 未披露;未列示;单独谈判 | 批量折扣、独家性、适应症范围都可变且未知 | 根据产品结构推断;没有公开合同样例 |
| NVIDIA BioNeMo — ESM3 NIM | 包含在 NVIDIA BioNeMo 平台中;用户定价按 NVIDIA 条款,不由 EvolutionaryScale 定 | 与 EvolutionaryScale 的收入分成条款未披露;现金收入可能为零 | NVIDIA BioNeMo 产品页;nvidianews.nvidia.com 合作公告 |
| AWS SageMaker JumpStart — ESM 模型 | AWS 市场定价(按实例小时);EvolutionaryScale 从上架费中分得多少未披露 | Amazon 领投 A 轮可能包含实物云算力——可能把现金成本归零,也把收入归零 | AWS SageMaker JumpStart 产品页;CNBC A 轮公告 |
公开来源没有实际成交价数据。所有数字均由产品结构和行业类比推断。主导风险在于 ESM2(Meta 开放权重)为嵌入提供零成本替代,压住非生成式用例的定价权。
[CI001, CI004, CI006, CI007]客户与 Forge API 及分发渠道的互动如何转化为 EvolutionaryScale 的收入流。
未公开披露收入数字或定价。节点细节反映已知机制和确认的合作结构,不代表金额。CZI Biohub 交易后,商业化流转被打断。
[CI001, CI005, CI006, CI007, CI009]4.2 成本结构、毛利率驱动因素与资本强度
EvolutionaryScale 的成本结构由三类主导:研究人员、模型训练 GPU 算力,以及 Forge API 推理基础设施。公司训练 ESM3 消耗超过 10^24 FLOPs——按公司博客说法,高于此前任何生物模型——运行在其称为 “one of the highest throughput GPU clusters in the world today” 的集群上。按当时云端 GPU 价格(H100 每 GPU-hour $2–5)估算,单次训练成本约 $10–50 million,成为主导性资本开支。 人员成本受员工数限制:LinkedIn 显示公司处于 11-50 人区间,峰值可能约 25-50 FTE。按每名 FTE $200,000–$300,000 的综合全成本(San Francisco AI 研究人才的常规水平)计算,年度人员烧钱约 $5–15 million。Forge API 的持续推理成本会随 API 调用量增加而形成可变成本——98B 参数 ESM3 的蛋白质生成单次查询计算成本很高。 API 推理业务的毛利率,很大程度取决于公司自营 GPU 集群(资本密集、长期毛利更高)还是租用云算力(capex 更低、COGS 更高)。公司未披露毛利率数据。正向信号是 Amazon 领投 Series A:AWS 可能在交易结构中提供了可观的实物云抵扣,实质性降低近期基础设施成本。ESM2 模型在 HuggingFace 上开源,给向量嵌入市场建立了永久竞争成本底线,这是 ESM Cambrian 商业层的结构性毛利压缩风险。 [CI011][CI012][CI013][CI014][CI015][CI016][CI017]
| 成本类别 | 主要驱动因素 | 估计规模 | 证据基础 | 置信度 |
|---|---|---|---|---|
| GPU 算力 — 模型训练 | ESM3 训练:在高吞吐集群上 >10^24 FLOPs | $10–50M 一次性(估计);H100 算力按 $2–5 / GPU-hour | ESM3 官方博客直接引用;NVIDIA BioNeMo 博客 | 中 — 算力规模已确认;成本费率为估计 |
| GPU 算力 — 推理(Forge API) | 每次 API 调用跑 ESM3(98B 参数)推理;随调用量变化 | $1–5M / 月,规模化后(估计);对调用量高度敏感 | 根据模型规模和云 GPU 定价基准推断 | 低 — 未披露调用量数据 |
| 人员 | ~25–50 FTE;AI 研究员、ML 工程师、平台工程师 | $5–15M / 年(估计);SF 每名 FTE 全包 $200–300K | LinkedIn 公司规模 11-50 档;Wikipedia 员工数 | 低 — 员工数为估计;无薪酬数据 |
| 基础设施 / 云(非训练) | API 服务、存储、数据管线、内部工具 | $0.5–2M / 月(估计) | 根据 API 业务基准推断;领投方 Amazon AWS 可能提供额度 | 低 — 未确认;Amazon 可能提供实物支持 |
| 研究 / 数据 | 训练数据许可、学术数据集访问、湿实验验证(有限) | 低至中等;大多数训练数据(UniProt、PDB)公开 | ESM3 博客引用公开蛋白序列数据库 | 中 — 数据成本可能较低;算力占主导 |
| G&A / 公司管理开销 | 支撑 25-50 FTE 的法务、财务、HR、办公场地 | $1–3M / 年(估计) | 早期 SF AI 公司基准 | 低 — 估计 |
所有规模估计都由分析师根据公开代理指标推导;没有财务报表或已确认成本数据。主导成本驱动因素是训练和推理所需的 GPU 算力。Amazon 作为领投方可能提供了云额度等实物支持,从而降低现金基础设施支出。估计仅为数量级。
[CI011, CI012, CI013, CI014, CI015]Forge API 推理业务的收入输入和主要成本扣减的定性流;所有数值均为估计。
所有数值来自员工数代理、GPU 定价基准和云 API 类比估计。EvolutionaryScale 未披露实际收入、COGS 或毛利率数据。
[CI011, CI012, CI014, CI015, CI016]4.3 资本充足性与 CZI Biohub 交易
EvolutionaryScale 的资本路径很清楚:先是 2023 年底约 $3M 的种子轮,随后在 September 26, 2024 完成 $142M Series A(CNBC、Axios 和 NVIDIA 同步披露)。 这一轮由 Amazon 和 NVIDIA 领投,Lux Capital、Nat Friedman、Daniel Gross 跟投;据报道投后估值约 $1.35B。累计融资约 $145M。 按每月 $5–15M 的估计烧钱速度(员工数 + 算力 + 管理开销)测算,$142M Series A 自 September 2024 交割起,理论上可支撑 9–28 个月现金跑道。实际现金跑道在 November 2025 结束——距离 Series A 交割仅约 14 个月——EvolutionaryScale 团队随后加入 CZI Biohub。 联合创始人兼首席科学家 Alex Rives 出任 CZI Biohub 科学负责人。交易条款未披露;研究未在 EDGAR 或其他公开监管数据库中找到由该交易触发的 SEC 文件(例如 Form D 修订、Hart-Scott-Rodino 披露或收购通知)。 $142M Series A 投资人(Amazon、NVIDIA、Lux Capital 和天使投资人)的财务结果仍不清楚。 如果 CZI/Biohub 交易是对实体的现金收购,投资人应获得分配;如果只是未购买实体的人才收购(acqui-hire), 交易发生时 $142M 资本大概率已被大幅消耗,投资人回收有限。没有任何公开交易金额披露,这一风险评估只能保持开放。 研究中发现一个值得注意的财务合规缺口:SEC EDGAR 中没有出现 EvolutionaryScale 任何名称变体的 Form D 文件(检索过 "EvolutionaryScale"、"Evolutionary Scale"、"Evolutionary Scale Inc",以及关键人物 "Alexander Rives")。私营公司通过 Regulation D 豁免融资 $142M,通常需要在首次出售后 15 天内向 SEC 提交 Form D。没有任何 Form D,可能说明文件以其他法律实体名称提交、使用了其他证券豁免(例如面向离岸投资人的 Regulation S),也可能就是申报缺口。 [CI018][CI019][CI020][CI021][CI022][CI023][CI024][CI025][CI026][CI027][CI028]
| 指标 | 估计值 | 依据 | 置信度 | 关键假设 / 注意事项 |
|---|---|---|---|---|
| A 轮净募资 | ~$142M | CNBC、Axios、MIT Tech Review 公告;NVIDIA 合作伙伴新闻 | 高 | 交割日期为 2024 年 9 月 26 日;按标准假设约 99% 到账 |
| 种子轮资本 | ~$3M(估计) | Crunchbase、NVIDIA 种子投资新闻;金额未获公开确认 | 低 | 金额未公开确认;投资方名称已确认 |
| 累计融资(CZI 前) | ~$145M | 由 A 轮加种子轮估计推导 | 低至中 | 种子轮金额未确认 |
| 估计月度烧钱速度 | $5–15M / 月(估计) | 员工数(25-50 FTE × $200-300K)+ 算力($2-7M / 月)+ 管理开销 | 低 | 没有实际烧钱数据;区间很宽;若 AWS 提供实物额度,低端值也成立 |
| A 轮理论现金跑道(2024 年 9 月) | 9–28 个月(估计) | $142M ÷ 估计 $5–15M / 月烧钱 | 低 | 现金跑道实际随 2025 年 11 月 CZI 交易结束(约 14 个月) |
| CZI Biohub 交易(2025 年 11 月) | 条款未披露 | CNBC 2025 年 11 月文章;Biohub.org 公告 | 高 — 交易已确认;条款未知 | 投资者是否获得回报未公开披露 |
| 债务 / 项目融资义务 | 未发现 | SEC EDGAR 搜索;没有公开债务披露 | 低 | 未披露不等于没有债务 |
| SEC Form D 申报 | EDGAR 未找到任何申报 | SEC EDGAR 对 'EvolutionaryScale'、'Evolutionary Scale'、'Evolutionary Scale Inc' 做全文搜索 | 高 — 多次搜索均返回 0 条结果 | 可能以不同法律实体名称申报;也可能适用 Reg S 离岸豁免 |
私人公司不披露信息,严重限制资本充足性评估。2025 年 11 月 6 日的 CZI Biohub 交易实际上标志着 EvolutionaryScale 作为独立商业实体的终结。按独立公司口径做前瞻资本充足性分析已无意义;未来所有尽调都必须转向 CZI Biohub。融资时间线(轮次日期、投资者)已在公司概况章节确立;本表聚焦充足性和合规信号。
[CI018, CI019, CI020, CI022, CI023, CI024]基于来源和分析师估算的关键财务参数区间;因未披露,所有数值置信度为低到中。
低 / 中 / 高边界来自:(1) 员工数代理(LinkedIn 11-50 档),(2) 云 GPU 定价基准,(3) CNBC/Axios 确认 $142M 融资,(4) 同业融资类比。未找到 EvolutionaryScale 财务报表。
[CI018, CI019, CI022, CI011]4.4 同业资本基准与定位
EvolutionaryScale 的 $142M Series A 和 $1.35B 估值,处在蛋白质 AI 融资活动的中高区间: 高于 Profluent(约 $44M)、Cradle(约 $73M)这类纯基础设施公司,但明显低于 Generate:Biomedicines(约 $700M+)和 Xaira Therapeutics(创立时 $1B)等全栈药物发现 AI 公司。 对一家员工少于 50 人、尚未产生收入的基础模型蛋白质 AI 公司来说,$1.35B 估值隐含了显著溢价。 按人均融资额看,EvolutionaryScale 的资本强度极高,约每名员工融资 $3–6M。这反映的是世界级 AI 研究人才和重算力模型开发的成本,而不是已规模化的商业运营。相比之下,上市 AI 药物发现公司(AbSci、Recursion) 相对融资额拥有更大的员工基数,资金被临床和制造运营摊薄。同业对比说明,Series A 时的 EvolutionaryScale 更像前沿研究载体,而不是规模化商业企业;$1.35B 估值中内嵌的商业收入爬坡假设因此需要被追问。 [CI029][CI030][CI031][CI032]
| 公司 | 累计融资(估计) | 阶段 / 重点 | 投后估值 | 资本效率备注 |
|---|---|---|---|---|
| EvolutionaryScale | ~$145M | A 轮;蛋白 AI 基础模型;2025 年 11 月被 CZI 人才收购 | ~$1.35B(2024 年 9 月) | 人均资本高(~$3-6M / FTE);人才收购时仍未产生收入 |
| Profluent Bio | ~$44M | A 轮;蛋白设计 AI;开源 ProGen2 | ~$200-300M 估计 | 资本基数更低;范围更窄;商业化焦点更强 |
| Cradle.bio | ~$73M | B 轮;AI 蛋白工程平台 | ~$400M 估计 | 资本效率与 EvolutionaryScale 相近 |
| Generate:Biomedicines | ~$700M+ | C+ 轮;全栈 AI 蛋白疗法 | ~$2B 估计 | 规模大得多;瞄准药物收入,而非 API |
| Xaira Therapeutics | ~$1B | 种子轮 / 启动;全栈 AI 药物发现 | 启动时估值 ~$2.5B | 资本最充足的蛋白 AI 初创公司;范围最广 |
| AbSci (ABSI) | $200M+(IPO 前);2021 年上市 | 生物制造 + AI 药物设计;约 350 名员工 | 市值 ~$500M(2026 年估计) | 员工数大得多;模式不同(湿实验 + AI) |
| Isomorphic Labs | 未披露(Alphabet 子公司) | A 轮;以 AI 为先的药物设计;DeepMind 分拆公司 | N/A(子公司) | 结构上不可比;由企业母公司支持 |
同业数据来自公开新闻来源(Axios、Crunchbase、Wikipedia)和公司网站。估值为最近已知轮次的投后估计;未经审计文件确认。Cradle、Profluent、Generate 和 Xaira 数据来自公开融资公告和分析师数据库。所有比较均为近似,仅用于相对参照。
[CI029, CI030, CI031, CI032]估计 $142M Series A 资金在约 14 个月现金跑道内如何部署,直至 CZI Biohub 交易。
所有数值均为分析师估计,来自员工数、GPU 基准和时间线分析。EvolutionaryScale 未披露实际资金用途。图中数值为按中点估算呈现的示意区间。
[CI018, CI019, CI022, CI023, CI025]4.5 财务信息缺口与尽调路径
EvolutionaryScale 的公开财务记录几乎为空。作为私营公司,EvolutionaryScale 不需要向 SEC 提交财务报表; Form D 文件缺失,又进一步压缩了公开监管渠道可验证的信息。已审阅的公开来源中,没有任何一个披露以下关键指标: 实际收入、年经常性收入(ARR)、毛利率、客户数、流失率、现金余额或确认后的月度烧钱速度。Crunchbase 将 $142M Series A 错误归为「种子投资」,说明私募市场数据聚合器并不适合做可验证的财务分析。 CZI 交易之后,财务尽调现在必须穿透到 CZI Biohub——一家由 Chan Zuckerberg Initiative 支持的非营利机构。 尽调路径因此被根本改写:问题不再是标准 VC 财务尽调,而是(1)CZI 交易条款,以及前投资人和股东拿到了什么; (2)CZI 体系下 Forge API 与 ESM 模型家族的持续运营状态和商业化策略;(3)是否还有任何残留商业实体 (持有 IP 的 EvolutionaryScale 实体)继续独立运营。所有未完成的财务分析,都需要查看 CZI Biohub 内部文件和任何交易披露。 [CI033][CI034][CI035][CI036]
| 缺失指标 | 对分析的影响 | 尽调路径 | 优先级 |
|---|---|---|---|
| 收入和年经常性收入(ARR)(Forge API) | 无法评估商业产品可行性、定价与市场匹配度或收入轨迹 | 向 CZI Biohub 商业团队索取;获取 Forge API 分析数据 | 关键 |
| 毛利率(API 推理) | 无法评估单位经济性或盈利路径 | 向 CZI Biohub 索取销售成本拆分;对标云 AI API 同业 | 高 |
| 已确认烧钱速度 | 无法确认资本充足性或核实现金跑道;估计上下限相差 3× | 通过 CZI Biohub 调取 EvolutionaryScale 实体历史月度 P&L | 高 |
| CZI Biohub 交易条款 | 无法评估投资者回报;无法判断 $142M A 轮是否有退出价值 | 索取交易条款清单及任何投资者分配记录;若有实体进入报告义务,检查 SEC 是否有 Form 8-K 类似文件 | 关键 |
| SEC Form D 申报 | 监管合规缺口;所有 Reg D 融资都须在 15 天内提交 Form D | 用所有可能的法律实体名称搜索 EDGAR;向 CZI Biohub 法务团队索取 | 高 |
| 客户数和企业客户管线 | 无法验证 Forge API 的商业牵引力或销售效率 | 向 CZI Biohub 索取;在 LinkedIn 检查是否有客户公开提及 | 高 |
| CZI 后 Forge API 运营状态 | 无法评估商业产品在 CZI 旗下是继续维护还是收缩 | 直接测试 forge.evolutionaryscale.ai API;向 CZI Biohub 索取路线图 | 高 |
| CZI 交易中的投资者回报 | 无法判断 Amazon、NVIDIA 或 Lux Capital 是否从 $142M 投资中获得回报 | 审阅 CZI 公开收购公告、新闻稿或 SEC 等效披露;直接询问投资方 | 关键 |
截至 2026 年 5 月,通过审阅 EDGAR、公司网站、CNBC、Biohub.org、Crunchbase 和 NVIDIA 公告,已确认上述缺口。CZI Biohub 交易是主导缺口:它把其他所有财务尽调重点都压缩成一个问题——交易条款和剩余商业业务能否延续。
[CI033, CI034, CI035, CI036]4.6 财务结论
EvolutionaryScale 的财务叙事,是异常强的早期融资之后,突然战略转向,独立商业化路径随之终止。 Amazon 和 NVIDIA 领投的 $142M Series A、$1.35B 估值,本质上是在押注 ESM3 成为蛋白质 AI 的基础层,而不是押注近期商业收入。商业产品 Forge API 于 January 2025 公开上线,但没有披露收入指标、 客户数或定价。November 2025 的 CZI Biohub 交易发生在 Series A 交割后 14 个月内,确认公司在进入 CZI 体系前,并未以独立公司身份实现商业突破。 收入质量评估:数据不足,无法判断。Forge API 机制本身成立——自研基础模型按 token 推理收费——但如果缺少更完整的药物发现工作流, 付费蛋白质 AI API 访问的可服务市场偏窄;开放权重替代品(Meta 的 ESM2)又压住了通用向量嵌入场景的付费意愿上限。 相比商业牵引,资本强度非常高。SEC Form D 申报缺口是值得关注的合规信号。主要尽调阻塞点包括:(1)CZI Biohub 交易条款;(2)经确认的 Forge API 收入或客户指标;(3)交易前实际烧钱速度;(4)CZI 交易中的投资人回报情况。 仅凭公开来源,无法对投资回报形成可承销级财务结论。 [CI033][CI034][CI035][CI036][CI037][CI038]
4.7 证据展品
05产品与技术
5.1 ESM 产品组合与模型家族
EvolutionaryScale 基于 ESM(Evolutionary Scale Modeling)底座,提供两条明确的产品线:ESM3 生成式蛋白质语言模型,以及 ESM-C(Cambrian)蛋白质向量嵌入家族。两者合计覆盖开放权重和商业 API 两个层级, 共八个模型 SKU。 ESM3 分为三个权重档位:ESM3-small-2024-08(1.4B 参数)、ESM3-medium-2024-08 (7B 参数)和 ESM3-large-2024-03(98B 参数)。ESM3-small 是唯一开放权重的 ESM3 变体,通过 HuggingFace 以 esm3-sm-open-v1 形式提供,并适用 Cambrian 非商业许可协议。 ESM3-medium 和 ESM3-large 属于商业模型,只能通过 Forge API 访问。98B 的 ESM3-large 被用于设计 esmGFP——一种从头生成的荧光蛋白,与最近的天然 GFP 仅有 58% 序列同一性,相当于约 500 million years 的进化分化,并已由 Science(January 2025)同行评议论文验证。 ESM-C(Cambrian)是独立的向量嵌入模型家族,包含三个规模:ESMC-300M、ESMC-600M 和 ESMC-6B。 ESMC-300M 与 ESMC-600M 在同一 Cambrian 非商业许可下开放权重。ESMC-6B 面向学术用户可通过 Forge API 访问,商业部署则通过 AWS SageMaker JumpStart。ESM-C 模型采用带旋转位置嵌入和 SwiGLU 激活的 Pre-LN transformer 架构;EvolutionaryScale 的基准测试显示,它们在各自规模上达到序列表征模型的最先进水平。 Forge API(forge.evolutionaryscale.ai)是主要商业变现载体。Forge 于 January 2025 与 Science 论文同步开放公测,通过可经 pip 安装的 Python SDK,为 ESM3 和 ESMC 模型家族提供同步和异步 REST API 访问。 SDK 内含面向高吞吐工作负载的批处理执行器。商业 Forge 访问价格未公开披露;学术访问条款可直接通过平台获取。[CE001, CE002, CE003, CE004, CE005, CE006]
| 模型 / 产品 | 参数规模 | 可用性 / 许可 | 主要用途 | 尽调缺口 |
|---|---|---|---|---|
| ESM3-small-2024-08 | 1.4 B | 开放权重 — Cambrian Non-Commercial License(HuggingFace esm3-sm-open-v1) | 研究用途蛋白生成;在非商业数据集上本地微调 | 不允许商业使用;与更大 ESM3 变体对比的基准有限 |
| ESM3-medium-2024-08 | 7 B | 仅 Forge API — 商业定价未披露 | 中等规模蛋白设计;Forge API 客户 | 定价不公开;没有与 ESM3-small 的公开基准对比 |
| ESM3-large-2024-03 | 98 B | 仅 Forge API — 旗舰商业模型 | 高复杂度蛋白设计;esmGFP 规模的生成式设计 | 推理成本未披露;SLA 条款不公开 |
| ESMC-300M(esmc-300m-2024-12 模型) | 300 M | 开放权重 — Cambrian Non-Commercial License(HuggingFace) | 用于 ML 管线的蛋白序列嵌入;研究用途 | 仅限非商业;没有第三方独立准确率基准 |
| ESMC-600M(esmc-600m-2024-12 模型) | 600 M | 开放权重 — Cambrian Non-Commercial License(HuggingFace) | 增强蛋白嵌入;研究和学术微调 | 仅限非商业;1,490 次 HuggingFace 下载显示早期采用 |
| ESMC-6B | 6 B | Forge API(学术)+ AWS SageMaker JumpStart(商业) | 企业级蛋白序列嵌入和相似性搜索 | 商业定价未披露;SageMaker 实例成本由用户承担 |
| Forge API(forge.evolutionaryscale.ai 服务) | 服务(所有 ESM3 / ESMC 模型) | 自 2025 年 1 月起公测;商业订阅模式 | 以程序方式访问所有模型;同步和异步推理;批量执行器 | 定价结构、使用层级、可用性 SLA 和客户名单不公开 |
模型规模和参数数量来自 EvolutionaryScale 官方博客、GitHub ESM 仓库 README 和 HuggingFace 模型卡。HuggingFace 下载量反映研究日期时的 30 天快照,可能波动。Forge API 和 SageMaker 商业层定价未公开披露;标为「未披露」的行反映研究时确认没有公开定价。
| 用例 | 目标用户 | ESM 工具 | 工作流步骤 | 验证证据 |
|---|---|---|---|---|
| 从头蛋白设计 | 蛋白工程师、药物发现研究员 | ESM3-large(Forge API) | 指定部分序列或结构约束;生成完整候选蛋白;迭代 | esmGFP — 2025 年 Science 同行评审论文;341 次引用 |
| 用于 ML 管线的蛋白序列嵌入 | 学术研究员、生物信息学团队 | ESMC-300M 或 ESMC-600M(开放权重) | 嵌入蛋白序列;将嵌入输入下游分类器或聚类 | ESMC-300M 在 HuggingFace 下载 6,320 次;社区在 129+ 篇 BioRxiv 论文中引用 |
| 企业级商业嵌入 | 企业生物信息学、药企 R&D | ESMC-6B(AWS SageMaker JumpStart) | 通过 CloudFormation 栈部署(15-25 分钟);运行批量蛋白相似性搜索 | AWS SageMaker JumpStart 上架;GitHub esm-sagemaker CloudFormation 文档 |
| 结构条件蛋白变体生成 | 计算生物学家 | ESM3-small(开放权重,本地 GPU) | 提供部分结构 token 作为条件;生成序列变体 | GitHub ESM README 示例;ESM3 架构多轨 token 化 |
| 功能引导蛋白设计 | 药物发现、酶工程 | ESM3(Forge API) | 指定功能注释关键词;联合优化序列和结构输出 | ESM3 Science 论文基准结果;ESM3 博客(公司声明) |
| 学术研究模型微调 | 学术实验室 | ESMC-300M(开放权重) | 在专有蛋白数据集上微调 ESMC,适配特定领域任务 | Cambrian Non-Commercial License 允许为非商业研究微调 |
用例来自 EvolutionaryScale 官方博客、GitHub ESM README、Science 论文(Hayes et al., 2025)和社区 HuggingFace 下载量。esmGFP 用例已完全验证;其他用例反映已记录的能力主张。
EvolutionaryScale 的 ESM 蛋白质 AI 平台五层架构,从训练数据到部署。
层边界是概念划分;Forge API 的确切服务架构和内部基础设施未公开记录。
[CE001, CE002, CE003, CE009, CE019, CE022]5.2 技术架构:多轨 Transformer、训练规模与 Token 化
ESM3 最核心的架构创新,是多轨 transformer 设计:在统一 transformer 框架内,模型同时处理三条并行 token 序列——氨基酸序列 token、结构 token(编码 3D 坐标)和功能注释 token(基于关键词的 GO term 标签)。 每一轨都采用离散 token 化。结构坐标通过向量量化变分自编码器(VQVAE)编码进有限的结构 token 码本, 让模型无需连续坐标回归,就能原生读取并生成三维蛋白质结构。 预训练使用掩码语言建模(MLM)目标,并同时施加在三条轨道上,使模型学到覆盖序列—结构—功能空间的联合表征。 98B 参数的 ESM3-large 在约 2.78 billion 条蛋白质序列(771 billion 个唯一 token)上训练, 训练消耗 1.07×10²⁴ 次浮点运算;训练集群为 Andromeda HPC,使用 NVIDIA H100 GPU 和 Quantum-2 InfiniBand 网络。NVIDIA 称,ESM3-large 相比前代 ESM2 使用约 25× 更多 FLOPs 和 60× 更多数据。ESM3-large 还应用了基于人类反馈的强化学习(RLHF),使蛋白质设计输出更贴合人类偏好。 ESM-C(Cambrian)采用另一套架构:带旋转位置嵌入(RoPE)和 SwiGLU 前馈激活的 Pre-Layer Normalization (Pre-LN)transformer,并用掩码语言建模预训练。ESMC-300M(30 层、隐藏宽度 960)训练消耗 1.26×10²² FLOPs; ESMC-600M(36 层、宽度 1152)为 2.17×10²² FLOPs;ESMC-6B(80 层、宽度 2560)为 2.37×10²³ FLOPs。 训练数据覆盖三个大型序列数据库:UniRef(83 million 个簇)、MGnify(372 million 个簇)和 JGI metagenomics (2 billion 个簇),全部按 70% 序列同一性聚类。 开源 DeepEP 库展示了 EvolutionaryScale 的内部基础设施能力。DeepEP 是面向 H800 GPU 的 Mixture-of-Experts 专家并行通信自定义 CUDA/NCCL 实现,拥有 1,253 个 GitHub 星标,说明公司具备活跃的 HPC 工程能力,可支撑大规模分布式训练和推理。[CE009, CE010, CE011, CE012, CE013, CE014]
| 组件 | 描述 / 规格 | 关键指标 | 主要来源 |
|---|---|---|---|
| ESM3 多轨 Transformer | 三条并行输入 / 输出轨道:氨基酸序列 token、VQVAE 结构 token、功能关键词 token;跨轨道统一注意力 | 3 条轨道;3 个模型尺寸分别为 1.4B / 7B / 98B 参数 | ESM3 博客(官方);Hayes 等,Science 2025 |
| VQVAE 结构 tokenizer | 向量量化变分自编码器把 3D 蛋白骨架坐标编码为有限码本中的离散结构 token | 离散码本;无需坐标回归即可原生生成 3D 结构 | ESM3 博客(官方);ESM3 预印本(bioRxiv) |
| ESM3 预训练与对齐 | 三条轨道统一做掩码语言建模(MLM);在 ESM3-large 上用 RLHF 微调,使其对齐蛋白设计偏好 | 1.07×10²⁴ FLOPs(98B 模型);2.78B 个蛋白;771B 个唯一 token | ESM3 博客(官方);NVIDIA 博客;Science 论文 |
| ESM-C 架构 | Pre-LN Transformer,使用 RoPE 位置嵌入和 SwiGLU 激活;掩码语言建模预训练;三个尺寸:300M / 600M / 6B | 架构:300M: 30L×960W; 600M: 36L×1152W; 6B: 80L×2560W | ESM-C 博客(官方) |
| ESM-C 训练算力与数据 | 训练数据:UniRef(83M 序列簇)、MGnify(372M)、JGI metagenomics(2B 簇),按 70% 序列一致性聚类;各模型 FLOPs:300M=1.26e22、600M=2.17e22、6B=2.37e23 | 总序列簇 2B+(最大组成来自 JGI metagenomics) | ESM-C 博客(官方) |
| Andromeda HPC 训练集群 | NVIDIA H100 Tensor Core GPU 集群,配 Quantum-2 InfiniBand 网络;用于训练 ESM3-large | FLOPs 较前代 ESM2 高 25×,数据多 60× | NVIDIA 博客(伙伴证明) |
| DeepEP MoE 专家并行库 | 专家混合(MoE)专家并行通信的开源 CUDA/NCCL 实现;为 H800 GPU 定制内核 | 1,253 个 GitHub stars;显示内部 HPC 基础设施能力较强 | GitHub evolutionaryscale/DeepEP(开发者信号) |
ESM-C 的架构参数(层数、隐藏宽度)来自官方 ESM-C 博客。ESM3-large 的训练 FLOPs 来自官方 ESM3 博客和 Science 论文。ESM-C FLOPs 来自 ESM-C 博客。基础设施细节(Andromeda 集群、H100 GPU)来自 NVIDIA 博客公告,并由 ESM3 Science 论文交叉印证。
八步工作流,展示蛋白质研究者如何使用 ESM3 平台,从假设到验证候选物。
流程经过简化;未展示实验结果与修订提示词之间的反馈回路。湿实验验证步骤由用户执行,不由 EvolutionaryScale 执行。
[CE005, CE007, CE018, CE029]5.3 部署与生态:Forge API、AWS、NVIDIA 与社区
EvolutionaryScale 搭建了多层分发策略:一端用开放权重吸收社区采用,另一端通过商业 API 和云市场提供访问。 Forge API(forge.evolutionaryscale.ai)于 January 2025 开放公测,是 ESM3 与 ESMC 商业使用的主要编程接口。 官方 Python 客户端(pip install evoscale-sdk)托管在 github.com/evolutionaryscale/esm,同时提供同步推理和异步批处理执行。 截至 May 2026,Forge API 门户可访问且运行中,但详细定价未公开列出。 AWS SageMaker JumpStart 通过 CloudFormation 堆栈提供 ESMC-6B 商业部署,可在 15-25 分钟内配置专用 GPU 实例。esm-sagemaker GitHub 仓库记录了该集成,目标客户是需要大规模向量嵌入工作流和可预测 SLA 的企业生物信息学用户。 Amazon 既是 EvolutionaryScale Series A 的共同投资人,也是部署合作伙伴。 ESM3 于 June 2024 首次发布时,NVIDIA 宣布将 ESM3 集成进 BioNeMo NIM 平台,用于 GPU 优化推理; EvolutionaryScale 的 ESM-C 博客(December 2024)把 BioNeMo 列为即将上线的分发渠道。NVIDIA NGC 目录也单独列出 ESM3 资源。 在 GitHub 上,EvolutionaryScale 维护九个公开仓库。除旗舰 esm 仓库外,面向社区的重要项目还包括 DeepEP (1,253 个星标)、一个 NCCL 分叉、一个 Hugging Face transformers 分叉和一个 Mamba 实现。HuggingFace 组织页(huggingface.co/evolutionaryscale)托管开放权重模型卡;esm3-sm-open-v1 模型页显示上月 3,105 次下载、291 个点赞,ESMC-300M 为 6,320 次下载,ESMC-600M 为 1,490 次下载。这些指标说明研究社区已有实质采用。[CE019, CE020, CE021, CE022, CE023, CE024]
EvolutionaryScale 的关键平台、基础设施和分发依赖的有向无环图。
依赖方向代表数据、计算和所有权流,不代表数据量大小。Forge API 的内部服务基础设施未公开记录。
[CE010, CE016, CE019, CE022, CE025, CE035]5.4 知识产权与竞争护城河
EvolutionaryScale 的 IP 护城河靠三根柱子支撑:(1)高影响力期刊上的发表深度;(2)自研训练规模和基础设施; (3)esmGFP 展示了模型在自然进化未探索过的蛋白质序列空间中生成新蛋白的能力。 旗舰 Science 论文(Hayes et al., Science, January 2025, Vol 387, Issue 6736, pp. 850-858, DOI 10.1126/science.ads0018)在本次研究观察到的引用指标中,已累计 341 次引用、68,494 次下载; 其中 318 次引用发生在发表后前 12 个月。ESM3 在 bioRxiv 上的预印本(10.1101/2024.07.01.600583, July 2024)第一年内被 129+ 篇下游论文引用。这一发表速度让 ESM3 跻身被引用最多的新蛋白质 ML 方法之一。 esmGFP 结果是 ESM3 生成能力最强的公开证明。该设计蛋白在 229 个氨基酸位置上携带 96 个突变(与最近已知天然 GFP 的汉明距离为 58%),进化距离可类比珊瑚与水母这两个不同门类之间的分化。EvolutionaryScale 已就 esmGFP 及相关蛋白质设计方法提交专利。复现 ESM3-large 需要的算力投入(1.07×10²⁴ FLOPs,相当于训练当时 GPT-4 训练预算的两倍以上)形成了有意义的成本壁垒。 主要竞争对手更多是轴向差异,而非完全重叠。AlphaFold3(DeepMind,May 2024)擅长蛋白质结构预测,包括小分子和抗体复合物, 但在设计意义上并非生成式模型,且商业使用限制在学术研究。Chai-1(Chai Discovery,2024)聚焦高精度蛋白质复合物结构预测。 ESM2(Meta AI,2022)是 650M 参数的开放权重前代模型,提供序列向量嵌入,但缺少生成式的序列—结构—功能联合建模。 EvolutionaryScale 的独特定位,是能够同时围绕序列、结构和功能推理的生成式蛋白质设计。[CE027, CE028, CE029, CE030, CE031, CE032]
| 信任 / 安全维度 | 当前公开状态 | 风险等级 | 尽调路径 |
|---|---|---|---|
| 双用途 / 生物安全风险(病原体蛋白设计) | 开放权重(ESM3-small、ESMC-300M/600M)受 Cambrian Non-Commercial License 限制;Forge API 需要账户接受 ToS;没有公开的生物安全筛查政策 | 高(行业共性担忧;未发布生物安全审计) | 向 EvolutionaryScale / CZI Biohub 索取生物安全政策文件和任何第三方生物安全评审 |
| 开放权重非商业许可合规 | Cambrian Non-Commercial License Agreement 禁止商业使用 ESM3-small 和 ESMC-300M/600M;商业客户必须使用 Forge API 或 SageMaker | 中(许可执行需要监控;灰色地带商业使用可能漏检) | 审阅 Cambrian Non-Commercial License Agreement;评估对未经授权商业微调的 IP 保护 |
| Forge API 数据隐私与留存 | Forge API 未公开提交序列的数据留存、删除或保密政策 | 中(拥有专有序列数据的药企客户会实质关切) | 向 EvolutionaryScale / CZI Biohub 索取 Forge API Terms of Service、Privacy Policy 和 Data Processing Agreement |
| 异构结构输入下的性能鲁棒性 | 独立 bioRxiv 研究(Dec 2024)发现,使用每个变体分别弛豫后的不同结构时,ESM3 结合预测会变差;见 SE007 | 中(反向发现;范围限定在异构结构输入) | 审阅 Gissing & Smith bioRxiv Dec 2024 预印本;用不同结构输入策略测试 ESM3 结合预测 |
| 组织连续性风险(转入 CZI Biohub) | EvolutionaryScale 团队已于 November 2025 加入 CZI Biohub;未来产品路线图由 CZI Biohub 而非独立创业公司治理 | 中(依赖非营利使命和资金连续性) | 跟踪 CZI Biohub 公告;确认转入后 Forge API SLA 承诺 |
信任和合规状态仅来自公开资料。生物安全政策、Forge API 数据留存政策和任何独立安全审计均未公开。该表反映截至 May 2026 已公开记录的控制措施和已知缺口。
对 ESM3(生成式)、ESMC(嵌入)、Forge API 和开放权重层在六个维度上的能力与成熟度比较。
能力评级是基于公开来源的证据评估。内部性能基准和 Forge API SLA 细节尚未披露。
[CE001, CE002, CE010, CE024, CE033, CE039]5.5 产品路线图、CZI Biohub 过渡与负责任开发
在 NVIDIA 早期种子投资之后,EvolutionaryScale 于 September 2024 完成由 Lux Capital 领投、Amazon 和 NVIDIA 参与的 $142 million Series A。November 2025,公司团队加入 CZI Biohub,成为 Chan Zuckerberg Initiative 宣布的重大「Frontier AI & Biology」计划的一部分。在这一过渡中,联合创始人兼首席科学家 Alex Rives 出任 Biohub 科学负责人,EvolutionaryScale 研究团队则并入 Biohub 由生物科学家、AI 工程师和技术人员组成的综合团队。 CZI Biohub 已宣布,到 2028 年将算力扩张至 10,000 GPUs,以支持该计划。 截至 May 2026 报告日,Forge API 和开放权重模型分发仍在运行。EvolutionaryScale 的公益公司(PBC)章程, 以及开放权重适用的 Cambrian 非商业许可,共同写入了对研究访问的承诺,同时把商业能力保留给 Forge API 收入模式。 关键的信任与安全维度需要尽调关注。生成式蛋白质设计的双重用途风险——包括被误用于病原体工程——是全行业问题。 EvolutionaryScale 通过开放权重的非商业许可限制和 Forge API 访问控制来应对,但尚未公开记录生物安全筛查政策或独立生物安全审计。 December 2024 发表的一篇独立 BioRxiv 预印本发现,如果把每个变体各自松弛后的不同蛋白质结构作为输入,而不是使用单一一致结构作为骨架, ESM3 的结合预测性能会下降——这是一个「结构更多,准确率更低」悖论;涉及异质结构输入的部署场景,应由尽调团队专门测试。 Forge API 提交序列的数据留存政策也未公开披露,这可能成为受监管行业企业客户的合规顾虑。未找到 EvolutionaryScale 的 SEC Form D 文件,与其私营公司身份一致。[CE034, CE035, CE036, CE037, CE038, CE039]
| 里程碑 | 日期 / 时间 | 状态 | 证据来源 |
|---|---|---|---|
| ESM2 发布(Meta AI,650M–3B 开放权重) | 2022 | 已完成 —— 开源,研究社区广泛采用 | Meta AI 博客;HuggingFace(前代产品,不属于 EvolutionaryScale) |
| EvolutionaryScale 成立;ESM3 预发布开发启动 | 2023 | 已完成 —— 公司由 Meta AI FAIR 前成员创立 | NVIDIA 种子投资公告;Crunchbase |
| ESM3-small 开放权重发布;ESM3 Forge 封闭 beta | June 2024 | 已完成 —— Forge 封闭 beta 开放;ESM3-small 上架 HuggingFace | ESM3 官方博客(SE001);NVIDIA 博客(SE017) |
| ESM3 预印本提交至 bioRxiv(10.1101/2024.07.01.600583) | July 2024 | 已完成 —— 首年内 129+ 篇引用论文 | bioRxiv 预印本(SE006);bioRxiv 搜索(SE008) |
| Series A 融资($142M)—— Lux Capital、Amazon、NVIDIA | September 2024 | 已完成 | Axios(SE025);Crunchbase(SE020) |
| ESM-C(Cambrian)模型发布 —— 300M/600M 开放权重 + 6B Forge | December 2024 | 已完成 —— 开放权重上架 HuggingFace;ESMC-6B 上线 Forge | ESM-C 博客(SE002);HuggingFace 模型卡(SE014、SE015) |
| ESM3 发表于 Science(Hayes 等,Vol 387,pp. 850-858) | January 16, 2025 | 已完成 —— 341 次引用;68,494 次下载 | Science DOI 10.1126/science.ads0018(SE005);Semantic Scholar(SE026)引用来源 |
| Forge API 公共 beta 开放 | January 2025 | 已完成 —— 与 Science 论文发表同步 | ESM3 博客(SE001);GitHub ESM README(SE009) |
| EvolutionaryScale 团队加入 CZI Biohub(Frontier AI & Biology 计划) | November 2025 | 已完成 —— Alex Rives 被任命为 Biohub 科学负责人 | CZI Biohub 博客(SE023) |
| NVIDIA BioNeMo NIM 集成(ESM-C) | 目标:2025/2026 | 进行中 —— ESM-C 博客(December 2024)列为“available soon” | ESM-C 博客(SE002);NVIDIA 博客(SE017);NVIDIA NGC catalog(SE018) |
| CZI Biohub 10,000-GPU 算力扩张 | 目标:2028 年前 | 已宣布 —— Biohub Frontier AI 计划 | CZI Biohub 博客(SE023) |
里程碑日期来自 EvolutionaryScale 官方博客、bioRxiv 投稿元数据、Science 论文发布日期,以及报道 Series A 的新闻文章。未来里程碑(BioNeMo NIM、10,000 GPUs)来自 NVIDIA 和 CZI Biohub 公告,代表计划目标,并非已确认交付。
5.6 证据展品
06客户情况
6.1 客户基础分层
EvolutionaryScale 的客户基础最适合拆成四个访问层级,每层买方画像、访问机制和证据深度都不同。最大、证据最充分的一层是学术和独立研究人员; 他们通过 GitHub 和 HuggingFace 直接访问开放权重的 ESM3(1.4B,非商业许可)和 ESM-C(300M、600M,开放权重)。 这些用户主要来自大学、研究机构和政府实验室,身份多为计算生物学家、结构生物学家和生物信息学家。使用场景包括蛋白质序列表征、 结构预测微调、功能注释、抗体设计和下游模型开发。开放权重用户不产生收入,但构成商业转化漏斗顶端信号。 第二层是商业云平台用户,他们通过 Amazon Web Services SageMaker Marketplace(ESM-C 模型可商业部署)和 NVIDIA BioNeMo(列为即将上线)接触 ESM 模型。这些买方通常是药企和生物技术公司的生物信息学与计算生物学团队, 相比直接订阅 API,更偏好云原生、由基础设施托管的模型部署。AWS 和 NVIDIA 是渠道合作伙伴,不是终端客户; 真正的企业买方是它们的下游客户。订阅用户数和部署指标未公开披露。 第三层是 Forge API beta 用户。截至 January 2025,EvolutionaryScale 开放了限时免费的 Forge API 公测, 让用户规模化访问 ESM3 和 ESM-C 模型。Forge API 面向需要超过 1.4B 开放模型推理能力的学术科学家和商业构建者。 Forge 公测后的商业定价尚未公开宣布。API 注册需要访问 token;用户数未披露。 第四层是为企业平台访问付费的大型药企 R&D 买方。这是价值最高的细分群体,但公开证据最弱。Adaptyv Bio (一家位于瑞士 Lausanne 的蛋白质工程公司)已确认是具名 ESM 生态合作伙伴。Pfizer、Eli Lilly、 Novartis、Roche 或其他全球 top-20 药企交易均未公开宣布;相较 Generate Biomedicines 和 Isomorphic Labs, 这是实质性的商业证明缺口。[CU023, CU031, CU012, CU008, CU010, CU011]
| 客户分层 | 买方 / 用户 / 付款方 | 接入渠道 | 用例 | 规模 / 触达(估计) | 收入 / 战略价值 | 核心证据缺口 |
|---|---|---|---|---|---|---|
| 学术与独立研究者 | 高校和研究机构的计算 / 结构生物学家、生物信息学家 | GitHub(开放权重)、HuggingFace、PyPI(esm 包) | 蛋白表征、结构预测微调、功能注释、抗体设计 | 3.1k+ 次 HF 下载(ESM3);7.8k+ 次 HF 下载(ESM-C);129+ 篇 bioRxiv 预印本 | 直接收入为零;漏斗顶端信号;学术引用背书 | 未披露学术用户转付费用户的转化率 |
| 云平台企业用户 | 生物技术 / 制药 IT 和计算生物学团队;AWS 与 NVIDIA 客户 | AWS SageMaker Marketplace(ESM-C 商业许可);NVIDIA BioNeMo(即将上线) | GPU 托管的蛋白嵌入、分子设计、虚拟筛选 | 未披露;由 AWS/NVIDIA 客户群中介触达 | 战略价值高(与 $142M 投资方的渠道伙伴关系对齐);商业指标不透明 | 订阅用户数、收入分成和 SageMaker 使用量未公开 |
| Forge API beta 用户 | 接入更大 ESM3 和 ESM-C 6B 模型的学术与商业科学家 | forge.evolutionaryscale.ai 上的 Forge API(token 门控) | 大规模蛋白生成;突破开放权重模型限制的大规模表征 | 未披露;自 January 2025 起免费 beta | 可能转化为付费;商业定价尚未公布 | beta 用户数、活跃使用和付费转化计划未披露 |
| 生物技术 / 蛋白工程公司 | 蛋白工程初创公司和 CRO(如 Adaptyv Bio) | Forge API、SageMaker 或开放权重集成 | 蛋白结合体设计、抗体优化、功能工程 | 一个具名伙伴(Adaptyv Bio);esm-partner repo 透露的管线深度未披露 | 早期阶段;未来收入潜力;验证产品-市场匹配信号 | 未披露交易条款、管线规模,也未披露从试点到生产的转化 |
| 大型药企 R&D(缺口分层) | 全球前 20 大药企的 CSO、药物发现 VP、BD 高管 | 企业版 Forge API 或直接 SageMaker 订阅(假设) | AI 辅助药物发现、靶点识别、生成式先导化合物优化 | 截至 May 2026 公开确认数为零;未宣布交易 | 潜在价值最高;商业验证完全缺失 | 没有类似 Generate Biomedicines($1.9B Amgen)或 Isomorphic Labs(Lilly/Novartis)的具名交易 |
学术和云层级的规模 / 触达估计来自 HuggingFace 下载量和 bioRxiv 搜索量;实际独立用户数不同且未知。收入和战略价值评估为推断,未由公司披露。大型药企分层是目标市场,不是已确认客户层级。
[CU001, CU007, CU011, CU012, CU017]EvolutionaryScale 的客户旅程:从开放权重学术发现,到 Forge API 试用,再到商业云部署和潜在药企企业合作。
旅程阶段根据已记录的访问机制和竞争市场常态推断。阶段转化率(从开放权重到 API 再到企业)完全未知。药企企业阶段是假设目的地,不是已确认结果。
[CU001, CU007, CU012, CU017, CU034]6.2 采用轨迹与开放访问使用
最清晰、最客观可衡量的采用信号来自开放访问渠道。在 HuggingFace 上,截至研究日,biohub/esm3-sm-open-v1 模型(ESM3 的 1.4B 开放权重版本)约有 3,110 次下载、291 个点赞;biohub/esmc-300m-2024-12 约有 6,320 次下载、30 个点赞;biohub/esmc-600m-2024-12 约有 1,490 次下载、32 个点赞。截至 May 2026,两个开放模型合计带来约 7,810 次 ESM-C 家族下载。ESM-C 模型在研究缓存日期前两天仍有更新, 说明维护活跃。这些下载数很可能低估实际使用,因为很多学术用户会克隆 GitHub 仓库,或通过官方 esm Python package 访问模型权重,而不是直接走 HuggingFace hub。 在 GitHub 上,evolutionaryscale/esm 仓库是主要开源分发渠道。该组织维护九个或更多仓库,包括衍生技术基础设施 (DeepEP 1,253 个星标、NCCL 分叉 1,270 个星标)、模型权重,以及明确标注为合作伙伴协作的 esm-partner 仓库。 March–May 2026 期间仍有活跃 commit 记录,说明开发活动持续。 学术引用证据扎实:截至 May 2026,Semantic Scholar API 搜索返回 32 篇基于 ESM3 的论文;bioRxiv 搜索 "evolutionaryscale ESM3" 返回 129 个预印本结果。具名下游应用包括 MegSite(核酸结合残基预测)、 ProteinReasoner(带链式思维推理的多模态蛋白质语言模型)、iNClassSec-ESM(非经典分泌蛋白发现), 以及用于色谱纯化的亲和肽设计,覆盖学术、临床和工业应用。ESM3 Science 论文(发表于 January 16, 2025, DOI 10.1126/science.ads0018)提供权威学术认可,也成为商业沟通中的可信度锚点。[CU001, CU002, CU003, CU004, CU005, CU006]
| 指标 | 数值 | 日期 / 期间 | 来源 | 置信度 | 含义 |
|---|---|---|---|---|---|
| ESM3-open(1.4B)HuggingFace 下载量 | ~3,110 | 截至 May 2026 | HuggingFace 组织页(biohub/esm3-sm-open-v1) | 高 —— 直接读取 HF 页面 | 开放权重版本的基线需求信号;低估总使用量(GitHub + pip) |
| ESM-C 300M HuggingFace 下载量 | ~6,320 | 截至 May 2026 | HuggingFace 组织页(biohub/esmc-300m-2024-12) | 高 —— 直接读取 HF 页面 | 最受欢迎的 ESM-C 模型;采用更广,可能因为算力要求更低 |
| ESM-C 600M HuggingFace 下载量 | ~1,490 | 截至 May 2026 | HuggingFace 组织页(biohub/esmc-600m-2024-12) | 高 —— 直接读取 HF 页面 | 能力更高的层级;增量用户愿意支付算力溢价 |
| ESM-C 系列 HF 总下载量(300M + 600M) | ~7,810 | 截至 May 2026 | HuggingFace 组织页汇总 | 高 —— 根据两个已确认数值计算 | ESM-C 系列是 ESM3-open 的 2.5x 以上,说明表征用例比生成用例需求更广 |
| 引用 ESM3 的下游论文(Semantic Scholar) | 32 篇论文 | 截至 May 2026 | Semantic Scholar API 搜索(query: ESM3 EvolutionaryScale protein language model) | 高 —— API 结果,查询已知 | 下游研究生态在增长;验证学术产品-市场匹配 |
| 提及 ESM3 + EvolutionaryScale 的 bioRxiv 预印本 | 129 条结果 | 截至 May 2026 | bioRxiv 搜索“evolutionaryscale ESM3” | 高 —— 直接搜索结果数 | 预印本数量是 Semantic Scholar 收录论文的 4x;说明未披露使用管线很大 |
| EvolutionaryScale GitHub DeepEP 仓库 star 数 | 1,253 | 截至 May 2026 | GitHub 组织页(evolutionaryscale/DeepEP) | 高 —— 直接读取 GitHub | 显示模型用户之外还有活跃开发者互动;开发者社区在形成 |
| NCCL fork star 数(evolutionaryscale/nccl) | 1,270 | 截至 May 2026 | GitHub 组织页(evolutionaryscale/nccl) | 高 —— 直接读取 GitHub | 显示 GPU 基础设施层工程可信度;吸引企业 AI 基础设施买家 |
| Forge API 公共 beta 上线 | January 2025 上线 | January 2025 | 公司博客(evolutionaryscale.ai,January 2025 文章) | 高 —— 公司官方公告 | 商业意图已确认;准确 beta 用户数未披露 |
| 具名下游学术应用(Semantic Scholar,精选) | MegSite、ProteinReasoner、iNClassSec-ESM、亲和肽设计(4+ 个具名) | 2025–2026 | Semantic Scholar API 结果(ESM3 EvolutionaryScale) | 高 —— 单篇论文引用已确认 | 展示临床、基础科学和工业场景中的多领域下游使用 |
HuggingFace 下载量仅代表 HuggingFace hub 的唯一模型下载;不包括 pip install、GitHub clone 或 SageMaker 部署的实际使用。GitHub star 数是开发者兴趣代理,不是活跃用户数。Semantic Scholar 返回已发表论文;bioRxiv 预印本数量约高 4x。所有指标都反映开放接入使用;商业部署指标完全不透明。
[CU001, CU002, CU003, CU004, CU005, CU006]自上而下估计漏斗:从总体可触达学术用户基数,到开放权重下载、API beta 注册、商业 SageMaker 订阅和药企企业合作。
“Forge API beta 注册者”以上的漏斗值为近似值:合计约 10.9k 的 HuggingFace 下载量,加上 GitHub-only 用户估计。Forge API beta 注册者(约 200)和 SageMaker 商业订阅者(约 10)的数值只是粗略低位估计;EvolutionaryScale 未披露这些数量。研究者社区可触达总量是行业估计。两个未披露层级的数字估计是占位值,不确定性极高。
[CU001, CU004, CU005, CU006, CU011, CU017]6.3 具名部署与集成伙伴
EvolutionaryScale 的具名商业部署和集成证据,主要由三个已确认渠道和一个具名伙伴支撑。第一,AWS SageMaker Marketplace 在 Cambrian Inference Clickthrough License Agreement 下列出 ESM-C 模型,可用于商业部署。 GitHub README 给出明确部署说明:需要管理员级 AWS 访问权限,通过 SageMaker Marketplace 订阅,并基于 CloudFormation 启动,耗时 15–25 分钟。GPU 成本直接计入订阅者的 AWS 账户。这是一条可验证的商业部署路径, 但订阅用户数未披露。 第二,截至 December 2024,NVIDIA BioNeMo 将 ESM-C 列为即将上线的集成。NVIDIA BioNeMo 面向药物发现、 分子设计、虚拟筛选和蛋白质结合剂设计场景,正好匹配 ESM-C 的预期商业应用。NVIDIA 也是 EvolutionaryScale 战略投资人(Series A 参与方),对深度集成有结构性激励。Series A 公告(BusinessWire)和 NVIDIA 关于种子投资的专门新闻稿, 都确认了 NVIDIA 的投资人关系。 第三,Adaptyv Bio——位于瑞士 Lausanne 的 Biopole Life Science Campus 的一家蛋白质工程公司——已确认为具名 ESM 生态伙伴。Adaptyv Bio 聚焦蛋白质设计,与 ESM 模型能力直接对齐。该合作关系反映出小型且成长中的生物技术客户群: 它们可以使用开放权重或 API 访问,而不承担大型药企的采购开销。 第四,Forge API 公开 beta(January 2025 上线)构成商业平台的客户证明,尽管注册用户规模和付费转化未披露。 EvolutionaryScale GitHub ESM partner 仓库(evolutionaryscale/esm-partner,标注为「用于合作伙伴协作的仓库」) 暗示 Adaptyv Bio 之外还有正式合作伙伴管线,但没有公开具名其他伙伴。 重要的是,早期 ESM 家族(ESM1b、ESM2)已有企业用户记录:BioNTech 和 InstaDeep 在 COVID 刺突蛋白 上微调 ESM 模型,创建变体早期预警系统,标记全部 16 个 WHO 关注变体;Hie et al. 用 ESM1v/ESM1b 进化抗体; Shanker et al. 用 ESM-IF1 针对 SARS-CoV-2 做抗体进化。这些企业和学术用户的历史用例验证了 ESM 家族的实用性, 但并不构成 EvolutionaryScale 当前付费产品的商业客户。[CU007, CU008, CU009, CU010, CU011, CU013]
| 客户 / 伙伴 | 分层 | 部署 / 集成 | 状态(生产 vs. 试点) | 结果 / 证据质量 | 核心局限 |
|---|---|---|---|---|---|
| AWS SageMaker Marketplace(ESM-C) | 云平台渠道 —— 企业生物技术 / 制药 AWS 客户 | ESM-C 300M、600M、6B 可订阅;CloudFormation 部署;GPU 费用由订阅方承担 | 生产 —— Marketplace listing 已上线;GitHub README 记录部署 | 高 —— GitHub README 记录了具体 Marketplace URL、部署步骤和 SDK 集成 | 订阅用户数和产生收入未公开;AWS 不披露单个 ISV 使用量 |
| NVIDIA BioNeMo 平台 | 云平台渠道 —— 使用 NVIDIA 硬件的企业药物发现团队 | ESM-C 被列为即将集成;BioNeMo 面向分子设计、虚拟筛选、蛋白结合体设计 | 即将上线 / 计划中 —— December 2024 ESM Cambrian 博客宣布;截至缓存日期尚未上线 | 中 —— 公司博客和 NVIDIA BioNeMo 平台页面已确认;公告后的集成状态未验证 | 没有确认的线上集成或用户数;December 2024 博客中的“soon”措辞表示计划中,非已确认 |
| Adaptyv Bio | 生物技术蛋白工程初创公司(瑞士洛桑) | 将 ESM 模型集成到蛋白工程工作流 | 生产 / 伙伴关系 —— 具名 ESM 生态伙伴 | 低-中 —— 具名伙伴已确认;具体用例、模型版本和商业条款未披露 | 网站内容很少;双方均未发布案例研究或量化结果 |
| Forge API Beta 注册用户 | 学术和商业科学家(混合分层) | 以 token 门控方式大规模接入 ESM3 和 ESM-C 6B;与 SageMaker 使用同一 SDK | 生产级 API(beta)—— January 2025 上线;限时免费访问 | 中 —— 根据 GitHub SDK 和公司博客,Forge API 已运行;注册规模未披露 | 免费 beta 状态;无收入;beta 后付费定价未公布;转化计划不透明 |
| BioNTech / InstaDeep(legacy ESM2 用户) | 大型生物技术 / AI 公司(ESM 前代) | 在 COVID spike protein 序列上微调 ESM 语言模型,用于变体早期预警系统 | 生产 —— 在官方指定前标记出全部 16 个 WHO 关注变异株 | 历史质量高 —— ESM3 博客和同行评议语境有记录;现实结果已确认 | 使用的是 ESM2(免费前代模型),不是当前付费 EvolutionaryScale 客户 |
覆盖范围不完整:只包括具名伙伴和 marketplace listing。未披露的 Forge API 用户、任何私有企业试点,以及 esm-partner GitHub repository 中的任何早期伙伴讨论均未纳入。BioNTech/InstaDeep 行记录的是此前 ESM 系列使用,不是当前商业关系。所有收入指标均为 null 或未披露。
[CU007, CU008, CU010, CU011, CU025, CU033]EvolutionaryScale 具名和推断客户部署的证据质量、部署状态、结果具体性和留存信号。
矩阵评估是基于每个部署可得证据类型和数量的定性判断。AWS SageMaker 和 Adaptyv Bio 的生产状态由已记录的访问机制推断;EvolutionaryScale 未发布新闻稿确认活跃商业部署。BioNTech/InstaDeep 行记录的是历史 ESM2 使用,不是当前 EvolutionaryScale 商业关系。
[CU007, CU010, CU011, CU025, CU033]6.4 留存、耐久性与满意度证据
截至 May 2026,EvolutionaryScale 未披露净留存率(NRR)、总留存率(GRR)、客户流失率、合同续约统计或客户满意度分数。 以公司当前商业化阶段看,缺少这些指标并不意外:Forge API beta 只在 January 2025 上线,AWS SageMaker 上架也较新,且没有披露条款的企业软件交易宣布。主要可观察留存信号都是间接的:HuggingFace 下载持续增长(ESM-C 在研究日前数日内更新)、April–May 2026 期间 GitHub commit 活跃、下游学术论文持续累积(自 ESM2 发布以来, 已有 37 个月围绕 ESM 模型继续构建)。 对 Forge API 渠道而言,公开 beta 的免费访问明确服务于客户开发。公司 January 2025 博客称这是 「公开 beta,让学术界和产业界科学家限时免费预览」——这暗示 beta 后付费转化是预期留存机制, 但尚未验证。ESM GitHub 仓库 SDK 能在本地、Forge 和 SageMaker 部署模式之间无缝集成(同一套 API 代码不受端点差异 影响),架构上形成低切换成本、高粘性的留存条件,但商业层面尚未验证。 对 AWS SageMaker 渠道而言,留存由 AWS 云基础设施锁定效应中介。一旦客户通过 CloudFormation 在自有 AWS 环境内部署 ESM-C,迁移到竞争性蛋白质 LM 需要有意重新集成,因此渠道粘性更持久。 ESM2 前代模型可在非商业许可下免费使用,是付费意愿分析的重要下限。如果客户用免费的 ESM2(最高 15B 参数) 就能满足蛋白质表征任务,除非性能优势足以证明溢价合理,否则为 ESM-C 商业访问付费的动力有限。针对具体药物应用证明可量化性能提升, 是仍未解决的关键留存证据缺口。[CU022, CU027, CU028, CU030, CU032, CU035]
| 指标 / 信号 | 数值或状态 | 分层 | 置信度 | 尽调要求 |
|---|---|---|---|---|
| 净收入留存率(NRR) | 未披露 | 所有商业分层 | N/A —— 指标未公开存在 | 在管理层尽调中要求披露 NRR;只有商业规模化上线后才可获得 |
| 总收入留存率(GRR) | 未披露 | 所有商业分层 | N/A —— 指标未公开存在 | 要求任何企业客户的 GRR;目前在付费层大规模启动前尚不适用 |
| HuggingFace 模型维护新鲜度 | ESM-C 在研究日期(May 2026)前 2 天更新;ESM3 于 January 29, 2025 更新 | 开放权重学术用户 | 高 —— 直接读取 HuggingFace 时间戳 | 跟踪 HuggingFace 更新频率,作为模型新鲜度承诺的代理指标 |
| GitHub commit 活跃度(evolutionaryscale org) | esm、DeepEP、nccl、transformers 仓库在 April–May 2026 仍有活跃 commit | 开发者社区用户 | 高 —— 组织页活动可见 | 跟踪 issue 解决率和发布节奏,评估开发者支持质量 |
| Forge API 可用性 / 正常运行时间 | GitHub SDK 文档显示 API 可用;未发布 SLA 或 uptime 数据 | Forge API beta 用户 | 中 —— 代码引用了 API endpoint,但没有状态页或 uptime 指标 | 企业 API 尽调中要求 SLA 条款和历史 uptime;检查 forge.evolutionaryscale.ai 状态页 |
| 学术下游论文累积速度 | 32 篇 Semantic Scholar 论文(ESM3 发布后约 13 个月);129 篇 bioRxiv 预印本 | 学术用户 | 高 —— 来自 API 搜索 | 按季度跟踪论文数量,作为商业管线转化的先行指标 |
| 已报道客户流失事件 | 截至 May 2026,公开记录的流失或不续约事件为零 | 所有分层 | 低 —— 没有证据不等于确认没有流失;公司尚未规模化 | 在付费商业关系披露前没有实际意义 |
| 客户证言 / G2、Gartner Peer Insights 评价 | 截至 2026 年 5 月未发现 | 全部商业细分市场 | 缺失结论可信度高——系统检索没有返回任何评价 | 定期检索 G2、Gartner Peer Insights 和 Capterra;企业版推出后,预计会出现首批评价 |
所有 NRR 和 GRR 单元格均为 null,因为截至 2026 年 5 月,EvolutionaryScale 尚未披露付费产品层级的商业收入。留存代理指标完全依赖开放访问指标(HuggingFace 下载量、GitHub 活动、论文数量)。公司仍处于 API beta 阶段;正式留存指标尚无法在商业规模下适用。
[CU022, CU027, CU028]截至 2026 年 5 月,HuggingFace 和学术文献渠道已知开放访问采用指标的对比条形表示。
所有数值来自截至 2026 年 5 月对平台的直接读取(HuggingFace 页面、GitHub 组织页面、API 搜索结果)。HuggingFace 下载量和 GitHub stars 是异质指标(下载反映模型权重获取,stars 反映开发者兴趣)。bioRxiv 和 Semantic Scholar 数值是搜索结果数量,可能包含间接提及。
[CU001, CU002, CU003, CU005, CU006]6.5 扩张驱动因素与集中度风险
EvolutionaryScale 的扩张轨迹由两股相互竞争的力量塑造。第一股有利:AWS 和 NVIDIA 的战略投资,带来优先渠道位置、 联合营销,以及潜在的优先触达两家公司企业客户网络。NVIDIA BioNeMo 声称训练快 2x、推理快 6x,再叠加 ESM 模型集成, 把 EvolutionaryScale 模型放进一个高采用度 GPU 基础设施平台。AWS 将 ESM-C 纳入 SageMaker JumpStart, 也让数千家在 AWS 上部署工作负载的生命科学公司更容易发现它。「免费学术层 → Forge API beta → 企业 SageMaker 合同」 这一漏斗在架构上成立。 第二股不利:EvolutionaryScale 没有披露药企锚定客户,没有落地后扩张案例研究,也没有公开宣布可支持市场标准比较的定价。 Generate Biomedicines 披露了 $1.9B 的 Amgen 合作;Isomorphic Labs 宣布与 Eli Lilly 和 Novartis 达成交易,潜在里程碑价值合计超过 $3B。EvolutionaryScale 的前沿蛋白质语言模型在学术资质上或许更强 (Science 论文、98B 参数 ESM3),但相对这些蛋白质 AI 直接竞争对手,商业证明明显更弱。客户集中度风险目前无法衡量—— 没有具名企业客户,技术上客户集中度为零,但这掩盖了更大的风险:一家 27 个月融资 $142M 的公益公司,完全没有企业收入。 扩张逻辑的结构性风险包括:(1)开放权重 ESM2 替代——很多下游用户用免费的 15B 参数 ESM2 就能得到足够好结果, 不必为 ESM-C 或 ESM3 商业访问付费;(2)生物安全约束——负责任开发框架和双重用途风险评估(NTI 和 safe.ai 有记录) 为前沿蛋白质设计模型设置访问门槛提供了正当理由,从而限制可服务商业用户基础;(3)学术开源竞争——AlphaFold3、ESMFold 和 RoseTTAFold 都可免费用于结构预测,把可服务市场压缩到 ESM3 真正有差异化的生成和多模态推理任务。[CU014, CU017, CU018, CU019, CU024, CU026]
| 扩张驱动因素 / 风险因素 | 类型 | 影响 | 可能性 / 状态 | 尽调路径 |
|---|---|---|---|---|
| AWS SageMaker + NVIDIA BioNeMo 渠道上架 | 扩张驱动因素 | 高——可借助云平台触达数千家企业级生命科学客户 | 已确认(SageMaker 已上线;BioNeMo 已宣布) | 跟踪 SageMaker 上架排名和 BioNeMo 上线时间表;要求提供渠道合作伙伴收入分成条款 |
| 未披露具名制药锚定客户(商业验证缺口) | 集中风险 / 反向 | 高——缺少企业客户验证,会压制 Series B 估值和后续融资 | 截至 2026 年 5 月,该缺口已确认 | 跟踪制药交易公告;要求管理层更新企业销售管线阶段和数量 |
| 开放权重 ESM2 免费替代风险 | 反向阻力 | 中——ESM2(最高 15B 参数、免费)能满足许多下游用户的表征任务 | 已确认——ESM2 可用;替代程度未知 | 量化 API beta 用户中真正需要 ESM3 或 ESM-C 6B 性能、而非 ESM2 的比例 |
| 前沿蛋白设计模型的生物安全 / 两用约束 | 反向阻力 | 低至中——负责任开发框架可能限制高风险用例访问,从而压缩可服务商业市场 | 活跃——NTI 和 safe.ai 记录了外界对蛋白设计 AI 的持续生物安全担忧 | 审阅 EvolutionaryScale 的负责任开发框架;评估访问控制是否会实质性限制制药客户用例 |
| Forge API 商业定价上线(未来) | 扩张驱动因素(待定) | 高——付费 Forge API 将形成首个直接收入指标,并验证付费意愿 | 待定——截至 2026 年 5 月仅有免费 beta;定价模式未公布 | 跟踪 Forge 定价页;向管理层询问定价层级结构和预计上线日期 |
| 学术到企业漏斗驱动先落地再扩张 | 扩张驱动因素(结构性) | 中——已有记录的 ESM2 企业采用(BioNTech/InstaDeep)说明存在企业转化可能 | 结构性——漏斗架构已确认;转化率未知 | 向管理层索取 Forge API 转化率数据;比较学术与商业 API 使用占比 |
扩张驱动因素评估具有前瞻性,基于渠道关系的结构推断,而非已披露收入数据。反向阻力(ESM2 替代、生物安全约束)分别由已确认的免费替代品和独立生物安全组织文件支撑。所有概率评估都是定性而非定量。
[CU017, CU018, CU024, CU035, CU036]6.6 证据展品
07风险
7.1 生物安全与双重用途风险
生物安全风险是 EvolutionaryScale 投资逻辑中最根本的风险维度。esmGFP 已经展示,ESM3 能生成与任何已知天然蛋白相距 「相当于 500 million years 的进化」序列距离的功能蛋白。同一种支撑药物发现的生成能力,原则上也可能被导向增强病原体毒力、 生成新型毒素,或工程化出监测系统所覆盖已知序列空间之外的生物制剂。MIT 2023 年发表在 arXiv 的研究(Sandbrink & Shulman, 2306.03809)显示,大语言模型可在一小时会话内,为没有实验室训练的非科学人员识别潜在大流行病原体、 合成路线和 CRO 合作伙伴。尽管该研究聚焦通用 LLM,而非蛋白质专用模型,担忧机制直接类比:蛋白质语言模型降低了设计功能性生物制剂的门槛。 US Executive Order 14110(October 30, 2023)明确把生物技术列为国家安全 AI 风险领域,要求评估那些可能降低生物、 化学、核或放射性大规模伤亡武器制造门槛的 AI 系统。NIST AI Risk Management Framework(AI RMF 1.0, January 2023 发布;Generative AI Profile 于 July 2024 更新)为识别和管理这些风险提供自愿性指南。 EU AI Act(Regulation 2024/1689,OJ 12 July 2024)也在 Annex III 及相关条款下,将生物相关双重用途 AI 纳入高风险范围。 EvolutionaryScale 发布过负责任开发框架(Responsible Development Framework),包含四项核心原则:沟通收益与风险、部署前严格评估模型、 采用护栏,并与政府和公民社会互动。ESM Cambrian 发布博客称:「ESM C 经科学专家委员会审查,委员会认为发布模型的收益大幅超过任何潜在风险。」 但是,在访问日(2026-05-18),该框架的规范 URL(/blog/responsible-development) 返回 404 错误,说明该框架文件可能无法公开访问;这本身就是透明度风险。EvolutionaryScale 的模型安全评估没有公开披露任何独立第三方验证。 Biological Weapons Convention(BWC,1972,截至 May 2025 有 189 个缔约方)禁止开发和储存生物武器, 但没有正式核查机制,也没有专门处理 AI 设计蛋白质的机制。Center for AI Safety 2023 年声明——由 Hinton、 Bengio 等人签署——把 AI 引发的大流行级生物风险列为最高级别灭绝风险之一。Johns Hopkins Center for Health Security 明确将 AI 与生物安全交叉作为核心研究领域。行业通过负责任 AI 生物框架自律(Anthropic 的 RSP、OpenAI 的安全承诺)仍处早期, 且不约束 EvolutionaryScale 这样的第三方。[CR001, CR002, CR003, CR004, CR005, CR006]
| 风险 | 类别 | 可能性(1-5) | 影响(1-5) | 剩余评分 | 当前缓释措施 | 剩余暴露 / 缺口 |
|---|---|---|---|---|---|---|
| 监管机构对蛋白 LLM 强制施加生物安全评估或出口管制 | 生物安全 / 监管 | 3 | 5 | 15 | 负责任开发框架;ESM3 博客称已与政府沟通 | 无约束性第三方评估;/blog/responsible-development URL 在访问日期无法打开 |
| 蓄意滥用 ESM3 API,设计新型病原体蛋白或毒素 | 生物安全 / 两用 | 2 | 5 | 10 | Forge API 访问控制;学术用途限制;声明会监测模型输出 | 未公开披露 API 护栏的独立生物安全审计 |
| 竞争商品化:免费工具(AlphaFold3 DB、Chai-1、OpenFold)侵蚀 Forge API 定价 | 竞争 | 4 | 4 | 16 | ESM3 98B 的生成式多模态能力区别于结构预测工具;面向药物发现的微调 | Chai-1 已在关键基准上追平 / 超过 ESM3;侵蚀在加速 |
| Meta 对用于初始化 ESM3 的 ESM2 祖先权重保留剩余知识产权 | 法律 / 知识产权 | 2 | 4 | 8 | ESM3 架构已提交专利;PBC 公司结构 | 未公开披露 Meta 知识产权协议;ESM2 模型卡对衍生品条款表述含糊 |
| Forge API 达到商业收入规模前耗尽现金跑道 | 财务 | 3 | 4 | 12 | 已融资 $145 M;Amazon/Nvidia 作为投资方提供算力获取可选性 | 无公开收入;按 10×+ 远期收入估值意味着门槛很高;未披露后续融资轮 |
| 若 Forge 商业采用滞后且 Series A 重估压缩,存在降价轮风险 | 财务 | 3 | 3 | 9 | 药物发现制药合作作为收入引擎;AWS 分发 | 未披露企业合同或 ARR 里程碑;市场价 GPU 成本仍高 |
| 关键人物离职:Rives、Sercu 或 Lin 流失会使模型开发停摆 | 人才 / 执行 | 2 | 4 | 8 | 暗含股权激励;四位创始人团队提供一定冗余 | 未披露继任计划;董事会未对创始人角色进行独立监督 |
| 单一前雇主文化集中(所有创始人均来自 Meta FAIR)削弱战略多样性 | 人才 / 文化 | 3 | 2 | 6 | 公司已扩展到创始团队之外(估计 50–80 名员工) | 未公开披露外部科学顾问委员会;可能存在范式盲点 |
| 投资方 / 竞争方冲突:Amazon 和 Nvidia 将企业客户导向竞争平台 | 伙伴 / 依赖 | 2 | 4 | 8 | 合同分发协议提供渠道激励 | 未披露 MFN 或排他性条款;BioNeMo 纳入第三方竞争工具 |
| EU AI Act 高风险分类触发 ESM3 API 的合规评估和访问限制 | 监管 | 2 | 3 | 6 | 负责任开发框架与 RSP 类自律机制对齐 | EU AI Act 全部条款于 2026 年 8 月适用;公司 EU 业务存在和合规姿态未披露 |
| 模型幻觉:ESM3 生成序列湿实验验证失败率降低客户 ROI | 技术 | 3 | 3 | 9 | ESM3 论文称,对齐训练(类似 RLHF 的反馈)提升生成质量 | 未发布湿实验验证失败率;API 调用到实验结果之间的延迟掩盖真实失败率 |
| 新型蛋白家族数据稀缺,限制 ESM3 泛化到未探索序列空间 | 技术 | 3 | 3 | 9 | ESM3 训练使用合成数据增强(预测结构 / 功能) | 合成数据质量依赖 AlphaFold 预测;若预测错误,会带来循环依赖风险 |
可能性和影响按 1–5 评分;剩余评分 = 可能性 × 影响。缓释措施来自 EvolutionaryScale 公开披露和行业标准实践。剩余暴露列反映尚未解决的公开证据缺口。
[CR001, CR005, CR013, CR016, CR019, CR023]风险项按可能性(x 轴,1–5)和影响(y 轴,1–5)绘制;右上象限 = 最高优先级。
可能性和影响评级是基于公开信息的定性估计;未使用定量概率模型。
[CR001, CR005, CR013, CR019, CR032, CR038]7.2 技术风险
ESM3 的性能取决于训练分布质量。模型在 2.78 billion 条蛋白质序列上训练,但新型家族中功能蛋白的天然多样性——例如非核糖体合成肽、 新型酶骨架或完全从头折叠——可能远在这一分布之外。蛋白质语言模型可能「幻觉式」生成高置信度序列,但这些序列并不按预测折叠或发挥功能; 任何 ESM3 生成序列用于治疗或工业用途前,都需要湿实验室验证。由此产生延迟风险:客户为 Forge API 调用付费, 但在产生商业价值前仍要承担昂贵实验验证;相较开放替代品,溢价定价的经济论据因此变弱。 基准饱和是近期技术风险。ESM3-98B 发布时在 CASP15 单体预测上达到最先进水平,但 Chai-1(Apache 2.0,免费) 已报告 Cα LDDT 0.849 vs ESM3-98B 的 0.801,以及 77% vs 76% 的 PoseBusters 成功率,在结构预测上接近商业模型。 AlphaFold 3 及其开放数据库(截至 March 2026,含蛋白质复合物在内超过 200 M+ 个结构)持续扩大免费覆盖。 Baker Lab 的 RFdiffusion(Nature 2023)可免费用于结合体设计。这些开放工具降低了 Forge API 门控访问的边际价值。 ESM3 的训练从 Meta 的 ESM2 权重初始化。根据 Meta 自有模型卡和 facebookresearch/esm GitHub 仓库条款, Meta 保留 ESM2 模型家族的 IP。ESM3 相比 ESM2 在架构和训练上有实质进步,但如果 Meta 对祖先权重仍有任何残留 IP 主张, 可能限制 EvolutionaryScale 商业化 98B 模型或再许可权重的能力。公司已就部分工作提交专利(ESM3 bioRxiv 预印本竞争利益声明中披露), 但完整专利组合及其与 Meta 在先技术的关系仍未披露。 对 NVIDIA(H100/H200 集群)和 Amazon(AWS)的算力依赖是一把双刃剑:两者既是投资人和渠道伙伴,理论上也可能限制访问或降低工作负载优先级。 GPU 供应约束可能推迟模型训练或 API 扩容;尤其是 ESM3 训练消耗 1×10²⁴ FLOPs,是发布时任何生物模型中规模最大的算力投入之一。[CR011, CR012, CR013, CR014, CR015, CR016]
| 风险 | 机制 | 证据 | 严重性 | 缓释措施 |
|---|---|---|---|---|
| 蛋白幻觉 / 非功能性生成 | 语言模型生成的序列统计上可信,但折叠错误或没有生物活性 | Chai-1 技术报告显示,ESM3-98B Cα LDDT 为 0.801,低于 Chai-1 的 0.849;AlphaFold2 仅在 CASP14 对约 2/3 蛋白达到准确度 | 对期待可靠命中的药物发现客户,严重性高 | ESM3 论文称对齐训练类似 RLHF;仍需实验室验证 |
| 基准饱和与竞争者追赶 | 免费开放工具在结构预测上快速缩小性能差距;生成任务基准更不完善 | AlphaFold3 DB 免费提供 200M+ 结构;Chai-1 采用 Apache 2.0 且达到 SOTA;RFdiffusion 来自 Baker Lab,免费 | 中——限制溢价定价能力 | ESM3 的多模态联合生成(序列 + 结构 + 功能)形成差异 |
| ESM2 知识产权来源——Meta 祖先权重 | ESM3 以 Meta 的 ESM2 初始化;Meta 模型卡未完全公开衍生品条款;可能对商业权重提出权利主张 | facebookresearch/esm 仓库文本写明在 Meta 条款下“包含预训练权重”;未明确给出 ESM2 权重的开源许可证 | 中——潜在许可风险 | 已提交专利;需要律师审查;PBC 结构提供一定保护 |
| 算力成本与 GPU 供应集中 | 训练消耗 1×10²⁴ FLOPs,且需要持续推理;ESM3-98B 需要 HPC 集群;NVIDIA H100/H200 供应稀缺 | ESM3 博客:“在当今世界吞吐量最高的 GPU 集群之一上训练” | 中等运营风险 | Amazon 和 Nvidia 作为投资方提供算力获取可选性;多云风险仍在 |
| 稀有蛋白类别训练数据缺口 | 新型生物、合成生物学底物或非天然氨基酸可能落在 2.78B 序列训练分布之外 | ESM3 论文:用合成数据补足缺口;ESM Cambrian 缩放律平台期表明存在上限 | 中——限制其在前沿药物发现中的效用 | 合成数据增强;通过 ESM Cambrian 系列持续更新模型 |
严重性评级为定性;证据引用指向公开模型卡和技术报告。
[CR011, CR012, CR013, CR014, CR015, CR016]这张有向无环图展示生物安全、技术和财务风险如何传导到商业结果和投资人影响。
[CR005, CR013, CR019, CR032, CR042]7.3 竞争商品化风险
蛋白质 AI 工具正在快速商品化。Google DeepMind 的 AlphaFold 3 数据库通过与 EMBL-EBI 的合作,免费提供超过 2 亿个预测蛋白质复合体结构(March 2026 更新后纳入蛋白质复合体)。Meta 的 ESM2 通过 facebookresearch/esm 以 MIT 许可证发布;OpenFold 使用 Apache 2.0。Chai-1 使用 Apache 2.0,允许免费商用。 Baker Lab 的 RFdiffusion 和 ProteinMPNN 可从 IPD/UW 免费获取。这些可免费使用的模型覆盖结构预测和结合体设计工作流, 正是 Forge API 的核心用例。 EvolutionaryScale 的防御性靠三点支撑:(1)ESM3 的生成式多模态能力不止结构预测,而是把序列、结构、功能放在一起生成; (2)98B 参数旗舰模型通过 API 闸门和商业许可控制;(3)药物发现所需的领域微调离不开专有数据。不过,Chai-1 技术报告称其无需 MSA 就能达到最先进的多聚体预测水平,直接冲击 ESM3 的关键差异点。如果学术团队和 VC 支持的对手(Profluent、Generate Biomedicines、AbSci、Isomorphic Labs)用宽松许可证发布有竞争力的生成模型, Forge 的定价权会被压缩。 投资人和合作伙伴重叠进一步放大竞争风险:Amazon(AWS)通过 SageMaker JumpStart 分发 EvolutionaryScale 模型, 但也投资并向竞争性生物 AI 公司提供算力;Nvidia 通过 BioNeMo 分发模型,而 BioNeMo 本身也是竞争性的模型分发平台。 合作伙伴利益和投资人利益一旦冲突,这些平台可能优先扶持替代方案。[CR019, CR020, CR021, CR022, CR023, CR024]
| 竞争工具 | 许可 / 成本 | 主要能力 | 对 Forge API 的威胁 | 与 ESM3 的差距 |
|---|---|---|---|---|
| AlphaFold 3 DB (DeepMind/EMBL-EBI) | 免费;CC BY 4.0 | 蛋白、复合物和小分子的结构预测;数据库含 200M+ 条目 | 对结构预测用例威胁高 | 不是生成模型;不能根据提示生成序列 |
| Chai-1 (Chai Discovery) | Apache 2.0;免费商用 | 多聚体结构预测;PoseBusters 77%;单体 LDDT 0.849;无需 MSA | 高——已在 CASP15 单体上超过 ESM3-98B | 尚不是生成式蛋白设计模型;序列生成有限 |
| OpenFold (AQ Laboratory) | Apache 2.0;可训练 | 等同 AlphaFold2 的结构预测;可用自有数据训练 | 中——训练需要算力 | 仅结构;不生成序列 / 功能;无 98B 规模模型 |
| RFdiffusion (Baker Lab/IPD) | 宽松许可;研究免费 | 从头生成蛋白骨架;结合蛋白设计;基序支架搭建 | 对结合蛋白设计用例威胁中等 | 无序列 / 功能联合推理;多模态弱于 ESM3 |
| Meta ESM2(通过 facebookresearch 采用 MIT 许可) | MIT——免费商用 | 序列嵌入;结构预测(ESMFold) | 对嵌入和结构任务威胁中等 | 不是 ESM3 意义上的生成式模型;已被 ESM3 架构取代 |
| Profluent Bio、AbSci、Generate Biomedicines 等竞争方 | 自研 / 合作 | AI 驱动的抗体 / 蛋白设计,配套湿实验 | 对企业药物发现客户威胁中等 | 垂直整合竞争者;按发现服务收费,而不是按 API 访问收费 |
竞争对手数据来自访问日期的公开 GitHub 仓库、模型卡和技术博客文章。
[CR019, CR020, CR021, CR022, CR023]7.4 监管格局
AI 设计蛋白质的监管环境仍在早期,且高度碎片化。还没有任何司法辖区出台专门规则,管控生成式蛋白质语言模型的商业部署。 美国、欧盟和英国都在制定框架,未来可能按不同风险分类覆盖 EvolutionaryScale 的产品。 在美国,EO 14110(October 30, 2023)要求超过定义阈值的双用途基础模型开发者向政府报告安全评估,尤其关注 “生物安全、网络安全和关键基础设施”风险。NIST AI RMF(January 2023)及其 Generative AI Profile(July 2024)提供自愿指引。FDA 通过软件即医疗器械(SaMD)框架和 2024 AI/ML 行动计划监管 AI/ML 医疗器械,但只覆盖诊断或治疗决策 AI,不覆盖纯发现阶段的蛋白质设计工具。BIS(工业与安全局) 已开始研究可用于生物武器场景的 AI 模型出口管制框架,但还没有最终规则专门管控蛋白质语言模型。 在欧盟,AI Act(Regulation 2024/1689,August 2024 生效;多数条款自 August 2026 起适用)会根据应用场景, 将可能造成双用途生物危害的 AI 系统划入高风险或禁止类别。待法案条款全面执行后,ESM3 的双用途 API 访问可能需要合格评定、 透明度义务和人工监督措施。 英国的生物安全审查仍在 Biosecurity Strategy 和 AI Safety Institute 框架下推进,该框架与美国及国际伙伴合作评估 AI 带来的生物风险。Biological Weapons Convention(已有 189 个国家签署并批准)禁止发展和生产生物武器,但没有 AI 专门条款,也缺少正式核查机制。 监管尾部风险不对称:新的强制性规则(强制安全评估、出口管制、访问限制)可能抬高合规成本、限制国际分发,或要求模型删减—— 任何一项都会削弱 Forge 的商业模式,而且没有先例可判断限制会持续多久、强度多高。[CR025, CR026, CR027, CR028, CR029, CR030]
| 监管工具 | 司法辖区 | 状态 | 对 ESM3/Forge 的适用性 | 可能时间线 | 剩余风险 |
|---|---|---|---|---|---|
| EO 14110 — 安全、可靠、可信的 AI(§4.4 生物技术) | 美国联邦 | 已生效(Oct 30, 2023);未来实施规则待定 | 要求前沿模型开发者在超过算力阈值时,向 NIST/OSTP 报告两用生物评估 | 持续推进;报告要求取决于规则制定 | 中:ESM3 达到 1×10²⁴ FLOPs,可能跨过阈值;截至目前未确认已报告 |
| NIST AI RMF 1.0 + 生成式 AI 概要(NIST-AI-600-1) | 美国(自愿) | 已发布 Jan 2023;GenAI 概要于 Jul 2024 发布 | 自愿框架;采购和监管语境中引用越来越多 | 自愿;事实标准 | 低至中:不合规仅带来声誉和采购风险 |
| EU AI Act (Regulation 2024/1689) | EU | 已发布 Jul 12, 2024;Aug 2026 全面执行 | 训练量 >10²⁵ FLOPs 的通用 AI 模型可能触发系统性风险义务;两用生物应用可能被列为高风险 | 大多数条款于 Aug 2026 适用 | 高:合规评估、透明度义务和第三方审计可能限制 EU 市场准入 |
| FDA AI/ML 医疗器械框架(SaMD) | US FDA | 演进中;2024 AI/ML 行动计划 | 适用于诊断 / 治疗 AI,不适用于发现阶段蛋白设计工具;未来可能扩展 | 渐进推进;暂无针对蛋白 LLM 的具体规则 | 目前低;若 ESM3 用于临床决策支持,可能扩大 |
| BIS 出口管理条例(EAR)——潜在 AI 生物管制 | 美国(Commerce/BIS) | ANPRM 开发中(2024);蛋白 LLM 尚无最终规则 | 可能限制向对抗性国家出口 ESM3 权重或提供 API 访问 | 若出台最终规则,可能在 2025–2027 年 | 中:会限制国际商业收入和学术分发 |
| 生物武器公约(BWC) | 国际(189 个缔约方) | 1975 年起生效;无核查机制 | 禁止开发生物武器;未专门处理 AI 设计蛋白;公司和客户均须合规 | 持续有效;近期无 AI 专门修正案 | 低至中:合规义务落在客户;除服务条款外,EvolutionaryScale 无直接监管负担 |
状态和时间线信息基于公开监管文件及抓取日期可得知识。EU AI Act 执行日期可能因实施法案变化。BIS ANPRM 状态可能变化。
[CR025, CR026, CR027, CR028, CR029, CR030]2022 至 2027 年(预测)期间影响 EvolutionaryScale 的监管节点按时间排列。
2026 年及以后监管节点的日期基于典型美欧规则制定节奏估算;实际日期可能不同。
[CR025, CR026, CR027, CR028, CR029]7.5 财务与运营风险
EvolutionaryScale 累计融资约 $145 M(种子轮加 $142 M Series A,September 2024),投后估值 $1.35 B。 运行 98B 参数级别的前沿蛋白质语言模型,需要大量持续算力。训练 ESM3 消耗 1×10²⁴ FLOPs,依托高吞吐 GPU 集群; 商业规模下的持续推理服务和模型迭代会反复产生硬件成本。公司没有公开收入披露;截至已知访问日,公司仍处于收入前或极早期收入阶段。 按一家拥有前沿 GPU 集群、50–80 人 AI 基础设施公司的典型开支速度,除非 Forge API 很快转化为有意义的商业收入, $145 M 对应的现金跑道大约只有 2–4 年。该估值意味着相对于当前 Forge 采用度可支撑的任何前瞻收入预测,倍数都超过 10×; 如果商业爬坡慢于投资人预期,下轮估值下调风险会出现。 Amazon 和 Nvidia 的投资人 / 合作伙伴集中带来双重冲突:两家公司同时是 EvolutionaryScale 的主要云算力供应商、 主要分发渠道(SageMaker JumpStart、BioNeMo)和 Series A 投资人。如果 EvolutionaryScale 需要重谈云合约或寻求竞争性报价, 投资人关系会限制谈判筹码。反过来,如果 Amazon 或 Nvidia 发展竞争能力(两者已经在做:BioNeMo 已纳入 ESM 模型, 也包含竞争工具),它们可能有动力把客户从 Forge API 引走。 除 Forge API 订阅外,公司没有披露任何 IP 变现策略。如果开放的 ESM3 1.4B 模型(学术用途免费)满足了大部分学术需求, 从而蚕食商业 Forge 采用;同时制药客户更愿意自建能力而不是支付 API 费用,商业模式在规模化时可能低于预期。[CR032, CR033, CR034, CR035, CR036, CR037]
| 风险 | 驱动因素 | 可能性 | 影响 | 缓释措施 | 尽调询问 |
|---|---|---|---|---|---|
| Forge API 收入成规模前资金耗尽 | 累计融资 $145M,对应 GPU 集群烧钱速度;无公开收入 | 中 | 严重 | Amazon/Nvidia 按投资方条款提供算力访问;ESM Cambrian 学术开放模型降低推理负担 | 披露月度烧钱速度、Forge ARR 和现金跑道指引 |
| 若商业爬坡滞后,$1.35B 估值存在降价轮风险 | 预收入阶段估值隐含 10× 收入预测;2024-2026 年市场可比公司倍数压缩 | 中 | 高 | Series A 超额认购(Amazon + Nvidia 领投);投资人团强 | 获取股权结构表、期权池稀释,以及任何 Series B 授权或降价轮保护条款 |
| 投资方 / 合作方冲突:Amazon 和 Nvidia 兼具投资方、算力提供方和分发平台身份 | 结构性:两方分别运营 BioNeMo 和 SageMaker 这类竞争平台 | 中 | 高 | 假设投资部门与商业部门有合同隔离,但尚未确认 | 获取投资人附函副本及任何竞业限制或渠道排他条款 |
| 收入集中:Forge API 是唯一商业产品 | 未公开披露药物发现合作收入、里程碑付款或数据授权等多元化收入 | 中 | 中 | Forge API 通过 AWS SageMaker 和 NVIDIA BioNeMo 拓宽分发;暗示存在制药合作 | 获取收入拆分:API 费用 vs 合作 vs 里程碑 vs 授权 |
| GPU 成本通胀:NVIDIA H100/H200 作为前沿训练唯一供应方具备定价权 | NVIDIA 在 AI 加速器市场占优;短期内 AMD/Intel 无同等性能替代品 | 中 | 中 | Amazon 和 Nvidia 作为投资方可能提供优惠定价;云现货价格提供可选性 | 确认每 1K 次 API 调用算力成本和单次模型训练成本;评估规模化后的利润率 |
可能性和影响为定性评级。财务风险评估仅基于公开信息;公司未披露收入、烧钱速度或利润率数据。
[CR032, CR033, CR034, CR035, CR036, CR037]7.6 人才、关键人物与文化风险
已披露姓名的四位创始人——Alexander Rives(CEO)、Tom Sercu(President)、Zeming Lin(CTO)和 Salvatore Candido——都来自 Meta AI 的 FAIR 蛋白质研究团队。单一雇主出身带来文化集中风险:团队共享同一套研究范式、人脉网络和职业经历。 文化单一能加快一致决策,但在评估战略转向或监管威胁时,会削弱认知多样性;FAIR 的学术文化也未必让团队为这些问题做好准备。 关键人物风险很高。Alexander Rives 是最早的 ESM 模型创建者,也是主要科学愿景负责人。BioRxiv 预印本作者名单显示, Tom Sercu 和 Zeming Lin 是 ESM3 与 ESM Cambrian 的主要技术架构师。任何一位创始人离开,都可能拖慢模型开发速度并削弱投资人信心。 公司没有公开披露继任计划或 CEO 独立性安排。 公司位于湾区 AI 人才市场,这是全球竞争最激烈的人才市场之一。面对资本充足的超大规模云厂商(Google DeepMind、Meta、 Microsoft)或制药 AI 部门(Isomorphic、Xaira)的挖角,在 $1.35 B 估值且没有公开流动性的情况下留住资深 ML 研究员, 是结构性挑战。Amazon 和 Nvidia 的投资人身份可能降低来自这两方的团队收购风险,但无法降低人才流向其他超大规模云厂商的风险。[CR038, CR039, CR040, CR041]
7.7 法律与 IP 风险
EvolutionaryScale 的 ESM3 以 ESM2 权重为起点训练。Meta 的 facebookresearch/esm GitHub 仓库(托管 ESM2) 没有为模型权重本身附上标准开源许可证;ESM2 模型权重适用 Meta 自己的模型卡条款。ESM3 商业权重与 ESM2 祖先权重之间的关系, 公开来源没有完整说明;如果 Meta 主张衍生作品权利,就会形成潜在 IP 来源风险。 biorxiv 预印本的利益冲突声明写明,“已就本工作的若干方面提交专利申请”。这些专利的性质、 权利要求和状态没有公开披露。ESM3 使用的蛋白质建模离散 token 方法(将 3D 结构和功能 token 化为离散字母表)可能面对来自学术团队的既有技术, 包括 Baker Lab、Meta FAIR 和 Oxford 研究者。任何侵权主张——或专利冲突程序——都可能拖慢商业化,或要求昂贵的授权安排。 EvolutionaryScale 注册为公益公司(PBC)。这一形式提供一定治理弹性,但也围绕公共利益使命产生义务, 可能约束纯商业决策,尤其是开放模型访问与商业闸门之间的取舍。[CR042, CR043, CR044, CR045]
7.8 附录
08估值
8.1 投资逻辑与反向逻辑
EvolutionaryScale 的投资逻辑建立在四个相互强化的支柱上。第一,创始人的领域权威:Alexander Rives、Tom Sercu、 Zeming Lin 和 Salvatore Candido 是 Meta AI FAIR 时期 ESM 蛋白质语言模型家族的真正创造者,拥有任何竞争团队都无法从零复制的机构知识和发表记录。 第二,经同行评审的科学验证:ESM3 于 January 16, 2025 发表在顶级同行评审期刊 Science Magazine,记录了一个新型荧光蛋白的生成, 等同于模拟 5 亿年进化——这一非凡科学主张已经通过编辑评审公开验证,并以超过 1×10^24 FLOPs 训练算力被索引。第三,结构性的云分发护城河: Amazon(AWS)和 NVIDIA 不只是财务投资人;它们也是分发渠道伙伴,把 Forge 嵌入 AWS SageMaker JumpStart 和 NVIDIA BioNeMo, 让 EvolutionaryScale 直接触达几乎所有全球制药和生物技术 R&D 组织使用的云基础设施。第四,独特的多模态生成能力:ESM3 同时在蛋白质序列、结构和功能上推理——同类蛋白质 AI 创业公司(Profluent、Cradle.bio、Absci)还没有在单一基础模型中做出这一能力。 反向逻辑同样有证据支撑。第一,收入披露为零:截至 May 2026,Forge 没有公开确认 ARR、客户数或毛利率;$1.35B Series A 估值完全建立在前瞻预期上,使其成为 AI 生物技术领域估值最高的收入前项目之一。第二,开源替代威胁:ESM2(前代模型)可免费开源使用; AlphaFold 3(Google DeepMind)向全球超过 300 万研究人员免费提供非商业蛋白质结构和相互作用预测;两者都直接替代 EvolutionaryScale 商业产品的核心。第三,关键人物集中:四位联合创始人全部来自同一前雇主(Meta AI FAIR);如果同时离开, 这是高度相关的集中风险,在可比创业公司中几乎没有先例。第四,双用途和生物安全监管阴影:ESM3 的生成式蛋白质设计能力存在生物安全风险, 责任发展框架也承认这一点;客户筛查协议未披露,监管暴露仍不确定。第五,VC 估值倍数压缩风险:KPMG 2024 Venture Pulse 报告明确警告, 投资人正在“更审慎判断 AI 领域谁可能成为赢家”,并会偏好商业模式可信的公司;公开证据看, EvolutionaryScale 还没有达到这一标准。[CV001, CV002, CV006, CV007, CV009, CV022]
| 视角 | 论点 | 何种情况会改变观点 |
|---|---|---|
| 投资逻辑 | 创始人(Rives、Sercu、Lin、Candido)在 Meta AI FAIR 创建了 ESM 蛋白语言模型家族;这种机构知识,竞争团队无法复制 | 联合创始人离职,或组建了一个能获取类似训练数据的竞争实验室 |
| 投资逻辑 | ESM3 发表在 Science Magazine(Jan 2025):首个多模态蛋白生成模型,并通过同行评审验证了 5 亿年进化模拟 | 科学同行挑战关键 ESM3 主张,或复现失败 |
| 投资逻辑 | Amazon (AWS) + NVIDIA 共同投资:Forge 部署在 SageMaker JumpStart 和 BioNeMo,可直接触达几乎所有全球制药 R&D 云基础设施 | Amazon 或 NVIDIA 终止分发合作,或转向竞争性蛋白 AI 平台 |
| 投资逻辑 | ESM3 多模态推理(序列 + 结构 + 功能同步)在蛋白 AI 同行中独特,可规模化支持提示引导的蛋白设计 | 同行在 Forge 锁定多年期合同前,开源并广泛采用了同等多模态能力 |
| 反向逻辑 | $1.35B Series A 且零披露收入,意味着投后估值 / 融资额约 9.5x——预收入 AI 生物技术项目中最贵的进入点之一;未确认 ARR 或客户数 | Forge 披露 ARR >$10M、毛利率 >60%,且拥有多家制药客户 |
| 反向逻辑 | ESM2(前代模型)开源;AlphaFold 3(Google DeepMind)向 3M+ 研究者提供免费的非商业蛋白结构预测——可直接替代低端 Forge 用例 | ESM3 的独特生成能力(开源尚未复刻)能长期维持高于商品化 API 的定价 |
| 反向逻辑 | 四位联合创始人全部来自同一前雇主(Meta AI FAIR)——关键人物离职风险相关,在领导层形成单点故障 | 创始人签署长期雇佣合同,并招聘第二梯队独立科学领导层 |
投资逻辑和反向逻辑均有证据支撑。行顺序按对估值的相对影响排列。所有反向逻辑行均反映截至 2026 年 5 月可观察公开证据;未纳入推测性主张。
[CV006, CV007, CV009, CV022, CV023, CV029]从创始人履历、平台验证、风险因素与估值锚点,推导至「继续研究」建议及所需催化剂的决策链。
[CV006, CV009, CV030, CV033]8.2 建议、置信度与风险评级
建议为继续研究 / 保持关注。这不是买入建议,原因有三点且都有证据支撑。第一,估值置信度为中等:$1.35B 投后 Series A 锚点经 Crunchbase 和 Bloomberg 确认,但没有 Forge 收入、ARR、毛利率或客户数数据——这些目前都未披露——就无法建立内在价值模型。 第二,进入倍数激进:在 $1.35B 且收入前的状态下,EvolutionaryScale 隐含的价格 / 已融资倍数(投后估值 / 已融资约 ~9.5x) 明显高于收入前生物技术公司的行业常态;没有商业牵引力可见度,很难证明这一估值合理。第三,悲观情景风险不对称:开源 ESM2 和免费的 AlphaFold 3 是真实替代品,可能压缩 API 定价,并在收入逻辑兑现前摧毁它。 整体置信度为中等。产品证据强(Science 发表、模型架构),团队证据强(Meta AI 背景、GitHub 活动、HuggingFace 存在), 融资证据也强(已确认且多来源)。商业侧证据弱(无收入、无 ARR、无客户披露、无合作财务条款)。风险评级为高,原因包括: 以高估值进入收入前公司;四位联合创始人均来自同一前雇主,关键人物高度集中;开源和免费层竞争替代;双用途生物安全监管不确定; 以及商业分发结构性依赖 Amazon 和 NVIDIA。 估值判断是基础设施 / 平台 AI 溢价,但没有临床验证上浮。不同于临床阶段 AI 药物发现公司(Insilico Medicine、Recursion), EvolutionaryScale 的价值完全在平台和基础模型层——更像一家把基础模型 API 公司(Anthropic、Cohere)套用到垂直生物领域的企业, 但没有通用模型的规模,同时市场暴露集中得多。买入建议需要三个条件:披露 Forge ARR 至少 $10M+、拥有活跃的多制药客户群、 并确认毛利率高于 60%。[CV001, CV002, CV005, CV011, CV013, CV030]
| 建议 | 信心 | 风险评级 | 估值立场 | 决策含义 |
|---|---|---|---|---|
| 继续研究 / 保持关注 | 中(无 Forge 收入数据;开源替代风险;优先股堆叠未知) | 高(预收入阶段估值 $1.35B;关键人物集中;开源 ESM2 + AlphaFold 3 替代;两用监管阴影) | 基础设施 / 平台 AI 溢价;基准情景 $1.5–2.5B;乐观情景 $3–5B,取决于 Forge ARR;悲观情景 $400–800M,受商品化拖累 | 跟踪 Forge ARR 和客户数;在上调至买入前,要求 ARR >$10M 且拥有多家制药客户;监控 Amazon/NVIDIA 合作财务条款 |
建议对价格和证据敏感。信心和风险评级反映截至 2026 年 5 月未披露 Forge 收入、且存在开源替代风险。估值区间假设没有已确认财务数据。
[CV001, CV002, CV030, CV033]面向 IC 的六维评分:市场验证、平台护城河、商业证据、经济性可见度、风险水平与证据质量。
[CV005, CV006, CV009, CV030]8.3 融资、估值背景与进入纪律
EvolutionaryScale 于 September 26, 2024 完成 $142M Series A,由 Amazon(AWS)和 NVIDIA 共同领投。 Lux Capital、Nat Friedman 和 Daniel Gross(AI Grant 组织)参投。包括种子轮在内,累计融资约 $145M。投后估值约 $1.35B——由 Crunchbase、Bloomberg(付费墙)和 PitchBook(付费墙)确认。SEC EDGAR 全文检索数据库中没有找到 “EvolutionaryScale” 的 Form D 证券备案;这与公司的私营状态一致,也可能意味着其使用 Regulation D 但未公开披露。 以历史生物技术创投标准看,收入前阶段给出 $1.35B 估值偏高;但放在 2024 AI 估值环境下,大体仍说得通:仅 Q4 2024, 美国就有五家公司各自完成 $4B+ 融资(KPMG Venture Pulse)。Amazon 和 NVIDIA 投资的战略性质显著改变了风险调整后的投资逻辑: 两者都是分发渠道伙伴,投资本身会形成自我强化的商业激励,把制药 API 流量导向 Forge。AWS SageMaker JumpStart 和 NVIDIA BioNeMo 合计触达几乎所有主要全球制药 R&D 组织,渠道护城河真实且持久。 进入纪律要求先确认商业牵引力,再给出买入建议。优先股堆叠、股权结构表和 Series A 带来的稀释压力都未知;没有经审计财务或备考股权结构披露, 就无法在任何给定企业价值下精确计算普通股价值。Amazon 和 NVIDIA 共同投资,从结构上降低了敌意人才收购或灾难性下轮的概率, 因为任何一方都不会受益于一笔破坏其云平台战略的困境出售。[CV001, CV002, CV003, CV004, CV005, CV009]
| 情景 | 关键假设 | 估值区间(USD B) | 关键风险和概率信号 |
|---|---|---|---|
| 乐观($3–5B) | 到 2027 年,Forge ARR 达到 $50–100M;签下多家制药公司多年期合同;AWS+NVIDIA 渠道带来规模化分发;ESM3 被生物制药行业采纳为蛋白基础模型标准;毛利率 >70% | $3.0–5.0B | 需要确认 $25M+ ARR 数据点和 2+ 个已披露制药合同;相较 Generate Biomedicines(~$2.5B)具备更好分发优势;概率:25–30% |
| 基准($1.5–2.5B) | 商业爬坡较慢;到 2027 年 ARR 为 $10–25M;大部分收入来自 AWS/NVIDIA 渠道费用;有部分企业级制药合同,但 API 定价受到开源压力;团队适度扩张;Series B 仅有温和溢价 | $1.5–2.5B | 与当前 Series A 约 $1.35B 的入场估值一致;小幅上修;概率:45–50% |
| 悲观情景($400M–800M) | 开源 ESM2 + AlphaFold 3 让 API 商品化;未签下大型药企合同;核心联合创始人离职引发人才流失;Amazon 或 NVIDIA 以承压估值收购团队;双用途监管行动限制分发 | $0.4–0.8B | 触发条件:到 2026 年底仍无 Forge ARR 数据;竞争对手开源能力追平;联合创始人离职公告;概率:20–25% |
所有估值区间都是按情景推导的估计,基于可比公司分析(ABSI 约 $800M、RXRX 约 $1.555B、Generate Biomedicines 最近披露约 $2.5B)、先例交易和 ARR 倍数建模。没有可用于 DCF 输入的已确认 Forge 收入。概率为主观估计。
[CV011, CV022, CV023, CV031, CV032, CV033]基于情景假设与可比公司基准,熊市、基准、牛市情景下的低至高估值区间(十亿美元)。
区间来自模型,使用可比公司倍数、M&A 先例与 ARR 情景建模。未使用已确认的 Forge 财务数据。区间是有依据的估算,不是精确 DCF 输出。
[CV031, CV032, CV033]8.4 可比估值组
EvolutionaryScale 的可比组横跨三类:上市 AI 药物发现公司、私有 AI 生物技术同行,以及近期融资交易。由于 EvolutionaryScale 的定位独特——一家收入前蛋白质基础模型 API 公司,并由全球最大的两家科技公司提供战略云分发——没有完美的单一可比对象。 上市可比公司中,Absci(NASDAQ:ABSI)按商业模式最接近——纯 AI 生物制剂设计,没有 Phase 2 或临床项目。ABSI 截至 May 2026 市值约 $800M,为一家 AI 药物创造平台提供了公开市场估值底部;其披露收入为 $2.8M(FY2025),净亏损 $115.2M。 隐含收入倍数极高(约 285x FY2025 收入),反映市场为平台可选性定价,而不是为近期基本面定价。Recursion Pharmaceuticals (NASDAQ:RXRX)市值约 ~$1.555B,但 Q1 2026 收入只有 $6.47M,累计亏损 $2.1B;它按拥有管线可选性的临床阶段 AI 平台交易。Schrödinger(NASDAQ:SDGR)市值约 ~$893M,披露了软件加基于结构的药物发现收入;其倍数更常规,但混合模式不同。 私有可比公司中,Generate Biomedicines 累计融资约 ~$700M,最近披露估值约 $2.5B——按模态(蛋白质生成式 AI)看最接近。 Xaira Therapeutics 于 April 2024 以 $1B Series A 启动——当时是有史以来最大的 AI 药物发现 Series A——估值约 ~$1B。Isomorphic Labs(Alphabet 支持)未披露独立估值,但正在与 Lilly 和 Novartis 进行数十亿美元级合作。Profluent 融资 $44M,Cradle.bio 融资约 ~$73M;两者都服务蛋白质工程用例,但阶段更早、估值低得多。 若以基础模型平台可比公司并按垂直市场狭窄度调整:Anthropic(约 ~$60B)、Mistral(约 ~$6B)和 Cohere(约 ~$5B) 提供了收入前 AI 基础模型公开市场情绪的上界。EvolutionaryScale 的 $1.35B 约为 Anthropic 估值的 2–3%,模型质量主张可比, 但可服务市场窄得多(仅生物)。鉴于总可用市场(TAM)集中,相比横向基础模型打折是合理的;但相较临床阶段 AI 药物发现上市可比公司, 它仍有溢价。[CV011, CV012, CV013, CV014, CV015, CV016]
| 可比公司 | 类型 | 核心指标和估值(USD) | 倍数或基准 | 对 EvolutionaryScale 的参考价值 | 局限 |
|---|---|---|---|---|---|
| Absci (NASDAQ:ABSI) | 上市公司 | 市值约 $800M(2026 年 5 月);FY2025 收入 $2.8M | 约 285x FY2025 收入 | 最接近的上市纯 AI 生物制品设计同业;没有二期项目;亏损;有 NASDAQ 数据 | Absci 收入来自里程碑式合作费用,不是 SaaS ARR;收入质量低于 EvolutionaryScale 潜在的 Forge 订阅 |
| Recursion (NASDAQ:RXRX) | 上市公司 | 市值约 $1.555B(2026 年 5 月);Q1 2026 收入 $6.47M;累计亏损 $2.1B | 约 60x 年化收入 | 最大的上市纯 AI 药物发现公司;表型组学 + AI 平台;3M+ 化合物表型图谱 | 临床阶段管线给 Recursion 相对 EvolutionaryScale 的溢价;商业模式不同(药物发现 + 管线,不是纯平台 API) |
| Schrodinger (NASDAQ:SDGR) | 上市公司 | 市值约 $893M(2026 年 5 月);52 周区间 $10.94–$27.63 | 公开市场披露 | 基于物理的模拟 + 软件授权;披露 ARR;作为上市公司运营历史更长 | 不是生成式 AI 蛋白语言模型;软件 + 药物发现混合模式;ARR 可见性更高,但增长曲线更低 |
| Generate Biomedicines | 未上市公司 | 累计融资约 $700M;最近披露估值约 $2.5B | 估值 / 融资额约 3.6x | 最接近的生成式生物学可比公司:用蛋白生成式 AI 做治疗药物;总部在 Massachusetts;Flagship Pioneering 支持 | 商业化阶段更早;模型架构不同;无公开财务;最近一轮估值未确认仍有效 |
| Xaira Therapeutics | 未上市公司 | $1B Series A(2024 年 4 月) | 启动时估值约 $1B | 与 EvolutionaryScale 融资同期、史上最大 AI 药物发现 Series A;为 $1B+ 未上市 AI 生物技术 融资轮提供结构性先例 | Xaira 聚焦药物项目而非 API 平台;退出路径和收入模式不同 |
| Profluent | 未上市公司 | 融资约 $44M | 估计估值 $200–400M | 面向 CRISPR 和基因编辑的 AI 蛋白设计;OpenCRISPR 开源发布;阶段更早 | 融资规模更小;应用更窄(基因编辑,不是广义蛋白 API);可比价值有限 |
| Cradle.bio | 未上市公司 | 融资约 $73M | 估计估值 $250–500M | 面向生物制药和工业生物的蛋白优化 SaaS;已确认 Novonesis 合作;更贴近 Forge 用例 | 商业化阶段更早;做优化而非生成;欧洲公司(Amsterdam);技术路径不同 |
| Isomorphic Labs (Alphabet) | 未上市公司 | 未披露;Lilly + Novartis 交易名义价值 $3B+ | 交易价值基准;无独立估值 | AlphaFold 3 缔造方;生成式生物学;Lilly 和 Novartis 合作;Alphabet 结构性优势 | 无独立估值;Alphabet 支持,资本成本和竞争位置根本不同 |
可比样本不完整且不对称:上市可比公司(ABSI、RXRX、SDGR)有 SEC 备案财务数据;未上市可比公司依赖媒体报道的融资轮和估计估值。无法取得投行或独立公平意见数据。所有未上市估值均为估计。
[CV011, CV012, CV013, CV014, CV015, CV016]| 公司 | 交易 | 金额(USD) | 估计估值 | 日期 | 核心投资方 |
|---|---|---|---|---|---|
| EvolutionaryScale | Series A | $142M | 投后约 $1.35B | 2024 年 9 月 | Amazon(AWS)、NVIDIA、Lux Capital、Nat Friedman、Daniel Gross 等投资人 |
| Xaira Therapeutics | Series A(启动) | $1,000M | ~$1B | 2024 年 4 月 | ARCH Venture Partners、Foresite Capital 等 |
| Generate Biomedicines | Series C(最近披露) | 约 $273M(Series C);累计约 $700M | 约 $2.5B(最近披露) | 2022–2023 | Flagship Pioneering、Fidelity、NVIDIA 等 |
| Isomorphic Labs | Series B | 据报道约 $600M | 约 $3B+(交易价值基准) | 2024 | Alphabet(Google);未披露机构联合投资方 |
| Insilico Medicine | 港交所 IPO | ~$293M | 约 $2.3B(此前 Series E) | 2025 年末 | 公开市场(SEHK:3696) |
| Profluent | Series A | $44M | 估计 $200–400M | 2023–2024 | Salesforce Ventures、Felicis、OpenAI Fund 等投资方 |
交易数据来自公司公告、Crunchbase、媒体报道和市场研究报告。未上市交易估值根据披露投后估值或隐含交易条款估计;Isomorphic Labs 估值反映 Lilly + Novartis 交易名义价值,并非已确认的独立股权估值。所有金额均为 USD。
[CV001, CV002, CV017, CV018, CV020, CV021]相对于约 $2.0B 基准情景中位数,EvolutionaryScale 估计企业价值(十亿美元)对单项上行和下行驱动因素的敏感度。
所有数值均为相对于约 $2.0B 基准情景中位数的估计敏感性变化。未获得 Forge 已确认财务数据。区间反映可比公司倍数、ARR 增长情景与 M&A 先例分析。
[CV031, CV032, CV033]8.5 退出准备度与最终尽调问题
EvolutionaryScale 近期最可能的退出路径包括:(1)被 Amazon 战略收购(人才收购或整体收购,把 Forge 嵌入 AWS AI Services)或被 NVIDIA 收购(加深 BioNeMo 的可微蛋白质设计能力);(2)在 ARR 证明商业产品市场契合后,被寻求蛋白质 AI 平台的大型药企收购(AstraZeneca、Pfizer、Genentech、Novartis);(3)如果 Forge 达到 $25M+ ARR 并拥有多家制药客户, 以 $2B+ 估值完成 Series B 或 Series C;或(4)达到 $50M+ ARR、毛利率 >60% 后 IPO——最早也大概率不会早于 2028。 Amazon 的结构性投资人关系显著降低悲观情景概率:一家由 AWS 支持、且 Forge 已部署在 SageMaker JumpStart 上的公司, 除非 Amazon 主动采取敌意决策,否则很难遭遇灾难性下轮。不过,同一个 Amazon 关系也让退出路径集中:如果 Amazon 是最可能买家, 二级投资人必须接受 M&A 定价纪律,而这未必能为非战略投资人最大化估值。 买入建议发布前,有五个尽调问题至关重要:(1)截至 Q2 2026 的 Forge ARR 和客户数;(2)Forge API 收入毛利率; (3)Amazon 和 NVIDIA 合作中的收入分成与排他条款;(4)与 Meta Platforms 之间的 ESM2/ESM3 IP 转让协议(如有), 确认 IP 权属链清晰;(5)生物安全和双用途客户筛查协议。五项得到确认前,估值置信度仍为中等,建议仍是继续研究 / 观察。[CV005, CV009, CV034, CV037, CV038]
| 触发条件 | 阈值和事件 | 对投资逻辑的传导 | 操作含义 |
|---|---|---|---|
| 到 2026 年底仍未披露 Forge ARR | 到 2026 年 Q4,也就是成立三年后,EvolutionaryScale 仍未披露任何 ARR、企业客户数或定价数据 | 证实商业化逻辑完全停留在推测;任何高于已融资资本的估值都失去收入倍数基础;释放可能被收购团队的风险信号 | 下调至回避;追加投入前,要求与公司 IR 直接会面确认 ARR |
| 联合创始人离职(Rives、Sercu、Lin、Candido 任一人) | 四位联合创始人中任一人公开宣布离开 EvolutionaryScale 主动职务,或 LinkedIn 离职信息获确认 | 摧毁创始人领域权威这一支柱;立刻引出 IP 延续性、团队士气以及 Amazon / NVIDIA 合作方信心问题 | 立即重新评估;降低仓位;任何投资逻辑上调前,要求解释 IP 归属和竞业限制状态 |
| 开源蛋白生成模型追平 | 任何开源蛋白语言模型发布,并具备可比 ESM3 的多模态生成能力且被社区广泛采用(6 个月内 GitHub star 数 >10k) | 抹掉 Forge API 的技术差异;让 $1.35B 估值锚商品化;定价权从 EvolutionaryScale 转向基础设施(AWS/NVIDIA) | 按悲观情景重估;评估仅靠 AWS/NVIDIA 分发优势能否撑住 $800M+ 估值 |
| Amazon 收购团队或敌意定价变化 | Amazon 出价以低于 $1B 的估值收购 EvolutionaryScale,或 AWS 调整 Forge 定价条款、直接截留经济收益 | 暴露 Amazon 把 EvolutionaryScale 视为基础设施组件,而非独立平台,进而触发估值重置 | 评估收购团队溢价与长期独立路径;将公允价值按 AWS 功能而非独立平台建模 |
终止触发条件是二元事件或阈值事件;监控需要定期检查 EvolutionaryScale 博客 / 新闻稿、跟踪 LinkedIn 上联合创始人动态、监控 GitHub 蛋白模型仓库以及 AWS Partner Network 公告。
[CV005, CV022, CV023, CV029, CV033, CV037]| 主题 | 缺失证据 | 重要性 | 负责人和尽调路径 |
|---|---|---|---|
| Forge ARR 和客户数 | 截至 2026 年 Q2,Forge API 平台的年经常性收入(ARR)、企业客户数和客户名称(至少到类别层面) | 没有收入确认,任何高于 Series A 入场估值的模型都站不住;ARR 是验证商业化逻辑的首要指标 | 公司 IR;AWS Marketplace 上架数据;提到面向客户岗位的 LinkedIn 招聘;Series B 融资资料室 |
| Forge 毛利率 | Forge API 收入成本(每次 API 调用的 GPU 计算成本、基础设施成本、分摊人员);毛利率百分比 | 毛利率决定 Forge 能否扩成高价值 SaaS 业务(毛利率 >70%),还是结构上只是低毛利算力转售 | 公司 IR;财务资料室;以 AWS 计算成本基准作代理 |
| Amazon 和 NVIDIA 合作财务条款 | AWS SageMaker JumpStart 和 NVIDIA BioNeMo 分发协议的收入分成比例、排他条款、最低承诺量和期限 | 如果 Amazon 或 NVIDIA 作为渠道费拿走 Forge 总收入的 >40%,EvolutionaryScale 的净经济性可能撑不起独立平台价值;排他条款决定自助获客能力 | Series B 资料室;M&A 资料室请求;NVIDIA 和 AWS 合作伙伴计划备案 |
| 来自 Meta 的 ESM IP 权属链 | Meta Platforms 与 EvolutionaryScale 创始人之间,覆盖 ESM 模型架构、训练代码或数据管线的任何 IP 转让、许可或让与协议 | 若 IP 权属链未确认,收购方或药企合作伙伴会面对 IP 诉讼风险;Amazon 或药企尽调会要求权属清晰 | 公司法务披露;专利检索(USPTO);联合创始人雇佣协议审查 |
| 生物安全与双用途筛查流程 | 客户筛查流程、高风险请求(病原体相关蛋白)的访问控制,以及 NIH Dual Use Research of Concern (DURC) 政策合规 | 生物安全监管行动可能限制 Forge 只能在非美国市场分发,或迫使 API 功能下线;治理透明是药企合作的前提 | EvolutionaryScale 负责任开发博客;DURC 政策审查;US DoD/BARDA 承包商关系核查 |
| 股权结构表和优先股堆叠 | Series A 清算优先权倍数、反稀释条款、已发行优先股总量,以及 Series A 后估计摊薄股数 | 低于 $1.35B 名义估值的普通股价值高度取决于优先股堆叠;2x 清算优先权或参与型优先股会在中等退出价格下显著压低普通股价值 | Series B 资料室;VC 法律顾问审查;Delaware 州务卿公司注册证书 |
六项尽调待确认事项都是买入建议的阻断项。Forge ARR 优先级最高;没有它,无法对高于当前 Series A 锚点的估值建立信心。IP 和生物安全是机构药企合作及任何 M&A 交易的前提。
[CV005, CV009, CV034, CV037]8.6 附录
免责声明
2025 年 11 月 6 日,EvolutionaryScale 团队并入 Chan Zuckerberg Initiative 旗下 CZ Biohub,公司不再作为独立营利实体运营。因此,本报告主要是对一个已失效的独立投资假设做历史 / 复盘式尽调;如今与这条线索相关的投资入口是 CZI / CZ Biohub 网络,但它是非营利网络,不向外部投资者开放。由于不存在 SEC 申报或经审计披露,所有财务数据(估值、融资、员工数、下载量)均来自第三方报道;收购条款未披露。本建议反映的是缺少面向未来的独立股权工具,并非对底层科学的判断。
证据索引
| 编号 | 陈述 | 可信度 | 来源 |
|---|---|---|---|
| CO001 | EvolutionaryScale was incorporated in 2023 and became operationally active in approximately March 2024. | 高 | SO001, SO004 |
| CO002 | EvolutionaryScale was co-founded by Alexander (Alex) Rives, Tom Sercu, Zeming Lin, and Sanjay Rao, all formerly of Meta AI Research (FAIR). | 高 | SO001, SO017, SO013 |
| CO003 | EvolutionaryScale was headquartered in San Francisco, California, USA, prior to the November 2025 CZI acquisition. | 中 | SO001, SO004 |
| CO004 | Alex Rives served as CEO of EvolutionaryScale from founding until the November 2025 CZI acquisition, at which point he became Head of Science at CZI. | 高 | SO001, SO017, SO014 |
| CO005 | Tom Sercu served as co-founder and VP of Engineering at EvolutionaryScale, leading infrastructure and large-scale model training. | 高 | SO017, SO010, SO009 |
| CO006 | EvolutionaryScale stated its mission as using generative AI to model the language of proteins and unlock programmable biology for human benefit. | 高 | SO001, SO002 |
| CO007 | EvolutionaryScale primary flagship product was ESM3, a generative multimodal protein language model, released June 25, 2024. | 高 | SO001, SO002, SO017 |
| CO008 | ESM3 was publicly released on June 25, 2024, with both an open-weights variant (esm3-sm-open-v1) for academic use and a commercial Forge API offering. | 高 | SO002, SO017, SO005 |
| CO009 | ESM3 is available in multiple model sizes; the largest publicly released variant has 98 billion parameters. | 高 | SO002, SO009 |
| CO010 | ESM3 was trained on 2.78 billion protein sequences totaling 771 billion tokens using approximately 1x10^24 FLOPs on a cluster of NVIDIA H100 GPUs. | 中 | SO002, SO009, SO010 |
| CO011 | EvolutionaryScale released ESM Cambrian (ESM-C) on December 4, 2024. | 高 | SO003, SO026 |
| CO012 | ESM Cambrian is available in three model sizes: 300M, 600M, and 6B parameters, optimized for efficient protein language modeling inference. | 高 | SO003, SO026 |
| CO013 | A peer-reviewed paper on ESM3 titled Simulating 500 million years of evolution with a language model was published in Science on January 16, 2025, with DOI 10.1126/science.ads0018. | 高 | SO009, SO010, SO002 |
| CO014 | ESM3 encodes and generates proteins by treating sequences, structures, and functional annotations as a multimodal language, sampling from the space of 500 million years of protein evolution. | 中 | SO002, SO009 |
| CO015 | EvolutionaryScale raised a seed round announced on June 25, 2024, with participation from Lux Capital, Nat Friedman, Daniel Gross, NVIDIA, and Amazon; the seed amount was not publicly disclosed. | 中 | SO017, SO004 |
| CO016 | EvolutionaryScale closed a $142M Series A round on September 26, 2024, co-led by Amazon and NVIDIA. | 高 | SO015, SO017, SO004 |
| CO017 | The Series A round was closed at an implied post-money valuation of approximately $1.35 billion. | 中 | SO004, SO015 |
| CO018 | Additional participants in the Series A included Lux Capital, Nat Friedman, and Daniel Gross, who had also participated in the seed round. | 中 | SO017, SO004 |
| CO019 | As of May 2026, no SEC Form D filings were found under any variant of EvolutionaryScale in EDGAR for the 2024 to 2026 period. | 高 | SO011, SO012 |
| CO020 | EvolutionaryScale had 11 to 50 employees according to its LinkedIn company page, consistent with a seed/Series A-stage AI research startup. | 低 | SO013 |
| CO021 | On November 6, 2025, CZ Biohub announced that the EvolutionaryScale team would join the CZ Biohub Network as part of the Frontier AI for Biology initiative led by the Chan Zuckerberg Initiative. | 中 | SO014, SO018 |
| CO022 | Following the November 2025 CZI acquisition, Alex Rives became Head of Science at the Chan Zuckerberg Initiative (CZI), and other co-founders joined CZ Biohub in senior research roles. | 中 | SO014, SO013 |
| CO023 | CZI and CZ Biohub framed the EvolutionaryScale acquisition as advancing open biological science and making frontier AI biology tools broadly accessible to researchers. | 中 | SO014, SO018 |
| CO024 | The ESM GitHub repository originally at github.com/evolutionaryscale/esm was transferred to the biohub organization following the CZI acquisition, signaling IP transfer. | 中 | SO007, SO006 |
| CO025 | ESM3 open-weights variant (esm3-sm-open-v1) accumulated over 3,100 downloads on HuggingFace; ESM Cambrian models accumulated over 6,300 downloads collectively, as of May 2026. | 中 | SO026, SO005 |
| CO026 | ESM3 was integrated into the NVIDIA BioNeMo platform and made available as an NVIDIA NIM microservice for enterprise deployment on H100 infrastructure. | 中 | SO017, SO019, SO022 |
| CO027 | No Wikipedia article exists for EvolutionaryScale; the URL en.wikipedia.org/wiki/EvolutionaryScale returns a 404 not-found page as of May 2026. | 中 | SO021, SO018 |
| CO028 | EvolutionaryScale operated a commercial API platform at forge.evolutionaryscale.ai providing developer access to ESM3 and ESM-C models; the platform is JavaScript-rendered and its operational status post-acquisition is unknown. | 中 | SO024, SO002, SO003 |
| CO029 | EvolutionaryScale never publicly disclosed commercial revenue, ARR, or customer count as a standalone entity. | 中 | SO001, SO004, SO024 |
| CO030 | All four co-founders (Rives, Sercu, Lin, Rao) were formerly at Meta AI (FAIR), creating a single-employer provenance risk with homogeneous cultural and technical assumptions and no evidence of diverse executive expertise outside AI research. | 高 | SO002, SO017, SO013 |
| CO031 | The ESM3 BioRxiv preprint (doi: 10.1101/2024.07.01.600583) was published in July 2024 ahead of the Science journal paper, with Rives, Sercu, Candido, Lin, and others as authors. | 中 | SO010, SO007 |
| CO032 | EvolutionaryScale technological moat rested on large-scale protein language model pre-training, proprietary training infrastructure (Andromeda H100 cluster), and a multi-year research lead through the ESM model family lineage from Meta FAIR. | 中 | SO002, SO009, SO008 |
| CO033 | The DeepEP repository demonstrates EvolutionaryScale infrastructure capability in mixture-of-experts inference and expert-parallel communication, relevant to deploying large protein language models at scale. | 中 | SO008, SO006 |
| CO034 | Following the CZI acquisition, the ESM model family is expected to remain accessible as open-source research tools through the CZ Biohub network, continuing the open-weights distribution strategy. | 中 | SO014, SO007 |
| CO035 | EvolutionaryScale was classified as an early-stage private company at seed through Series A stage, with no commercial product revenue disclosed prior to the CZI acquisition in November 2025. | 高 | SO001, SO004, SO015 |
| CO036 | NVIDIA participated in EvolutionaryScale seed round announced alongside the ESM3 launch on June 25, 2024, and later co-led the Series A in September 2024. | 中 | SO017, SO023 |
| CO037 | The Forge API platform (forge.evolutionaryscale.ai) was the commercial interface for EvolutionaryScale protein design models, providing programmatic access for biotechnology and pharmaceutical customers. | 中 | SO024, SO002 |
| CO038 | The Bloomberg article reporting on the $142M Series A is behind a paywall, preventing public verification of full financing terms, investor rights, and any secondary components of the deal. | 中 | SO015, SO027 |
| CM001 | EvolutionaryScale's core addressable market is the protein language model (PLM) API and platform market—cloud-hosted AI models enabling protein engineers to generate, predict, and optimize protein sequences and structures without exhaustive wet-lab directed evolution. | 高 | SM011, SM012 |
| CM002 | ESM3, published in Science on January 16, 2025 (DOI: 10.1126/science.ads0018), is the first generative protein language model to simultaneously reason over sequence, structure, and function in a single unified architecture—trained on 2.78 billion protein sequences with 98 billion parameters using approximately 1×10^24 FLOPs. | 高 | SM013, SM015 |
| CM003 | Status-quo substitutes for protein LM platforms include AlphaFold2/3 (free structure prediction database, 200M+ structures), Rosetta/PyRosetta (open-source protein design), directed evolution in wet lab (weeks per cycle, throughput-limited), and traditional molecular dynamics tools (GROMACS, Schrödinger Maestro), none of which provide generative multi-modal reasoning over sequence, structure, and function jointly. | 中 | SM007, SM017 |
| CM004 | The adjacent AI drug discovery platform market (Grand View Research) is estimated at $2.35B in 2025 growing to $13.77B by 2033 at 24.8% CAGR; EvolutionaryScale's Forge API serves as infrastructure enabling this broader market by providing protein characterization and engineering capabilities. | 中 | SM005 |
| CM005 | Industrial biotechnology—enzyme engineering for green chemistry, agriculture, biomaterials, and food science—is a secondary adjacency for EvolutionaryScale with shorter product development cycles and lower regulatory burden than pharmaceutical applications. | 中 | SM011, SM025 |
| CM006 | The outer boundary drug discovery market (all modalities) is estimated at $71.89B in 2025 growing to $158.74B by 2034 at 9.2% CAGR (Precedence Research); protein engineering API tools constitute a specialized AI sub-segment within this broader market well beyond EvolutionaryScale's direct footprint. | 中 | SM006 |
| CM007 | MarketsandMarkets estimates the protein engineering market at $2.2B (2019) growing to $3.9B by 2024 at 12.4% CAGR; Allied Market Research estimates $2.2B (2022) to $7.7B by 2032 at 13.2% CAGR; Grand View Research estimates $2.60B (2023) to $7.62B by 2030 at 16.24% CAGR—all directionally consistent at 12–16% annual growth over a decade. | 中 | SM001, SM003, SM004 |
| CM008 | Precedence Research takes the broadest scope, estimating the protein engineering market at $5.09B in 2025 growing to $23.59B by 2035 at 16.57% CAGR, incorporating industrial enzymes, biopharmaceuticals, and all research tools rather than just software and services. | 中 | SM002 |
| CM009 | The protein engineering market has a 10× analyst estimate dispersion ($2.2B to $23.59B for 2019–2025 entry years), attributable to scope inconsistency: narrow estimates focus on software/services while broad estimates incorporate industrial enzymes, biopharmaceuticals, and manufacturing applications. | 中 | SM001, SM002, SM003, SM004 |
| CM010 | The FDA received over 500 AI/ML-enabled drug development submissions between 2016 and 2023, issued draft AI guidance in 2025, and established the CDER AI Council in 2024, signaling active and accelerating federal regulatory engagement with AI-native drug development tools including protein engineering applications. | 高 | SM010, SM016 |
| CM011 | EvolutionaryScale raised $142 million in a Series A round (Crunchbase), establishing investor-validated commercial potential for the protein LM API market; the Forge commercial API monetizes ESM3 access for biopharma and biotech customers beyond the MIT-licensed free tier. | 中 | SM022, SM024 |
| CM012 | The AI drug discovery market (24.8% CAGR, GVR) grows materially faster than the protein engineering tools market (12–17% CAGR), reflecting accelerated pharma AI investment post-AlphaFold; EvolutionaryScale's Forge API benefits from both trajectories as an enabling platform. | 中 | SM005, SM001, SM002 |
| CM013 | No independently published serviceable addressable market figure exists for protein language model APIs specifically within pharmaceutical R&D; all protein engineering market estimates encompass the full market including reagents and instruments, making PLM API SAM derivation assumption-dependent. | 中 | SM001, SM002, SM003, SM004 |
| CM014 | ESM3's commercial Forge API and ESM-C open weights were launched in June 2024 and September 2024 respectively, with ESM-C distributed on AWS SageMaker JumpStart and NVIDIA BioNeMo to reach enterprise pharma customers already embedded in those cloud ecosystems. | 高 | SM011, SM012, SM008, SM009 |
| CM015 | The primary commercial buyer for EvolutionaryScale's Forge API is the large or mid-tier pharmaceutical or biotech company with an active computational biology or protein engineering program, where economic buyer authority rests with a VP of Computational Biology, Director of Drug Discovery, or Chief Scientific Officer. | 中 | SM011, SM012 |
| CM016 | Academic and government research labs constitute a high-volume, zero-revenue user segment: ESM-C open weights under MIT license have been downloaded over 6,320 times from HuggingFace and the ESM package is available via PyPI, providing community mindshare that can feed eventual commercial pipeline. | 中 | SM018, SM025 |
| CM017 | NVIDIA BioNeMo and AWS SageMaker JumpStart serve as enterprise distribution channels for ESM-C, lowering commercial adoption friction for pharma customers with existing cloud infrastructure contracts on those platforms. | 高 | SM008, SM009, SM012 |
| CM018 | The technical champion for ESM3/ESM-C adoption is typically a computational biologist, structural biologist, or machine learning scientist within a pharma or biotech R&D organization who evaluates model capabilities and advocates for integration into existing protein engineering pipelines. | 中 | SM011, SM017 |
| CM019 | Industrial biotechnology companies—engineering enzymes for green chemistry, agriculture, and specialty materials—represent a growing buyer segment with different procurement patterns than pharma: shorter development cycles, lower regulatory burden, and higher tolerance for experimental API tools. | 中 | SM025, SM011 |
| CM020 | The ESM Python package on PyPI enables access to ESM3 open models and commercial Forge API, listing all model sizes (esm3-large-2024-03, ESM-C 300M/600M/6B) and API authentication, supporting both researcher self-service and enterprise paid Forge access from a single installation. | 中 | SM025, SM017 |
| CM021 | Biotech startups at Series A–B stage represent an emerging paid segment for Forge API: they have computational infrastructure but lack resources to train frontier protein LMs independently, making API access economically rational vs. self-hosting 98B-parameter ESM3. | 中 | SM022, SM011 |
| CM022 | DNA sequencing costs declined from approximately $10,000 per genome in 2011 to approximately $100 per genome by 2023 (NHGRI), enabling exponential growth in protein sequence databases and providing the training data foundation that enables frontier-scale protein language models like ESM3 to generalize across the protein universe. | 高 | SM023, SM020 |
| CM023 | Google DeepMind's AlphaFold protein structure database provides free access to over 200 million predicted protein structures; this open resource normalizes computational protein tools in pharma R&D and expands the addressable buyer base for ESM3/ESM-C by reducing scientific credibility risk. | 高 | SM007, SM013 |
| CM024 | NVIDIA BioNeMo delivers 2× faster biofoundation model training and 6× faster model inference versus unoptimized implementations, reducing the total cost of ownership for enterprise protein LM deployment and strengthening EvolutionaryScale's NVIDIA distribution partnership as a commercial channel. | 高 | SM008, SM009 |
| CM025 | ESM3 was trained on 2.78 billion protein sequences with 98 billion parameters using approximately 1×10^24 FLOPs of compute (Science, January 2025)—a scale achievable only because of the exponential growth in protein databases enabled by declining sequencing costs. | 高 | SM013, SM015 |
| CM026 | ESM-C's release under MIT license on HuggingFace with AWS SageMaker and NVIDIA BioNeMo distribution mirrors the open-weight strategy that drove commercial cloud API conversion in NLP (e.g., Hugging Face, Mistral AI), establishing EvolutionaryScale as the community standard for protein LMs. | 中 | SM012, SM018 |
| CM027 | ESM-C (300M, 600M, and 6B parameter variants) is available under MIT license on HuggingFace (6,320+ downloads for the 600M variant, 3,110+ for ESM3 open), enabling any organization with GPU access to self-host the model at zero marginal cost, creating a pricing ceiling and limiting paid Forge conversion for price-sensitive customers. | 中 | SM018, SM017 |
| CM028 | No protein engineered purely by a computational AI model has received regulatory approval without extensive in vitro and in vivo wet-lab validation; the experimental bottleneck remains a necessary post-computational step, structurally limiting the standalone commercial value of a protein LM API. | 高 | SM010, SM013 |
| CM029 | Enterprise pharma technology procurement cycles typically add 12–24 months to commercial deployment timelines relative to academic adoption due to IT security reviews, cloud data governance policies, SOC2/GxP compliance requirements, and multi-year vendor vetting processes. | 中 | SM008, SM009 |
| CM030 | Google DeepMind (AlphaFold3), NVIDIA BioNeMo, and AWS HealthOmics all have distribution, compute, and ecosystem advantages that could threaten EvolutionaryScale's commercial differentiation if frontier protein LM capabilities converge toward commodity—a material long-run competitive risk. | 中 | SM007, SM008, SM019 |
| CM031 | The bioRxiv preprint server indexed over 129 papers citing ESM3 or EvolutionaryScale as of the access date, indicating strong academic community engagement with the ESM protein LM family and validating the open-weight strategy for building ecosystem adoption. | 中 | SM014, SM015 |
| CM032 | No independent SAM figure for protein language model APIs within pharmaceutical R&D has been published; all accessible analyst estimates cover the full protein engineering tools market ($2.2B–$23.59B), making PLM API SAM derivation assumption-dependent and constituting a material diligence gap. | 中 | SM001, SM002, SM003, SM004 |
| CM033 | EvolutionaryScale has not publicly disclosed Forge API pricing, subscriber counts, or revenue figures; HuggingFace download metrics and GitHub stars are developer adoption proxies that do not directly translate to commercial revenue without knowledge of the paid conversion funnel. | 中 | SM018, SM025, SM022 |
| CM034 | The protein engineering market analyst consensus (MAM, Allied, GVR) converges on a 2024 base of $2.2B–$2.6B with 12–16% CAGR reaching $7–8B by 2030; Precedence's $5.09B base is an outlier explained by broader scope inclusion of industrial enzymes and biopharmaceutical manufacturing. | 中 | SM001, SM002, SM003, SM004 |
| CM035 | The ESM3 Science paper has accumulated over 40,000 citations to AlphaFold as context and 129+ follow-on bioRxiv preprints within one year of publication, demonstrating the scientific impact of the ESM model family and establishing ecosystem depth that sustains commercial positioning. | 中 | SM013, SM014 |
| CM036 | EvolutionaryScale's distribution strategy—open weights on HuggingFace (MIT license) + enterprise Forge API + AWS SageMaker + NVIDIA BioNeMo—creates a multi-channel commercial model spanning free community tier, cloud marketplace access, and direct enterprise contracts. | 高 | SM011, SM012, SM008, SM009 |
| CP001 | AbSci Corporation (NASDAQ: ABSI) filed a 10-K with the SEC for fiscal year ended December 31, 2025, confirming it is a publicly traded generative AI drug company based in Vancouver, Washington. | 高 | SP026, SP004 |
| CP002 | DeepMind's AlphaFold Protein Structure Database, developed in partnership with EMBL-EBI, provides open access to over 200 million predicted protein structures under a CC-BY-4.0 license, used by over 3 million researchers in 190+ countries. | 高 | SP024, SP006 |
| CP003 | EvolutionaryScale's ESM3 is the first generative model to simultaneously reason over protein sequence, structure, and function in a single multimodal architecture, published in Science on January 16, 2025, trained with over 10^24 FLOPs and 98 billion parameters. | 高 | SP022, SP025 |
| CP004 | Generate Biomedicines has generated, built, and tested over 42,000 proteins through its continuously learning platform, with 140,000+ square feet of lab space across Boynton Yards and Andover locations. | 中 | SP002 |
| CP005 | Cradle.bio's homepage reports that teams using Cradle achieve 2–12x faster protein development timelines, with results compounding across successive rounds of wet-lab and AI iteration. | 中 | SP005 |
| CP006 | The RFdiffusion algorithm for de novo protein structure and function design was published in Nature in July 2023 by Baker Lab researchers, representing the Baker Lab / IPD's leading open-source generative design tool. | 高 | SP021, SP010 |
| CP007 | Meta's ESM2 and ESMFold protein language models are released under an MIT license, confirmed on both GitHub (github.com/facebookresearch/esm) and HuggingFace, permitting commercial use at zero cost. | 高 | SP017, SP018 |
| CP008 | Meta's ESM protein language models were created by Alexander Rives, Zeming Lin, Tom Sercu, and Salvatore Candido at Meta AI FAIR — the exact same four individuals who co-founded EvolutionaryScale in 2023. | 高 | SP017, SP020 |
| CP009 | Isomorphic Labs is an Alphabet subsidiary focused on AI-driven drug discovery, building on the Nobel Prize-winning AlphaFold system, with an interdisciplinary team of drug discovery experts and machine learning specialists. | 中 | SP008 |
| CP010 | Chai Discovery is developing Chai-2, which targets drug-like antibody design against challenging targets with atomic precision, building on its earlier open-released Chai-1 model. | 中 | SP009 |
| CP011 | Recursion Pharmaceuticals (NASDAQ: RXRX) has generated over 50 petabytes of biological and chemical data and operates BioHive-2, a biopharma supercomputer built in partnership with NVIDIA. | 中 | SP011 |
| CP012 | Schrödinger's computational platform is built on over 30 years of R&D and includes FEP+, WaterMap, and LiveDesign as core products used by leading pharmaceutical companies for molecular discovery and optimization. | 中 | SP013, SP014 |
| CP013 | Inceptive specializes in foundation models for RNA, mRNA, siRNA, ASO, and peptide therapeutics, operating from offices in Palo Alto, Berlin, and Zurich, and was founded in 2021. | 中 | SP015 |
| CP014 | Iambic Therapeutics uses its Enchant and NeuralPLexer AI technologies for drug design and has reported Phase 1b safety and tolerability data for IAM1363, a HER2-targeted inhibitor for brain-penetrant cancer treatment. | 中 | SP016 |
| CP015 | Xaira Therapeutics is building predictive and agentic AI models across the complete drug discovery and development process, including target identification, therapeutic design, and patient selection. | 中 | SP028 |
| CP016 | The OpenFold Consortium provides permissively licensed open-source protein folding tools including OpenFold, OpenFold-SoloSeq (no MSA required), and OpenFold-Multimer for protein complex modeling. | 中 | SP019 |
| CP017 | AbSci's AI Drug Creation Platform operates with 6-week wet-lab and AI iterative cycles for de novo biologic design, enabling multi-parametric lead optimization from concept through to clinical trial pipeline. | 中 | SP004 |
| CP018 | AbSci has designed ABS-201, an AI-generated antibody targeting prolactin receptors for androgenetic alopecia, which demonstrated hair follicle regeneration in vivo studies as a potential best-in-class therapeutic developed in 24 months. | 中 | SP004 |
| CP019 | Profluent Bio describes OpenCRISPR on its website as the world's first AI-designed gene editor, representing the company's flagship public demonstration of protein design AI capability. | 中 | SP001 |
| CP020 | Cradle.bio is SOC 2 compliant and operates on a software subscription model where customer IP is fully retained, customer experimental data is never used to train models for other customers, and no royalties are charged. | 中 | SP005 |
| CP021 | Novonesis (formerly Novozymes), one of the world's largest industrial biotech companies, has publicly stated a partnership with Cradle that embeds AI directly into how it innovates protein products to shorten development time. | 中 | SP005 |
| CP022 | Generate Biomedicines operates over 140,000 square feet of lab space at Boynton Yards and Andover locations, supporting a capital-intensive generate-build-measure-learn platform. | 中 | SP002 |
| CP023 | Generate Biomedicines' lead program GB-0895 is an AI-designed anti-TSLP antibody for asthma, co-optimized for both biological effect and reduced dosing frequency, with potential to shift treatment from monthly to twice-yearly administration. | 中 | SP003 |
| CP024 | Recursion Pharmaceuticals' clinical pipeline includes REC-4881 (Phase 2 MEK1/2 inhibitor for FAP with Orphan Drug and Fast Track designations) and REC-3565 (Phase 1 MALT1 inhibitor for B-cell lymphoma). | 中 | SP012 |
| CP025 | Adaptyv Bio is based at the Biopole Life Science Campus in Epalinges, Lausanne, Switzerland and positions itself as a cloud lab for protein designers. | 低 | SP007 |
| CP026 | EvolutionaryScale offers Forge, its commercial API platform for ESM3 and ESMC access, described as entering public beta in January 2025 alongside the Science publication announcement. | 中 | SP025, SP023 |
| CP027 | Cradle.bio charges customers a software subscription fee, explicitly promises no royalties, and states that customer sequences and data are private, secure, and never used to train models for other customers. | 中 | SP005 |
| CP028 | Meta's ESM2 model family is available on both GitHub and HuggingFace under the MIT license with no usage restrictions—including commercial use—at zero cost, for models ranging from 8M to 650M parameters publicly on HuggingFace. | 高 | SP017, SP018 |
| CP029 | Schrödinger announced Q1 2026 financial results in May 2026, confirming its active status as a publicly traded drug discovery platform company (NASDAQ: SDGR). | 中 | SP013 |
| CP030 | Generate Biomedicines and AbSci both monetize through B2B pharma partnership and licensing models rather than offering public self-service APIs, distinguishing their commercial models from EvolutionaryScale's Forge API approach. | 中 | SP002, SP004 |
| CP031 | ESM2 was released by Meta AI FAIR under an MIT license, and the same researchers (Rives, Lin, Sercu, Candido) who created it at Meta are the co-founders of EvolutionaryScale, creating a structural commoditization baseline against their own commercial offering. | 高 | SP017, SP020, SP025 |
| CP032 | ESMFold, a protein structure prediction model based on ESM2 developed at Meta AI, predicts protein structure end-to-end up to 60x faster than prior state-of-the-art methods and is freely available. | 中 | SP020 |
| CP033 | The Meta ESM2 model family (up to 15B parameters) and ESMFold, both released under MIT license for any use including commercial, were built by EvolutionaryScale's own founders and set a free commoditization floor for basic protein language modeling. | 高 | SP017, SP018, SP020 |
| CP034 | Insilico Medicine (HKEX: 3696) has completed a Phase 2 clinical trial for ISM001-055 (TNIK inhibitor for IPF), making it the first AI drug discovery company to reach Phase 2 completion with a drug designed entirely using AI. | 中 | SP027 |
| CP035 | ESM3's defining differentiation is simultaneous joint reasoning over protein sequence, structure, and function in one multimodal model — a capability absent from ESM2 (sequence only) and standard AlphaFold variants (structure prediction only). | 高 | SP022, SP025 |
| CP036 | Generate Biomedicines has raised substantially more capital than EvolutionaryScale (estimated ~$700M+ vs $142M), enabling a capital-intensive wet-lab validation strategy that EvolutionaryScale's disclosed funding cannot currently replicate. | 中 | SP002 |
| CP037 | AlphaFold 3's commercial rights for drug discovery are exclusively licensed to Isomorphic Labs, while the AlphaFold model code and weights are available for academic non-commercial use, creating a two-tier access structure. | 中 | SP006, SP008 |
| CP038 | Pharma clients can simultaneously use free protein AI tools (AlphaFold DB CC-BY-4.0, ESM2 MIT, OpenFold open-source) and paid platforms (Forge, Cradle, Generate), enabling multi-homing that limits any single vendor's pricing power. | 中 | SP006, SP017, SP005, SP025 |
| CP039 | David Baker (Institute for Protein Design, University of Washington) was co-awarded the Nobel Prize in Chemistry in October 2024 for computational protein design, alongside Demis Hassabis and John Jumper (Google DeepMind) for AlphaFold. | 高 | SP006, SP010 |
| CP040 | The Institute for Protein Design distributes RFdiffusion and RoseTTAFold software royalty-free and has developed a COVID-19 vaccine using protein design technology that received approval in the UK and South Korea under WHO Emergency Use Listing. | 中 | SP010, SP021 |
| CI001 | EvolutionaryScale's commercial product was Forge, a protein language model inference API launched in public beta in January 2025, providing pay-per-token access to ESM3 and ESM Cambrian models. | 高 | SI001, SI003, SI004 |
| CI002 | ESM Cambrian (released January 2025) was made available exclusively as a commercial model through the Forge API, unlike the open-weight ESM2 from Meta AI Research which is freely available on HuggingFace. | 高 | SI003, SI023 |
| CI003 | No revenue figures, ARR, gross margin, customer count, or commercial traction metrics were publicly disclosed by EvolutionaryScale at any point during its operation as a standalone entity. | 高 | SI001, SI014, SI015 |
| CI004 | The Forge API pricing schedule required a user login to view at forge.evolutionaryscale.ai; no public price list was available to unauthenticated users as of the research date. | 中 | SI004 |
| CI005 | EvolutionaryScale's revenue model combined at least three streams: Forge API pay-per-use (per-token), enterprise annual API contracts, and partner distribution through NVIDIA BioNeMo and AWS SageMaker JumpStart. | 中 | SI004, SI016, SI017 |
| CI006 | ESM3 was integrated into NVIDIA's BioNeMo platform as a NVIDIA Inference Microservice (NIM), enabling cloud-hosted protein generation through NVIDIA's commercial distribution channel. | 高 | SI017, SI018, SI019 |
| CI007 | EvolutionaryScale's ESM models were listed on AWS SageMaker JumpStart for cloud-hosted access; Amazon was the lead co-investor in the September 2024 Series A, suggesting a strategic alignment between investment and cloud distribution. | 中 | SI009, SI017 |
| CI008 | EvolutionaryScale offered an academic free tier with capped token allowances as a freemium entry point to Forge API, intended to drive academic usage and downstream conversion to enterprise or paid API tiers. | 中 | SI004, SI001 |
| CI009 | The open-weight ESM2 model (developed by Meta AI Research and released on HuggingFace) serves as a zero-cost alternative for protein sequence embeddings, creating a structural competitive ceiling on EvolutionaryScale's Forge API pricing power for non-generative use cases. | 中 | SI023, SI003 |
| CI010 | The Forge API's post-CZI Biohub operational status and pricing under CZI management are unconfirmed in public sources as of May 2026; the company homepage states it is "joining forces with Biohub" without specifying Forge API continuity. | 高 | SI001, SI013 |
| CI011 | ESM3 was trained with over 10^24 FLOPs—described by EvolutionaryScale as "the most compute ever applied to training a biological model"—on what the company called "one of the highest throughput GPU clusters in the world today." | 高 | SI002, SI018 |
| CI012 | At over 10^24 FLOPs and H100 GPU pricing of approximately $2–5 per GPU-hour, the ESM3 training run is estimated to have cost $10–50 million, making it the dominant one-time capital expenditure in EvolutionaryScale's history. | 低 | SI002, SI018 |
| CI013 | EvolutionaryScale's LinkedIn company profile shows the 11-50 employee size bracket, implying approximately 25-50 full-time employees at its peak operational scale. | 中 | SI025 |
| CI014 | At an estimated 25-50 FTE with a blended all-in cost of $200,000–$300,000 per employee annually (standard for San Francisco AI research teams), EvolutionaryScale's annual personnel burn is estimated at $5–15 million per year. | 低 | SI025 |
| CI015 | Amazon's role as lead Series A investor may have included in-kind AWS cloud compute credits as part of the deal structure, which would reduce EvolutionaryScale's cash infrastructure spend and extend effective runway beyond naive burn-rate estimates. | 低 | SI009, SI017 |
| CI016 | Gross margin for the Forge API inference business depends on whether EvolutionaryScale owned GPU cluster infrastructure (capital-intensive, higher long-run margin) or rented cloud compute (lower capex, COGS-heavy). No gross margin figures were disclosed. | 低 | |
| CI017 | ESM3 was developed with 98 billion parameters, placing it firmly in the frontier model scale class for biological language models; inference costs per query at this parameter scale are substantially higher than smaller protein models. | 高 | SI002, SI018 |
| CI018 | EvolutionaryScale raised $142 million in a Series A round announced on September 26, 2024, led by Amazon and NVIDIA, with co-investment from Lux Capital, Nat Friedman, and Daniel Gross. | 高 | SI001, SI009, SI010, SI017 |
| CI019 | The Series A was reported at a post-money valuation of approximately $1.35 billion, placing EvolutionaryScale among the most highly valued pre-revenue protein AI companies at the time of the round. | 高 | SI009, SI011, SI020 |
| CI020 | NVIDIA joined EvolutionaryScale's seed round (announced June 25, 2024), making NVIDIA both a seed and Series A investor—an unusual dual-round commitment that underscores the strategic importance of ESM3 to NVIDIA's BioNeMo platform. | 高 | SI016, SI017 |
| CI021 | EvolutionaryScale's seed round (late 2023) was led by Nat Friedman and Daniel Gross, with participation from Lux Capital; the dollar amount raised in the seed was not publicly confirmed in any accessible source. | 中 | SI014, SI015 |
| CI022 | Total capital raised by EvolutionaryScale prior to the CZI Biohub transaction is estimated at approximately $145 million ($3M seed + $142M Series A). | 中 | SI014, SI009, SI016 |
| CI023 | On November 6, 2025, the EvolutionaryScale team joined CZI Biohub to advance the Frontier AI for Biology Initiative, as announced by Biohub.org and reported by CNBC. | 高 | SI013, SI012, SI001 |
| CI024 | Alex Rives, EvolutionaryScale co-founder and chief scientist, became Head of Science at CZI Biohub following the November 2025 transaction. | 高 | SI013, SI012 |
| CI025 | The CZI Biohub transaction was consummated approximately 14 months after the $142M Series A closed on September 26, 2024, providing only a narrow window for commercial revenue ramp before the standalone entity effectively ended. | 高 | SI009, SI013 |
| CI026 | No Form D filings for EvolutionaryScale appear in SEC EDGAR under any of the following search terms: "EvolutionaryScale," "Evolutionary Scale," "Evolutionary Scale Inc," or by key person "Alexander Rives" — across four separate EDGAR full-text and company browse searches. | 高 | SI005, SI006, SI007, SI008 |
| CI027 | Private companies raising capital under SEC Regulation D exemptions are legally required to file Form D with the SEC within 15 days of the first sale of securities. The absence of Form D in EDGAR for EvolutionaryScale's $142M Series A raise is a noteworthy regulatory compliance gap or indicator of filing under an undiscovered legal entity name. | 高 | SI005, SI006 |
| CI028 | The financial terms of the November 2025 CZI Biohub transaction were not disclosed in any public source reviewed, including Biohub.org, CNBC, EvolutionaryScale's homepage, or SEC EDGAR. | 高 | SI012, SI013, SI001 |
| CI029 | Xaira Therapeutics raised $1 billion at its founding in 2024 for full-stack AI drug discovery, representing the largest single-round raise in protein AI; EvolutionaryScale's $1.35B valuation on $142M capital compares to Xaira's larger initial capital base. | 中 | SI014, SI015 |
| CI030 | Profluent Bio raised approximately $44 million across its financing rounds for protein design AI with a narrower commercial scope than EvolutionaryScale, demonstrating that the protein AI market can support smaller, more focused capital deployments alongside frontier-scale foundation models. | 中 | SI014 |
| CI031 | Generate:Biomedicines raised over $700 million across Series A through C for full-stack AI protein therapeutics, targeting drug revenue rather than API monetization—a fundamentally different business model and capital structure from EvolutionaryScale's foundation-model API approach. | 中 | SI014, SI015 |
| CI032 | EvolutionaryScale's ~$3–6M capital raised per employee (based on ~$145M total and ~25–50 FTE) substantially exceeds Profluent's and Cradle's capital efficiency, reflecting the compute intensity of frontier biological foundation model training rather than scaled commercial deployment. | 低 | SI014, SI025 |
| CI033 | Crunchbase incorrectly labels EvolutionaryScale's $142M Series A as a "seed investment round" in its AI-generated summary, illustrating the unreliability of AI-generated private-market data summaries; the actual round type is confirmed as Series A by CNBC, Axios, NVIDIA, and MIT Technology Review. | 高 | SI014, SI009 |
| CI034 | The investor return profile for EvolutionaryScale's $142M Series A participants (Amazon, NVIDIA, Lux Capital, Nat Friedman, Daniel Gross) is not determinable from public sources following the CZI Biohub transaction, as no deal terms or investor distribution amounts were disclosed. | 低 | SI012, SI013 |
| CI035 | EvolutionaryScale as an independent commercial entity effectively ceased to operate following the November 2025 CZI Biohub transaction; the company homepage confirms the entity is "joining forces with Biohub" without a separate commercial continuation announcement. | 高 | SI001, SI013, SI012 |
| CI036 | All five planned financial information gaps—actual revenue, confirmed burn rate, CZI transaction terms, Form D filings, and enterprise customer count—remain unresolved in public sources as of May 2026 and require direct access to CZI Biohub documentation or historical EvolutionaryScale internal records. | 高 | SI005, SI001, SI014 |
| CI037 | The CZI Biohub is a non-profit initiative of the Chan Zuckerberg Initiative whose Frontier AI for Biology Initiative absorbs EvolutionaryScale's team and models under a philanthropic, non-commercial mandate—a fundamental change from the VC-backed commercial API business model. | 高 | SI013, SI012 |
| CI038 | The $142M Series A at $1.35B valuation for a pre-revenue, sub-50-employee foundation model company represents a significant premium ascribed entirely to the scientific moat and strategic optionality of ESM3/ESM Cambrian rather than demonstrated commercial revenue or customer traction. | 中 | SI009, SI019, SI002 |
| CE001 | EvolutionaryScale's product portfolio consists of two model families: ESM3 (multimodal generative protein LM in 1.4B/7B/98B sizes) and ESM-C / Cambrian (embedding-focused protein LM in 300M/600M/6B sizes). | 高 | SE001, SE002, SE003 |
| CE002 | ESM3-small-2024-08 has 1.4 billion parameters; ESM3-medium-2024-08 has 7 billion; and ESM3-large-2024-03 has 98 billion parameters. | 高 | SE001, SE009, SE017 |
| CE003 | ESMC-300M uses 30 transformer layers with hidden width 960; ESMC-600M uses 36 layers with width 1152; ESMC-6B uses 80 layers with width 2560. | 中 | SE002 |
| CE004 | Open weights for ESM3-small-2024-08 and ESMC-300M/ESMC-600M are available on HuggingFace under the Cambrian Non-Commercial License Agreement, which prohibits commercial use. | 高 | SE001, SE014, SE009 |
| CE005 | EvolutionaryScale's open weights for ESM3-small were first released in June 2024 concurrent with the ESM3 launch; ESMC-300M and ESMC-600M open weights were released in December 2024. | 高 | SE001, SE002 |
| CE006 | ESM3's flagship proof of concept is esmGFP, a novel functional fluorescent protein designed with only 58% sequence identity to the nearest known natural GFP — approximately equivalent to 500 million years of evolutionary distance. | 高 | SE001, SE005, SE006 |
| CE007 | The Forge API (forge.evolutionaryscale.ai) provides programmatic access to ESM3 and ESMC models through a Python SDK (pip install evoscale-sdk) with synchronous and asynchronous inference and a batch executor; the API was opened to public beta in January 2025. | 高 | SE001, SE004, SE009 |
| CE008 | Amazon Web Services and NVIDIA are EvolutionaryScale's primary commercial deployment partners: ESMC-6B is deployed on AWS SageMaker JumpStart, and NVIDIA is integrating ESM-C into BioNeMo NIM. | 高 | SE017, SE018, SE019 |
| CE009 | ESM3 uses a multitrack transformer architecture with three separate discrete token tracks: amino acid sequence tokens, VQVAE-encoded structure tokens (representing 3D backbone coordinates), and function keyword tokens (GO annotations). | 高 | SE001, SE005, SE006 |
| CE010 | ESM3-large (98B parameters) was trained on 2.78 billion proteins, 771 billion unique tokens, using 1.07×10²⁴ floating-point operations on the Andromeda cluster. | 高 | SE001, SE005, SE017 |
| CE011 | ESM3 employs a vector quantized variational autoencoder (VQVAE) to encode 3D protein backbone coordinates as discrete structural tokens, enabling the transformer to natively generate structure as tokens rather than as continuous coordinates. | 高 | SE001, SE006 |
| CE012 | ESM3 is pre-trained using a masked language modeling (MLM) objective applied jointly across all three tracks (sequence, structure, function), enabling the model to infer any track from the others. | 高 | SE001, SE005 |
| CE013 | Reinforcement learning from human feedback (RLHF) was applied to ESM3-large to align outputs with human preferences for protein design tasks. | 中 | SE001 |
| CE014 | ESM-C uses a Pre-Layer Normalization transformer architecture with rotary positional embeddings (RoPE), SwiGLU feed-forward activations, and masked language modeling pre-training. | 中 | SE002 |
| CE015 | ESMC training compute: ESMC-300M was trained on 1.26×10²² FLOPs; ESMC-600M on 2.17×10²² FLOPs; ESMC-6B on 2.37×10²³ FLOPs. | 中 | SE002 |
| CE016 | ESM-C was trained on three protein sequence databases: UniRef (83 million sequence clusters), MGnify (372 million), and JGI metagenomics (2 billion clusters), all clustered at 70% sequence identity. | 中 | SE002 |
| CE017 | EvolutionaryScale's DeepEP library is an open-source CUDA/NCCL implementation of Mixture-of-Experts Expert Parallelism communication optimized for H800 GPUs, with 1,253 GitHub stars as of the research date. | 中 | SE011, SE010 |
| CE018 | NVIDIA reports that ESM3-large uses approximately 25× more FLOPs and 60× more training data than its predecessor, ESM2 (Meta AI), and was trained on NVIDIA H100 GPUs via the Andromeda HPC cluster. | 中 | SE017 |
| CE019 | The GitHub ESM repository (github.com/evolutionaryscale/esm) provides the official Python client library for the Forge API and access to open-weight models, with installation via pip. | 中 | SE009, SE012 |
| CE020 | The HuggingFace model card for esm3-sm-open-v1 showed 3,105 downloads in the prior 30 days and 291 likes as of the research access date. | 中 | SE013, SE014 |
| CE021 | The ESMC-300M model card showed 6,320 downloads on HuggingFace; the ESMC-600M model card showed 1,490 downloads, as of the research access date. | 中 | SE013, SE015 |
| CE022 | ESMC-6B is available via the Forge API for academic users and via AWS SageMaker JumpStart for commercial deployments; the SageMaker deployment uses a CloudFormation stack documented in the esm-sagemaker GitHub repository, with setup time of 15-25 minutes. | 中 | SE002, SE009, SE019 |
| CE023 | EvolutionaryScale's GitHub organization (github.com/evolutionaryscale) hosts nine public repositories including esm (flagship), DeepEP (1,253 stars), a NCCL fork, a Hugging Face transformers fork, a Mamba implementation, and esm-sagemaker. | 中 | SE010, SE011, SE009 |
| CE024 | EvolutionaryScale has not filed any SEC Form D equity offering disclosures as of May 2026, confirming the company's status as a privately held entity that has not made registered securities offerings. | 中 | SE021 |
| CE025 | NVIDIA announced a partnership with EvolutionaryScale to integrate ESM3 into the BioNeMo NIM platform for GPU-optimized inference, and participated in EvolutionaryScale's seed investment. | 高 | SE016, SE017, SE018 |
| CE026 | Hacker News search results show ten or more community discussion threads covering EvolutionaryScale and ESM3, including "Show HN: ESM C" and multiple threads on the Science publication and initial ESM3 launch, indicating meaningful developer community engagement. | 中 | SE022 |
| CE027 | The ESM3 Science paper (Hayes et al., January 2025, Vol 387, Issue 6736, pp. 850-858, DOI 10.1126/science.ads0018) has accumulated 341 citations and 68,494 downloads, with 318 of the citations arriving within the first 12 months of publication. | 高 | SE005, SE026 |
| CE028 | The ESM3 preprint on bioRxiv (submitted July 2024, DOI 10.1101/2024.07.01.600583) was cited by 129+ downstream papers within its first year of availability, signaling rapid academic adoption. | 中 | SE006, SE008 |
| CE029 | esmGFP carries 96 mutations out of 229 total amino acid positions, achieving 58% sequence identity to the nearest known natural GFP and representing a protein in a region of sequence space separated from known fluorescent proteins by approximately 500 million years of evolutionary divergence. | 高 | SE001, SE005, SE006 |
| CE030 | ESM3 designed esmGFP by jointly optimizing across sequence, structure, and function tracks, using the multitrack generative capability to explore protein sequence space beyond the reach of natural evolution or previous directed-evolution methods. | 高 | SE001, SE005 |
| CE031 | The 58% sequence identity distance between esmGFP and the nearest natural GFP is comparable to the evolutionary separation between corals and jellyfish, which represent two distinct animal phyla. | 中 | SE001, SE006 |
| CE032 | EvolutionaryScale filed patents covering esmGFP and related protein design methods, as stated in the ESM3 bioRxiv preprint. | 中 | SE006 |
| CE033 | ESM3 competes in the protein AI landscape against AlphaFold3 (structure prediction, DeepMind, May 2024), Chai-1 (protein complex structure, Chai Discovery), and ESM2 (sequence LM, Meta AI); each competitor focuses on structure prediction rather than generative protein design. | 中 | SE027, SE028, SE017 |
| CE034 | EvolutionaryScale raised a $142 million Series A in September 2024 led by Lux Capital, with participation from Amazon and NVIDIA, following an earlier seed investment from NVIDIA. | 中 | SE024, SE025, SE020 |
| CE035 | In November 2025, EvolutionaryScale's team joined CZI Biohub as part of its Frontier AI & Biology initiative; co-founder and chief scientist Alex Rives was appointed head of science at Biohub. | 中 | SE023 |
| CE036 | Open-weight ESM3 and ESMC models are distributed under the Cambrian Non-Commercial License Agreement, which restricts use to non-commercial research; commercial customers must access models through the Forge API or AWS SageMaker. | 高 | SE001, SE014 |
| CE037 | EvolutionaryScale employs a dual-access commercial model: open-weight non-commercial access for research and community adoption, and Forge API / SageMaker commercial access with undisclosed pricing. | 中 | SE001, SE002, SE004, SE019 |
| CE038 | EvolutionaryScale is described as a public benefit company (PBC) in CZI Biohub's November 2025 announcement, consistent with its stated mission of advancing biology through responsible AI. | 中 | SE023 |
| CE039 | An independent BioRxiv preprint (December 2024) found that ESM3's binding prediction accuracy deteriorates when distinct per-variant relaxed protein structures are used as inputs, compared to a single consistent structural backbone — a finding the authors describe as the 'More Structure, Less Accuracy' paradox. | 中 | SE007 |
| CE040 | EvolutionaryScale has not publicly disclosed commercial pricing for the Forge API, customer names or contract counts, or revenue metrics as of May 2026. | 高 | SE003, SE004, SE021 |
| CU001 | The biohub/esm3-sm-open-v1 model on HuggingFace had approximately 3,110 downloads and 291 likes as of May 2026, reflecting academic uptake of the ESM3 open-weight model. | 中 | SU006 |
| CU002 | The biohub/esmc-300m-2024-12 model on HuggingFace had approximately 6,320 downloads and 30 likes as of May 2026. | 中 | SU006 |
| CU003 | The biohub/esmc-600m-2024-12 model on HuggingFace had approximately 1,490 downloads and 32 likes as of May 2026. | 中 | SU006 |
| CU004 | Total ESM-C family HuggingFace downloads across 300M and 600M open models sum to approximately 7,810 as of May 2026. | 中 | SU006 |
| CU005 | A Semantic Scholar API search for ESM3 and EvolutionaryScale returned 32 papers building on the ESM3 framework as of May 2026. | 中 | SU009 |
| CU006 | A bioRxiv search for 'evolutionaryscale ESM3' returned 129 preprint results as of May 2026, indicating broad academic interest in ESM3. | 中 | SU010 |
| CU007 | ESM-C models are available for commercial deployment on Amazon SageMaker under the Cambrian Inference Clickthrough License Agreement, enabling broad commercial use by enterprise customers. | 高 | SU003, SU004 |
| CU008 | ESM-C 6B is available for academic use via the Forge API and for commercial use via Amazon SageMaker, as stated in the ESM Cambrian launch blog post. | 中 | SU003 |
| CU009 | AWS SageMaker deployment of ESM-C requires admin-level AWS account access, subscription via the Marketplace, and uses CloudFormation to deploy a dedicated GPU endpoint in 15–25 minutes billed to the customer's AWS account. | 中 | SU004 |
| CU010 | NVIDIA BioNeMo was listed as an upcoming integration channel for ESM-C models as of December 2024; the live status of the integration could not be confirmed as of May 2026. | 中 | SU003, SU007 |
| CU011 | Adaptyv Bio, a protein engineering company based at Biopole Life Science Campus in Lausanne, Switzerland, has been confirmed as a named ESM ecosystem partner. | 中 | SU008 |
| CU012 | EvolutionaryScale opened the Forge API public beta in January 2025, offering scientists in academia and industry a free limited-time preview of ESM3 and ESM-C models. | 高 | SU001, SU002 |
| CU013 | The EvolutionaryScale GitHub organization includes an 'esm-partner' repository explicitly labeled 'Repository for partner collaborations,' indicating a formal partner pipeline. | 中 | SU005 |
| CU014 | EvolutionaryScale raised $142M Series A in September 2024 from Amazon (AWS) and NVIDIA as strategic investors, with Lux Capital, Nat Friedman, and Daniel Gross also participating. | 高 | SU011, SU012 |
| CU015 | NVIDIA participated in EvolutionaryScale's seed investment round, as confirmed by a dedicated NVIDIA Newsroom press release, establishing the NVIDIA–EvolutionaryScale relationship before the Series A. | 中 | SU016 |
| CU016 | Lux Capital co-led or participated in EvolutionaryScale's Series A round, as confirmed by a Lux Capital blog post announcing the investment. | 中 | SU017 |
| CU017 | No named pharmaceutical company (such as Pfizer, Eli Lilly, Novartis, or Roche) has been publicly disclosed as a paying enterprise customer of EvolutionaryScale as of May 2026. | 中 | SU014, SU018 |
| CU018 | Generate Biomedicines announced a multi-billion-dollar collaboration with Amgen, representing a commercial benchmark that EvolutionaryScale has not publicly matched as of May 2026. | 中 | SU019 |
| CU019 | Isomorphic Labs signed collaboration agreements with Eli Lilly and Novartis totaling over $3 billion in potential milestone value, creating a commercial proof standard EvolutionaryScale has not yet demonstrated. | 中 | SU021 |
| CU020 | ESM3 was published in Science Magazine on January 16, 2025 (DOI: 10.1126/science.ads0018), providing peer-reviewed academic validation that anchors downstream commercial trust. | 高 | SU015, SU002 |
| CU021 | MegSite (nucleic acid binding residue prediction), ProteinReasoner (multi-modal protein LM with chain-of-thought), and iNClassSec-ESM (non-classical secreted protein discovery) are among the named downstream academic applications built on the ESM3 framework. | 中 | SU009 |
| CU022 | EvolutionaryScale has disclosed no NRR, GRR, customer count, or customer satisfaction metrics as of May 2026; the company is in a pre-commercial-scale API beta phase. | 中 | SU014, SU018 |
| CU023 | The ESM3 open-weight model (1.4B parameters) is licensed for non-commercial use only; commercial access to all model scales (including 7B and 98B ESM3) requires Forge API tokens or AWS SageMaker subscriptions. | 高 | SU004, SU001 |
| CU024 | Biosecurity organizations including NTI and the Center for AI Safety have documented ongoing biosecurity concerns about dual-use capabilities of protein design AI, which may constrain the addressable commercial market for open-weight frontier protein language models. | 中 | SU028, SU029 |
| CU025 | BioNTech and InstaDeep fine-tuned an ESM language model (predecessor generation) on COVID spike proteins to create a variant early-warning system that flagged all 16 WHO variants of concern before official designation, demonstrating prior named corporate production use of the ESM family. | 中 | SU002 |
| CU026 | The global drug discovery market is valued at approximately $71.89 billion in 2025, growing at a CAGR of 9.20% through 2034, providing a large total addressable market for AI infrastructure vendors like EvolutionaryScale. | 中 | SU020 |
| CU027 | EvolutionaryScale's company website does not display customer logos, named case studies, testimonials, or enterprise customer success content as of May 2026. | 中 | SU001 |
| CU028 | ESM-C models on HuggingFace were updated as recently as two days before the research cache date (approximately May 2026), indicating active model maintenance and development velocity. | 中 | SU006 |
| CU029 | The GitHub repository evolutionaryscale/esm is the primary open-source distribution channel for ESM model weights, code, and API client libraries including the Forge and SageMaker SDKs. | 中 | SU004 |
| CU030 | The ESM3 open-weight model (1.4B parameters) was released on June 25, 2024 under a non-commercial license as stated in the ESM3 launch blog post. | 高 | SU002, SU004 |
| CU031 | The ESM Cambrian (ESM-C) model family was released on December 4, 2024 at three scales (300M, 600M, 6B), with open weights for the two smaller models and gated commercial access for the 6B model. | 中 | SU003 |
| CU032 | NVIDIA BioNeMo platform explicitly targets drug discovery, molecular design, virtual screening, and protein binder design use cases, which directly overlap with ESM3/ESM-C's primary commercial applications. | 中 | SU007 |
| CU033 | AWS SageMaker listing of ESM-C models creates a cloud-based commercial deployment channel for enterprise users who can subscribe and deploy without requiring direct Forge API account creation. | 中 | SU004, SU025 |
| CU034 | EvolutionaryScale's open-weight model strategy creates a top-of-funnel adoption mechanism where academic users build familiarity that can convert to commercial API usage, following a pattern analogous to open-source AI infrastructure companies. | 中 | SU001, SU004 |
| CU035 | ESM2, the predecessor to ESM3, with up to 15B parameters and freely available open weights, represents a no-cost substitution option for protein representation tasks that reduces commercial willingness-to-pay for ESM-C paid access. | 中 | SU004 |
| CU036 | Amazon AWS's strategic investment in EvolutionaryScale's Series A creates a structural incentive for preferential channel placement in AWS SageMaker JumpStart and AWS HealthOmics, giving EvolutionaryScale access to AWS's enterprise life sciences customer base. | 中 | SU011, SU004 |
| CU037 | Semantic Scholar papers building on ESM3 were published as recently as July–August 2025 (ProteinReasoner: July 25, 2025; MegSite: August 31, 2025), indicating ongoing downstream academic use at least 13 months after ESM3's release. | 中 | SU009 |
| CU038 | EvolutionaryScale is incorporated as a public benefit company (PBC), a legal structure that creates mission constraints that could limit commercialization of some high-profit but ethically questionable protein design applications. | 中 | SU001 |
| CR001 | ESM3 can generate proteins at 58% sequence identity to any known natural fluorescent protein, representing an equivalent of 500 million years of natural evolution, demonstrating the model's capability to design genuinely novel proteins far from existing sequence space. | 高 | SR014, SR028 |
| CR002 | EvolutionaryScale's canonical Responsible Development Framework URL (evolutionaryscale.ai/blog/responsible-development) returned a 404 error on 2026-05-18, indicating the public documentation page is not accessible at access date. | 中 | SR014 |
| CR003 | US Executive Order 14110 (October 30, 2023) explicitly mandates that developers of dual-use foundation models address security risks 'with respect to biotechnology, cybersecurity, critical infrastructure, and other national security dangers.' | 中 | SR001 |
| CR004 | A 2023 MIT study (arXiv 2306.03809) showed that general-purpose LLM chatbots could, in one hour, suggest four pandemic pathogen candidates, synthesis routes, DNA suppliers with lax screening, and troubleshooting protocols to non-scientists, indicating LLMs broadly lower biosecurity barriers. | 中 | SR004 |
| CR005 | No independent third-party biosecurity audit of EvolutionaryScale's Forge API guardrails has been publicly disclosed as of May 2026, making it impossible to verify the effectiveness of the company's self-regulatory biosecurity measures. | 中 | SR014, SR015 |
| CR006 | The Biological Weapons Convention (BWC), effective since 1975 with 189 states party as of May 2025, contains no AI-specific language and has no formal verification mechanism, leaving AI-designed protein risks unaddressed by existing international law. | 中 | SR006 |
| CR007 | The Center for AI Safety's 2023 statement (co-signed by Hinton and Bengio) identifies mitigating the risk of extinction from AI as a global priority on par with pandemics and nuclear war, with biological weapons specifically cited as a concern. | 中 | SR005 |
| CR008 | EvolutionaryScale's ESM Cambrian (Dec 2024) launch blog states that 'ESM C was reviewed by a committee of scientific experts who concluded that the benefits of releasing the models greatly outweigh any potential risks,' but the committee composition and evaluation criteria are not publicly disclosed. | 中 | SR015 |
| CR009 | The Asilomar Conference on Recombinant DNA (1975) established the precedent of voluntary self-regulatory frameworks for biotechnology, which EvolutionaryScale explicitly cites as inspiration for its Responsible Development Framework, but Asilomar's effectiveness in the longer term depended on subsequent binding FDA regulations. | 中 | SR014, SR027 |
| CR010 | The NTI biosecurity program identifies AI-biotech convergence as introducing 'risks of accidental misuse and deliberate exploitation, which could result in a biological catastrophe with grave consequences,' framing regulatory action as urgent. | 中 | SR007 |
| CR011 | Chai-1 (Apache 2.0, free for commercial use, released September 2024) achieves 0.849 Cα LDDT on CASP15 monomer prediction, outperforming ESM3-98B's 0.801, with 77% PoseBusters success vs AlphaFold3's 76%, making it the only freely available commercial-use model at or above ESM3 structure-prediction accuracy. | 中 | SR023 |
| CR012 | AlphaFold 3 database (Google DeepMind/EMBL-EBI) provides over 200 million protein structure predictions freely under CC BY 4.0, updated as of March 2026 to include protein complex structures, directly covering a major Forge API use case at no cost. | 中 | SR011 |
| CR013 | ESM3-98B's training consumed 1×10²⁴ FLOPs, described at launch as 'one of the highest throughput GPU clusters in the world,' creating a compute cost that is a recurring operational risk as the company scales inference and future model training. | 高 | SR014, SR028 |
| CR014 | Meta's facebookresearch/esm repository states it 'contains code and pre-trained weights for Transformer protein language models from the Meta Fundamental AI Research Protein Team (FAIR),' and ESM3's biorxiv preprint confirms ESM3 was developed by founders who were FAIR employees, raising IP provenance questions about the ESM2-to-ESM3 transition. | 中 | SR010, SR028 |
| CR015 | RFdiffusion (Baker Lab, Nature 2023) is freely available for protein backbone generation, binder design, symmetric oligomer design, and active-site scaffolding — core use cases also addressed by ESM3 — and is distributed with permissive licensing from the University of Washington's Institute for Protein Design. | 中 | SR009 |
| CR016 | OpenFold (Apache 2.0, AQ Laboratory) provides a trainable reproduction of AlphaFold2 that organisations can fine-tune on proprietary data, enabling pharmaceutical companies to build internal capabilities that reduce dependence on Forge API subscription. | 中 | SR012 |
| CR017 | ESM3 uses a discrete token representation of protein structure that tokenises 3D protein backbone into a sequence alphabet, an architectural innovation first published in the ESM3 Science paper (January 2025) and biorxiv preprint (July 2024), with patents filed. | 高 | SR028, SR016 |
| CR018 | AlphaFold 2's prediction accuracy at CASP14 was 'insufficient for a third of its predictions' per Wikipedia's AlphaFold article, indicating that even state-of-the-art protein AI models have non-trivial failure rates — a parallel risk for ESM3-generated sequences in drug-discovery applications. | 中 | SR022 |
| CR019 | Amazon (AWS) and Nvidia are simultaneously Series A investors in EvolutionaryScale and operators of competing bio-AI model distribution platforms (SageMaker JumpStart and BioNeMo respectively), creating a structural investor-competitor conflict. | 高 | SR017, SR013, SR026 |
| CR020 | NVIDIA BioNeMo is described as 'the development platform for AI-driven biology and drug discovery,' a direct competitor to EvolutionaryScale's Forge API, while NVIDIA is simultaneously an investor and hardware supplier to EvolutionaryScale. | 中 | SR013 |
| CR021 | ESM Cambrian (300M and 600M parameter models) are released as open-weight models for academic and commercial use, with ESM-C 6B available on Forge for academic use and on AWS SageMaker for commercial use, meaning EvolutionaryScale deliberately makes its representation models freely available to drive adoption of Forge API. | 高 | SR015, SR016 |
| CR022 | Meta's ESM2 model is available under the MIT license via the facebookresearch/esm repository, making it freely usable for commercial applications without royalty obligations — this creates a baseline that limits the premium a customer would pay for ESM3 API access for embedding-only use cases. | 中 | SR010 |
| CR023 | Chai-1's technical report demonstrates that multimer structure prediction without MSA at AlphaFold-Multimer quality level (69.8% DockQ acceptable vs 67.7%) is achievable under Apache 2.0 without API fees, representing a direct commercial threat to Forge API's structure-prediction use cases. | 中 | SR023 |
| CR024 | Meta's ESM Metagenomic Atlas blog (November 2022) confirms that ESMFold (based on ESM2) provides structure predictions up to 60x faster than the prior state-of-the-art, illustrating that Meta's FAIR team (EvolutionaryScale's founding employer) retains independent protein AI capabilities that could re-enter the competitive landscape. | 中 | SR024 |
| CR025 | The EU AI Act (Regulation 2024/1689), published 12 July 2024 and entering full enforcement August 2026, lays down harmonised rules for AI systems with EEA relevance, potentially subjecting general-purpose AI models with large training compute (>10²⁵ FLOPs) to systemic risk obligations including third-party audits. | 高 | SR020, SR021 |
| CR026 | The EU AI Act's full enforcement provisions take effect August 2026, meaning EvolutionaryScale will need to assess EU compliance — including conformity assessments, transparency obligations, and potentially human oversight for high-risk applications — within its current planning horizon. | 高 | SR020, SR021 |
| CR027 | The NIST AI Risk Management Framework Generative AI Profile (NIST-AI-600-1, published July 2024) provides voluntary guidance for organisations developing generative AI, increasingly referenced in government procurement requirements, creating de-facto compliance pressure for Forge API customers in the public sector. | 中 | SR002 |
| CR028 | FDA's AI/ML-enabled medical devices framework (SaMD) and 2024 action plan govern AI used in clinical diagnosis and treatment decision support but do not yet specifically regulate AI protein design tools used in pre-clinical drug discovery — a regulatory gap that may be filled if ESM3-based designs progress toward IND submissions. | 中 | SR003 |
| CR029 | No public BIS (Bureau of Industry and Security) final rule specifically governing export of protein language model weights or API access has been identified as of May 2026, but CSET has highlighted advancing US biotechnology governance as urgent for AI biosecurity, suggesting rulemaking activity is directionally likely. | 中 | SR008, SR001 |
| CR030 | The Biological Weapons Convention's absence of any AI-specific language or verification regime means that the primary international legal framework against bioweapons does not currently create binding compliance obligations for EvolutionaryScale specifically related to protein language model deployment. | 中 | SR006 |
| CR031 | Industry self-regulatory AI safety commitments (Anthropic's Responsible Scaling Policy, OpenAI safety commitments) set voluntary precedents for biosafety evaluation at capability thresholds, but EvolutionaryScale's Responsible Development Framework does not publicly specify comparable quantitative triggers or third-party verification requirements. | 中 | SR029, SR030, SR015 |
| CR032 | EvolutionaryScale has raised approximately $145 M total (seed plus $142 M Series A, September 2024) at a $1.35 B post-money valuation, with Amazon and Nvidia as lead investors; no subsequent funding round has been publicly disclosed as of May 2026. | 高 | SR017, SR019 |
| CR033 | EvolutionaryScale has disclosed no public revenue or ARR metrics; at $1.35 B valuation with $145 M raised and a pre-revenue or early-revenue profile, the implied revenue multiple significantly exceeds typical Series A SaaS multiples, creating down-round risk if commercial adoption is slower than investor expectations. | 中 | SR017 |
| CR034 | Amazon (AWS) is simultaneously a lead Series A investor, primary compute provider (AWS EC2 GPU instances), distribution channel (SageMaker JumpStart), and operator of a competing bio-AI discovery platform — a four-way conflict of interest with no public disclosure of ring-fencing arrangements. | 中 | SR017, SR026 |
| CR035 | Nvidia is simultaneously a lead Series A investor, primary GPU hardware supplier, BioNeMo platform operator (including ESM model distribution), and a developer of competing bio-AI capabilities — creating a comparable four-way structural conflict to Amazon's. | 中 | SR017, SR013 |
| CR036 | ESM Cambrian commercial use is available via AWS SageMaker, meaning Amazon earns transaction fees on Forge-equivalent commercial access to EvolutionaryScale's models — an arrangement that benefits Amazon's cloud revenue while potentially constraining EvolutionaryScale's direct-to-customer pricing power. | 中 | SR015, SR026 |
| CR037 | ESM3-98B training at 1×10²⁴ FLOPs represents one of the most computationally intensive biological model training runs recorded; the GPU compute costs for ongoing model development, API inference at commercial scale, and future ESM4 training represent a significant and growing operating expense with no public disclosure of unit economics. | 中 | SR014, SR013 |
| CR038 | All four named EvolutionaryScale founders (Alexander Rives, Tom Sercu, Zeming Lin, Salvatore Candido) are alumni of Meta AI's FAIR protein team, representing a single-employer concentration in the founding team with no disclosed external scientific advisory board. | 高 | SR028, SR014 |
| CR039 | The ESM3 biorxiv preprint author list names Thomas Hayes, Roshan Rao, Halil Akin, Nicholas Sofroniew, Deniz Oktay, Zeming Lin, Robert Verkuil, Tom Sercu, Salvatore Candido, and Alexander Rives among the core team — all identified as EvolutionaryScale, PBC employees — indicating technical concentration in the founding team. | 中 | SR028 |
| CR040 | Alexander Rives, CEO, is the original ESM model creator and lead author of the 2021 PNAS paper on ESM1v; Tom Sercu and Zeming Lin are the primary technical architects of ESM3 and ESM Cambrian respectively — departure of any of the three would represent a material scientific knowledge risk. | 中 | SR010, SR028 |
| CR041 | No succession plan, key-person insurance, or CEO independence structure has been publicly disclosed by EvolutionaryScale, leaving investors with no visible mitigation for key-person departure risk at the $1.35 B valuation level. | 低 | |
| CR042 | The facebookresearch/esm GitHub repository states it 'contains code and pre-trained weights' under Meta's terms; ESM3 was developed by founders who worked at Meta FAIR and built ESM2, creating a plausible IP provenance chain where Meta could assert rights over ESM3 commercial weights as derivative works. | 中 | SR010, SR024 |
| CR043 | The ESM3 biorxiv preprint competing interest statement discloses 'patents have been filed related to aspects of this work' but does not specify patent numbers, claims, status, jurisdiction, or relationship to Meta's prior art — leaving investors and customers unable to assess the durability of EvolutionaryScale's IP position. | 中 | SR028 |
| CR044 | EvolutionaryScale is incorporated as a Public Benefit Corporation (PBC), which in Delaware law creates a board obligation to balance shareholder interests with a stated public benefit purpose — potentially constraining purely commercial decisions about model access pricing or API gating in ways that may conflict with investor return expectations. | 中 | SR014, SR028 |
| CR045 | No litigation, regulatory complaint, enforcement action, or customer dispute records involving EvolutionaryScale, PBC have been identified in publicly accessible sources as of May 2026, indicating a clean legal record at this early stage. | 中 | SR017, SR014 |
| CV001 | EvolutionaryScale raised $142M in its Series A on September 26, 2024. | 高 | SV001, SV002, SV029 |
| CV002 | EvolutionaryScale's September 2024 Series A valued the company at approximately $1.35B post-money. | 高 | SV001, SV028, SV029 |
| CV003 | Amazon (AWS) and NVIDIA co-led EvolutionaryScale's Series A; Lux Capital, Nat Friedman, and Daniel Gross participated. | 高 | SV001, SV002 |
| CV004 | EvolutionaryScale has raised approximately $145M in total funding including seed capital as of May 2026. | 中 | SV001, SV028 |
| CV005 | EvolutionaryScale has not publicly disclosed any revenue, ARR, Forge customer count, or gross margin as of May 2026. | 高 | SV002, SV028 |
| CV006 | ESM3 was published in Science Magazine on January 16, 2025, documenting the generation of a novel fluorescent protein equivalent to simulating 500 million years of evolution. | 高 | SV005, SV003 |
| CV007 | ESM3 was trained on 2.78 billion proteins and 771 billion tokens with over 1×10^24 FLOPs, described as the most compute ever applied to training a biological model. | 高 | SV003, SV005 |
| CV008 | The Forge commercial API platform provides fee-based access to ESM3 and ESM Cambrian models for pharmaceutical and biotech R&D users. | 中 | SV002, SV004 |
| CV009 | EvolutionaryScale distributes Forge via AWS SageMaker JumpStart and NVIDIA BioNeMo, providing direct access to global pharma R&D cloud infrastructure. | 高 | SV026, SV002 |
| CV010 | The ESM Cambrian model family has over 6,320 downloads on HuggingFace as of May 2026, indicating active developer community adoption. | 中 | SV033, SV004 |
| CV011 | Absci (NASDAQ:ABSI) had a market capitalization of approximately $800M as of May 2026. | 高 | SV006, SV009 |
| CV012 | Absci reported revenue of $2.8M for FY2025 (down from $4.5M in FY2024) and a net loss of $115.2M for FY2025. | 高 | SV009, SV006 |
| CV013 | Recursion Pharmaceuticals (NASDAQ:RXRX) had a market capitalization of approximately $1.555B as of May 2026. | 高 | SV007, SV010 |
| CV014 | Recursion reported Q1 2026 revenue of $6.47M, which fell short of analyst expectations, with cash of $665.2M providing runway into early 2028. | 高 | SV007, SV010 |
| CV015 | Recursion had an accumulated deficit of $2.1 billion as of December 31, 2025, reflecting the capital intensity of AI drug discovery platform development. | 高 | SV010, SV007 |
| CV016 | Schrodinger (NASDAQ:SDGR) had a market capitalization of approximately $893M as of May 2026, with a 52-week high of $27.63 and low of $10.94. | 高 | SV008, SV013 |
| CV017 | Generate Biomedicines has raised approximately $700M in total disclosed funding and operates a generative biology platform targeting protein therapeutics. | 中 | SV021, SV028 |
| CV018 | Profluent has raised approximately $44M in disclosed funding and introduced OpenCRISPR, described as the first AI-designed gene editor. | 中 | SV019, SV030 |
| CV019 | Cradle.bio has raised approximately $73M in total disclosed funding and serves top biopharma and industrial bio R&D teams for protein optimization. | 中 | SV020, SV030 |
| CV020 | Xaira Therapeutics launched in April 2024 with $1B in Series A funding, the largest-ever AI drug discovery Series A at the time. | 中 | SV025, SV032 |
| CV021 | Isomorphic Labs (Alphabet-backed) is developing AI drug discovery with Lilly and Novartis collaborations reportedly worth $3B+ in combined headline deal value. | 中 | SV023, SV029 |
| CV022 | AlphaFold 3 was released by Google DeepMind on May 8, 2024, predicting structure and interactions of all biomolecules; AlphaFold Server provides free access to 3M+ researchers across 190+ countries for non-commercial research. | 高 | SV027, SV032 |
| CV023 | The ESM2 predecessor protein language model is available as open-source software from EvolutionaryScale's GitHub, providing free sequence-embedding capability comparable to lower-capability ESM3 tiers. | 高 | SV003, SV033 |
| CV024 | The global AI in drug discovery market is estimated at $2.35B in 2025, projected to grow to $13.77B by 2033 at a CAGR of 24.8%. | 中 | SV032, SV016 |
| CV025 | The global protein engineering market is estimated at $5.09B in 2025, projected to grow to $23.59B by 2035 at a CAGR of 16.57%. | 中 | SV015, SV017 |
| CV026 | US VC deal value reached $74.6B across 2,859 deals in Q4 2024, the highest since Q2 2022, driven primarily by AI investment including five companies raising $4B+ rounds. | 高 | SV014, SV032 |
| CV027 | KPMG's Venture Pulse Q4 2024 warned that VC investors are becoming more discerning as to who the winners may be in the AI space and will favor companies with credible commercial models over AI-wrapper businesses. | 中 | SV014 |
| CV028 | No EvolutionaryScale Form D securities filing was identified in SEC EDGAR's full-text search database as of May 2026, consistent with private status and possible Regulation D without full disclosure. | 中 | SV031, SV011 |
| CV029 | All four co-founders of EvolutionaryScale (Rives, Sercu, Lin, Candido) joined from Meta AI FAIR, representing correlated key-person concentration risk. | 高 | SV002, SV028 |
| CV030 | EvolutionaryScale's $1.35B Series A at pre-revenue stage implies an approximately 9.5x post-money-to-raised multiple, substantially above historical norms for pre-revenue biotech Series A rounds. | 中 | SV001, SV028, SV029 |
| CV031 | The bull case for EvolutionaryScale is $3–5B, contingent on Forge achieving $50–100M ARR by 2027 through multi-pharma contracts and AWS/NVIDIA channel scale. | 低 | SV001, SV032, SV028 |
| CV032 | The base case for EvolutionaryScale is $1.5–2.5B, reflecting a modest step-up from Series A entry if commercial ramp is slow ($10–25M ARR) and AWS/NVIDIA partnerships drive most revenue. | 中 | SV001, SV014, SV028 |
| CV033 | The bear case for EvolutionaryScale is $400M–800M, reflecting commoditization risk from open-source ESM2 and free AlphaFold 3, possible key-person departure, or acqui-hire by Amazon at a down-round price. | 中 | SV027, SV023, SV014 |
| CV034 | Amazon (AWS) and NVIDIA's co-investment creates a structural distribution advantage: both partners have a commercial incentive to route pharma API traffic through Forge via their respective cloud platforms. | 中 | SV026, SV002, SV009 |
| CV035 | Insilico Medicine completed an HKEX IPO raising approximately $293M in late 2025 (SEHK:3696), with a prior Series E valuation of ~$2.3B, providing a precedent for AI drug discovery company public market exits. | 中 | SV022, SV032 |
| CV036 | EvolutionaryScale's founders created the original ESM protein language model family at Meta AI FAIR, establishing unique domain authority and an institutional knowledge base not replicable at competing protein AI startups. | 高 | SV003, SV005, SV002 |
| CV037 | EvolutionaryScale's responsible development framework acknowledges dual-use biosecurity risks of ESM3; no public disclosure of specific customer screening protocols or DURC compliance procedures has been made. | 中 | SV002, SV003 |
| CV038 | Recursion's FY2025 10-K disclosed that three partners represented 95% of total partner program revenue, illustrating extreme customer concentration risk inherent in AI drug discovery platform businesses. | 高 | SV010, SV007 |
| CV039 | The drug discovery market is estimated at $71.89B in 2025, growing to $158.74B by 2034 at a 9.2% CAGR, representing the broader TAM for AI tools improving pharmaceutical R&D efficiency. | 中 | SV016, SV032 |
| CV040 | Recursion (RXRX) had a 52-week share price range of $2.80–$7.18 and Schrodinger (SDGR) a range of $10.94–$27.63 as of May 2026, evidencing high multiple compression volatility in public AI drug discovery comps. | 高 | SV007, SV008 |
| CV041 | Schrodinger (SDGR) most recently filed a 10-K on February 25, 2026 (for FY2025), confirming ongoing SEC reporting status and a current market cap of approximately $893M. | 高 | SV013, SV008 |
| CV042 | ESM3 was trained on biological data spanning diverse environments including the Amazon rainforest, hydrothermal vents, and soil microbiomes, representing one of the most comprehensive biological training datasets compiled by any private company. | 中 | SV003, SV005 |
| 编号 | 出版方 | 标题 | 引文 |
|---|---|---|---|
| SO001 | EvolutionaryScale | EvolutionaryScale Official Homepage | We are building the foundation for a new era of programmable biology — from foundational models for protein sequences to tools that let scientists design and understand proteins. |
| SO002 | EvolutionaryScale | ESM3 Release Blog Post | We are releasing ESM3, a generative multimodal model for protein design. ESM3 reasons over the sequence, structure, and function of proteins. |
| SO003 | EvolutionaryScale | ESM Cambrian Launch Blog Post | We are releasing ESM Cambrian, a new family of protein language models available in 300M, 600M, and 6B parameter sizes. |
| SO004 | Crunchbase | EvolutionaryScale on Crunchbase | EvolutionaryScale raised a total of $142M in funding over 2 rounds. Their latest funding was raised on Sep 26, 2024 from a Series A round. |
| SO005 | Hugging Face | EvolutionaryScale Organization on Hugging Face | EvolutionaryScale organization on Hugging Face; hosts ESM3 and ESM Cambrian model weights and model cards. |
| SO006 | EvolutionaryScale | EvolutionaryScale GitHub Organization | EvolutionaryScale on GitHub: 9 repositories including esm, DeepEP, nccl, and transformers forks. |
| SO007 | EvolutionaryScale / CZ Biohub | ESM Repository on GitHub | ESM: Evolutionary Scale Modeling — official model weights and inference code; repository redirected to biohub organization following acquisition. |
| SO008 | EvolutionaryScale | DeepEP Repository on GitHub | DeepEP: An efficient expert-parallel communication library optimized for mixture-of-experts models and inference. |
| SO009 | Science (AAAS) | Simulating 500 million years of evolution with a language model | We describe ESM3, a generative multimodal model that reasons over the sequences, structures, and functions of proteins. ESM3 was found to generate a new fluorescent protein distant from known sequences. |
| SO010 | bioRxiv (Cold Spring Harbor Laboratory) | ESM3: Simulating 500 million years of evolution with a language model (preprint) | ESM3: Simulating 500 million years of evolution — BioRxiv preprint doi: 10.1101/2024.07.01.600583; authors include Rives, Sercu, Candido, Lin. |
| SO011 | U.S. Securities and Exchange Commission | SEC EDGAR Full-Text Search: Form D filings for EvolutionaryScale 2024-2026 | 0 results found for EvolutionaryScale in Form D filings from January 2024 through May 2026. |
| SO012 | U.S. Securities and Exchange Commission | SEC EDGAR Company Search: EvolutionaryScale Form D | No companies found matching evolutionaryscale for Form D filings in SEC EDGAR. |
| SO013 | EvolutionaryScale on LinkedIn | EvolutionaryScale — Company size: 11-50 employees — Industry: Biotechnology Research — team has joined CZ Biohub. | |
| SO014 | CZ Biohub / Chan Zuckerberg Initiative | CZ Biohub: Frontier AI for Biology Initiative | We are thrilled to welcome the EvolutionaryScale team to the CZ Biohub Network. Alex Rives will serve as Head of Science at CZI, working to advance open biological science. |
| SO015 | Bloomberg | EvolutionaryScale Raises $142M from Amazon, NVIDIA | EvolutionaryScale Inc. raised $142 million from Amazon.com Inc. and Nvidia Corp. in a new financing round. Full article paywalled; detailed terms and investor rights not accessible for diligence. |
| SO016 | Reuters | EvolutionaryScale raises $142M for AI protein design | Reuters article confirmed as broken or inaccessible (401 JS-only response); content unavailable. |
| SO017 | NVIDIA | NVIDIA Blog: EvolutionaryScale ESM3 on BioNeMo and H100 NIM | EvolutionaryScale and NVIDIA partner to deploy ESM3 as a NIM microservice on H100 infrastructure. Tom Sercu, co-founder and VP of engineering, described the partnership. NVIDIA participated in both the seed and Series A rounds. |
| SO018 | Hacker News (Algolia API) | Hacker News Stories About EvolutionaryScale | Top HN stories include: ESM3 Simulating 500 million years of evolution (2024, ~500 points); EvolutionaryScale raises $142M Series A; EvolutionaryScale Acquired by CZI (Nov 2025, story 45838940). |
| SO019 | NVIDIA | NVIDIA NGC Catalog: ESM3 by EvolutionaryScale | ESM3 listed in NVIDIA NGC catalog under Clara team; page rendered as JS-only SPA; existence confirmed but detailed content not accessible. |
| SO020 | Semantic Scholar (Allen Institute for AI) | Semantic Scholar API: ESM3 / EvolutionaryScale paper search | Semantic Scholar API returns multiple papers related to protein language models and ESM3; provides citation count proxy and publication network for ESM family research. |
| SO021 | Wikimedia Foundation | Wikipedia: EvolutionaryScale (page not found) | Wikipedia page for EvolutionaryScale does not exist; URL returns HTTP 404 Not Found. No Wikipedia article has been created for the company as of May 2026. |
| SO022 | NVIDIA | NVIDIA Clara BioNeMo Platform | NVIDIA BioNeMo is a cloud platform for generative AI drug discovery; features ESM3 integration for protein sequence and structure generation. |
| SO023 | NVIDIA | NVIDIA News: NVIDIA Joins Seed Investment in EvolutionaryScale | NVIDIA News URL for seed investment announcement returns news archive page rather than the specific article; original press release content is inaccessible. |
| SO024 | EvolutionaryScale | EvolutionaryScale Forge API Platform | Forge API platform is a JavaScript-rendered SPA; no textual content accessible; existence confirmed but operational status post-acquisition is unknown. |
| SO025 | GlobeNewswire (expected: EvolutionaryScale) | GlobeNewswire: EvolutionaryScale Series A press release (expected) | URL returned content for a different company (Banzai International press release); EvolutionaryScale Series A press release was not accessible at this URL. |
| SO026 | Hugging Face / EvolutionaryScale | HuggingFace: ESM3-sm-open-v1 model card | ESM3-sm-open-v1 on HuggingFace: 3,110+ downloads; open model for non-commercial academic use; model card describes sequence, structure, and function inputs. |
| SO027 | Axios | Axios: EvolutionaryScale Series A funding protein AI | Axios article on EvolutionaryScale Series A was rate-limited during fetch; content not retrieved; URL confirms coverage of the funding round. |
| SM001 | MarketsandMarkets | Protein Engineering Market Size, Share & Trends Analysis Report — MarketsandMarkets | The protein engineering market size is projected to grow from USD 2.2 billion in 2019 to USD 3.9 billion by 2024, at a CAGR of 12.4%. |
| SM002 | Precedence Research | Protein Engineering Market Size, Growth & Forecast 2025–2035 — Precedence Research | The global protein engineering market size was estimated at USD 5.09 billion in 2025 and is expected to reach around USD 23.59 billion by 2035, growing at a CAGR of 16.57%. |
| SM003 | Allied Market Research | Protein Engineering Market by Type, Application, and Region — Allied Market Research | The global protein engineering market size was valued at $2.2 billion in 2022, and is projected to reach $7.7 billion by 2032, growing at a CAGR of 13.2% from 2023 to 2032. |
| SM004 | Grand View Research | Protein Engineering Market Size, Share & Trends Analysis — Grand View Research | The global protein engineering market size was valued at USD 2.60 billion in 2023 and is expected to grow at a CAGR of 16.24% from 2024 to 2030. |
| SM005 | Grand View Research | Artificial Intelligence In Drug Discovery Market — Grand View Research | The global artificial intelligence in drug discovery market was valued at USD 2.35 billion in 2025 and is expected to reach USD 13.77 billion by 2033 at a CAGR of 24.8%. |
| SM006 | Precedence Research | Drug Discovery Market Size, Share & Trends 2025–2034 — Precedence Research | The global drug discovery market size was estimated at USD 71.89 billion in 2025 and is expected to reach around USD 158.74 billion by 2034, growing at a CAGR of 9.2%. |
| SM007 | Google DeepMind | AlphaFold: Protein Structure Database — Google DeepMind | The AlphaFold Protein Structure Database provides open access to over 200 million protein structure predictions covering nearly all known proteins. |
| SM008 | NVIDIA | NVIDIA AI for Healthcare and Life Sciences — NVIDIA BioNeMo | NVIDIA BioNeMo is the development platform for AI-driven biology and drug discovery. 2x faster biofoundation model training. 6x faster model inference. |
| SM009 | Amazon Web Services | Amazon SageMaker JumpStart — AWS | |
| SM010 | U.S. Food and Drug Administration (FDA) | Artificial Intelligence in Drug Development — FDA | FDA has received over 500 AI/ML-enabled drug development submissions since 2016. The CDER AI Council was established in 2024. |
| SM011 | EvolutionaryScale | Simulating 500 million years of evolution with a language model — ESM3 Blog | ESM3 is a frontier multimodal generative model for biology. We are releasing ESM3 as open for academic and non-commercial use. For commercial access to ESM3, we are launching the Forge API. |
| SM012 | EvolutionaryScale | ESM Cambrian: Building the Frontier of Protein Language Models — Blog | ESM C is now available on AWS SageMaker JumpStart and NVIDIA BioNeMo. ESM C is released under the MIT license for any use, including commercial applications. |
| SM013 | Science (AAAS) | Simulating 500 million years of evolution with a language model — Science | ESM3 is a frontier multimodal generative model for biology that reasons over the sequence, structure, and function of proteins simultaneously, trained on sequences of 2.78 billion proteins. |
| SM014 | bioRxiv (Cold Spring Harbor Laboratory) | Search results: evolutionaryscale ESM3 — bioRxiv | |
| SM015 | bioRxiv (Cold Spring Harbor Laboratory) | Simulating 500 million years of evolution with a language model — bioRxiv preprint | We have developed ESM3, a frontier multimodal generative model for biology trained at the scale of evolution. |
| SM016 | IQVIA | IQVIA — Healthcare and Life Science Analytics | |
| SM017 | EvolutionaryScale (GitHub) | evolutionaryscale/esm — GitHub repository | |
| SM018 | Hugging Face | evolutionaryscale — Hugging Face organization page | esm3-sm-open-v1: 3,110 downloads. esmc-600m-2024-12: 6,320 downloads. |
| SM019 | Amazon Web Services | AWS for Health — Genomics and Life Sciences | |
| SM020 | Statista | Pharmaceutical industry research and development expenditure worldwide 2008–2024 | |
| SM021 | Fortune Business Insights | Artificial Intelligence In Drug Discovery Market Size & Forecast | |
| SM022 | Crunchbase | EvolutionaryScale — Crunchbase company profile | Total funding: $142M. Most recent funding: Series A. |
| SM023 | National Human Genome Research Institute (NHGRI) | DNA Sequencing Costs: Data — National Human Genome Research Institute | The cost per raw megabase of DNA sequence dropped dramatically from ~$10,000 in 2001 to less than $0.01 by 2023, reaching approximately $100 per genome. |
| SM024 | Lux Capital | EvolutionaryScale — Lux Capital portfolio | ESM3 is the first generative AI model for biology that simultaneously reasons over the sequence, structure, and function of proteins. |
| SM025 | PyPI | esm — Python Package Index | This repository contains flagship protein models for EvolutionaryScale, as well as access to the API. ESM3 is our flagship multimodal protein generative model. |
| SP001 | Profluent Bio | Profluent — AI-Designed Proteins | The world's first AI-designed gene editor, demonstrating authorship in action. |
| SP002 | Generate Biomedicines | Generate Biomedicines — Generative Biology | "42,000 proteins generated, built, and tested – and we're just getting started." |
| SP003 | Generate Biomedicines | The Generate Platform | "GB-0895 has the potential to shift treatment from monthly to just twice per year." |
| SP004 | AbSci Corporation | AbSci — Unlocking Novel Biology with AI | "De novo design of biologics; Multi-parametric lead optimization; Data to train, AI to create, and wet lab to validate with 6 week cycle times" |
| SP005 | Cradle | Cradle — AI Protein Engineering Platform | "Teams that use Cradle report 2-12x faster development timelines." |
| SP006 | Google DeepMind | AlphaFold — AI for Protein Structure | "Demis Hassabis and John Jumper are co-awarded the Nobel Prize in Chemistry for their work on AlphaFold, alongside David Baker for his work on computational protein design." |
| SP007 | Adaptyv Bio | Adaptyv Bio — Cloud Lab for Protein Designers | |
| SP008 | Isomorphic Labs | Isomorphic Labs — Reimagining Drug Discovery with AI | "Isomorphic Labs is here to advance human health by building on and beyond the Nobel-winning AlphaFold system." |
| SP009 | Chai Discovery | Chai Discovery — Drug-Like Antibody Design | "Drug-like antibody design against challenging targets with atomic precision" |
| SP010 | Institute for Protein Design (University of Washington) | Institute for Protein Design — We Create New Proteins | "We create new proteins that solve challenges in medicine, technology, and sustainability." |
| SP011 | Recursion Pharmaceuticals | Recursion — Pioneering AI Drug Discovery | "Over 50 petabytes spanning phenomics, transcriptomics, proteomics, ADME, and de-identified patient data." |
| SP012 | Recursion Pharmaceuticals | Recursion Drug Discovery Pipeline | |
| SP013 | Schrödinger | Schrödinger — Physics-Based Software Platform for Molecular Discovery | "Built upon more than 30 years of R&D, our industry-leading computational platform is transforming the way therapeutics and materials are discovered." |
| SP014 | Schrödinger | Schrödinger Computational Platform for Molecular Discovery & Design | |
| SP015 | Inceptive | Inceptive — Foundation Models of Life, for Life | "We build end-to-end foundation models that learn to design molecules directly from diverse observations of life. We specialize in sequence-based medicines like mRNA, siRNA, ASOs, and peptides." |
| SP016 | Iambic Therapeutics | Iambic Therapeutics — Better Technology for Better Medicines | "IAM1363 for HER2: Highly selective, brain-penetrant inhibitor for HER2-driven cancers that has shown anti-tumor activity, safety and tolerability in Phase 1b studies" |
| SP017 | Meta AI (Facebook Research) | GitHub: facebookresearch/esm — Evolutionary Scale Modeling (esm) | "This repository contains code and pre-trained weights for Transformer protein language models from the Meta Fundamental AI Research Protein Team (FAIR), including our state-of-the-art ESM-2 and ESMFold." |
| SP018 | Meta AI (AI at Meta) | facebook/esm2_t33_650M_UR50D · Hugging Face | License:mit |
| SP019 | OpenFold Consortium | OpenFold Consortium — Open Ecosystem for AI Biology | "Our goal is to develop an open ecosystem of accelerated AI for Biology tools in order to catalyze innovation, starting with state-of-the-art and permissively licensed protein structure prediction training and inference pipelines and models." |
| SP020 | Meta AI | ESM Metagenomic Atlas: The First View of the 'Dark Matter' of the Protein Universe | "We found that using a language model of protein sequences greatly accelerates the speed of structure prediction (up to 60x)." |
| SP021 | Nature | De novo design of protein structure and function with RFdiffusion | |
| SP022 | Science (AAAS) | Simulating 500 million years of evolution with a language model | |
| SP023 | EvolutionaryScale | EvolutionaryScale on HuggingFace — ESM3 and ESMC Model Families | |
| SP024 | EMBL-EBI / Google DeepMind | AlphaFold Protein Structure Database | "AlphaFold DB provides open access to over 200 million protein structure predictions to accelerate scientific research." |
| SP025 | EvolutionaryScale | ESM3 — A Frontier Language Model for Biology (EvolutionaryScale Blog) | "ESM3 represents a milestone model in the ESM family—the first created by our team at EvolutionaryScale, an order of magnitude larger than our previous model ESM2, and natively multimodal and generative." |
| SP026 | U.S. Securities and Exchange Commission (EDGAR) | Absci Corp (ABSI) 10-K Filing — FY2025 (Period: 2025-12-31; Filed: 2026-03-24) | |
| SP027 | Insilico Medicine | Insilico Medicine — Generative AI Software for Drug Discovery | |
| SP028 | Xaira Therapeutics | Xaira Therapeutics — AI Drug Discovery | "We are building predictive and agentic AI models across the complete spectrum of the drug discovery and development process." |
| SP029 | Wikipedia | AlphaFold — Wikipedia | |
| SI001 | EvolutionaryScale | EvolutionaryScale homepage — joining forces with Biohub | |
| SI002 | EvolutionaryScale | ESM3: A new paradigm for protein language models — ESM3 release blog | |
| SI003 | EvolutionaryScale | ESM Cambrian blog — commercial model release January 2025 | |
| SI004 | EvolutionaryScale | Forge API product page — commercial protein AI API | |
| SI005 | U.S. Securities and Exchange Commission | SEC EDGAR full-text search — Form D filings for 'EvolutionaryScale' (0 results) | |
| SI006 | U.S. Securities and Exchange Commission | SEC EDGAR full-text search — Form D filings for 'Evolutionary Scale' (0 results) | |
| SI007 | U.S. Securities and Exchange Commission | SEC EDGAR company browse — Form D filings for 'evolutionary scale' (0 results) | |
| SI008 | U.S. Securities and Exchange Commission | SEC EDGAR company browse — Form D filings for 'evolutionaryscale' (0 results) | |
| SI009 | CNBC | EvolutionaryScale raises $142 million from Amazon, Nvidia for protein AI | |
| SI010 | Axios | EvolutionaryScale Series A funding — protein AI $142 million Amazon NVIDIA | |
| SI011 | MIT Technology Review | EvolutionaryScale raises $142 million for protein AI from Amazon, Nvidia | |
| SI012 | CNBC | Chan Zuckerberg Initiative Biohub joins with EvolutionaryScale team | |
| SI013 | CZI Biohub | Frontier AI for Biology Initiative — EvolutionaryScale team joins Biohub | |
| SI014 | Crunchbase | EvolutionaryScale company profile — funding, investors, products | |
| SI015 | PitchBook | EvolutionaryScale company profile — funding history | |
| SI016 | NVIDIA | NVIDIA joins seed investment in EvolutionaryScale | |
| SI017 | NVIDIA | NVIDIA partners with EvolutionaryScale — ESM3 on BioNeMo | |
| SI018 | NVIDIA | EvolutionaryScale debuts with ESM3 generative AI model on BioNeMo and H100 | |
| SI019 | NVIDIA NGC Catalog | ESM3 model on NVIDIA NGC Catalog — Clara / BioNeMo resource | |
| SI020 | Lux Capital | Lux Capital — EvolutionaryScale Series A portfolio announcement | |
| SI021 | GitHub | EvolutionaryScale GitHub organization — repos and activity | |
| SI022 | GitHub | evolutionaryscale/esm — ESM model repository | |
| SI023 | Hugging Face | EvolutionaryScale/esm3-sm-open-v1 — open-weight ESM3 on HuggingFace | |
| SI024 | Bloomberg | EvolutionaryScale raises $142 million from Amazon, Nvidia (Bloomberg; access blocked) | |
| SI025 | Wikipedia | EvolutionaryScale — Wikipedia article | |
| SI026 | Hacker News (Algolia API) | Hacker News search — EvolutionaryScale funding Series A discussions | |
| SE001 | EvolutionaryScale | Simulating 500 million years of evolution with a language model | ESM3 is a generative model that reasons over the sequence, structure and function of proteins simultaneously. We trained ESM3 on an enormous scale: 771B tokens and 1.07×10^24 FLOPs. |
| SE002 | EvolutionaryScale | ESM Cambrian: New foundational protein language models | ESM Cambrian (ESMC) introduces a new family of protein language models with 300M, 600M, and 6B parameter sizes. |
| SE003 | EvolutionaryScale | EvolutionaryScale — Homepage | |
| SE004 | EvolutionaryScale | Forge — EvolutionaryScale API Platform | |
| SE005 | American Association for the Advancement of Science (AAAS) | Simulating 500 million years of evolution with a language model (Science, Vol 387) | Hayes et al. Science Vol 387 Issue 6736 pp. 850-858 (January 16, 2025); DOI 10.1126/science.ads0018 |
| SE006 | bioRxiv / Cold Spring Harbor Laboratory | Simulating 500M years of evolution with a language model (ESM3 preprint) | esmGFP is 58% sequence identical to the nearest natural GFP and has 96 mutations out of 229 total amino acid positions. |
| SE007 | bioRxiv / Cold Spring Harbor Laboratory | More Structure, Less Accuracy: ESM3's Binding Prediction Paradox | When distinct relaxed mutant structures are used per variant (rather than a single consistent backbone), ESM3's binding prediction performance deteriorates — a counter-intuitive result suggesting that more structural information can reduce accuracy. |
| SE008 | bioRxiv / Cold Spring Harbor Laboratory | bioRxiv search — EvolutionaryScale ESM3 citing papers | |
| SE009 | EvolutionaryScale | GitHub — evolutionaryscale/esm (official Python client and model weights) | |
| SE010 | EvolutionaryScale | GitHub — evolutionaryscale organization page | |
| SE011 | EvolutionaryScale | GitHub — evolutionaryscale/DeepEP (MoE Expert Parallelism library) | |
| SE012 | GitHub / EvolutionaryScale | GitHub API — evolutionaryscale/esm repository metadata | |
| SE013 | HuggingFace / EvolutionaryScale | HuggingFace — EvolutionaryScale organization page | |
| SE014 | HuggingFace / EvolutionaryScale | HuggingFace model card — esm3-sm-open-v1 | 3,105 downloads in the last month; 291 likes; license: Cambrian Non-Commercial License Agreement |
| SE015 | HuggingFace / EvolutionaryScale | HuggingFace model card — esmc-600m-2024-12 | |
| SE016 | NVIDIA | NVIDIA Clara BioNeMo — Platform for generative AI in drug discovery | |
| SE017 | NVIDIA | EvolutionaryScale Debuts With ESM3 Generative AI Model for Protein Design | ESM3 was trained using the Andromeda cluster, which uses NVIDIA H100 GPUs and NVIDIA Quantum-2 InfiniBand networking. ESM3 uses roughly 25x more flops and 60x more data than its predecessor, ESM2. |
| SE018 | NVIDIA | NVIDIA NGC Catalog — ESM3 (Clara resource) | |
| SE019 | Amazon Web Services | Amazon SageMaker JumpStart — Foundation models and ML solutions | |
| SE020 | Crunchbase | EvolutionaryScale — Crunchbase organization profile | |
| SE021 | U.S. Securities and Exchange Commission (SEC) | SEC EDGAR EFTS — Form D search for EvolutionaryScale | |
| SE022 | Hacker News (Algolia) | Hacker News search — EvolutionaryScale community discussion | |
| SE023 | CZI Biohub | Biohub launches initiative combining frontier AI & frontier biology | The team at EvolutionaryScale, a frontier AI research lab and public benefit company that has created groundbreaking, large-scale AI systems for the life sciences, will join Biohub to help advance this initiative. Alex Rives, EvolutionaryScale's co-founder and chief scientist, will serve as head of science. |
| SE024 | Lux Capital | EvolutionaryScale Series A — Lux Capital investment update | |
| SE025 | Axios | EvolutionaryScale raises $142 million Series A — Axios Pro Health Tech | |
| SE026 | Semantic Scholar / Allen Institute for AI | Semantic Scholar paper search — ESM3 EvolutionaryScale citations | |
| SE027 | DeepMind / Google | AlphaFold — DeepMind protein structure prediction | |
| SE028 | Chai Discovery | Chai Discovery — protein complex structure prediction | |
| SE029 | Semantic Scholar / Allen Institute for AI | Semantic Scholar paper search — ESM3 protein language model evaluation benchmark limitation | |
| SU001 | EvolutionaryScale | EvolutionaryScale — Company Homepage | ESM3 is a family of models in three sizes: small, medium, and large, available through our API and our partner's platforms. |
| SU002 | EvolutionaryScale | ESM3: A Frontier Language Model for Biology (Blog) | We're opening our API for biological intelligence, now in public beta, allowing scientists in academia and industry a free limited time preview of the capabilities of some of our models through Forge. |
| SU003 | EvolutionaryScale | ESM Cambrian: Revealing the Mysteries of Proteins with Unsupervised Learning (Blog) | ESM C 6B is available on Forge for academic use, and AWS Sagemaker for commercial use. ESM C will also be available on NVIDIA BioNemo soon. |
| SU004 | EvolutionaryScale (GitHub) | evolutionaryscale/esm — GitHub Repository (README) | ESM C models are also available on Amazon SageMaker under the Cambrian Inference Clickthrough License Agreement. Under this license agreement, models are available for broad use for commercial entities. |
| SU005 | EvolutionaryScale (GitHub) | evolutionaryscale — GitHub Organization Page | esm-partner — Repository for partner collaborations |
| SU006 | HuggingFace | EvolutionaryScale — HuggingFace Organization Page | biohub/esm3-sm-open-v1 Updated Jan 29, 2025 • 3.11k • 291 ... biohub/esmc-300m-2024-12 Updated 2 days ago • 6.32k • 30 ... biohub/esmc-600m-2024-12 Updated 2 days ago • 1.49k • 32 |
| SU007 | NVIDIA | NVIDIA BioNeMo — AI Platforms for Healthcare and Life Sciences | BioNeMo — NVIDIA BioNeMo™ is the development platform for AI-driven biology and drug discovery. Use Cases: biofoundation model building, molecular design, virtual screening, protein structure prediction, protein binder design |
| SU008 | Adaptyv Bio | Adaptyv Bio — Company Website | |
| SU009 | Semantic Scholar (Allen Institute for AI) | Semantic Scholar API — ESM3 EvolutionaryScale Downstream Paper Search | total: 32 |
| SU010 | bioRxiv (Cold Spring Harbor Laboratory) | bioRxiv Search — evolutionaryscale ESM3 Preprint Results | 129 Results for term 'evolutionaryscale ESM3' |
| SU011 | BusinessWire (Berkshire Hathaway) | EvolutionaryScale Raises $142M Series A to Advance Protein Language Models | |
| SU012 | CNBC | EvolutionaryScale Raises $142 Million, Protein AI Amazon NVIDIA | |
| SU013 | GlobeNewsWire | EvolutionaryScale Raises $142M Series A | |
| SU014 | Crunchbase | EvolutionaryScale — Crunchbase Organization Profile | EvolutionaryScale secured $142 million in a seed investment round. The funding was backed by Amazon and Nvidia and is intended for the development of protein-generating AI. |
| SU015 | Science (AAAS) | Simulating 500 million years of evolution with a language model | |
| SU016 | NVIDIA Newsroom | NVIDIA Joins Seed Investment in EvolutionaryScale | |
| SU017 | Lux Capital | EvolutionaryScale Series A — Lux Capital Blog | |
| SU018 | Wikipedia | EvolutionaryScale — Wikipedia | |
| SU019 | Wikipedia | Generate Biomedicines — Wikipedia | |
| SU020 | Precedence Research | Drug Discovery Market Size, Share & Trends 2025–2034 | The global drug discovery market size is valued at USD 71.89 billion in 2025 and is predicted to increase from USD 78.51 billion in 2026 to approximately USD 158.74 billion by 2034, expanding at a CAGR of 9.20% from 2025 to 2034. |
| SU021 | Isomorphic Labs | Isomorphic Labs — Company Homepage | |
| SU022 | Generate Biomedicines | Generate Biomedicines — Company Homepage | 42,000 proteins generated, built, and tested |
| SU023 | Axios | EvolutionaryScale Series A Funding — Axios | |
| SU024 | TechCrunch | EvolutionaryScale — TechCrunch Tag Page | |
| SU025 | Amazon Web Services | EvolutionaryScale ESM-C — AWS Marketplace Product Listing | |
| SU026 | Amazon Web Services | Revolutionizing Drug Discovery with AI: A Spotlight on EvolutionaryScale (AWS Industries Blog) | |
| SU027 | NVIDIA Developer | EvolutionaryScale ESM3 on NVIDIA (Developer Blog) | |
| SU028 | Nuclear Threat Initiative (NTI) | NTI Biosecurity — Program Overview | |
| SU029 | Center for AI Safety | Statement on AI Risk — safe.ai | |
| SR001 | U.S. Government Publishing Office / Federal Register | Executive Order 14110: Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence | addressing AI systems' most pressing security risks — including with respect to biotechnology, cybersecurity, critical infrastructure, and other national security dangers |
| SR002 | National Institute of Standards and Technology (NIST) | AI Risk Management Framework (AI RMF) — ITL AI Program | NIST released NIST-AI-600-1, Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile |
| SR003 | U.S. Food and Drug Administration | Artificial Intelligence-Enabled Medical Devices | |
| SR004 | arXiv / MIT (Sandbrink, Shulman) | Can large language models democratize access to dual-use biotechnology? | In one hour, the chatbots suggested four potential pandemic pathogens, explained how they can be generated from synthetic DNA using reverse genetics, supplied the names of DNA synthesis companies unlikely to screen orders, identified detailed protocols |
| SR005 | Center for AI Safety (CAIS) | Statement on AI Risk | Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war. |
| SR006 | Wikipedia | Biological Weapons Convention | As of May 2025, 189 states have become party to the treaty. The convention's effectiveness has been limited due to insufficient institutional support and the absence of any formal verification regime to monitor compliance. |
| SR007 | Nuclear Threat Initiative (NTI) | Biosecurity — NTI | These technologies also introduce risks of accidental misuse and deliberate exploitation, which could result in a biological catastrophe with grave consequences. |
| SR008 | Center for Security and Emerging Technology (CSET, Georgetown) | Biosecurity and Innovation in the Age of AI: Safeguarding the Future of U.S. Biotechnology | |
| SR009 | Nature (Baker Lab / University of Washington) | De novo design of protein structure and function with RFdiffusion | RFdiffusion enables the design of diverse functional proteins from simple molecular specifications |
| SR010 | Meta AI (facebookresearch) | facebookresearch/esm: Evolutionary Scale Modeling — Pretrained language models for proteins | This repository contains code and pre-trained weights for Transformer protein language models from the Meta Fundamental AI Research Protein Team (FAIR) |
| SR011 | EMBL-EBI / Google DeepMind | AlphaFold Protein Structure Database | AlphaFold DB provides open access to over 200 million protein structure predictions to accelerate scientific research |
| SR012 | AQ Laboratory (Columbia/UCSF) | aqlaboratory/openfold: Trainable, memory-efficient PyTorch reproduction of AlphaFold 2 | |
| SR013 | NVIDIA | NVIDIA BioNeMo — AI Development Platform for Biology and Drug Discovery | NVIDIA BioNeMo is the development platform for AI-driven biology and drug discovery |
| SR014 | EvolutionaryScale | ESM3: A New Era for Protein Design (Launch Blog — Responsible Development) | We have created a Responsible Development Framework to guide our work towards our mission with transparency and clarity |
| SR015 | EvolutionaryScale | ESM Cambrian: Representation learning for protein language models | ESM C was reviewed by a committee of scientific experts who concluded that the benefits of releasing the models greatly outweigh any potential risks |
| SR016 | EvolutionaryScale | evolutionaryscale/esm — ESM protein models and Forge API access | |
| SR017 | Crunchbase | EvolutionaryScale — Crunchbase Company Profile | |
| SR018 | HuggingFace | EvolutionaryScale — HuggingFace Organization Page | |
| SR019 | Bloomberg | EvolutionaryScale Raises $142 Million for Protein AI | |
| SR020 | EUR-Lex (European Union) | Regulation (EU) 2024/1689 — The Artificial Intelligence Act | laying down harmonised rules on artificial intelligence ... to promote the uptake of human centric and trustworthy artificial intelligence |
| SR021 | artificialintelligenceact.eu | The Act Texts — EU Artificial Intelligence Act | |
| SR022 | Wikipedia | AlphaFold | AlphaFold 2's results at CASP14 were described as 'astounding' and 'transformational'. As of November 2025, the paper had been cited nearly 43,000 times. |
| SR023 | Chai Discovery | Introducing Chai-1: A Multi-Modal Foundation Model for Molecular Structure Prediction | We tested Chai-1 across a large number of benchmarks, and found that the model achieves a 77% success rate on the PoseBusters benchmark (vs. 76% by AlphaFold3), as well as an Cα LDDT of 0.849 on the CASP15 protein monomer structure prediction set (vs. 0.801 by ESM3-98B) |
| SR024 | Meta AI | ESM Metagenomic Atlas: The first view of the 'dark matter' of the protein universe | |
| SR025 | Johns Hopkins Center for Health Security | Center for Health Security — Mission and Research Focus | We advance policies and practice addressing diverse challenges, including... the potential for biological accidents or intentional threats |
| SR026 | Amazon Web Services | Amazon SageMaker JumpStart | |
| SR027 | Wikipedia | Asilomar Conference on Recombinant DNA | A group of about 140 professionals participated in the conference to draw up voluntary guidelines to ensure the safety of recombinant DNA technology |
| SR028 | bioRxiv / EvolutionaryScale | Simulating 500 million years of evolution with a language model (Preprint) | Authors are employees of EvolutionaryScale, PBC. Patents have been filed related to aspects of this work. |
| SR029 | OpenAI | Safety and Responsibility — OpenAI | |
| SR030 | Anthropic | Responsible Scaling Policy | |
| SR031 | Wikipedia | AI Safety | |
| SV001 | Crunchbase | EvolutionaryScale — Funding, Investors, and Overview | EvolutionaryScale secured $142 million in a seed investment round. The funding was backed by Amazon and Nvidia and is intended for the development of protein-generating AI. |
| SV002 | EvolutionaryScale | EvolutionaryScale — Company Homepage | |
| SV003 | EvolutionaryScale | ESM3: A frontier language model for biology — ESM3 Release Blog | trained with over 1x10^24 FLOPS and 98B parameters |
| SV004 | EvolutionaryScale | ESM Cambrian — EvolutionaryScale Blog | |
| SV005 | Science (AAAS) | Simulating 500 million years of evolution with a language model | Simulating 500 million years of evolution with a language model |
| SV006 | Yahoo Finance | Absci Corporation (ABSI) Stock Price, News, Quote & History | Market Cap (intraday) 799.793M |
| SV007 | Yahoo Finance | Recursion Pharmaceuticals (RXRX) Stock Price, News, Quote & History | Market Cap (intraday) 1.555B |
| SV008 | Yahoo Finance | Schrodinger, Inc. (SDGR) Stock Price, News, Quote & History | Market Cap (intraday) 892.913M |
| SV009 | Absci Corporation (SEC Filing) | Absci Corporation Annual Report on Form 10-K for FY2025 (absi-20251231) | Revenue was $2.8 million for the year ended December 31, 2025 compared to $4.5 million for the year ended December 31, 2024. |
| SV010 | Recursion Pharmaceuticals (SEC Filing) | Recursion Pharmaceuticals Annual Report on Form 10-K for FY2025 (rxrx-20251231) | We had an accumulated deficit of $2.1 billion as of December 31, 2025. |
| SV011 | SEC EDGAR | EDGAR Filing Search — Absci Corp 10-K filings | |
| SV012 | SEC EDGAR | EDGAR Filing Search — Recursion Pharmaceuticals 10-K filings | |
| SV013 | SEC EDGAR | EDGAR Filing Search — Schrodinger Inc 10-K filings | |
| SV014 | KPMG Private Enterprise | Venture Pulse Q4 2024 — US Venture Capital Trends | people are now starting to become more discerning as to who the winners may be in the AI space — the companies with credible business models, creating highly disruptive solutions, as opposed to others who have put AI wrappers on existing solutions |
| SV015 | Precedence Research | Protein Engineering Market Size to Hit USD 23.59 Billion By 2035 | The global protein engineering market size was estimated at USD 5.09 billion in 2025 and is predicted to increase from USD 5.95 billion in 2026 to approximately USD 23.59 billion by 2035, expanding at a CAGR of 16.57% from 2026 to 2035. |
| SV016 | Precedence Research | Drug Discovery Market Size, Share, and Trends 2025–2034 | The global drug discovery market size is valued at USD 71.89 billion in 2025 and is predicted to increase from USD 78.51 billion in 2026 to approximately USD 158.74 billion by 2034. |
| SV017 | MarketsandMarkets | Protein Engineering Market — Global Forecast | |
| SV018 | Absci | Absci — AI Biologics Drug Creation Platform | |
| SV019 | Profluent | Profluent — AI Protein Design | |
| SV020 | Cradle | Cradle — AI Protein Engineering Platform | |
| SV021 | Generate Biomedicines | Generate Biomedicines — Generative Biology Platform | |
| SV022 | Insilico Medicine | Insilico Medicine — Generative AI Drug Discovery | |
| SV023 | Isomorphic Labs | Isomorphic Labs — Reimagining Drug Discovery with AI | |
| SV024 | Recursion | Recursion — Pioneering AI Drug Discovery | |
| SV025 | Xaira Therapeutics | Xaira Therapeutics — Company Homepage | |
| SV026 | NVIDIA | NVIDIA Clara BioNeMo — AI Drug Discovery and Protein Language Models | |
| SV027 | Google DeepMind | AlphaFold — Predicting Protein Structure and Interactions | AlphaFold 3 and AlphaFold Server are launched — Google DeepMind and Isomorphic Labs introduce AlphaFold 3, which predicts the structure and interactions of all of life's molecules. |
| SV028 | PitchBook | EvolutionaryScale — PitchBook Funding Profile | |
| SV029 | Bloomberg | EvolutionaryScale Raises $142 Million From Amazon, Nvidia | |
| SV030 | Hacker News (Algolia Search API) | Hacker News — EvolutionaryScale funding Series A discussions | |
| SV031 | SEC EDGAR (Full-Text Search) | EDGAR Full-Text Search — EvolutionaryScale Form D | hits total value 0 |
| SV032 | Grand View Research | Artificial Intelligence In Drug Discovery Market Report, 2033 | The global artificial intelligence in drug discovery market size was estimated at USD 2.35 billion in 2025 and is projected to reach USD 13.77 billion by 2033, growing at a CAGR of 24.8% from 2026 to 2033. |
| SV033 | Hugging Face | EvolutionaryScale Organization — Hugging Face Model Hub |