初创公司尽调
尽调报告 AI safety / interpretability tools Series B private 2026-06-10

Goodfire

以可解释性为原生起点的模型设计实验室,背后资本强,但商业规模仍未验证

Goodfire 看起来像能定义品类的可解释性公司,但公开记录仍不足以把 2026 年 2 月估值承销为明确便宜。

封面要素

最新公开估值 01
1.25 USD B [CV001]
最新轮次 02
150 USD M [CV001]
已披露累计融资 03
207 USD M [CO021, CI005]
投资建议 05
research-more [CV047]

公司概况

Goodfire 是一家位于 San Francisco 的 AI 可解释性公司,也是公益公司,正在搭建一套模型设计环境,用来理解、调试和引导神经网络。公司围绕 Silico / Ember 式可解释性工作流,向前沿模型团队、医疗和科学 AI 项目以及其他高风险部署场景销售选择性的企业与研究合作;但公开披露仍让收入质量和客户广度大体不透明。

官网
www.goodfire.ai
成立时间
2024-01-01
创始人
Eric Ho, Daniel Balsam, Tom McGrath
创立地点
San Francisco, California, USA
总部
San Francisco, California
产品
Goodfire 的产品是一套模型设计环境,能够暴露模型内部机制、帮助诊断失效模式、支持引导和监控,并且越来越多地围绕选择性企业与科学部署来包装。
客户
前沿模型开发者、企业 AI 团队、生命科学和科学 AI 团队,以及其他高风险模型开发者。
商业模式
围绕平台访问、试点以及高接触研究或现场工程支持,开展选择性的设计伙伴和企业软件合作。
阶段
Series B private
融资情况
2026 年 2 月宣布完成 $150 million Series B,估值 $1.25 billion;此前已完成种子轮和 Series A。
[CO001, CO003, CO004, CO018, CO021, CI007, CI008, CI009]

执行摘要

主要优势

  • Goodfire 研究可信度罕见地强,产品逻辑也从一开始围绕可解释性展开,差异化清楚。
  • 股权结构表里有高信号投资人和战略支持者,横跨前沿 AI 与企业软件。
  • 医疗、科学 AI 和企业设计伙伴工作流中的早期旗舰合作,显示公司确实有切入口潜力。

主要风险

  • 公开披露仍看不到 ARR、收入质量、标准化定价、留存或客户集中度。
  • 公司在证明可重复的软件经济性前,已被按未来基础设施赢家定价。
  • 相邻的可观测性、护栏和平台厂商,可能不用 Goodfire 更深的工具,也能吃掉许多买方预算。

未决问题

  • 仍需要 NDA 级披露来验证经常性收入、定价架构以及软件 / 服务收入结构。
  • Series B 后的优先股堆叠、所有权结构以及任何老股或债务特征仍未披露。
  • 公开材料没有给出经验证的客户数、员工数或客户集中度画像。

目录

Chapter 01

01公司概况

1.1 身份、使命与产品定位

Goodfire 将自己定义为一家研究公司,用可解释性来理解、学习和设计 AI 系统;多份官方和融资来源也称其为位于 San Francisco 的公益公司。公司的核心投资逻辑是,前沿 AI 仍过度以黑箱方式构建,因此其使命不是只靠规模,而是让模型变得可理解、可调试、可塑造。官方材料始终围绕「模型设计环境」来讲业务:帮助用户检查模型内部机制、诊断失效模式,并在特征或回路层面干预行为。 产品叙事已经逐步成熟。2025 年 Series A 材料把 Ember 放在 Goodfire 旗舰可解释性平台的位置;到 2026 年,面向公众的产品页则把 Silico 包装成首个面向有意图模型设计的平台。GTM 动作看起来更偏选择性,而非大众市场:Goodfire 称其与 Fortune 500 企业、大型医疗机构和 AI 研究实验室合作,公开产品文案也反复指向训练或微调基础模型的组织。因此,公开证据支撑的公司身份是研究实验室、平台供应商与设计伙伴模式的结合体;客户集中度和商业规模仍基本未披露。[CO001, CO002, CO003, CO004, CO005, CO024]

快照 KPI 表
指标数值 / 状态日期置信度缺口 / 备注
总部San Francisco, California2026-06-10官方和投资人材料一致;招聘页明确 Telegraph Hill 办公室
组织类型公益公司(Public benefit corporation)2026-06-10官方和融资材料反复出现
当前阶段私营,Series B 阶段2026-06-10私营状态已披露;阶段由最新融资推断
最新轮次$150M Series B2026-02-05B Capital 领投
最新估值$1.25B2026-02-05官方和第三方报道反复确认
已披露总融资~$207M;公开口径四舍五入为 >$200M2026-02-05已披露种子轮、A 轮和 B 轮融资合计
成立日期2026-06-102024 年由种子轮时间和 Series A 表述暗示,但一个独立 profile 称为 2023 年
当前产品品牌Silico2026-04-302025 年早期材料使用 Ember;产品命名已有演变
收入 / ARR2026-06-10已审阅来源没有公开披露收入或 ARR
客户数量2026-06-10未公开披露客户数量;只提到宽泛客户类别
员工数量2026-06-10未披露官方员工数;一个独立 profile 估计截至 2026 年 1 月约 51 人
已披露客户画像Fortune 500 企业、主要医疗机构、AI 研究实验室2026-06-10具名 logo 和合同数量仍稀少

Null 值表示公开指标缺乏支撑,而不是为零。融资和估值证据较充分,成立日期、员工数、收入和客户数量仍不完整或间接。

[CO001, CO003, CO004, CO008, CO009, CO020]
FO002: 公司快照逻辑

Goodfire 如何把研究身份、产品架构、合作伙伴类型、资本和执行依赖串起来。

[CO002, CO003, CO004, CO005, CO024, CO026]

1.2 创始人、领导层与组织画像

创始团队公开叙事集中在三位联合创始人:Eric Ho 任 CEO,Daniel Balsam 任 CTO,Tom McGrath 任首席科学家。在投资人和公司材料中,Ho 是主要公开发言人与战略声音;Balsam 更像把可解释性转化为产品和应用研究的技术运营者;McGrath 则以 Google DeepMind 可解释性团队前创始人的身份提供强科学背书。Menlo 和 Salesforce 材料还把 Ho、Balsam 与 RippleMatch 的早期运营经历关联起来,强化了 Goodfire 同时具备前沿研究履历和创业执行力的叙事。 更广义的团队画像也是投资逻辑的一部分。Goodfire 及其支持者重点提到来自 OpenAI、Google DeepMind、Harvard、Stanford 和 UC San Diego 的校友,以及 Nick Cammarata、Leon Bergen 等具名贡献者。不过,公开领导层披露并不完整:已审阅材料没有给出完整 C-suite 名单、详细董事会构成或股权结构图。甚至创立日期也存在一定模糊。融资材料暗示公司成立于 2024 年,因为 Series A 被称为在成立不到一年后完成,且 Lightspeed 在 2024 年 8 月公开宣布种子轮;但一份独立简介称 Goodfire 成立于 2023 年。公开披露的办公足迹也很窄:Goodfire 招聘页称岗位要求每周五天在 San Francisco 的 Telegraph Hill 办公室现场办公。[CO006, CO007, CO008, CO009, CO010, CO011]

领导层与创始人表
人员职位背景创始人-市场匹配 / 覆盖范围关键人物依赖
Eric HoCEO,联合创始人RippleMatch 前创始人 / 运营者;在融资和媒体报道中是 Goodfire 的公开代表设定公司叙事、融资、伙伴关系和可解释性商业定位高——外部叙事和投资人信心与 Ho 紧密绑定
Daniel BalsamCTO,联合创始人RippleMatch 前 AI 和工程负责人;在 Mayo 和投资人材料中以技术运营者身份出现把前沿可解释性研究接入产品,以及应用基因组学 / 企业用例高——核心技术执行和产品化很大程度上落在 Balsam 身上
Tom McGrath首席科学家,联合创始人Google DeepMind 可解释性团队前创始人;反复被引用为科学锚点提供研究可信度、议程设定和技术招聘能力高——科学品牌和品类权威实质依赖 McGrath
Nick Cammarata高级可解释性研究员 / 明星团队成员OpenAI 早期开创性可解释性团队核心贡献者说明 Goodfire 能从全球很小的顶尖可解释性人才池招聘中——不是唯一决策者,但对研究正当性有价值

覆盖范围是局部的,因为已审阅来源没有披露完整董事会、财务负责人或完整管理层名单。本表聚焦公开具名创始人和高信号技术领导层。

[CO009, CO010, CO011, CO012, CO013, CO014]

1.3 融资历史、投资人基础与当前阶段

对一家研究优先的基础设施公司来说,Goodfire 的融资速度异常快。公开来源显示,公司在 2024 年 8 月完成由 Lightspeed 领投的 $7 million 种子轮,2025 年 4 月完成由 Menlo Ventures 领投的 $50 million Series A,并在 2026 年 2 月完成由 B Capital 领投、估值 $1.25 billion 的 $150 million Series B。Series A 财团加入了 Lightspeed、Anthropic、B Capital、Work-Bench、Wing 和 South Park Commons;Series B 则在原有投资人之外,把 Juniper Ventures、DFJ Growth、Salesforce Ventures 和 Eric Schmidt 加入股权结构表。Goodfire 与第三方报道都把累计融资约为「超过 $200 million」,而简单相加已披露轮次则意味着约 $207 million。 投资人组合与融资金额同样重要。Anthropic 参与 Series A,是来自安全导向前沿实验室的战略信号;Salesforce Ventures 指向企业软件采用角度;B Capital 在 Series B 领投,则反映出其相信可解释性可能成为一层重要基础设施。不过,公开记录对持股比例、清算结构、债务、老股交易和董事席位披露很少。已审阅公开来源把 Goodfire 标为私营公司;应把公司视为一家后期风投阶段、Series B 私营企业,而非已经规模化的商业软件公司。这个区分很重要,因为融资速度和估值已经远超公司已披露的收入与客户指标。[CO016, CO017, CO018, CO019, CO020, CO021]

利益相关方或投资人图谱
利益相关方角色控制权或经济重要性尽调请求
Lightspeed Venture Partners种子轮领投;Series A 和 B 参与方最早的机构领投方和持续支持者;可能在早期治理中有影响力确认当前持股、pro rata 权利和任何董事席位
Menlo VenturesSeries A 领投;Series B 参与方首个大型机构轮次的关键财务赞助方,也是公开可见的支持者确认董事会角色、后续融资储备用途和任何保护性条款
AnthropicSeries A 参与方战略投资人;其出现向前沿实验室释放安全与可解释性相关性信号澄清投资是否包含技术合作、渠道价值,还是单纯财务敞口
B CapitalSeries B 领投;Series A 参与方$1.25B 估值轮的领投方;Series B 后可能拥有重大董事会和治理影响力确认持股比例、董事席位、清算条款和任何商业引荐权
Juniper VenturesSeries B 现有投资人被列为 Series B 回归投资人,但在更早公开材料中能见度较低判断进入轮次、持股以及相对于更知名 VC 的影响力
DFJ GrowthSeries B 新投资人增加后期风险资本规模和潜在后续融资能力评估 DFJ 的观点是平台基础设施,还是前沿模型期权
Salesforce VenturesSeries B 新投资人和战略企业伙伴释放企业软件和采购相关性信号,而不只是研究支持澄清 Salesforce 是否提供渠道访问、产品合作或董事会观察权
Eric SchmidtSeries B 天使 / 战略投资人带来品牌和政策可信度,影响力可能超过支票规模判断 Schmidt 是被动资本,还是主动网络参与者
Wing Venture CapitalSeries A 和 B 参与方来自基础设施导向投资人群体的持续风险资本支持确认持股以及在产品 GTM 指导中的任何角色
South Park CommonsSeries A 和 B 参与方;早期生态赞助方鉴于 Goodfire 早期办公历史和人才网络,是重要生态支持者澄清人才管道,以及 SPC 是否在正式成立前提供孵化
Work-BenchSeries A 参与方在更早阶段提供企业软件模式识别能力判断 Work-Bench 在 Series B 后是否仍活跃
Mayo Clinic / 设计伙伴战略性非投资人利益相关方伙伴在经济上重要,因为商业证明似乎依赖选择性高风险合作要求签署客户参考、付费试点状态和续约动态

投资人图谱覆盖所有公开明确具名的利益相关方,但不等于完整股东名单。准确持股、董事会代表、清算优先权和二级交易活动,在已审阅来源中未公开披露。

[CO016, CO017, CO018, CO020, CO022, CO023]
FO003: 快照 KPI

一张紧凑的成熟度快照,突出融资、阶段,以及运营指标公开披露有限。

已披露累计融资只是公开 $7M 种子轮、$50M Series A 和 $150M Series B 的简单加总。该图有意省略收入和客户数 KPI,因为公开来源无法支撑这些指标。

[CO020, CO021, CO026, CO029, CO030, CO032]

1.4 时间线、封面指标与关键尽调风险

公开时间线很短,但信息密度高。Goodfire 在 2024 年通过种子融资进入公众视野,2025 年 4 月宣布 Series A,2025 年 9 月公开 Mayo Clinic 合作,2025 年末推出 fellowship 项目和行业建设型教育内容,随后在 2026 年 2 月宣布 Series B 和更广泛的有意图设计议程。到 2026 年 4 月,MIT Technology Review 把 Silico 作为调试和引导模型的商业产品报道;到 2026 年 5 月,Goodfire 开始强调 SOC 2 认证和日益面向企业的姿态。这条序列显示,公司试图在不到两年内,把前沿可解释性研究转化为产品与合作伙伴可信度。 封面指标的关键模式是不对称:估值和累计融资有充分支撑,经营指标仍稀疏。已审阅公开来源没有披露收入、ARR 或客户数。员工数没有官方披露;一份独立简介估计截至 2026 年 1 月约 51 名员工。这种不透明很重要,因为来源集中最可信的反向证据不是不当行为,而是执行风险:MIT Technology Review 引用一位外部可解释性研究者的观点,认为 Goodfire 是给「炼金术增加精度」,而不是把 AI 工程变成完全有原则的科学;一份独立健康科技分析则认为,对一家研究优先、商业牵引仍早期的公司来说,Series B 估值偏激进。因此,尽调负担不在标题可信度,而在商业证明、治理披露,以及可解释性需求多快能转化为可重复的软件收入。[CO025, CO026, CO027, CO030, CO031, CO032]

里程碑表
日期事件类型金额 / 估值 / 状态参与方含义
2024-08-15Lightspeed 公开宣布领投 Goodfire 种子轮创立$7M seedGoodfire;Lightspeed Venture Partners建立第一个公开融资标记,并锚定 2024 年运营时间线
2025-04-17Goodfire 宣布 Series A 和 Ember 平台融资$50M Series AMenlo Ventures 领投;Lightspeed、Anthropic、B Capital、Work-Bench、Wing、South Park Commons公司从种子期研究实验室,转向机构支持的平台叙事
2025-09-09Goodfire 宣布与 Mayo Clinic 合作基因组医学合作合作已宣布Goodfire;Mayo Clinic相关性从核心可解释性研究扩展到医疗和临床 AI
2025-10-09Goodfire 开放秋季 fellowship 项目扩张Fellowship cohort 招募Goodfire 研究人员释放主动扩充人才、并在核心创始团队之外建设领域生态的信号
2025-12-11Goodfire 分享 Stanford 可解释性客座讲座治理教育内容发布Goodfire 研究人员;Stanford 课程社区显示思想领导力,以及围绕自身议程塑造学科的努力
2026-02-05Goodfire 宣布 Series B 和 intentional design 议程融资$150M,估值 $1.25BB Capital 领投;Juniper、Menlo、Lightspeed、DFJ Growth、Salesforce Ventures、Eric Schmidt 等验证投资人兴趣,并把公司大幅重定价为品类基础设施
2026-04-30MIT Technology Review 报道 Silico 公开发布产品收费产品发布 / 发布报道Goodfire;MIT Technology Review标志公司从研究平台叙事,转向更广泛商业产品化
2026-04-30外部研究者警告称 Silico 给「炼金术」增加了「精度」反向持怀疑态度的专家评论Leonard Bereska 与 MIT Technology Review引入怀疑:产品是否真的解决了其声称要处理的科学不确定性
2026-05-22Goodfire 宣布 SOC 2 Type II 认证监管合规认证已宣布Goodfire支撑企业采购和处理敏感模型开发流程的信任姿态
2026-06-10公开客户画像仍是选择性披露,而非广泛市场证明扩张已陈述 Fortune 500 / 医疗 / 研究实验室使用;无广泛指标Goodfire;未具名客户商业故事仍更依赖设计伙伴质量,而不是已披露体量指标

时间线强调公开材料中可见的带日期事件。部分条目代表公开披露日期,而非底层运营开始日期;成立和商业规模问题仍有部分未解。

[CO017, CO016, CO018, CO024, CO026, CO027]
FO001: 公司里程碑时间线

从 Goodfire 种子轮亮相,到 Series B、产品发布以及首次有分量的怀疑报道,梳理关键公开里程碑。

[CO017, CO016, CO018, CO024, CO026, CO035]

1.5 图表

Chapter 02

02市场分析

2.1 市场边界与证据约束下的规模测算

Goodfire 的相关市场比 AI 热潮的标题叙事更窄。公司材料把产品栈描述为围绕理解模型内部机制、调试失效、引导行为、塑造训练,并在部分场景监控生产行为来构建。这个边界排除了通用 copilot 类工具、一般应用可观测性,以及从未进入模型设计工作流的 AI 基础设施支出。最接近的公开类比是 Arize、Fiddler、Datadog、LangSmith、Langfuse、Humanloop、Arthur 和 Patronus 等 LLM 可观测性与评估供应商;但即便是这些公司,也主要监测提示词、追踪、会话和输出,而不是模型参数或潜在表示。 由于 Goodfire 不披露定价、客户数或收入,经典 TAM-SAM-SOM 堆栈会夸大精确度。证据约束下的做法是改用多重视角:第一,看 AI 使用和 ROI 压力扩散的宏观需求信号;第二,用已发布的相邻工具定价,为已经购买可观测性和评估产品的团队建立软件预算下限;第三,用访问视角把可触达市场缩窄到能够提供模型内部机制、并能承受服务较重试点动作的组织。这一组合支撑的是一个真实但选择性的市场;短期实质更集中在先进模型团队,而不是泛化的企业 AI 叙事。[CM001, CM002, CM015, CM016, CM017, CM020]

市场定义表
细分市场 / 品类纳入支出排除支出买方 / 付款方相关性
前沿实验室可解释性和模型设计可解释性研究基础设施、steering 工作流、训练塑形工具、安全诊断,以及与自有模型绑定的生产监控通用 AI 基础设施、纯推理托管和通用应用分析研究负责人、安全团队和前沿模型 R&D 预算最自然的直接细分市场,因为实验室拥有模型内部结构,也已经重视可解释性
企业模型工程与治理专有或开放权重企业模型的 debugging、eval、steering 和监控只使用第三方闭源 API、没有内部模型访问权的团队VP Engineering、AI 平台负责人、ML infra 和高级产品预算企业运行或微调自身重要模型时可触达
科学 AI 与生命科学模型设计基因组学、生物学和机器人学中的模型解码、验证、混杂因素剔除和发现工作流通用实验室软件、湿实验工具和非模型 R&D 软件科学项目负责人、计算生物学团队和研究预算当理解模型内部能改变科学质量或部署质量时,匹配度很强
受监管和高后果采用者面向金融、医疗、法律或安全关键 AI 的可解释性、治理和验证层商品化办公 copilots 和通用知识工作者订阅临床、合规、风险或领域运营预算,并有技术赞助需求很高,但采购和证据负担更重,成交更难
相邻 LLM 可观测性和评测 stack生产 AI 团队已编入预算的 tracing、prompt 管理、eval、实验和护栏当供应商只观察输出或 traces 时,深度参数或潜空间控制不在其中开发者工具、平台工程和 MLOps 预算重要相邻市场,因为这些预算定义了最接近的公开可比集合

市场边界刻意收窄:它跟随可能落入模型内部理解、steering 和验证工作流的支出,而不是所有生成式 AI 软件或基础设施。

[CM020, CM021, CM022, CM028, CM029, CM030]
TAM/SAM/SOM 或规模测算视角表
发布方年份地域数值CAGR方法置信度局限
Gartner2025全球幻灭低谷(定性成熟度视角)用 hype-cycle 视角判断落地现实性和 ROI 分化对判断时点和保持谨慎有用,但不是市场规模数字。
PwC2025全球100% 行业提高 AI 使用;AI 暴露行业收入 / 员工增长高 3x宏观采用和生产率视角采用广度真实存在,但它不能隔离可解释性工具预算。
Arize + Langfuse2026全球 SaaS小团队在重度使用前的年度标价为 $348-$600基于公开自助计划的自下而上相邻定价视角Trace-and-eval 工具相邻,但不同于模型内部设计工具。
Langfuse2026全球 SaaS批量附加项前的企业年度标价为 $29,988公开企业标价视角单一供应商数据点无法揭示 Goodfire 定价或胜率。
Fiddler2026全球 SaaS$0.002 per trace基于使用量的可观测性视角支出完全取决于 trace 量,并且仍反映输出 trace 可观测性,而不是可解释性工作。
Goodfire 直接市场视角2026选择性设计伙伴未披露 / 逐案定价来自 MIT 报道和 Goodfire 试点协议的直接商业视角公开 ACV、客户数量或 pipeline 数据均不存在,无法真正搭建 TAM-SAM-SOM。

本表刻意混合定性成熟度信号和相邻定价代理,因为公开证据不支持干净的 Goodfire TAM-SAM-SOM。重点是用可观察视角框定市场,而不是虚构自上而下数字。

[CM015, CM016, CM017, CM019, CM024, CM025]
FM001: 市场规模视角

公开证据支持一个庞大的 AI 需求背景、一个可见的相邻可观测性预算层,以及一个窄得多的 Goodfire 直接捕获层;后者受模型访问和高接触试点定义。

这是一组受约束的视角层,而不是数值化 TAM-SAM-SOM 瀑布。只有相邻预算层有公开可见定价;Goodfire 的直接商业层未披露。

[CM020, CM019, CM024, CM026, CM044, CM045]
FM002: 市场估算区间

公开定价只能支撑相邻软件预算的自下而上区间;Goodfire 的直接 ACV 仍未披露,因此这些数字是比较代理,而不是 Goodfire 收入估算。

所有数值都是相邻市场价格代理,不是 Goodfire 价格。按量计费行直接取自 Fiddler 公开的每条 trace 费率,并使用明确的 100k、1M 和 10M 年度 trace 场景。

[CM024, CM025, CM026, CM047, CM048, CM049]

2.2 买方分层、预算所有者与采用路径

最清晰的买方,是既控制重要模型、又能暴露足够内部状态让 Goodfire 做出有意义工作的团队。前沿实验室位于名单最上方,因为它们已经有可解释性工作,拥有能使用工具的研究和安全人员,并直接承受塑造模型行为的压力。企业模型团队排在其次,前提是它们拥有自研或开放权重模型,并能用 AI 平台或先进工程预算证明专用工具的合理性。基因组学、生物学、机器人等研究密集领域的科学 AI 团队尤其相关,因为可解释性可以验证预测到底来自真实结构还是捷径,并能暴露人类可复用的领域知识。受监管采用者需求强,但隐私、治理和证据要求叠加,拉长了成交周期。 付款方并不总是最终用户。研究负责人、CTO、平台负责人或科学项目 owner 可能购买;模型科学家、安全团队和计算研究者使用;中央 AI R&D、平台或研究项目预算买单。公开法律和产品证据暗示,销售动作以试点优先:识别一个高风险模型问题,拿到模型和数据访问权限,在共享环境中运行可解释性或引导工作,证明控制或验证结果,然后才扩展到更长期的监控或授权。这套动作更适合高接触、设计伙伴型市场,而不是大规模自助式软件市场。[CM003, CM005, CM006, CM009, CM010, CM011]

细分市场 / 买方图谱
细分市场买方用户付款方 / 工作流预算负责人采用触发点
前沿实验室首席科学家、可解释性负责人、安全负责人可解释性研究员、模型科学家、安全工程师围绕训练控制、对齐和失效分析的研究项目前沿模型研发和安全预算需要调试、引导或对齐内部开发的前沿模型
企业模型团队CTO、工程副总裁、AI 平台负责人应用科学家、ML 工程师、评测团队自有或微调模型项目,需要可靠性或控制能力AI 平台、基础设施或高级产品预算高价值模型工作流中,追踪不够用,必须做更深层控制
生命科学 / 科学 AI 团队研究总监、计算生物学负责人、科学创始人计算科学家、建模人员、转化研究团队科学发现或验证工作流,与自有基础模型绑定研究项目或疾病领域预算需要验证模型预测反映真实机制,而不是混杂因素
受监管采用者临床、法律、合规或风险高管,并配有技术支持者领域专家、审查团队、模型风险人员围绕高后果决策支持或专门模型治理的试点领域预算加治理监督扩大部署前,需要透明、可审计的行为

买方、用户、付款方的拆分很重要,因为 Goodfire 卖的是高接触能力层。每个细分市场里, 最好的触发点都是客户足够掌控、能够深入检查的高价值模型。

[CM028, CM029, CM030, CM031, CM032, CM033]
FM003: 买方 / 细分市场地图

Goodfire 近期最适合的细分市场,既需要强可解释性,也真正能访问模型内部;受监管采用者需求强,但眼下触达较弱。

该矩阵是基于公开产品、法律、研究和独立报道的证据化序数综合。它衡量相对可触达性,而不是已披露收入。

[CM028, CM029, CM030, CM031, CM036, CM046]
FM004: 采用漏斗 / 价值链地图

Goodfire 的公开材料暗示一条试点优先的价值链:从高风险模型问题开始,只有在模型访问和可解释性工作证明价值后才扩展。

该序列来自 Goodfire 的法律试点协议、产品页面和 Series B 叙事。公开来源没有披露逐阶段转化率。

[CM009, CM010, CM032, CM036, CM044]

2.3 增长驱动、约束与估值相关性

需求侧条件有利。PwC 显示,AI 暴露度高的行业正在产生显著更高的人均收入,并支付较大的技能溢价,说明企业确实愿意为提高 AI 系统效果的工具付费。同时,相邻供应商反复把可观测性、护栏和评估描述为业务关键,因为自主系统如今已经触达收入、运营和用户体验。这对 Goodfire 有利,因为预算对话已经存在;公司不需要从零发明可靠性或控制的重要性。其科学和受监管用例也恰好对应仅看输出的评估最不充分、深层可解释性最有战略价值的场景。 刹车因素同样重要。Gartner 称 ROI 差异很大,隐藏实施成本可能很高。NIST 式治理预期、数据隐私规则以及临床或科学验证标准都会拖慢部署。最重要的是,Goodfire 自身叙事和独立报道都同意,这个领域技术上仍不成熟:公司营销的是精密工程,但外部批评者、甚至 Goodfire 自己的研究论文都承认,可解释性仍有重大开放问题。再叠加对模型内部访问的要求,以及公开定价和客户数据缺失,估值就应锚定一个选择性高价值楔子,而不是大众市场软件假设。[CM016, CM017, CM018, CM019, CM034, CM035]

增长驱动因素与约束表
驱动因素 / 约束方向时点含义尽调追问
更高风险的 AI 部署上行当前AI 进入科学、医疗和自主工作流后,市场会更需要深度验证和控制追问现有客户把 Goodfire 用于部署前验证,还是事后分析。
生产率和劳动力压力上行12-24 个月企业看到真实 AI 生产率收益后,更愿意为提升模型可靠性的工具付费要求证明 Goodfire 能把调试或后训练迭代周期缩短到足以支撑预算。
相邻可观测性预算常态化上行当前追踪、评测和护栏已经是获批预算类别,预算沟通更容易追问 Goodfire 多常与 LangSmith、Datadog、Langfuse 或类似平台一起销售。
科学发现上行空间上行12-36 个月如果结果可重复,生物和机器人用例会把市场从软件团队扩到更广要求提供科学客户或合作伙伴的收入拆分和续约证据。
模型访问依赖下行当前闭源模型客户更难服务,因为 Goodfire 需要的访问深度超过多数纯 API 用户能提供的水平要求拆分管线中开放权重、自有内部、闭源 API 潜在客户的占比。
治理和验证负担下行当前且上升受监管买方可能最看重可解释性,但采购周期也最长要求按细分市场提供平均销售周期,以及安全或治理审查耗时。
机制可解释性的技术不成熟下行12-36 个月业界争论该领域离精密工程还有多近,会压住预算紧迫感要求提供基准证据,证明 Goodfire 能改变生产任务结果,而不只是研究演示。
Goodfire 定价和客户披露不透明下行当前缺少公开价格和客户数据,外部投资人只能承销一个选择性故事,而不是广谱市场故事要求按 cohort 披露 ACV 区间、试点到许可转化率和客户数。

核心承销问题不是需求是否存在,而是 Goodfire 能否把真实的控制需求转成可重复的商业部署, 且速度快过访问限制、治理摩擦和领域不成熟带来的拖累。

[CM016, CM018, CM019, CM034, CM035, CM036]

2.4 图表

Chapter 03

03竞争格局

3.1 按竞争者类别划分的格局

Goodfire 处在一个不寻常的竞争位置。公开产品语言讲的不是事后提示词监控或通用 LLM 遥测,而是有意图的模型设计、特征引导、定向失效修正,以及对模型内部机制的程序化访问。MIT Technology Review 把 Silico 定位为一种机制可解释性工具,把此前集中在 Anthropic、OpenAI 和 Google DeepMind 内部的技术交到更小公司和研究团队手中。因此,对构建或改造开放权重模型的买方来说,前沿实验室内部可解释性团队和成熟的内部研究团队,是最接近的直接替代方案。更广义的商业格局更拥挤,但间接性更强。Arize Phoenix、LangSmith、Langfuse、Datadog、Fiddler、Arthur,以及 Humanloop 等已退出独立平台的公司,都在争夺可信 AI 开发相关预算;但它们默认的控制点是围绕已部署系统的追踪、评估、护栏或治理,而不是深入编辑已学得的表示。实际含义是,评估 Goodfire 时不应把它当作又一个可观测性仪表盘,而应视为一层新工具,服务于需要在训练前、训练中和训练后理解机制的模型构建者。[CP001, CP002, CP004, CP006, CP007, CP008]

竞争对手画像表
竞争对手类别规模 / 融资信号目标细分市场关键差异化相比 Goodfire 的关键短板
Goodfire / Silico机制可解释性原生的模型设计完成 $150M Series B,估值 $1.25B;披露总融资约 $209M构建或适配开放权重和领域专用模型的团队以编程方式访问模型内部、特征引导、数据归因和部署前失效诊断相比相邻工具厂商,公开定价、胜率和装机基数证据稀少
前沿实验室内部可解释性团队(Anthropic / OpenAI / Google DeepMind)直接在位者 / 内部自建嵌在前沿实验室内部,而不是作为独立产品销售拥有闭源权重访问权限的前沿模型构建者对专有模型和内部研究人才的访问最深对多数买方不是商业产品;无法作为供应商采购
Arize Phoenix相邻开源追踪和评测平台开源产品;AX Pro 起价 $50/月,另有企业层构建 agent 和 LLM 应用的 AI 工程师追踪、评测、数据集、实验和开源入口聚焦 agent 开发可观测性,而不是机制层面编辑模型内部
Fiddler AI相邻企业可观测性 / 护栏供应商免费层,开发者计划每条 trace $0.002,企业部署选项需要 AI 系统监控、策略和治理的企业统一可观测性、自定义评测器、实时护栏、SaaS/VPC/本地部署选项竞争发生在监控和控制平面层,而不是特征级模型设计层
Arthur相邻生命周期可靠性和治理供应商企业 AI 平台,页面上有监控和策略工作流证明同时管理 agents、GenAI 和传统 ML 的企业覆盖 AI 生命周期的持续评测、策略、护栏、仪表盘和监督公开证据很少显示其具备机制可解释性或定向内部模型编辑
Datadog LLM Observability在位可观测性平台每月免费 40K 个 LLM span;Pro 起价 $160/月,含 100K 个 span现有 Datadog 客户把 APM 延伸到 AI 交付把 agent 可观测性与后端监控、实验、数据留存和企业控制打包更适合运营生产 AI 系统,而不是逆向工程模型表征
LangChain LangSmith相邻开发者工作流在位者开发和小规模生产有免费层;付费计划随 trace 量扩展已在 LangChain 或多框架 agent 栈上开发的团队强 agent 追踪、SDK 广度、框架邻近性和调试工作流公开页面描述的是可观测性,不是机制模型编辑或训练数据归因
Langfuse相邻开源 AI 工程平台每月 10B+ observations;100k+ 工程师;免费以及 $29/$199/$2499 自助计划想要 OSS 追踪、评测、prompt 和生产反馈循环的开发者OpenTelemetry 基础、自托管、透明定价和大型 OSS 分发经济性和开发者工作流优势,不会自动转化为 Goodfire 式内部模型控制
Humanloop(历史)相邻评测 / prompt 管理供应商免费试用含 50 次 eval run 和每月 10K 条日志;现加入 Anthropic 并关闭为可信 LLM 应用评测模型、管理 prompt 的团队Prompt 管理、评测指标、私有部署附加项已不再是独立平台,凸显品类整合风险
Weights / Weave(历史)被前沿实验室吸收的相邻工具供应商团队加入 OpenAI 后,产品陆续关闭使用早期 Weights 产品的创作者和模型构建者说明 AI 工具人才可能被前沿实验室吸收已不再是活跃独立竞争对手;主要是品类吸收信号
内部黑箱工作流现状替代 / 内部自建工程人力加商品化开源或点状工具不愿购买新供应商品类的团队灵活且初期便宜:prompt、评测、微调和护栏可以逐步拼装让团队停留在猜测—验证循环里,缺少关于模型为何失败的机制证据

画像有意混合直接、在位、相邻、历史和替代选项,因为 Goodfire 争夺的是一个待完成任务, 而不是单一分析师定义的软件类别。

[CP001, CP006, CP007, CP017, CP018, CP019]
FP001: 竞争定位图

序数定位显示,Goodfire 最靠近机制化模型控制;Datadog、Fiddler、LangSmith 和 Langfuse 在部署可观测性广度上得分更高。

X 轴是机制化访问 / 直接模型可编辑性,从 1(仅表层可观测性)到 5(深度模型内部访问)。Y 轴是部署与分发广度,从 1(狭窄研究工作流)到 5(广泛装机基础或平台触达)。分数是基于证据的序数,由已审阅来源页面综合而来,不是基准测试结果。

[CP001, CP006, CP007, CP017, CP019, CP021]

3.2 相邻供应商:能力、包装与预算重叠

相邻供应商集合在商业上相关,因为它们争夺同一场围绕可信 AI 的买方对话,但产品通常锚定不同工作流。Arize Phoenix 强调面向智能体开发的开源追踪、评估、数据集和实验。Fiddler 与 Arthur 更偏生命周期可观测性、护栏、政策和治理。Datadog 把智能体可观测性折进更大的应用监控资产中,这一点重要,因为既有装机基础会让「够用」的 AI 监管比独立平台更容易被采购。LangSmith 和 Langfuse 都强调开发者工作流和生产调试;Langfuse 尤其把强开源姿态与透明自助定价结合起来,而 LangSmith 宣传免费层和按 trace volume 计费。Humanloop 历史上面向可信 LLM 应用的开发、提示词管理和评估,但其并入 Anthropic 表明,这个类别可能被模型实验室吸收,而不一定保持独立。相较这些供应商,Goodfire 在机制访问和定向模型编辑上看起来有差异化,但在公开定价、装机基础和广泛部署的可观测性界面上更薄。[CP017, CP018, CP019, CP020, CP021, CP022]

功能 / 能力矩阵
购买标准Goodfire前沿实验室内部团队Arize PhoenixFiddler AIArthurDatadogLangSmithLangfuseHumanloop(历史)内部黑箱栈
机制性访问模型内部不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知有限
定向引导或编辑已学得特征不支持 / 未知有限有限不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知仅通过 prompt 或微调实现,能力有限
训练数据归因或探针工作流不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知unknown
生产追踪 / 实验 / 评测循环有限unknown部分
实时护栏 / 策略执行有限unknown有限有限有限有限有限部分
开源或自托管路径公开证据有限无商业路径有限unknown有限unknown有限
企业部署 / 合规控制仍在形成 / 公开证明有限仅内部使用unknown历史上较强取决于内部团队
领域专用科学模型工作流公开证明有限不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知不支持 / 未知团队自建则可定制

单元格是基于已审阅产品页面的定性判断;不支持或缺少能力披露的地方标为未知, 而不是靠推断补足。

[CP001, CP002, CP007, CP008, CP011, CP012]
定价 / 包装对比
产品公开价格 / 合同模式包装细节包含能力未知项 / 折扣含义
Goodfire / Silico按案例收费;Goodfire 拒绝给出具体定价按客户需求定制商业合作模型设计环境、实验 agent、机制调试和引导未披露公开自助标价或使用量计费表买方更难对标 ROI;产品必须靠差异化结果销售,而不是靠透明入门价格
Arize AX / Phoenix免费层;AX Pro $50/月;企业定制Pro 含每月 50k spans 和 10GB;企业可选 SaaS 或自托管追踪、评测、数据集、实验、可观测性提到创业公司定价,但未公开列出细节为主要需要 agent 遥测和评测工作流的团队设定低入门门槛
Fiddler AI免费层;Developer 每条 trace $0.002;企业定制Developer 计划增加统一可观测性、自定义评测器、SSO、SaaS 部署可观测性、测试和实验、护栏、治理企业定价除层级框架外未公开围绕已部署系统的安全和治理预算,形成按量竞争
Datadog LLM Observability每月 40K 个 LLM spans 以内免费;Pro 起价 $160/月,含 100K 个 spans100K spans 后按需超量计费;M2M 和年度承诺有折扣Agent 可观测性、评估、留存选项、敏感数据扫描留存附加项和完整企业包装随承诺不同而变化对已标准化使用 Datadog 的团队,是强在位捆绑
LangSmith开发和小规模生产有免费层;付费计划随 trace 量扩展;企业需联系与框架无关的 SDK 访问,并可按使用量付费扩展Agent 追踪、可观测性、调试已审阅页面未显示准确公开价格当团队需要工作流可见性而不是模型内部控制时,预算重叠最强
Langfuse免费 Hobby 层;$29 Core;$199 Pro;$2499 Enterprise基于 units 计费,含 50k 免费和付费计划内 100k;可选 $300 Teams 附加项追踪、评测、prompt、分析、合规功能、自托管选项标价 unit 阶梯之外的量价折扣透明定价和 OSS 姿态,会挤压主打通用 AI 工程价值的供应商
Humanloop(历史)免费试用;企业 / 定制计划2 名成员、50 次 eval run、每月 10K 条日志;VPC 附加项和企业支持Prompt 管理、评测、可信 LLM 应用工作流被 Anthropic 收购后,独立商业未来已经消失说明相邻平台品类可能在独立成熟前就并入前沿实验室
内部黑箱栈无软件单列支出;内部人力加云和工具开支混合使用现有工具中的 prompt、评测框架、微调和护栏团队不愿新增供应商支出时的灵活替代路径真实总成本常藏在合规审查、再训练和内部开销里除非 Goodfire 证明调试、安全或领域结果明显更好,否则现状仍可行

公开定价结合了官方标价、层级描述和已审阅页面明确披露的未知项; 没有数字被视为证据缺口,而不是隐藏假设。

[CP018, CP020, CP023, CP024, CP026, CP027]
FP002: 功能广度 / 能力地图

能力地图突出 Goodfire 在模型内部编辑和特定领域机制化工作流上的相对优势;相邻供应商则强在追踪、治理和生产运营。

分数是仅基于公开能力描述的 1-5 序数判断。5 表示在已审阅来源集中可见契合度最强,并非经审计的市场排名。该图是综合强度地图,不同于 TP002 的支持 / 未知矩阵。

[CP002, CP003, CP011, CP012, CP013, CP017]

3.3 切换成本、替代方案与分发力量

这个格局里的切换成本并不对称。一旦团队把 Datadog、LangSmith 或 Langfuse 标准化为追踪、评估和生产调试工具,即便这些工具不暴露模型内部机制,也会成为 AI 质量工作的默认操作界面。这种分发优势很重要,因为许多组织宁愿扩展既有开发者或可观测性栈,也不愿采用新的研究原生工作流。反过来,Goodfire 最强的用例出现在追踪本身不够的地方:开放权重模型构建者、安全关键领域,以及需要检查特征、把行为归因到训练数据、或在部署前干预的研究团队。主要替代方案仍是由提示词、基准评估、护栏和迭代式微调拼成的黑箱栈,有时由开源工具在内部组装。这条路前期更便宜,也更熟悉;但 Goodfire 的论点是,它会让团队继续猜测模型为什么表现糟糕。竞争问题在于,买方是否已经被这种猜测—检查循环折磨到愿意把预算从可观测性或提示词工具转向机制化模型设计。[CP004, CP008, CP016, CP017, CP018, CP022]

3.4 护城河耐久性与竞争风险

当买方本身重视机制理解时,Goodfire 的护城河最容易成立。公司可以用特征引导、数据归因、PII 检测探针,以及生物学和机器人领域工作来证明,模型内部机制可用于调试、安全和科学发现,而不只是事后监控。这给了它比相邻评估供应商更研究原生的产品叙事。但反向证据也重要。MIT Technology Review 引用一位外部机制可解释性研究者的观点,认为 Goodfire 可能只是在给当下的炼金术增加精度,而不是把 AI 变成一门完全有原则的工程学科。同一篇文章还指出,Silico 在客户能访问模型权重的地方最有用,这限制了其在封闭前沿模型上的适用性。OnHealthcare 也把公司描述为一家 51 人、研究优先的组织,估值相对已披露商业牵引偏激进。因此,最高风险情景很清楚:更大的可观测性供应商加入 explain-and-steer 功能,前沿实验室把最深的可解释性优势留在内部,或者客户认定 trace 级控制已经足够。如果 Goodfire 成为开放权重和特定领域 AI 项目的默认模型设计层,它仍能赢;但公开胜率、定价或留存证据尚未证明这种耐久性。[CP005, CP007, CP008, CP009, CP010, CP011]

护城河耐久性 / 竞争风险登记表
护城河或风险判断支撑证据反向压力严重程度缓释措施 / 尽调问题
机制可解释性是 Goodfire 最清晰的产品护城河Silico、特征调控、数据归因、Llama 调控和探针研究都指向对模型内部机制的直接干预前沿实验室内部也在做机制可解释性,外部人士也质疑这套工作流现在到底有多成体系要求客户证据,证明机制工作流能改变部署或训练决策,而这些改变是可观测性工具做不到的
客户能检查开放权重或可适配模型时,Goodfire 最强MIT 称,团队能访问模型内部机制时 Silico 最好用;Goodfire 对外销售的是训练 / 调试模型设计环境闭源前沿模型限制适用范围;许多企业买家仍从黑盒供应商调用 API按开放权重部署与纯 API 部署拆分客户结构,并提供闭源模型路线图证明
相邻可观测性厂商能吃掉 AI 质量预算的大部分Arize、Fiddler、Datadog、LangSmith、Langfuse、Arthur 和 Humanloop 都在销售 tracing、评测、护栏或治理能力这些工具并不能明显解决特征级调试或数据归因,仍给更深的设计层留下空间测试 Goodfire 是否挂在独立预算负责人名下,还是必须挤掉可观测性支出
其他厂商透明的自助定价,让 Goodfire 不透明定价成为销售风险Arize、Fiddler、Datadog 和 Langfuse 公布入门价,Goodfire 则采用逐案商业条款如果买家把 Goodfire 看成又一家工具供应商,而不是差异化研究层,价格发现会显得不划算中高要求实际成交价格、试点转生产转化率,以及首次产出价值的平均时间
研究广度只有产品化之后才能变成护城河Goodfire 引用幻觉降低、PII 检测、生物学发现和扩散搜索等多个领域成果研究组合太宽也会带来聚焦风险,拖慢可重复产品包装中高询问路线图和人力中,有多少比例绑定可复用产品,而不是定制研究项目
品类整合是真实威胁Humanloop 将加入 Anthropic 并停止运营;Weights 团队加入 OpenAI 后已收尾前沿实验室吸收相邻能力和人才的速度,可能快过初创公司独立规模化评估 Goodfire 更可能成为耐久平台、另一套技术栈里的一个功能,还是有吸引力的收购标的
只有买家相信可解释性能给可观测性增量加分,治理和信任要求才会利好 GoodfireNIST AI RMF 和 Gartner 都强化了敏感 AI 系统里的治理、评测和隐性运营成本问题同一批担忧也会增强 Fiddler、Arthur 和 Datadog 等护栏与可观测性既有厂商验证受监管买家是否明确要求机制证据,还是只要 trace 级控制和政策执行就满足

严重程度反映的是 Goodfire 自身面对的竞争压力,而不是厂商绝对质量;缓释要求聚焦公开记录缺失的证据。

[CP005, CP007, CP008, CP009, CP010, CP011]
FP003: 护城河 / 准备度 KPI

紧凑 KPI 概括 Goodfire 护城河周围的商业与竞争边界:大额研究资金、不透明定价、相邻免费层,以及来自前沿实验室内部团队的直接压力。

KPI 项有意混合资金、价格底和打包信号,因为 Goodfire 的竞争耐久性同时受技术差异化和相邻工具经济性塑造。

[CP005, CP010, CP018, CP020, CP023, CP026]
Chapter 04

04财务情况

4.1 收入模式与定价界面:软件可见,经济性不可见

公开证据支持 Goodfire 已有商业产品,但不支持其有公开价目表。Goodfire 官方界面把 Silico 描述为模型设计环境,也是一个在 Goodfire 基础设施上训练和调试模型的工作空间;各垂直页面反复邀请正在训练或微调基础模型的团队申请访问,而不是进入自助式公开结账流程。联系页面更进一步,称平台已经被 Fortune 500 企业、大型医疗机构和 AI 研究实验室使用。这些表述支撑了企业产品和企业目标市场的存在,但没有披露这些客户接受的实际商业条款。 法律文件让定价姿态更清楚。主服务协议和使用条款都把商业经济性推入协商后的订单表。条款明确设想费用、使用量超过合同配额时的超额费用,以及对计费具有权威性的仪表盘或使用报告记录。试点协议还单独说明,试点访问仅用于内部评估,评估期后需要另签商业许可。这一组合指向的变现栈,是围绕定制合同构建,而不是公开标价:试点费、商业平台费、按用量超额收费,以及潜在的额外服务费。 投资人真正需要承保的部分仍不可得。已审阅公开页面没有披露标价、最低年度承诺、支持层级定价、折扣梯度,或按客户类型划分的实际成交价。因此,定价 / 变现表把已验证的商业机制与缺失的经济性分开。企业 AI 基础设施不公开价格并不罕见,但这意味着外部读者无法仅从官方界面推断 ACV、客户分层或软件毛利率。正确结论不是 Goodfire 没有收入,而是 Goodfire 选择了协商式、不透明的商业姿态。[CI007, CI008, CI009, CI012, CI013, CI014]

收入来源表
来源机制单位当前价值 / 状态质量尽调问题
试点项目完整商业许可前,按试点协议提供评估访问试点费 / 试点期限订单表中存在试点费;公开金额未披露存在性为中,价值为低提供已签署的试点订单表、收费表,以及转化为商业合同的比例。
Silico 商业平台访问基于订单表访问托管平台、API、工具、文档和相关软件年度合同或定制许可订单表中存在商业费用;无公开标价机制为中,定价为低提供标准订单表、ACV 区间、最低承诺和计费依据。
使用量超额费根据使用条款,对超出合同配额的用量收费配额以上的使用单位已明确设想超额费;触发单位和价格未披露机制为中,实际经济性为低披露计量单位、包含配额、超额费率和客户用量结构。
支持 / 现场工程 / 研究服务平台使用之外提供技术协助、现场工程、协作活动和交付物项目、顾问保留费或服务工作说明书合同上可提供服务;公开定价和附加率未披露存在性为中,利润率结构为低披露服务收入占比、定价方法、利用率和毛利率。
生命科学发现项目面向 Prima Mente 等科学发现伙伴,提供平台加嵌入式可解释性工作定制项目有具名证明点;未披露合同金额或续约数据当前收入贡献为低提供合同金额、续约状态,以及这些项目是否转化为经常性软件收入。
企业设计伙伴关系与前沿或高风险 AI 团队开展选择性合作定制合作官方描述为选择性、需申请访问;无公开合同经济性当前收入质量为低提供设计伙伴数量、转化为生产合同的比例,以及每个账户的实际年度支出。

已验证机制来自法律文件和官方产品页。当前价值 / 状态刻意保持定性,因为 Goodfire 未披露收入结构或实际定价。

[CI007, CI008, CI009, CI012, CI013, CI014]
定价 / 货币化表
产品 / 路径价格 / 单位 / 合同标价 vs 实际成交折扣 / 未知项来源
Silico 商业许可未披露公开金额无公开标价;仅有协商后的实际成交价最低承诺、合同期限、席位或计算量基础未知官方产品页 + MSA/TOS
试点协议试点费在订单表中设定;金额未披露无公开标价评估期限、转化抵扣和试点成功标准未知Pilot Agreement
使用量超额费超过包含配额后收取超额费;单位未公开仅实际成交价费率表、阈值和真实用量驱动因素未知TOS
支持 / 现场工程未披露公开价格仅实际成交价是否打包、单独开票或包含在企业层级内未知TOS + MSA
合规就绪的企业部署SOC 2 / SOC 3 支持采购就绪,但不设定价格不是价格点安全 / 合规溢价是否被直接货币化未知SOC 2 博客 + 联系入口
已废弃演示 / API 预览当前无公开商业价格;预览 API 已于 2026 年 2 月废弃历史预览已从公开界面移除是否有任何自助定价在私下保留未知特征调控博客

本表区分已披露的商业机制和未披露的经济性。官方定价基本缺席;所有公开路径都指向定制合同。

[CI008, CI013, CI014, CI015, CI018, CI020]
FI001: 收入模型桥

展示公开可见的收入架构:从选择性获客到平台使用和服务,同时标出实际定价与毛利率不再公开的位置。

公开来源验证了节点和商业机制,但没有验证实际金额、合同规模或毛利率。这是结构性桥接,不是量化瀑布。

[CI007, CI008, CI009, CI012, CI013, CI014]

4.2 GTM 动作与单位经济:高接触部署,公开可见度低

Goodfire 公开 GTM 看起来选择性强、接触度高。Series B 文章称,公司与构建高风险或前沿系统的团队进行深入且选择性的合作;联系页面则描述了一个被大型企业、医疗机构和 AI 研究实验室使用的平台。客户故事材料说明了这件事在财务上的重要性:在 Prima Mente 合作中,Goodfire 研究人员嵌入客户团队,并围绕客户模型搭建了一个生物标志物发现流程。使用条款还描述了支持、技术协助、现场工程支持、研究活动、协作活动和交付物。合在一起,这些来源表明,至少部分部署不是纯粹按席位计费的软件订阅;它们很可能把平台访问与定制科学或工程工作结合起来。 这带来两面影响。积极一面是,嵌入式工作可以加速设计伙伴转化、拓宽产品护城河,并支撑高端企业定价。它也能让 Goodfire 在客户需要解释帮助而不只是仪表盘的高风险领域发挥作用。消极一面是,服务较重的收入通常扩张更慢,毛利率也往往弱于纯软件。公开来源没有说明 Goodfire 的收入中有多少来自软件使用、年度许可、试点或研究服务——如果确有收入的话。它们也没有披露客户数、试点转生产转化率、销售周期长度、留存、CAC 或回本周期。 技术证明点有意义,但不是财务指标。Goodfire 的 RLFR 研究称,相比 LLM-as-a-judge 方法,它以大约低 90 倍的成本把幻觉减少 58%;生命科学案例研究也展示了诊断和科学发现中的可信客户价值故事。这些是强商业化叙事,但不等同于已披露的收入质量。因此,单位经济桥梁只能是定性的。它展示了从选择性设计伙伴关系走向合同化软件和超额用量收费的可能路径,同时明确每一步的实际数值仍是私有信息。[CI009, CI010, CI011, CI012, CI015, CI016]

单位经济性表
指标值 / 空值置信度重要性尽调问题
Silico 公开标价没有标价或起步价,外部无法框定 ACV 或客户分层。要求当前价格卡,或按部署类型提供匿名报价样本。
平均合同价值(ACV)需要 ACV,才能把选择性设计伙伴牵引力转成收入规模。提供试点、企业订阅和战略合作的 ACV 分布。
使用量毛利率消费型软件可以有高毛利,但嵌入式计算或人工交付会压缩毛利。按平台使用量收入线和服务收入线提供毛利率。
服务收入占比服务占比过高会改变可规模化性和估值框架。披露过去十二个月的软件与服务收入结构。
试点转生产转化率在选择性企业 GTM 模型中,这是收入质量最清晰的代理指标。提供已启动、已转化和已流失的试点数量。
销售周期长度企业和医疗保健采购周期很长,可能推迟收入确认和现金回收。按客户细分披露从首次接触到签署订单表的中位周期。
CAC 回本周期判断高接触 GTM 在经济上是否耐久,必须看这个指标。按 cohort 提供全口径 CAC 和毛利回本周期。
留存 / 扩张只有账户续约并扩张,超额费和用量增长才有意义。提供付费账户的 logo 留存率、总留存率和扩张率。

空值表示该指标在已审阅来源中未公开披露,并不表示该指标为零或无关紧要。

[CI012, CI016, CI022, CI023, CI029, CI030]
FI002: 单位经济桥

从获客动作到混合经济性的定性桥接,突出公开证据在哪里结束、尽调要求必须从哪里开始。

这座桥故意保持定性,因为 Goodfire 未披露 ACV、CAC、回本周期、留存或毛利率。

[CI012, CI014, CI015, CI016, CI022, CI023]

4.3 资本充足性与融资:融资已验证,现金跑道未验证

公开记录中最强的财务事实是融资事实。Goodfire 在 2025 年 4 月宣布 $50 million Series A,并在 2026 年 2 月宣布估值 $1.25 billion 的 $150 million Series B。SEC Form D 文件让这些公告更精确。2025 年文件显示,在 2025-04-02 首次出售后已售出 $52,029,991;2026 年文件显示,在 2025-12-17 首次出售后,针对 $161,674,124 的总发行额已售出 $149,999,796。基于这个窄口径,两轮已披露融资中至少 $202.0 million 的股权出售,可以由一级申报数据直接验证;公开评论则把包括早期资本在内的总融资略高于 $200 million。 这一融资事实模式支撑一个清晰结论:Goodfire 拥有强资本获取能力。但它没有回答核心的资本充足性问题。已审阅公开来源没有披露手头现金、月度烧钱速度、现金跑道月数、债务契约或下一轮触发条件。公开资金用途很宽:前沿研究、下一代产品开发,以及在 AI 智能体和生命科学领域扩大合作。这些都是真实现金用途,但分母缺失,无法推导现金跑道。即便 2026 年 Form D 中一个相对小的额外线索——总发行额高于已宣布售出金额——也只说明该轮可能有容量或预留,而不是实际仍可用现金。 因此,财务估计区间保持克制,只框定有来源支持的融资事实。它不发明收入、烧钱速度或现金跑道。同样,资本强度图展示现金可能流向哪里——产品、研究、嵌入式交付和企业合规——同时保留已记录融资与推断成本结构之间的区别。这才是证据约束下的正确立场:融资得到验证,但融资之外的资本充足性无法从公开数据承保。[CI001, CI002, CI003, CI004, CI005, CI006]

资本充足性表
项目公开数值 / 状态置信度重要性尽调问题
已验证 Series A 融资宣布 $50M;Form D 显示已售 $52.029991M一手证据确认 2025 年完成外部融资。将新闻稿宣布的轮次规模与 cap table 交割文件对齐。
已验证 Series B 融资宣布 $150M,估值 $1.25B;Form D 显示已售 $149.999796M,总发行额 $161.674124M一手证据确认 2026 年的大额融资,并可能仍有剩余发行额度。提供最终交割时间表,以及是否仍有未售配额可用。
Series A 以来累计披露资本2025-2026 年 Form D 文件合计至少已售 $202.029787M;公开评论称整体背书超过 $200M这是公开资本充足性最强的锚点。提供包括种子轮在内的累计融资,以及剩余不受限现金。
账上现金需要现金余额,才能把融资历史转成实际现金跑道。提供当前不受限现金和短期投资。
月度烧钱速度没有烧钱速度,任何公开现金跑道估计都站不住。提供过去六个月净烧钱速度,以及按职能拆分的计划支出。
现金跑道月数大额融资之后,现金跑道是核心充足性指标。提供管理层在基准和下行情景下的现金跑道判断。
计划资金用途前沿研究、下一代核心产品,以及在 AI 智能体和生命科学领域扩大合作伙伴关系确认资本同时投向 R&D 和 GTM,而不只是保留在资产负债表上。提供董事会批准的资金用途模型,包括时间安排和预算桶。
债务 / 项目融资义务已审阅来源中未发现公开债务或项目融资义务未披露不等于零杠杆,但这里没有浮现公开义务。提供债务明细表、venture debt 条款、租赁,以及任何已承诺的计算资源义务。

本表区分已验证融资事实和不可得的流动性指标。空值反映公开披露缺失,而不是负面发现。

[CI001, CI002, CI003, CI004, CI005, CI006]
FI003: 财务估计区间

有来源支撑的财务区间仅限融资事实;收入、烧钱速度和现金跑道因未公开披露而排除。

低 / 基准 / 高值对齐新闻稿披露的融资、Form D 已售金额,以及围绕总支持规模的更广泛公开评论。本图不为收入、烧钱速度或现金跑道编造区间。

[CI001, CI002, CI003, CI004, CI005, CI034]
FI004: 资本强度 / 现金流地图

矩阵展示哪些公开资本证据已经存在,哪些经营现金证据仍然缺失。

这是一张结构化证据地图,不是量化现金流报表。目的在于把已验证融资与缺失的经营流动性数据分开。

[CI006, CI019, CI020, CI037, CI038]

4.4 财务结论与公开缺口:融资已验证,变现靠推断,承保仍未闭合

证据支持一个精确但狭窄的结论。Goodfire 并非财务上尚未成形;它拥有已验证的企业产品界面、真实外部融资、受监管和前沿领域的具名合作伙伴、企业安全资质,以及设想费用、超额收费和使用计量的商业合同。这些是真实业务的要素。但几乎所有用于判断收入质量和利润率路径的指标仍是私有信息。没有公开收入,没有公开 ARR,没有毛利率披露,没有现金余额,没有烧钱速度,没有现金跑道,也没有债务时间表。 这个缺口很重要,因为可能的业务模式是混合型。如果使用和超额收费占主导,软件平台可以成为有价值的经常性收入层。但客户证据和服务条款暗示,当前产品至少有一部分包含嵌入式科学和工程劳动。不知道软件与服务的拆分,投资人就无法判断 Goodfire 应该更像企业基础设施软件、专业应用研究服务,还是一个从服务较重、软件较轻起步再逐步成熟的混合体。 反向解读很直接。一份持怀疑态度的行业分析认为,对于一家商业牵引早期、尚未形成可预测 SaaS 形态的公司,$1.25 billion 估值偏激进。结合公开数据,这一批评方向上成立:资本已经披露,但运营模型没有披露。因此,承保答案是把已验证与推断分开。已验证:融资、企业合同机制、安全准备度和选择性客户牵引。推断:变现组合、毛利率路径和现金跑道耐久性。下方缺口表捕捉了具体尽调请求;只有补齐这些请求,财务投资逻辑才能从「合理」走向「可承保」。[CI017, CI018, CI020, CI025, CI029, CI030]

公开财务缺口表
缺失的私有指标对承销的影响精确尽调路径
按季度拆分的收入 / ARR无法测试估值是否有实际商业规模支撑。要求月度经常性收入衔接表、季度收入和过去十二个月 ARR 变动表。
按客户类型拆分的实际成交价格无法区分高溢价软件经济性和服务占比较高的定制项目。要求覆盖企业、医疗保健和研究客户的匿名已签订单表和发票样本。
软件与服务收入结构无法承销毛利率路径或可规模化性。要求管理层拆分过去十二个月的平台、超额费、试点和服务收入。
毛利率和贡献毛利率无法评估消费型和嵌入式交付成本是否支撑耐久单位经济性。要求按收入线提供毛利率,并列出计算、支持和人员成本桶。
现金余额和烧钱速度尽管近期轮次规模很大,仍无法估算现金跑道或下一轮融资需求。要求现金、债务、净烧钱速度,以及未来 24 个月的计划招聘 / 研究支出。
销售效率和留存无法判断选择性 GTM 是否能转成可重复的企业软件经济性。要求销售管线转化、销售周期、CAC、回本周期、logo 留存和扩张指标。

这里每一行都是实质性尽调阻碍,而不是无关紧要的遗漏。这些缺口正是本章仍受证据约束的原因。

[CI029, CI030, CI031, CI037, CI038, CI040]
Chapter 05

05产品与技术

5.1 产品定义与客户工作流

Goodfire 的商业界面最好理解为模型设计环境,而不是通用 LLM 可观测性仪表盘。Silico 被描述为首个面向有意图模型设计的平台,是一个在 Goodfire 基础设施上训练和调试模型的工作空间,也是一套围绕具体任务把可解释性产品化的系统:看进预测内部、运行健康检查、调试失效、塑造行为并改善泛化。实际结果是,产品比标准应用层分析更靠近模型开发循环。 客户工作流也异常高接触。公开页面反复把团队推向申请访问或合作动作,而不是自助上手路径。实际中,工作流从一个已经控制权重、激活值,或至少足够内部机制的模型团队开始,让 Goodfire 能检查模型行为。随后 Goodfire 把模型、数据集、提示词、工作流和评估任务拉进共享工作空间,运行智能体辅助实验,再把得到的机制发现转化为引导、诊断、数据过滤或奖励塑形等干预。 垂直页面显示,同一个循环在不同领域重复。语言团队用这套栈减少幻觉;生命科学团队用它从模型内部机制中提取生物标志物和变异假设;机器人和视觉团队用它在部署前捕捉脆弱特征和泄漏。结果是一个工作流特异性很强的产品,但它仍依赖客户愿意在共享、接近研究的环境中工作,而不是通过成熟的大宗商品 API 界面来使用。[CE001, CE002, CE003, CE004, CE005, CE006]

产品模块 / 资产矩阵
模块 / 资产 / 产品线主要用户状态 / 成熟度差异化尽调缺口
Silico 共享工作区前沿实验室和企业模型团队线上产品界面;访问受控围绕模型设计环境包装可解释性,而不是做应用层仪表盘无公开租户模型、API 参考或部署架构
模型科学家 agent / 实验编排研究人员和模型工程师内部已上线,并在发布材料中公开描述在同一工作区内自动化实验规划和执行人工审核规则、护栏和客户自主级别未公开
诊断和健康检查训练、评测和安全团队线上工作流主张部署前暴露瓶颈、特征塌缩、捷径学习和罕见故障未按模型类别公布 precision/recall 或 benchmark 覆盖
调控和干预控制调整模型行为的 AI 工程师已上线,但预览工具废弃后仍在演进直接特征调控、奖励塑形和数据过滤式编辑支持模型矩阵、回滚控制和商业包装仍为私有
语言可靠性工作流开放模型或微调团队最具体的公开工作流58% 幻觉降低主张,加上用于干预审查的 rollout viewer证据很强,但仍集中在 Goodfire 自选案例研究中
科学发现工作流基因组学和生命科学研究人员高级伙伴工作流把模型内部机制转成生物标志物、致病性探针和人类可读的变异假设临床验证和监管路径仍取决于具体伙伴
Physical AI / 创意工作流资产机器人、视觉和图像模型团队伙伴工作流或研究预览把同一套可解释性基础组件扩展到策略瓶颈、泄漏检测和 latent editing UI案例研究之外的商业状态和可重复性未公开

各行合并了公开产品模块和工作流资产,因为 Goodfire 通过问题导向界面销售平台,而不是通过公开 SKU 表销售。

[CE001, CE002, CE003, CE004, CE007, CE011]
工作流 / 用例表
用户任务当前工作流Goodfire 方案可衡量收益限制
部署前降低 LLM 幻觉prompt 调整、judge 循环和事后输出审查模型设计环境内的 RLFR、特征调控和 Hallucinations Viewer主张幻觉降低 58%,且干预成本比 LLM-as-judge 大约低 90x证据绑定具体工作流,不是普适性能保证
调试前沿 reasoning model 行为prompt hack 和粗粒度响应 benchmarkR1 上的推理模型 SAE、特征数据库和时序感知调控展示 backtracking 等推理特定特征,并在大规模下暴露调控边界案例需要权重或 activation 访问,也需要专家处理模型特定行为
从科学模型中提取生物标志物黑盒预测审查和湿实验室分诊用 SAE、tracing 和 ablation 在客户模型上做嵌入式可解释性工作发现一个新的 Alzheimer's 生物标志物类别,并得到一个可推广到独立 cohort 的人类可读分类器仍需要下游实验验证
解释全基因组变异效应不透明致病性分数和仅限编码区的工具Evo 2 embeddings,加上通过 EVEE 做探针和推理模型综合在 839k ClinVar 变异上达到 0.997 AUROC,并为 4.2M 变异生成结构化假设输出是研究假设,不是诊断或监管级证据
部署前捕捉机器人或视觉故障等待 benchmark 失误或生产故障出现部署前检查 latent policy 结构、几何和泄漏能在已审阅案例研究中定位瓶颈、未用观测信号和 ECG 泄漏公开证据来自案例研究,而非产品文档
直接编辑图像模型行为只能在提示词框里反复试Paint With Ember 画布可操控潜在激活和概念权重支持添加、移动、重塑概念,不只是改写提示词更像研究预览,而不是核心商业 SKU

收益项混合了研究直接声称的结果和具体工作流演示。Goodfire 没有发布这些流程的客户级 ROI、转化率或使用频率指标。

[CE004, CE005, CE006, CE009, CE010, CE012]
FE002: 客户工作流 / 运营流程

公开工作流从合作伙伴主导的准入动作开始,进入共享可解释性实验,最后落到定向模型引导或设计决策。

该运营流程综合了语言、生命科学、机器人和发布材料中的反复模式。公开来源未披露正式买方手册或转化漏斗。

[CE003, CE004, CE007, CE011, CE016, CE033]

5.2 可解释性原语与运行架构

Goodfire 的架构把共享实验工作空间与一套研究栈配对,后者覆盖激活分析、几何发现、参数分解和干预工具。官方研究界面显示,稀疏自编码器、探针和流形方法承担早期工作,用来暴露可解释特征;神经几何工作主张,许多重要概念存在于弯曲的内部流形上,而不是单一方向上;随机参数分解则把栈推向更深的权重层,Goodfire 试图识别哪些因果组件可以在不改变输出的情况下移除。这一组合表明,平台不是单一技术,而是用于解释、定位和编辑模型行为的分层工具包。 R1 工作尤其能说明能力与摩擦并存。Goodfire 称其在前沿推理模型上训练了首个公开稀疏自编码器,并为此搭建了定制推理和解释器模型基础设施。与此同时,这项工作也显示,引导推理模型并非即插即用:干预必须发生在模型默认回答前缀之后,一些过重的引导会让行为弹回原始回答。这让核心产品主张更强,而不是更弱:Silico 的全部意义,就是在客户盲目发布或重新训练之前,把这些隐藏的运行约束暴露出来。 这套架构也解释了 Goodfire 的依赖栈。最深的工作流需要访问模型内部机制,因此开放权重或客户可控模型比封闭 API 端点更适配。它也解释了为什么 Goodfire 能把同一组核心想法复用于不同领域。EVEE、Alzheimer's 生物标志物工作、Paint With Ember 和机器人瓶颈分析都遵循同一模式:抽取内部结构,把它翻译成可读内容,再用这种理解来更有意图地调试、引导或设计模型。[CE013, CE014, CE018, CE019, CE020, CE021]

技术 / 运营架构表
层级 / 流程 / 组件作用依赖风险
客户模型与材料摄取将权重、数据集、文件、代码、提示词和工作流带入工作区客户必须控制或暴露足够多的内部信息,才能分析封闭 API 模型和限制性数据共享规则可能挡住最深入的工作流
共享工作区与智能体编排在 Goodfire 基础设施上运行实验、捕获输出,并协调可解释性任务Goodfire 计算、推理和智能体工具租户、区域布局和审查 / 审批控制未公开
激活可解释性层用 SAE、探针和相关工具定位模型特征与信号激活访问权限,加上训练好的解释器模型线性特征方法可能漏掉全局弯曲结构
几何 / 流形层复原结构化概念空间,让理解和控制更平滑面向内部表征的聚类和几何发现流水线研究成熟度高,但打包后的产品边界尚未完全公开
参数分解层将权重拆成因果组件来检查,而不只是观察激活SPD 式分解与掩码方法可扩展性、运行成本和产品包装仍有一部分处在研究阶段
监控与故障暴露层用放大采样和评估感知分析,捕捉训练后罕见故障前后检查点、发布分析和评审器基础设施监控发现可能取决于提示词设计,未必自动泛化
干预与 steering 闭环应用特征 steering、过滤、奖励塑形和定向模型编辑编辑权限、回滚纪律和按模型定制的启发式方法时机错误或过度 steering,可能让推理模型学会绕路
服务与商业交付层围绕平台提供支持、技术协助、现场工程和研究合作订单、Goodfire 人员和合作伙伴工作流高触达交付会拖慢扩张,也会掩盖价值中有多少来自软件、多少来自服务

这是有证据支撑的运营架构,不是官方工程图。它区分了公开方法层,以及租户、供应商栈、数据驻留等未披露基础设施细节。

[CE018, CE020, CE021, CE022, CE023, CE024]
FE001: 产品架构地图

Silico 把客户控制的模型访问、共享实验、可解释性原语和干预工具堆进同一个模型设计环境。

该堆栈从产品页、研究文章、发布报道和法律条款推断而来。Goodfire 未发布标准架构图,也未发布逐供应商基础设施地图。

[CE001, CE016, CE018, CE023, CE026, CE033]
FE003: 关键依赖地图

Silico 依赖客户开放模型内部、Goodfire 控制的实验基础设施、合同订单表和特定领域合作伙伴上下文。

这张图综合了公开产品、法律和发布材料。它强调一个实际依赖:客户能够暴露模型内部,而不是只调用不透明 API 时,Goodfire 效果最好。

[CE017, CE033, CE034, CE036, CE037, CE038]

5.3 信任、质量与合规姿态

Goodfire 的公开信任姿态,在企业安全上比在公开运营透明度上更成熟。最强的可见采购信号是公司的 SOC 2 Type II 公告,称审计无例外完成,并附有公开 SOC 3 摘要。面向医疗的材料又加一层,描述了 Mayo 专属隐私协议和治理框架,旨在减少伪相关并提高临床相关性。对受监管环境中的买方来说,这些都是有意义的指标。 不过,法律界面清楚显示,投资人和企业架构师通常想检查的许多运营细节仍是私有信息。使用条款把平台定义得很宽,包括软件、APIs、工具、文档、支持和服务,但具体经济性落在协商订单表中。使用报告对计费具有权威性,超额收费存在;除非订单表另有说明,试点明确按「现状」提供。公开条款还保留了出于安全、法律、运营和付款原因暂停服务的权利,并允许第三方产品进入交付栈。 这是可信的企业合同脚手架,但仍留下重要尽调缺口。公开材料没有披露自助 API 参考、公开状态页、部署数量证据、租户架构或量化可用性历史。这一点重要,因为 Goodfire 所进入的外部框架越来越无法容忍黑箱治理。NIST 关注设计、开发、使用和评估全链条的可信度,而 Gartner 警告,高风险 GenAI 部署中隐藏治理和变更管理成本可能主导 ROI。Goodfire 方向上符合这些买方需求,但在公开暴露多少运营证据上仍处早期。[CE033, CE034, CE035, CE036, CE037, CE038]

信任 / 质量 / 合规表
控制 / 认证 / 质量指标状态范围缺口
SOC 2 Type II / SOC 3已取得;Type II 宣布无例外企业安全与采购保障不能替代公开正常运行时间或架构透明度
订单商业控制已上线的合同结构费用、超额费、服务范围和商业承诺没有公开价目表,也没有公开交易条款基准
试点项目护栏已上线的评估结构仅供内部评估;试点后需要单独商业许可默认试点条款按 AS IS 提供,且不公布服务等级
使用报告与计量已上线的计费控制Goodfire 记录是费用计算和使用摘要的权威依据公开文件未披露具体计量单位、配额或阈值
暂停与第三方产品治理已上线的合同控制处理安全、法律、运营、付款和第三方集成兜底流程和供应商名单未公开
Mayo 隐私与治理协议针对合作伙伴的公开承诺健康与基因组学合作不是面向所有客户部署的通用公开隐私架构
公开透明度界面有限信任门户、联系路径和安全摘要没有公开状态页、自助 API 文档、事故历史或部署数量披露

表格把正式采购信号和缺失的公开运营证据分开。Goodfire 在谈判式企业控制上更强,在广泛公开透明度上较弱。

[CE034, CE035, CE037, CE038, CE039, CE040]

5.4 路线图、发布节奏与成熟度

Goodfire 的发布节奏更像一家快速移动的研究组织在产品化内部栈,而不是一家拥有稳定公开变更日志的传统企业软件供应商。一个公开线索是 2026 年 2 月对早期 SAE 演示界面和 API 的弃用通知,暗示公司正在脱离狭窄的研究预览工具。到 2026 年 4 月末,MIT Technology Review 已经把 Silico 作为一款外部可用产品报道;公司自己的融资新闻材料也已经把路线图框定为下一代产品开发,以及在 AI 智能体和生命科学领域扩大合作。 发布后的节奏仍主要通过研究发布来表达。仅 2026 年 5 月,Goodfire 就发布了 eval-awareness measurement、story-shape geometry 和 SAE-based geometry recovery 相关工作。对于一家试图卖进企业和受监管工作流的公司来说,这种公开迭代速度异常快。它也意味着路线图可见度是不对称的:买方能看到科学引擎快速运转,但还看不到版本化发布说明、公开事故历史或广泛集成目录等普通 SaaS 证据链。 因此,成熟度图景是混合但连贯的。核心科学能力看起来强,语言、基因组学和科学发现中的领域工作流也不只是概念。安全姿态具备企业可信度。主要不成熟之处在包装:访问仍需协商,许多部署看起来附带服务,几个关键可靠性和集成细节仍是私有信息。因此,Goodfire 最成熟的形态,是为拥有严肃模型所有权的团队提供高端设计环境;最不成熟的形态,则是作为广泛标准化的开发者平台。[CE015, CE017, CE039, CE044, CE046, CE047]

路线图 / 发布 / 开发阶段表
日期 / 阶段功能 / 里程碑状态含义来源
2026 年 2 月前预览独立 SAE 演示界面和 API2026 年 2 月废弃Goodfire 从窄口径预览工具,整合到更宽的平台打法Feature Steering 博客
2026-02 战略论点有意设计与下一代核心产品叙事已公开阐述路线图锚定闭环训练控制,而不只是事后解释Intentional Design + PR Newswire
2026-04-30Silico 发布 / 对外亮相已上线的产品界面内部可解释性工具变成对外提供的产品,并按案例定价MIT Technology Review
2026-05-04Verbalized eval awareness 论文已发表公开研究节奏聚焦可靠性和基准质量,面向重视安全的买家Goodfire Research
2026-05-20The Shape of Stories Inside Neural Networks 论文已发表显示每周都有几何研究输出,而不是典型 SaaS 更新日志模式Goodfire Research
2026-05-21Can SAEs Capture Neural Geometry? 论文已发表延续工具工作,可供未来控制界面和几何感知方法使用Goodfire Research
2026 安全里程碑SOC 2 Type II / SOC 3已取得采购就绪度跑得比公开运营遥测更快Goodfire 博客
2026 合作伙伴扩展Mayo、Prima Mente 和 Radical 领域工作流活跃项目路线图包括基因组学、医疗和材料方向的科学垂直化,不只是通用 LLM 工具Goodfire 合作伙伴 / 客户页面

Goodfire 主要靠研究文章、合作伙伴公告和融资叙事披露路线图,而不是公开更新日志。因此,日期追踪的是公开里程碑,不是版本历史流。

[CE015, CE039, CE047, CE048, CE049, CE050]
FE004: 产品成熟度 / 能力地图

成熟度在核心可解释性引擎和特定领域工作流上最强,在公开平台包装和透明运营遥测上最弱。

评级是仅基于公开证据的定性判断。它们衡量可见成熟度,不衡量内部产品质量或客户满意度。

[CE015, CE039, CE044, CE046, CE047, CE048]

5.5 图表

Chapter 06

06客户情况

6.1 客户分层与购买中心

Goodfire 的公开客户故事聚焦于构建或微调基础模型的组织,而不是终端应用买方。最清楚的宽口径分层主张来自公司联系页面:平台被 Fortune 500 企业、大型医疗机构和 AI 研究实验室使用。产品页面进一步细化了这幅图景:Silico 面向跨架构和模态训练或微调模型的团队;语言页面指向希望不从头重新训练也能预测失效并改善行为的 LLM 开发者;生命科学页面指向基因组学和科学模型团队;机器人 / 视觉页面指向物理 AI 和医学影像工作流。综合这些界面,可能的经济买方是负责模型性能和可靠性的 R&D、平台或产品负责人;日常用户则是研究科学家、ML 工程师和可解释性专家。 重要限定是,Goodfire 没有把这些细分主张转化为数量、收入组合或具名企业参考。公开材料没有披露客户数、ARR、细分占比,也没有列出广义企业主张背后的 Fortune 500 用户名单。因此,公开证明集在垂直特异性上远深于商业广度:具名证据集群集中在基因组学、临床研究、AI 智能体安全和材料发现,其余企业叙事仍基本未列举。这种不对称表明,Goodfire 的 GTM 动作是选择性、高接触的:先赢得少数技术成熟的设计伙伴,之后才可能扩展到更标准化的企业软件分发。[CU001, CU002, CU003, CU004, CU005, CU006]

客户分层表
客群买方 / 用户 / 付费方主要用例公开证据战略价值缺口
前沿模型实验室与 AI 研究团队买方:研究 / 平台负责人;用户:可解释性研究员和 ML 工程师;付费方:R&D 或模型平台预算检查内部结构、调试故障、塑造训练、监控部署Silico 页面、Series B 文章、MIT Technology ReviewGoodfire 可能在这个核心类别里变成工作流基础设施除点名引用外,没有公开账户数或实验室名单
医疗与基因组学机构买方:医疗 AI 负责人或科学项目负责人;用户:计算生物学 / 基因组学团队;付费方:研究或转化医学预算解释科学模型、暴露生物标志物、解释变异效应、验证模型推理Mayo Clinic、Prima Mente、Arc Institute、EVEE 研究质量最高的点名证据,差异化结果也最强大多数证据仍处研究阶段,不是常规临床生产
大型企业 / Fortune 500买方:企业 AI 或产品负责人;用户:ML / 安全 / 模型运营团队;付费方:创新、平台或业务单元预算提升内部模型的可靠性、可控性和 ROI联系页面和 Salesforce Ventures 论点如果宽泛主张能转化成点名 logo,ACV 可能显著扩大没有点名 Fortune 500 账户或披露结果
AI 智能体平台与消费互联网运营商买方:安全 / 产品负责人;用户:护栏和基础设施团队;付费方:平台工程预算检测 PII、监控智能体行为、部署轻量护栏Rakuten 生产部署证明 Goodfire 能支撑在线企业工作流的最佳公开证据已审阅来源中只有一个点名生产企业案例
材料与物理科学团队买方:科学项目负责人;用户:模型科学家和自主实验室团队;付费方:R&D 预算利用内部结构改进逆向设计和候选目标筛选Radical AI 合作与自我纠错搜索研究将 Goodfire 从生物学扩展到更广泛的 in silico 发现商业成熟度和可重复性仍早

各行只总结公开客群证据。Null 和未点名企业主张代表缺少披露,而不是没有客户。

[CU001, CU002, CU003, CU004, CU005, CU008]
FU001: 客户旅程地图

公开证据指向一条选择性企业旅程:识别高风险模型问题,让 Goodfire 作为设计伙伴介入,在共享环境中工作,验证技术收益,再扩展到更广的监控或研究项目。

[CU001, CU003, CU009, CU011, CU017, CU022]

6.2 具名客户证明与采用动作

具名证明集显示,Goodfire 确实在为客户和伙伴做真实工作,但证明类型因账户而异。Prima Mente 是最清楚的模型到科学案例研究:Goodfire 称其将研究人员嵌入 Prima Mente,解读 Pleiades 表观基因组学模型,并帮助识别一类新的血源性 Alzheimer's 检测生物标志物。Arc Institute 是一个强科学参照,显示 Goodfire 能大规模处理前沿生物基础模型;不过,Arc 证据仍最好理解为研究合作,而不是常规软件部署,尤其因为初始引导工作被描述为早期阶段。Mayo Clinic 同样支撑品类可信度、治理准备度和临床邻近性,但公开记录把这项工作框定为研究和假设生成,而非常规临床部署。 Rakuten 与其他案例不同,因为它是最清楚的公开生产式部署:Goodfire 称,Rakuten 在 AI 智能体中部署 SAE 探针做 PII 检测,当时系统必须从合成训练数据泛化到真实多语言流量,并满足高召回要求。Radical AI 提供了第五个材料科学具名证明点,但商业化成熟度仍早期,因为公开披露强调技术进展,并承诺稍后提供更多细节。合在一起,采用动作看起来是咨询式、深度协作式的。Goodfire 反复描述共享环境、选择性设计伙伴合作、嵌入式工作和逐案定价,而不是自助上手。这是启动先进基础设施产品的可信方式,但也意味着当前证据基础更清楚地证明了技术参与深度,而不是可重复、规模化的软件分发。[CU010, CU011, CU012, CU013, CU014, CU015]

客户增长 / 采用轨迹表
指标数值日期来源质量含义缺少的分母
已披露的宽泛客户类别Fortune 500 企业、大型医疗机构、AI 研究实验室2026-06-10Goodfire 的市场不止纯研究实验室没有按类别拆分的数量或 logo 名单
点名公开合作者 / 客户及具体用例52026-06-10公开证据集包括 Prima Mente、Arc Institute、Mayo Clinic、Rakuten 和 Radical AI不是总客户数
带量化技术结果的点名证明42026-06-10Prima Mente、Mayo EVEE、Rakuten 和 Radical 披露了可衡量技术结果结果指标是技术指标,不是商业指标
明确描述为生产部署的点名证明12026-06-10Rakuten 是最清晰的生产式企业账户其余客户群没有披露生产账户数
价格披露按案例定价并需申请访问2026-04-30销售动作看起来是企业级、顾问式没有公开定价层级或合同区间
初次合作公告后的公开后续证据22026-06-10Arc 和 Mayo 后续有公开更新,暗示关系有一定连续性后续证据不等于付费续约
公开客户数 / ARR / NRR2026-06-10无法从公开证据量化商业规模采用度和耐久性的核心分母未披露

计数行指已审阅来源中可见的公开证据集,不代表 Goodfire 的全部客户群。Null 表示未披露。

[CU001, CU006, CU007, CU022, CU024, CU025]
点名客户证明表
客户 / 合作伙伴客群部署 / 用例生产还是试点结果限制
Prima MenteAI 神经科学 / 生命科学解释 Pleiades 表观基因组学模型,以暴露疾病信号并改进模型设计高触达研究合作;未披露为常规临床生产识别出一类新的血源性阿尔茨海默病生物标志物;突出 fragmentomics / fragment length实验验证和发表仍待完成
Arc Institute基因组学基础模型研究解释 Evo 2 表征,并探索可 steering 的生物特征研究合作,后续有 Nature 相关验证;商业条款未披露在编码序列、蛋白质结构和生命之树表征中发现特征初始 steering 工作被描述为早期阶段
Mayo Clinic大型医疗机构 / 基因组医学逆向工程基因组学基础模型,并推出 EVEE 变异效应探索器研究与转化合作;未披露为常规临床部署在 839k 个 ClinVar 变异上 AUROC 为 0.997;为全部 4.2M 个 ClinVar 变异给出可解释预测工作正在同行评审中,计算输出不是诊断
Rakuten企业 AI 智能体平台为 AI 智能体检测多语言用户消息中的 PII生产部署SAE 探针已部署,从合成到真实场景的泛化强,且相比 LLM-as-judge 大幅节省成本公开的点名生产企业部署只有一个
Radical AI材料发现 / 自主实验室在 MatterGen 上用自我纠错搜索改进逆向材料设计早期设计合作 / 技术证明成功候选总体增加约 27%,目标区间内 SUN 材料增加约 30%公开披露没有讲清商业化和重复使用

这是有意保留为部分列表的公开证明枚举。它区分点名、具体用例证据,以及更宽泛但未点名的企业主张。

[CU010, CU012, CU013, CU014, CU016, CU017]
FU002: 采用 / 部署漏斗

因客户数量未披露,采用图以部署流程而非数字漏斗呈现:Goodfire 似乎从选择性寻客推进到共享环境工作、技术验证,并且只偶尔披露生产推出。

[CU009, CU011, CU022, CU024, CU029, CU030]
FU003: 客户证明矩阵

该矩阵从披露、量化结果、生产成熟度、独立佐证和留存可见性几个维度,比较每个具名参考客户的公开质量。

[CU013, CU015, CU018, CU020, CU024, CU025]

6.3 耐久性、扩张与集中度风险

Goodfire 的客户持久性,是公开记录里最薄弱的一环。已审阅来源没有披露 NRR、GRR、流失率、续约率、合同期限、席位扩张、客户集中度,也没有 NPS 等满意度指标。公司也不公布客户数量,外部投资人无法判断业务到底靠少数大型设计伙伴撑着,还是已经有更广的装机客户基础。眼下最好的持久性代理指标,来自公开合作历史里的连续信号:Arc 从 2025 年初的公告,推进到后来与 Nature 相关的更新;Mayo 也从 2025 年合作公告,推进到 2026 年 EVEE 研究成果。这些信号说明部分关系持续到了足以产出更多公开工作的阶段,但不能证明付费续约、收入扩张或长期粘性。 扩张潜力仍然可见。Goodfire 可以先切入高风险模型开发流程,再从研究支持扩展到监控、训练干预、护栏和相邻科学项目。风险在于,证明集集中在少数具名合作者,且明显偏向生命科学;更宽泛的 Fortune 500 说法大多仍是匿名。两个独立来源强化了这份谨慎。MIT Technology Review 肯定 Silico 的实用性,但引用 Leonard Bereska 的观点称,Goodfire 给「炼金术」增加了「精度」,而不是把模型设计变成完全有原则的工程;OnHealthcare 则认为,在公开商业披露有限的情况下,$1.25 billion 估值显得激进。因此,客户论点有希望但仍脆弱:Goodfire 拥有可信的参考客户和技术成果,但可投资性很大程度上仍取决于账户规模、合同经济性和重复使用的私下证据。[CU038, CU039, CU040, CU041, CU042, CU043]

留存 / 重复使用 / 满意度表
指标数值 / null客群置信度尽调问题
净收入留存率(NRR)所有客群要求提供客户队列表和按账户年份划分的扩张数据
总收入留存率 / 流失所有客群要求提供按客户类型划分的续约和 logo 留存数据
合同长度 / 商业条款企业与研究账户要求提供价格表、期限长度和试点转付费转化率
公开连续性代理:Arc Institute2025 年初次公告后,2026 年出现 Nature 相关更新基因组学研究确认连续性代表付费续约、范围扩大,还是仅代表论文发表
公开连续性代理:Mayo Clinic2025 年合作后来被 2026 年 EVEE 研究引用医疗 / 基因组学确认后续工作是在一份主协议下,还是分多个阶段
客户满意度代理所有客群要求提供 NPS、客户访谈或用户评论数据;缓存中没有浮现公开评论

Null 表示该指标未公开披露。两条连续性行只是关系代理,不应解读为收入留存指标。

[CU019, CU024, CU038, CU039, CU040, CU046]
扩张与集中度风险表
扩张驱动因素集中度 / 摩擦风险影响证据尽调路径
从研究合作落地到共享产品环境高触达交付可能更像专家服务,而不是纯软件可能带来高 ACV,但 logo 速度慢Series B 文章、Silico 页面、Prima Mente 嵌入式工作描述按软件订阅、服务和定制研究拆分收入
从生命科学扩展到企业模型运营公开点名证明仍明显偏向生物学垂直集中可能扭曲需求广度的表观判断生命科学页面、Rakuten 证明、Salesforce Ventures 论点衡量生物学之外的管线和已赢账户
从开放模型研究团队拓宽到企业模型访问限制可能约束其用于封闭前沿模型采用可能偏向有参数访问权限的实验室MIT Technology Review 和 Silico 页面记录对封闭模型监控或合作伙伴集成的支持
用头部客户背书赢得 Fortune 500 买家Fortune 500 主张未点名,因此弱于点名证据集相比已披露证据,企业可信度可能被高估联系页面和点名证明表要求提供点名推荐客户、结果和客户访谈许可
深化 AI 智能体和护栏用例Rakuten 是唯一披露的生产账户品类可能很大,但公开生产证明仍薄Rakuten 研究与融资 / 投资者报道补充智能体工作流里的生产客户和续约证据

扩张行反映公开材料中可见的 GTM 向量。风险聚焦披露缺口、证明样本集中度,以及交付模式很可能偏服务化。

[CU022, CU024, CU025, CU029, CU031, CU032]
FU004: 留存 / 重复队列

Goodfire 未披露真实收入留存队列,因此本图展示一个更窄的代理指标:具名公开协作队列中,后来获得额外公开后续证据的占比。它是连续性代理,不是 NRR 或客户留存。

本图受证据限制。Goodfire 未披露客户留存指标,因此该队列只展示具名关系后续公开连续性。

[CU019, CU024, CU038, CU040]

6.4 附录

Chapter 07

07风险

7.1 法律、监管与合同风险

Goodfire 的法律和监管姿态足以通过初步企业尽调,但还不足以消除下行风险转移。正面证据是真实的:Goodfire 称已取得 SOC 2 Type II;Mayo 描述合作在严格隐私和治理协议下进行;公司把可解释性定位为让敏感 AI 用例更可治理的桥梁。更难承保的地方在合同。默认条款排除对服务不中断、安全、准确或无错误的保证;除非订单另有约定,试点和评估模式可以不承担安全或支持承诺;总赔偿责任也以已付费用为上限。这些都是创业软件公司的常见立场,但平台瞄准医疗、安全以及潜在关键基础设施流程时,客户仍要承担相当一部分宕机、泄露和部署风险。 数据权利是第二个锋利边缘。TOS 赋予 Goodfire 对 Usage Data 的广泛权利,并授予其对 Workflow Data 的永久许可,可用于改进、评估、训练和商业化,同时还把反馈 IP 转让给 Goodfire。对研究驱动型平台来说,这可能有商业合理性;但在受监管场景里,客户往往希望运营痕迹、模型行为和供应商产品改进之间硬隔离,这会拖慢采购。NIST 的生成式 AI profile 和 2026 年关键基础设施概念说明,都指向更明确的风险控制;Gartner 同样强调治理、成本纪律和现实测量是采用门槛。结论是,Goodfire 目前看不到公开诉讼或执法压力,但确实背着合同与治理负担:如果订单条款不能明显优于默认文本,进入受监管工作负载的速度会慢于品牌叙事。[CR008, CR009, CR010, CR011, CR012, CR013]

监管 / 法律风险登记表
规则 / 义务 / 姿态司法辖区状态可能性严重性缓释措施剩余风险暴露尽调路径
默认保修免责声明和责任上限美国合同法 / 客户订单表公开 MSA、试点协议和 TOS 中现行有效谈判客户专属合同、网络保险和安全附录受监管或安全关键型买家的风险为高审阅前 10 大已签企业合同红线,相比默认条款有哪些差异,以及是否存在不设上限的保密 / 安全例外条款。
宽泛的 Workflow Data 和 Usage Data 权利跨境企业采购 / 隐私TOS 中现行有效中高客户专属数据使用例外条款、去标识化控制、审计权中高审阅 DPA、数据流图、留存窗口,以及工作流数据能否排除在改进 / 训练之外。
医疗可解释性和临床治理负担美国医疗 / 受监管研究Mayo 治理表述和生物标志物案例研究已部分缓释中高将可解释性用作验证层,与受监管机构合作,沉淀治理包在出现更广泛部署证据前仍为高获取临床验证计划、监管定位备忘录,以及除已点名研究合作之外的部署证据。
关键基础设施可信度预期美国关键基础设施按 NIST 2026 概念说明,外部预期在上升将控制映射到 NIST AI RMF profile 和客户模型风险工作流中高索取行业专属控制矩阵、日志 / 可审计架构和事件响应流程。
出口管制和受限司法辖区约束美国出口 / 再出口法律公开合同中现行有效筛查客户、地域和下游模型用途;敏感部署由法律顾问介入审阅出口筛查流程,以及任何受阻国家或受限最终用途政策。
反馈转让和服务 IP 所有权客户 / 供应商 IP 分配MSA 和 TOS 中现行有效为客户发明和受监管工作流设置合同例外条款审阅企业合同是否限制反馈转让、交付物所有权和衍生作品歧义。

公开证据显示公司有强烈企业化意图,但默认合同仍偏向公司;各行按剩余承销重要性排序。

[CR008, CR009, CR010, CR011, CR012, CR013]
FR001: 风险热力图

将 Goodfire 的主要风险按缓释成熟度定位,显示公司拥有有意义的智力资产和治理资产,但在可重复性、客户广度和受监管部署就绪度上的公开证明仍弱。

热力图单元格是截至 2026-06-10 基于公开证据的综合判断,不是公司内部风险评分。

[CR009, CR011, CR016, CR019, CR024, CR025]

7.2 技术可靠性与产品证明风险

Goodfire 的核心产品主张很有野心:用可解释性把模型开发从猜测推向可控工程。风险在于,Goodfire 自己的研究记录也显示,这条路仍处早期。intentional-design 文章称科学尚不完整,最难的问题仍未解决。MIT Technology Review 从外部强调了同样的张力,引用一位机制可解释性研究者的说法:Silico 有用,但更像更精确的炼金术,而不是真正的工程。这一点重要,因为 Goodfire 卖的不只是仪表盘;它卖的是一种信任——其干预能够暴露正确的内部机制,并在关键系统中安全改变行为。 公司近期论文进一步说明,仍需保持怀疑。模型口头表达的评测意识会抬高测得的安全性;推理轨迹可能是表演性的,而非忠实的;罕见有害行为或后门行为可能躲过标准评测;记忆编辑能保留部分推理,却损伤算术和回忆;Goodfire 自己的方法文章也说,SAE、线性 steering 和参数分解都有重要局限。这些并不否定技术。事实上,它更强化了一个判断:Goodfire 正在严肃处理真实失败模式。但这也意味着,买方和投资人应把当前结果视为高级仪器化能力,而不是模型行为已经完全可读、可控的证明。尽调最尖锐的问题是:在周边 AI stack 把相邻监控、评测和 tracing 流程商品化之前,Goodfire 能否更快把有希望的研究变成生产级可靠性证据。[CR001, CR002, CR003, CR004, CR005, CR006]

运营 / 质量 / 安全风险登记表
故障模式可能性严重性缓释成熟度剩余风险暴露未解决缺口
可解释性科学仍不完整,产品承诺可能跑在因果理解前面部分:Goodfire 公开发表局限,同时继续搭工具需要独立生产案例研究,证明干预能改善结果且没有隐藏退化。
基准安全分数可能被评测感知和提示词伪影抬高部分:Goodfire 已识别这种扭曲,并提出提示词重写缓释方案需要第三方评测方法,证明部署行为会跟随基准表现。
在较简单任务上,思维链可能只是表演,并不忠实中高部分:探针和早退方法有帮助,但不能解决完整忠实性中高需要不只依赖可见推理的部署监控。
罕见有害或后门行为可能绕过标准测试,直到部署后才暴露中高部分:模型差异放大看起来有助于浮现罕见失败需要标准化预部署红队工作流,并证明它能泛化到模型生物之外。
抑制记忆或引导行为的编辑可能削弱算术能力或事实回忆中高弱-部分:权衡已有记录,但还没有被干净解决中高需要模型质量记分卡,显示可解释性干预落地时哪些能力被牺牲。
当前 SAE / 引导方法只捕捉几何结构碎片,且可能产生偏离目标的效果部分:Goodfire 正转向流形感知方法和 SPD需要证明新方法能从玩具模型和简单演示任务扩展出去。
公开安全姿态显示 SOC 2,但没有公开 SLA、事件历史或运行时控制细节中高部分:已有 SOC 2 和信任门户中高受监管买家需要正常运行时间披露、事件历史和架构细节。

严重性衡量的是该故障模式是否会破坏 Goodfire 作为关键 AI 控制层的可信度,而不只是某项研究结果是否有趣。

[CR003, CR004, CR005, CR016, CR028, CR029]
FR002: 风险传导地图

展示 Goodfire 的研究风险和合同风险如何传导为受监管采用变慢、参考客户质量变弱,以及潜在估值压缩。

DAG 表达的是方向性商业逻辑,而非测得概率。

[CR003, CR004, CR009, CR011, CR024, CR025]

7.3 伙伴、客户与依赖风险

Goodfire 的公开市场证明,比标题听起来更窄。公司称平台被 Fortune 500 企业、医疗机构和 AI 实验室使用,但公开具名证据集中在少数合作:Prima Mente 的阿尔茨海默病生物标志物发现、Mayo Clinic 的基因组医学、Radical AI 的材料科学,以及面向训练或微调模型公司的申请访问产品页。即便最强的案例研究,也描述 Goodfire 研究人员嵌入客户团队、共同搭建流程。这是技术深度的有价值证据,但更像高接触交付模式,而不是清晰可复制的软件收入。MIT Technology Review 提到的逐案定价,以及 On Healthcare 观察到的尚非可预测 SaaS 画像,都指向同一模式。 这种集中度带来两个相连风险。第一,公开参考质量偏伙伴驱动,而非广泛客户基础:如果一个旗舰合作停滞,披露出来的体量不足以吸收叙事冲击。第二,更广泛的买方工作流里,已经有 Datadog、LangSmith 等可观测性和评测供应商的相邻产品,为生产 AI 团队打包测试、tracing、监控和治理。这些平台不是机制可解释性的等价物,但它们争夺预算,也争夺定义生产中 AI 控制与监控形态的权利。因此,Goodfire 需要证明深度白盒访问是一层值得单独购买的控制层,而不只是客户已理解 stack 里的高级研究插件。[CR006, CR007, CR018, CR024, CR025, CR026]

合作伙伴 / 依赖风险登记表
依赖交易对手 / 暴露面角色集中度失败情景严重性缓释措施剩余风险暴露
已点名参考客户基础Prima Mente、Mayo Clinic、Radical AI,以及未具名企业证明平台能在重要领域运转的公开证据一两个旗舰合作停滞,已披露广度不足以抵消叙事受损增加跨行业、多元化的已点名生产参考客户和续约证明
偏研究的交付模式嵌入式研究员、现场工程、协作服务改造客户模型,并产出最强公开成果收入随稀缺专家劳动扩张,而不是随可复制软件使用扩张将产品化模块、操作手册和自助工作流与定制研究工作拆开
前沿模型构建方需求跨架构和模态训练或微调模型的公司Silico 的核心买家群中高开放模型团队或前沿实验室把类似工具内化,或认为可观测性已经够用展示清晰 ROI 和控制优势,且不能被标准追踪栈复制中高
客户愿意共享工作流和使用数据TOS / 订单表流程下的企业客户可改善平台表现和产品学习回路采购团队限制数据使用权,或要求轨迹硬隔离中高提供更严格客户控制和合同选项,在不抽掉全部学习回路的前提下守住信任
相邻可观测性栈Datadog Agent Observability、LangSmith 和类似工具争夺同一批监控、评估和治理预算线客户购买可观测性加评估后,认为不再需要单独的白盒可解释性中高将可解释性定位成独立的因果控制层,并用调试或模型设计中的可测提升来证明中高
医疗治理合作伙伴Mayo 和其他受监管机构在敏感领域提供合法性背书如果治理重的合作伙伴无法转化为更广泛部署,Goodfire 会停留在定制研究供应商中高将旗舰医疗工作转化为可复制的合规和验证包中高

集中度只按已披露公开证据判断;公司私下可能有更宽的商业覆盖,但目前还不够可见,不能把它承销为核心缓释项。

[CR006, CR007, CR017, CR018, CR024, CR025]
FR003: 依赖地图

映射 Goodfire 当前最依赖的外部表面:旗舰协作者、愿意共享数据并购买定制工作的企业买方,以及塑造买方预期的相邻可观测性平台。

合作伙伴和买方集中度只从公开披露的验证点推断;未披露客户可能让真实图景更好。

[CR007, CR024, CR026, CR027, CR044, CR045]

7.4 执行、人才、资本与论点失效触发器

Goodfire 同时在做三件难事:推进前沿可解释性研究,把研究变成企业平台,并在受监管和高风险领域建立品类权威。公开证据显示,相比这份野心,公司规模仍小。On Healthcare 估计员工约 51 人;可解释性专家的人才池似乎异常稀薄;招聘页也显示组织仍在扩张。同时,2026 年 2 月 Series B 把估值推到 $1.25 billion,压缩了执行失误的容错空间。一家公司公开披露的客户广度有限、没有公开定价架构、还带有高接触服务成分,现在必须证明自己能足够快地变成可复制软件,才能撑住这个估值。 实际投资答案,是把这些不确定性转成硬触发器。如果客户合同继续把安全和宕机风险大多留给买方,如果具名生产参考没有实质扩大,如果软件收入仍无法与嵌入式服务分开,或者相邻可观测性平台满足了买方大部分需求,投资论点会迅速走弱。反过来,如果 Goodfire 能在受监管场景展示生产续约、明显收紧默认条款的企业合同,以及可解释性干预在部署中有效而不只是论文或定制合作中的独立证据,风险就会收缩。在那之前,Goodfire 更像一个高上行、但仍受证明约束的控制层赌注,而不是已去风险的基础设施标准。[CR019, CR020, CR021, CR022, CR023, CR024]

人员 / 执行风险登记表
角色 / 职能依赖或缺口可能性严重性缓释措施尽调路径
可解释性研究梯队全球人才池看起来异常稀薄且昂贵用资本招聘资深研究员,并把声誉转化为招聘杠杆审阅留存指标、关键招聘管线,以及相对前沿实验室的薪酬竞争力。
研究到产品转化公司必须把前沿论文转化为可复制的企业工作流产品化最高价值干预,并收窄初始滩头用例审阅产品路线图、收入中的服务占比,以及已点名客户的部署架构。
商业扩张 / GTM逐案定价和申请访问姿态限制可见复制性中高标准化套餐、实施流程和采购合同索取定价架构、ACV 区间、销售周期数据和续约指标。
管理层带宽小团队同时推进研究、平台和受监管领域合作中高中高优先少数垂直切口,减少定制项目审阅职能领导层深度、招聘计划,以及路线图中客户专属内容占比。
独角兽定价后的资本纪律Series B 估值压低了市场对商业证明缓慢的容忍度用新增资本拓宽参考客户基础,并快速证明软件杠杆索取董事会材料,关注支出分配、下一阶段里程碑门槛,以及下一轮目标证据。

公开风险不只是团队小;而是公司的野心、估值和劳动力市场稀缺性同时把执行范围推大,速度超过公开证明扩张。

[CR019, CR020, CR021, CR022, CR023, CR024]
缓释和终止标准表
风险可监控触发器阈值 / 事件行动含义
合同文件仍偏向初创公司企业 MSA 仍镜像公开责任上限和保修免责声明前 3 个参考客户没有有意义的安全 / 停机 / 保密例外条款将受监管部署投资逻辑视为未证实;不要承销医疗或关键基础设施扩张。
数据权利摩擦阻断采购客户围绕 Workflow Data 要求重大红线,或完全拒绝数据共享两个或更多优先账户明确卡在数据使用条款上假设销售周期更慢、产品学习回路更弱;下调软件规模化假设。
参考客户集没有拓宽已点名生产客户没有扩展出当前偏合作的证明集下一轮刷新周期内新增已点名生产参考客户少于 3 个将公司重新定价为定制研究 / 服务业务,而不是基础设施层。
研究结果没有转化为部署提升没有独立证据证明可解释性干预带来生产收益没有第三方部署研究或客户 KPI 显示可测改善下调护城河假设,并直接对标传统可观测性供应商。
安全和正常运行时间姿态仍不透明除 SOC 2 外,没有公开正常运行时间、事件历史或运行时控制证据又一个刷新周期过去,仍没有 SLA、状态页或事件披露假设敏感工作负载中的企业渗透会更慢。
人才管线走弱核心可解释性岗位招聘速度或留存下降连续两个季度错过资深研究 / 产品招聘预期路线图滑期,创始人 / 研究员集中度风险更重。
估值跑在可复制性前面融资额和估值增长快于可见收入质量下一次重大融资事件前,没有定价标准化或软件 / 服务拆分没有可复制单位经济证据时,避免为品类期权付费。
可观测性平台吸收买家问题客户采用追踪 / 评估栈,但不再加白盒可解释性参考买家把 Goodfire 描述为锦上添花的研究工具,而不是控制平面基础设施投资逻辑破裂:品类坍缩成一个功能,而不是独立平台。

终止标准被写成可观察的公开或尽调事件,便于未来刷新时复盘,而不是停留在抽象担忧。

[CR009, CR011, CR013, CR016, CR019, CR024]
Chapter 08

08估值

8.1 建议、融资背景,以及为什么价格比叙事更重要

公开证据把 Goodfire 描绘成一家罕见且高质量的可解释性公司。公司很快拼出了一套精英融资组合:2025 年 4 月 $50 million Series A,随后 2026 年 2 月以 $1.25 billion 估值完成 $150 million Series B;Menlo、Anthropic、B Capital、Salesforce Ventures 和 Eric Schmidt 都出现在股东名单里。官方与备案记录也支撑其机构化质量的基本面:Goodfire 是一家 2023 年成立的 Delaware public benefit corporation,总部在 San Francisco;到 2026 年初,已提交 Series A 和 Series B 阶段的 Form D 文件。Goodfire 还借 Ember、Mayo Clinic、Arc Institute、Prima Mente、Microsoft,以及 2026 年 2 月 SOC 2 Type II 公告,宣称自己具备企业就绪动能。 这些优点重要,但本章做的是估值,不是赞美。证据在团队质量、科学可信度和投资人信号上很强;在通常用于支撑软件基础设施价格的商业数据上很弱。本资料包里的公开轮次材料,没有披露 ARR、收入、定价、客户数量、留存、毛利率或软件与服务的混合比例。这一缺口具有决定性。$1.25 billion 估值下,投资人显然不是在为已经验证的基本面付费;他们买的是一个期权:可解释性成为核心 AI 基础设施,Goodfire 成为品类赢家之一。这可能发生,但仅凭公开证据,价格已经假设了比公司披露更多的商业化进展。因此建议是继续研究,而非买入;估值立场是偏紧绷,而不是有吸引力。[CV001, CV002, CV004, CV005, CV006, CV007]

推荐摘要表
维度评估决策含义
推荐继续研究只有在 NDA 尽调补上收入质量和股权结构表缺口,或定价回落到基准情景区间时,才重新接触。
置信度公司信号质量强;估值信号质量不完整。
风险评级商业不透明、品类形成风险和优先股堆叠不确定性主导承销。
估值姿态偏高$1.25B 轮次靠近乐观情景低端,而不是基准情景中枢。
近期行动积极跟踪维持尽调接触,但不要只靠叙事承销本轮。

仅使用截至运行日期的公开证据;入场纪律假设主要敞口接近 2026 年 2 月轮次条款。

[CV001, CV005, CV015, CV036, CV047, CV048]
投资逻辑 / 反向逻辑表
视角投资逻辑反向逻辑什么会改变判断
品类需求企业要求 AI 可控、可解释,可解释性应变得更重要。企业可能认为可观测性和护栏已经够用,让可解释性停留在小众需求。预算数据显示 Goodfire 赢下的是标准预算线,而不是实验性支出。
产品Ember 提供差异化的模型内部控制层,不只是事后监控。产品可能仍过于研究化或定制化,难以作为软件扩张。标准定价、价值兑现周期和可复制部署的证据。
科学证明Goodfire 有真实研究产出,包括引导、神经几何、基因组学和多模态工作。科学可信度不会自动转化为经常性收入。旗舰研究项目能转化为持久商业账户的证据。
战略需求Anthropic、Salesforce 和 Eric Schmidt 是该品类的强信号投资人。热门 AI 市场里,聪明投资人仍可能为战略期权价值付高价。独立软件指标验证价格,而不是依赖股权结构表声望。
估值如果 Goodfire 成为高风险部署的核心 AI 基础设施,$1.25B 标记可以被解释。今天的公开证据没有披露支撑该标记所需的 ARR 或利润率。NDA 披露的 ARR、毛利率和留存能支撑可扩张软件倍数。

各行是有证据支撑的论点,以及会改变判断的可观察条件。

[CV010, CV011, CV013, CV015, CV022, CV029]
FV001: 建议逻辑

科研实力、商业不透明度和本轮价格如何共同导向投资建议。

该流程为定性呈现,用来展示决策逻辑,不是加权评分模型。

[CV010, CV015, CV036, CV037, CV047, CV048]
FV004: 投资 KPI

关键承销数据点:有些已有公开证据,有些仍然缺失。

KPI 面板混合了已确认公开事实和已标记缺口;未知商业指标明确显示为未披露。

[CV001, CV005, CV015, CV028, CV036, CV047]

8.2 证据受限的估值框架与可比标记

收入未披露,传统收入倍数模型会制造虚假精确。更合适的方法,是把可比私人估值标记与基于公开和未公开信息的情景逻辑结合起来。可比集合的价值不在于公式,而在于纪律检查。Anysphere、Harvey 和 Glean 在媒体给出数十亿美元估值时,都有披露的 ARR;Anthropic 则处在完全不同的前沿模型与算力稀缺宇宙。Goodfire 不属于 Anthropic 的地带;也不同于 Anysphere、Harvey 或 Glean,它没有公开展示能让外部投资人捍卫倍数的经常性收入基础。这迫使当前轮次只能被解读为战略期权价值。 因此,牛市、基准和熊市情景取决于里程碑转化,而不是电子表格外推。牛市情景下,Goodfire 证明 Ember 能把设计伙伴和研究合作者转化为可复制软件收入,持续交付差异化可解释性突破,并成为高风险 AI 部署的必备层。基准情景下,品类真实存在,Goodfire 仍是其中最强的独立团队之一,但商业化仍早且高接触;这应当对应上一轮折价,而不是溢价。熊市情景下,研究仍令人印象深刻,但预算流向可观测性、护栏或前沿实验室本身,Goodfire 留下的是定制服务画像,估值也会明显下修。按这个框架,2026 年 2 月轮次更接近牛市区间底部,而不是基准区间中部。[CV010, CV011, CV016, CV022, CV023, CV024]

乐观 / 基准 / 悲观情景表
情景核心假设估值 / 回报逻辑关键风险概率信号
乐观Ember 把研究可信度转化为可复制软件收入;合作伙伴成为规模化参考客户;安全和治理姿态打开企业采用。$1.25B-$1.85B EV;约为上一轮的 1.0x-1.5x,也就是说有上行,但除非执行极强,否则空间不大。商业转化可能仍比研究叙事暗示的更慢。低-中;需要尚未公开的证据。
基准品类需求真实存在,Goodfire 仍是最强的独立团队之一,但变现还早,且部分项目仍需定制。$0.80B-$1.10B EV;约为上一轮的 0.6x-0.9x,意味着按当前价格计算,风险调整后回报偏弱。公开数据始终无法补上收入质量缺口;预算分流到相邻供应商。中;最符合当前公开证据。
悲观可解释性仍有价值,但预算转向可观测性、护栏或前沿实验室;Goodfire 难以把产品收入标准化。$0.35B-$0.65B EV;约为上一轮的 0.3x-0.5x,意味着永久资本损失风险显著。商业化仍偏定制;多重压缩冲击 AI 基础设施公司。中-低,但披露有限,一旦不利仍足以影响判断。

情景值是受证据约束的企业价值区间,不是精确 DCF 输出。回报逻辑以 2026 年 2 月 $1.25B 轮次标记为基准。

[CV041, CV042, CV043, CV044, CV045, CV046]
可比估值表
可比公司公开指标估值 / 状态相关性局限
Goodfire收入未披露;$150M Series B$1.25B 估值(2026 年 2 月)本章的直接市场锚点。没有公开 ARR、定价或客户数据,无法支撑软件倍数。
Anysphere / Cursor>$500M ARR$9.9B 估值(2025 年 6 月)展示一家领先 AI 应用公司在估值匹配已披露规模时的样子。产品、增长曲线和开发者驱动的分发都不同。
Harvey$190M ARR$11B 据报融资目标(2026 年 2 月)展示增长得到证明后,顶级企业 AI 估值如何跑赢传统倍数。法律 AI 是不同垂直领域,且数字来自报道,并非公司确认。
Glean>$100M ARR$7.2B 估值(2025 年 6 月)在已披露 ARR 的企业 AI 价值上,是有用的应用软件参照。企业搜索和智能体比可解释性更成熟。
Anthropic前沿模型与算力规模$350B 估值,Google 承诺最高投入 $40B(2026 年 4 月)AI 前沿模型稀缺价值的上界。运营上不可比;Goodfire 不是前沿基础模型实验室。

选取 2025-2026 年私营 AI 公司的估值标记作为纪律性校验,而不是一一对应的估值公式;Goodfire 收入未披露,因此无法负责任地计算隐含倍数。

[CV001, CV030, CV032, CV033, CV034, CV035]
FV002: 估值敏感性

估值信念的方向性敏感度;正向条提高支付意愿,负向条削弱支付意愿。

由于缺少公开收入披露,敏感性条是方向性信念分数,不是美元差值。

[CV015, CV022, CV036, CV039, CV040, CV041]
FV003: 估值 / 回报区间

以 2026 年 2 月 $1.25B 轮次为参照、受证据约束的估值区间。

这些情景区间来自公开可比公司和里程碑逻辑,不能替代有 NDA 支撑的财务承销。

[CV001, CV042, CV043, CV044, CV045, CV048]

8.3 退出纪律、论点失效触发器与最终尽调请求

近中期退出路径几乎肯定是另一轮私募融资或战略交易,而不是 IPO。Goodfire 太早,也太不透明,难以支撑公开市场承保:投资人没有经审计收入规模、利润率画像,甚至没有基本客户数量披露。这并不意味着公司缺乏吸引力;它意味着投资案例取决于尽调。实际含义是,入场纪律必须盯住那些缺失证明点,它们会决定 Goodfire 能否从「具有商业前景的杰出研究公司」变成「可承保的软件基础设施业务」。这些证明点包括经常性收入质量、标准定价、集中度、毛利率,以及 Series B 之后的优先权 stack。 投资论点也可能以可观察方式失效。如果合作者无法转化为可复制客户,如果管理层在 NDA 下也无法披露有说服力的收入质量,或者预算持有人认为相邻供应商的 tracing、监控和护栏已经足够、不需要 Goodfire 更深的内部控制层,当前价格就难以捍卫。反过来,如果 Goodfire 能展示可复制软件订阅、强伙伴转化,以及可解释性正在受监管和高风险部署中成为强制性基础设施的证据,本轮估值就有机会被业绩消化。在这些证据出现之前,更有纪律的姿态是把 Goodfire 放在观察名单前列,继续积极尽调,并避免把 2026 年 2 月价格当成已经验证的便宜货。[CV022, CV026, CV027, CV036, CV039, CV040]

投资逻辑破裂与叫停触发表
触发项阈值对投资逻辑的传导行动含义
收入质量持续不透明管理层在 NDA 下仍无法披露 ARR、毛利率、客户集中度和留存。投资仍由叙事驱动,而不是基本面驱动。不要按高于基准情景区间承销;默认放弃。
没有合作伙伴转付费模式科研合作者和设计伙伴不能转化为可重复的平台收入。Goodfire 看起来像高端研究工作室,而不是可扩展的基础设施软件。估值转向悲观情景,要求更低入场价或结构性下行保护。
可观测性供应商满足预算需求客户靠追踪、监控和护栏解决痛点,不需要模型内部控制。品类切入点收窄,Goodfire 的 TAM 被压缩。大幅降低确信度,重新评估品类所有权。
优先股堆叠对投资者不友好Series B 文件显示高额优先级、异常保护或有意义的稀释压力。企业价值可能无法转化为可接受的股权回报。继续推进前,按股权价值口径重算回报。
安全或治理可信度下滑重大信任、合规或治理问题削弱高风险部署叙事。与安全、可控 AI 绑定的溢价会快速走弱。暂停尽调,直到补救措施得到独立验证。

触发项被定义为可观察的尽调发现,或投后监控事件;一旦出现,就会打破承销假设。

[CV036, CV043, CV046, CV049, CV050]
最终尽调要求表
主题缺失证据重要性负责人 / 尽调路径
收入质量ARR、预订额、净留存、总留存、毛利率和收入结构。要使用战略期权价值之外的任何估值方法,这些都是核心输入。CFO / NDA 下的财务资料室。
定价与打包当前价格表、试点转生产条款,以及软件与服务的变现拆分。决定业务能否按产品收入而非定制项目扩张。访谈销售负责人和产品负责人,并审阅合同样本。
客户集中度前十大客户、收入集中度、部署范围和续约状态。高集中度会让当前价格更难辩护。客户队列审阅和账户级尽调。
股权结构表与优先权Series B 后股权结构表、清算优先权、按比例认购权和治理保护。如果优先权很重,股权价值可能与企业价值显著不同。对融资文件做法律尽调。
商业转化证明 Mayo、Arc、Microsoft 或类似关系能形成可重复付费软件模式的证据。这是科学可信度与可扩展投资案例之间的桥。管理层深度访谈,要求提供队列案例和实施指标。

要把 Goodfire 从一家有意思的公司推进到可按上一轮价格或接近该价格承销的投资,这些是最低要求。

[CV027, CV040, CV049, CV050]

8.4 附录

免责声明

本报告是基于公开证据的尽调快照,不构成投资建议。重要财务、法律、技术和合同事实仍未公开;作出任何投资决定前,应直接向管理层和一手文件核验。

证据索引

结论
编号陈述可信度来源
CO001 Goodfire describes itself as a San Francisco-based research company and public benefit corporation. SO001, SO002, SO014, SO018
CO002 Goodfire’s mission is to build safe and powerful AI by understanding and intentionally shaping model internals rather than relying on scaling alone. SO001, SO004, SO005, SO006
CO003 Goodfire’s current public product is a model design environment that helps users understand, debug, and shape models through interpretability-based tooling. SO002, SO007, SO027
CO004 Goodfire says its platform is used by Fortune 500 enterprises, major healthcare institutions, and AI research labs. SO010
CO005 Official materials frame Goodfire around two linked pillars: intentional design of models and scientific discovery from model internals. SO004, SO005, SO008
CO006 Lightspeed publicly announced Goodfire’s $7 million seed round on August 15, 2024, showing the company was operating by mid-2024. SO022, SO023
CO007 Series A materials say the $50 million round came less than one year after Goodfire’s founding, which supports a 2024 founding window. SO018, SO020, SO021
CO008 One independent profile describes Goodfire as founded in 2023, creating a conflict with the 2024 founding window implied by financing materials. SO028
CO009 Goodfire’s careers page says all roles are full-time and in person five days a week at a Telegraph Hill office in San Francisco. SO003
CO010 Eric Ho is Goodfire’s CEO and primary public spokesperson in financing and media materials. SO014, SO018, SO029
CO011 Daniel Balsam is publicly identified as Goodfire’s cofounder and CTO. SO009, SO024, SO030
CO012 Tom McGrath is publicly identified as Goodfire’s cofounder and chief scientist, and partner materials credit him with founding DeepMind’s interpretability team. SO024, SO030
CO013 Goodfire and third-party coverage say the team includes researchers or engineers from OpenAI, Google DeepMind, Harvard, Stanford, and UC San Diego. SO004, SO014, SO017
CO014 Investor materials tie Eric Ho and Daniel Balsam to prior operating work at RippleMatch, supporting the claim that the founding team combines startup execution with research pedigree. SO021, SO022, SO024
CO015 Reviewed public materials do not disclose a full board roster or a complete executive team beyond the founders and a few named researchers. SO002, SO024, SO025
CO016 Goodfire announced a $50 million Series A led by Menlo Ventures with Lightspeed, Anthropic, B Capital, Work-Bench, Wing, and South Park Commons participating. SO018, SO019, SO020, SO021, SO026
CO017 Lightspeed says it led Goodfire’s $7 million seed round in August 2024. SO022, SO023
CO018 Goodfire announced a $150 million Series B at a $1.25 billion valuation led by B Capital with Juniper Ventures, Menlo Ventures, Lightspeed Venture Partners, South Park Commons, Wing Venture Capital, DFJ Growth, Salesforce Ventures, Eric Schmidt, and others participating. SO004, SO014, SO015, SO016, SO017, SO029
CO019 The Series B was announced less than a year after Goodfire’s Series A. SO014, SO015, SO004
CO020 Goodfire and third-party coverage describe the company as having raised more than $200 million in total funding after the Series B. SO004, SO014, SO016
CO021 Adding the publicly disclosed seed, Series A, and Series B rounds implies roughly $207 million of total disclosed capital. SO022, SO018, SO014
CO022 Reviewed public sources do not disclose debt financing, secondary transactions, ownership percentages, or board-seat allocations for Goodfire’s financings. SO014, SO018, SO025
CO023 Salesforce Ventures’ investment materials frame Goodfire as foundational enterprise AI infrastructure rather than only a research project. SO024, SO025
CO024 Goodfire’s public product branding shifted from Ember in 2025 financing materials to Silico in 2026 product materials. SO018, SO020, SO007, SO029
CO025 Goodfire says it reduced hallucinations in a large language model by about half using interpretability-informed training. SO004, SO027
CO026 Official materials name Prima Mente, Arc Institute, Mayo Clinic, and Microsoft as partners or collaborators. SO004, SO008, SO009, SO011
CO027 The Mayo Clinic collaboration explicitly discloses that Mayo Clinic has a financial interest in the technology referenced in the announcement. SO009
CO028 Goodfire’s public commercial proof remains broad and category-based because it names customer types but does not list many named enterprise customers or contract counts. SO010, SO028
CO029 Goodfire should be classified as a private Series B-stage company based on investor profiles labeling it private and the February 2026 financing history. SO025, SO030, SO014
CO030 Goodfire’s best-supported current public valuation is $1.25 billion. SO004, SO014, SO015, SO016, SO017
CO031 Goodfire’s best-supported public total capital figure is above $200 million. SO004, SO014, SO016, SO022, SO018
CO032 No reviewed public source discloses Goodfire’s revenue, ARR, or customer count. SO004, SO014, SO025
CO033 No official source reviewed discloses employee headcount, but one independent profile estimates Goodfire had about 51 employees as of January 2026. SO003, SO028
CO034 Reviewed public sources identify only a single disclosed office location in San Francisco and do not name other offices. SO003, SO025
CO035 The public milestone arc visible in reviewed sources runs from seed financing in August 2024 to Series A in April 2025 and Series B in February 2026. SO022, SO020, SO004
CO036 Goodfire’s September 2025 Mayo Clinic announcement shows the company expanding from interpretability tooling into healthcare and genomic medicine partnerships. SO009
CO037 By February 2026 Goodfire was publicly describing partnerships spanning AI agents and life sciences. SO004, SO015
CO038 MIT Technology Review reported on April 30, 2026 that Goodfire was commercially releasing Silico as a fee-based tool for model debugging and steering. SO027
CO039 MIT Technology Review quoted an outside interpretability researcher saying Goodfire is adding “precision to the alchemy” rather than making model design fully principled. SO027
CO040 An independent health-tech analysis argues the $1.25 billion valuation is aggressive for a research-first company with early commercial traction and an estimated 51 employees. SO028
CO041 Goodfire’s public materials show active field-building and recruiting through a fellowship program, Stanford guest lectures, and ongoing in-person hiring in 2025-2026. SO003, SO012, SO013
CM001 Goodfire positions itself as an interpretability lab focused on understanding and intentionally designing AI rather than only monitoring outputs. SM001, SM007
CM002 Silico is described as a model design environment for training and debugging models on Goodfire infrastructure. SM003, SM007
CM003 Goodfire says it partners with organizations training or fine-tuning foundation models across architectures and modalities. SM003, SM004, SM005, SM006
CM004 Goodfire claims its language-model workflow cut hallucinations by 58% without degrading benchmark performance and at about 90x lower cost than LLM-as-a-judge. SM004
CM005 Goodfire publicly markets use cases across language models, genomics, and robotics or vision instead of only text-model applications. SM004, SM005, SM006
CM006 Goodfire says it works with partners such as Arc Institute, Mayo Clinic, and Microsoft and uses a shared environment with customers. SM007
CM007 Goodfire publicly describes inference-time monitors and production monitoring as part of its intentional-design platform. SM001, SM007
CM008 Goodfire argues that black-box prompting and fine-tuning are inadequate for reliable high-stakes AI engineering and that feature steering can substitute for some fine-tuning work. SM008, SM009
CM009 Goodfire's pilot agreement starts with internal evaluation of software plus services and explicitly aims toward a later commercial license. SM014
CM010 The pilot agreement requires customer cooperation, access to software or equipment, and designated contacts, implying a high-touch delivery model. SM014
CM011 Prima Mente used Goodfire to decode an epigenomics model for biomarker discovery and model redesign, showing a plausible scientific-AI buyer archetype. SM005, SM015
CM012 Goodfire and Mayo frame interpretability as a way to validate model predictions, reduce spurious correlations, and improve scientific or clinical relevance under governance controls. SM005, SM010
CM013 MIT Technology Review says Goodfire is one of a small handful of companies pioneering mechanistic interpretability and that frontier labs already have internal interpretability teams. SM030
CM014 MIT says Silico is most usable where customers can access model internals, which is easier for open-source or in-house models than for closed models like ChatGPT or Gemini. SM003, SM030
CM015 MIT reports that Goodfire will price Silico case-by-case instead of publishing standard pricing. SM030
CM016 Gartner says generative-AI ROI varies widely by use case and that hidden costs such as compliance reviews, retraining, and internal overhead can exceed initial expectations. SM016
CM017 Gartner places generative AI in the 2025 Trough of Disillusionment, which implies more cautious implementation expectations even as interest remains high. SM016
CM018 NIST's AI Risk Management Framework treats trustworthy, governable AI as a prerequisite for adoption in higher-risk settings. SM018
CM019 PwC reports that AI-exposed industries have 3x higher revenue-per-worker growth since 2022 and workers with AI skills command a 56% wage premium. SM017
CM020 Goodfire's relevant market boundary is narrower than broad generative-AI narratives and should focus on model design, interpretability, and model-behavior tooling for teams that can inspect or modify internals. SM001, SM003, SM014, SM030
CM021 The included spend pool covers representation analysis, failure diagnosis, steering, interpretable training feedback, and production monitors, while excluding generic AI hardware, generic copilots, and pure app-performance monitoring. SM003, SM007, SM023
CM022 Arize, Fiddler, Datadog, LangSmith, Langfuse, Patronus, Arthur, and Humanloop show that tracing, evaluation, monitoring, and agent control are already recognized software categories. SM019, SM021, SM023, SM024, SM025, SM027, SM028, SM029
CM023 Those adjacent platforms mostly observe prompts, traces, sessions, and outputs, whereas Goodfire's differentiation claim is control over internal features, parameters, or latent representations. SM003, SM011, SM013, SM019, SM021, SM024, SM025
CM024 Arize sells from free or open-source tooling to a $50-per-month Pro plan and custom enterprise pricing, showing the adjacent observability layer already has self-serve pricing and startup programs. SM019, SM020
CM025 Fiddler publishes a developer price of $0.002 per trace and markets enterprise guardrails, observability, and governance as one platform. SM021, SM022
CM026 Langfuse publishes prices from free to $29 per month Core, $199 per month Pro, and $2,499 per month Enterprise, with enterprise security and support features. SM025, SM026
CM027 Humanloop markets enterprise evaluation tooling with a free trial, 50 eval runs, and 10,000 logs per month, reinforcing that adjacent budgets often begin with workflow tooling rather than custom research engagements. SM029
CM028 Goodfire's direct market reach is highest in frontier labs because they already run interpretability teams, possess model internals, and value precise control over training and behavior. SM003, SM007, SM030
CM029 Enterprise model teams are reachable when they train or fine-tune proprietary or open-weight models, but teams using only closed APIs are outside Goodfire's near-term reach. SM003, SM009, SM014, SM030
CM030 Scientific-AI teams in genomics, biology, and robotics are attractive because model internals can reveal domain mechanisms, improve generalization, and validate whether predictions rely on real structure or shortcuts. SM005, SM006, SM010, SM012, SM015
CM031 Regulated adopters have strong need for interpretability and trustworthy AI, but procurement and deployment cycles are slower because governance, privacy, and evidence standards are higher. SM010, SM017, SM018
CM032 Goodfire's adoption motion likely starts with a pilot or design-partner evaluation, then requires model and data access, interpretability work, and only later expands to production monitoring and longer-term licensing. SM003, SM007, SM014
CM033 In this market the buyer, user, and payer often differ, with research or platform leaders buying, model scientists and safety teams using, and AI R&D or platform budgets paying. SM002, SM003, SM014, SM030
CM034 The category grows as models take on higher-stakes tasks in health, science, finance, and autonomous agent workflows where output-only evaluation is insufficient. SM005, SM010, SM021, SM023, SM024
CM035 Agent-observability vendors frame autonomous decisions, guardrails, and repeatable evaluation as business-critical, which expands the adjacent budget pool that Goodfire can sell into or alongside. SM021, SM022, SM023, SM024, SM025, SM027
CM036 Dependence on model-internal access is a major constraint because Goodfire's tooling requires deeper access than teams using only hosted closed-model APIs can usually provide. SM003, SM014, SM030
CM037 Goodfire presents interpretability as precision engineering that can turn training into intentional design. SM007, SM008
CM038 MIT Technology Review quotes an external researcher saying Goodfire is adding precision to alchemy, which challenges the precision-engineering narrative. SM030
CM039 Goodfire's own intentional-design essay says the agenda is at the beginning of a deep technical tree and still needs better interpretability tools and algorithms. SM008
CM040 Goodfire's parameter-decomposition research says current interpretability methods still struggle to map model behavior cleanly to underlying parameters and circuits, which reinforces technical immaturity. SM013
CM041 Goodfire's manifold-steering research argues that linear steering often mismatches model geometry and that geometry-aware steering works better, suggesting the technical edge is not commodity tracing. SM011
CM042 Goodfire's Evo 2 work shows interpretability can reveal biologically relevant features and possibly guide DNA generation, supporting a scientific-AI market lens beyond enterprise copilots. SM005, SM012
CM043 Goodfire says customer conversations show teams prioritize rapid iteration and migration to newer models over heavy fine-tuning, which implies demand for lighter-weight control tooling. SM009
CM044 Public adjacent pricing creates a floor for what buyers expect to pay for observability and eval tooling, but Goodfire's undisclosed case-by-case pricing means it must win on higher-value model-internal outcomes rather than commodity traces. SM020, SM022, SM026, SM029, SM030
CM045 Because Goodfire has no public pricing schedule, customer count, or disclosed recurring revenue, a defensible TAM, SAM, or SOM cannot be computed from public evidence alone. SM014, SM030
CM046 The most evidence-backed near-term SOM is a small set of frontier labs, advanced enterprise model teams, and scientific model builders willing to grant model access and buy services-heavy pilot engagements. SM003, SM005, SM006, SM014, SM030
CM047 Published self-serve observability prices imply an annual software band of roughly $348 to $2,388 before enterprise add-ons or heavy usage. SM020, SM026
CM048 Public list pricing shows adjacent enterprise-grade observability software can reach at least about $29,988 per year before overage charges or custom services. SM026
CM049 Fiddler's per-trace pricing implies annual monitoring spend can range from hundreds to tens of thousands of dollars depending on trace volume. SM022
CP001 Goodfire positions Silico as the first platform for intentional model design and as a workspace for training and debugging models at frontier scale. SP001
CP002 Goodfire says its language-model workflow predicts failures before deployment and can correct failure modes directly without retraining from scratch. SP001, SP002
CP003 Goodfire extends the same model-internal workflow into life sciences and robotics/vision use cases, not just generic chat applications. SP003, SP004
CP004 Goodfire explicitly frames feature steering as an alternative to black-box prompting and fine-tuning workflows. SP005
CP005 Goodfire disclosed a $150 million Series B at a $1.25 billion valuation and third-party coverage describes roughly $209 million raised in total. SP006, SP008
CP006 MIT Technology Review describes Goodfire as one of a small handful of mechanistic-interpretability pioneers alongside Anthropic, OpenAI, and Google DeepMind. SP007
CP007 MIT Technology Review says frontier labs already have internal interpretability teams, making them Goodfire's closest direct incumbent alternative for top-end model builders. SP007
CP008 MIT Technology Review says Silico is most useful where customers can inspect a model's inner workings, limiting its applicability on closed models such as ChatGPT or Gemini. SP007
CP009 Outside researcher Leonard Bereska told MIT Technology Review that Goodfire may be adding precision to existing AI alchemy rather than fully turning model building into engineering. SP007
CP010 On Healthcare Tech characterizes Goodfire as a roughly 51-person, research-first organization whose $1.25 billion valuation looks aggressive relative to disclosed commercial traction. SP008
CP011 Goodfire's probe-based data-attribution work claims a 63% reduction in harmful behavior after filtering flagged data and larger reductions after swapping labels or removing responsible sources. SP009
CP012 Goodfire says SAE probes for Rakuten AI agents generalized better than other probes on PII detection and were cheaper than LLM-as-judge baselines. SP010
CP013 Goodfire's Llama 3 research preview claims it can extract modifiable internal features and steer behavior while minimizing performance degradation. SP011
CP014 Goodfire's VPD explainer says direct edits to decomposed parameter subcomponents can produce precise behavior changes without retraining. SP012
CP015 Goodfire says its self-correcting-search collaboration improved viable candidate materials by about 30%, supporting its claim that mechanistic tools can affect model behavior in non-LLM domains. SP013
CP016 Goodfire's own reasoning-theater research argues that chain-of-thought can be unfaithful to internal computation, which weakens the claim that trace-level reasoning alone is enough for debugging. SP014
CP017 Arize Phoenix markets an open-source platform for agent development and evaluation built around tracing, evals, datasets, and experiments. SP015
CP018 Arize prices AX Pro at $50 per month with 50k spans and 10 GB, while enterprise packaging is custom and can be self-hosted. SP016
CP019 Fiddler positions itself as a unified AI observability and security platform with lifecycle evaluation, monitoring, and real-time guardrails. SP017
CP020 Fiddler publishes a free tier and a Developer plan priced at $0.002 per trace, with enterprise deployment options spanning SaaS, VPC, and on-premise. SP018
CP021 Arthur markets a full-lifecycle platform for reliable AI that combines continuous evals, policies, guardrails, dashboards, and oversight. SP019
CP022 Datadog ties agent observability to its broader application-monitoring estate and says teams can test prompt, model, and tool changes against production data in one workflow. SP020
CP023 Datadog publishes a free tier up to 40K LLM spans per month and a Pro plan starting at $160 per month for 100K spans, with no separate evaluation fee. SP020
CP024 LangSmith markets agent observability with framework-agnostic SDKs and says it has a free tier while paid plans scale with trace volume. SP021
CP025 Langfuse markets an open-source AI engineering platform, self-hosting, OpenTelemetry compatibility, 10+ billion observations per month, and more than 100,000 engineers building on it. SP022
CP026 Langfuse publishes transparent self-serve pricing: free Hobby, $29 Core, $199 Pro, and $2499 Enterprise, plus a unit-based overage ladder. SP023
CP027 Humanloop historically sold enterprise tools to develop, evaluate, and ship trustworthy LLM apps, including private deployment options and a free trial. SP024
CP028 Humanloop is now joining Anthropic and sunsetting its platform, so it is better read as a consolidation signal than as a durable stand-alone peer. SP025
CP029 Weights says its products and services were wound down after its team joined OpenAI, reinforcing the pattern that AI tooling teams can be absorbed by frontier labs. SP026
CP030 NIST's AI RMF and Gartner's GenAI guidance both emphasize trustworthiness, governance, evaluation, and hidden operating costs in sensitive AI deployments. SP027, SP028
CP031 Goodfire's closest direct alternatives are internal frontier-lab interpretability teams and advanced in-house build paths, not ordinary tracing vendors. SP001, SP007, SP015, SP021
CP032 Most adjacent vendors in the reviewed set compete at the trace, eval, guardrail, or governance layer rather than through direct edits to learned model features. SP017, SP019, SP020, SP021, SP022
CP033 Goodfire appears best aligned to buyers building or adapting open-weight models in high-stakes domains where pre-deployment diagnosis matters more than general observability breadth. SP002, SP003, SP004, SP007
CP034 Datadog, LangSmith, and Langfuse have stronger visible developer distribution than Goodfire because they ride existing observability, framework, or open-source workflows. SP020, SP021, SP022
CP035 Fiddler and Arthur compete more directly with governance- and trust-led procurement because they explicitly emphasize guardrails, policies, monitoring, and enterprise oversight. SP017, SP018, SP019
CP036 Goodfire's public commercial disclosure is thinner than that of Arize, Fiddler, Datadog, and Langfuse because MIT describes Silico pricing as case-by-case and Goodfire declined specifics. SP007, SP016, SP018, SP020, SP023
CP037 Free or low-cost adjacent tools put price pressure on any attempt to sell Goodfire as a generic AI engineering or observability layer instead of a differentiated model-design product. SP015, SP016, SP020, SP022, SP023, SP024
CP038 Category consolidation is already visible through Humanloop's move to Anthropic and Weights' move to OpenAI, which raises the risk that interpretability adjacencies become features inside larger labs or stacks. SP025, SP026
CP039 Goodfire's moat is strongest if its research outputs in steering, attribution, probes, and domain science can be productized into repeatable workflows rather than bespoke research wins. SP006, SP009, SP010, SP011, SP013
CP040 The public record does not yet show enough win-rate, realized-pricing, or retention evidence to underwrite Goodfire's competitive durability with high confidence. SP007, SP008
CP041 The status-quo substitute for many buyers remains an in-house black-box stack of prompting, eval harnesses, fine-tuning, and guardrails, which is cheaper up front but less mechanistically explanatory. SP005, SP015, SP021, SP024
CP042 Goodfire's partner access today looks more domain-credibility-led than platform-distribution-led: Microsoft, Mayo, Rakuten, and Radical-style collaborations support relevance but do not equal Datadog- or LangChain-style installed-base reach. SP006, SP010, SP013, SP020, SP021
CP043 Humanloop packages enterprise LLM evals as a standalone platform, reinforcing that adjacent evaluation vendors compete for some of the same budgets Goodfire targets. SP029
CI001 Goodfire announced a $150 million Series B round at a $1.25 billion valuation in February 2026. SI002, SI016, SI017
CI002 Goodfire's 2026 Form D reports $149,999,796 sold after a first sale on 2025-12-17, against a total offering amount of $161,674,124. SI028
CI003 Goodfire announced a $50 million Series A round in April 2025. SI021, SI022, SI023, SI027
CI004 Goodfire's 2025 Form D reports $52,029,991 sold after a first sale on 2025-04-02. SI027
CI005 At least $202,029,787 of equity sold across Goodfire's 2025 and 2026 Form D filings is directly verifiable from primary filing evidence. SI027, SI028
CI006 Goodfire says the Series B proceeds will fund frontier research, next-generation product development, and scaled partnerships across AI agents and life sciences. SI002, SI016, SI018
CI007 Goodfire describes Silico as a model-design environment and workspace for training and debugging models on Goodfire infrastructure. SI001, SI003
CI008 Goodfire's product and vertical pages route prospects to request access or contact the company instead of publishing self-serve commercial pricing. SI003, SI004, SI005, SI006, SI008
CI009 Goodfire's contact page says its platform is used by Fortune 500 enterprises, major healthcare institutions, and AI research labs. SI008
CI010 Goodfire's Series B post says the company has partnered with Arc Institute, Mayo Clinic, and Microsoft to deploy its technology. SI002
CI011 In the Prima Mente case study, Goodfire says its research scientists embedded with the customer and built a biomarker-discovery pipeline around the customer's model. SI011
CI012 Goodfire's public contract terms show a commercial bundle that can include platform access plus support, technical assistance, field engineering support, research activities, collaboration activities, and deliverables. SI013, SI015
CI013 Goodfire's MSA and TOS place core commercial fees in negotiated order forms rather than in public documentation. SI013, SI015
CI014 Goodfire's TOS explicitly contemplates overage charges when usage exceeds the allotment included in the applicable order form. SI015
CI015 Goodfire's pilot agreement says pilot access is for internal evaluation and requires a separate commercial license for post-evaluation use. SI014
CI016 Goodfire's TOS says usage reports provided through the platform dashboard or on request are the authoritative source for calculating Fees. SI015
CI017 Goodfire's MSA says it will not use Customer Data to train foundation models or generalized machine-learning models for the benefit of Goodfire, other customers, or third parties. SI013
CI018 Goodfire's TOS gives Goodfire a perpetual license to use Workflow Data to provide, improve, train, fine-tune, and commercialize the platform, subject to confidentiality constraints. SI015
CI019 Goodfire's MSA and TOS allow suspension for overdue accounts and provide for late-payment interest of 1.5 percent per month. SI013, SI015
CI020 Goodfire announced SOC 2 Type II compliance and a public SOC 3 report by February 2026. SI010
CI021 Goodfire's official vertical pages target teams training or fine-tuning AI models across architectures and modalities rather than retail end users. SI003, SI004, SI005, SI006
CI022 Goodfire's RLFR post claims a 58 percent reduction in hallucinations in Gemma-3-12B-IT at roughly 90 times lower cost than an LLM-as-a-judge alternative, with no degradation on standard benchmarks. SI004, SI012
CI023 Goodfire's RLFR and life-sciences proof points are technical or scientific outcomes, not disclosed customer ROI or recognized revenue metrics. SI011, SI012, SI026
CI024 Goodfire's feature-steering post says the SAE demo interface and API were deprecated in February 2026. SI007
CI025 The deprecation of public preview tooling and the request-access posture together suggest Goodfire has shifted its public surface toward enterprise and custom deployments. SI003, SI007, SI008
CI026 Goodfire's public evidence includes named life-sciences proof points with Prima Mente, Mayo Clinic, and Arc Institute. SI002, SI005, SI011
CI027 Salesforce Ventures presents Goodfire as foundational enterprise infrastructure for understanding and intentionally designing modern AI. SI025
CI028 Menlo Ventures says Goodfire is productizing Ember and commercializing model understanding, and notes that Eric Ho previously scaled a prior company to more than $10 million in ARR. SI023
CI029 No reviewed public source discloses Goodfire's revenue, ARR, gross margin, cash balance, burn rate, runway, or customer retention metrics. SI001, SI002, SI003, SI013, SI015, SI016, SI026
CI030 No reviewed public source discloses public list pricing, minimum commits, or discount ladders for Silico or related commercial offerings. SI003, SI004, SI005, SI006, SI008, SI013, SI015
CI031 Because pricing is private and contracts are order-form based, Goodfire's realized pricing and software-versus-services revenue mix cannot be inferred from the official surface alone. SI011, SI013, SI014, SI015
CI032 A skeptical sector analysis argues that Goodfire's $1.25 billion valuation is aggressive for a roughly 51-person company with early commercial traction and not yet a predictable SaaS business. SI026
CI033 The same skeptical analysis argues that investors are underwriting Goodfire on research and platform option value rather than on publicly evidenced near-term software revenue. SI026
CI034 Goodfire's Series B was announced less than a year after its Series A, showing capital access that scaled faster than disclosed operating metrics. SI002, SI021, SI026
CI035 Goodfire's 2025 Form D lists 47 investors, while the 2026 Form D lists 19 investors. SI027, SI028
CI036 Goodfire's 2026 Form D total offering amount exceeds the press-announced $150 million sold amount, implying possible residual allocation or additional financing capacity within the same offering. SI016, SI028
CI037 No public debt facility or project-finance obligation surfaced in the reviewed sources, but the absence of disclosure should not be treated as proof of zero leverage. SI013, SI015, SI016, SI026
CI038 Post-Series-B capital adequacy can only be assessed qualitatively: Goodfire is well funded relative to public stage signals, but runway cannot be modeled without cash and burn data. SI016, SI026, SI028
CI039 Goodfire's public messaging implies a high-touch GTM motion centered on selective design partnerships rather than broad self-serve transaction volume. SI002, SI008, SI015
CI040 Because Goodfire's customer evidence includes embedded scientific work and its terms contemplate field engineering and collaboration activities, at least some current revenue likely mixes software access with services delivery. SI011, SI015
CI041 Goodfire publicly presents Radical AI as a materials-science design partner, supporting a commercialization path beyond language-model customers. SI029
CE001 Silico is described as the first platform for intentional model design and as a model-design environment built on Goodfire infrastructure. SE001, SE030
CE002 Silico markets five operator jobs around model internals: seeing inside predictions, running health checks, debugging failures, shaping behavior, and generalizing from less data. SE001
CE003 Goodfire's current product motion is request-access and partnership-led for teams training or fine-tuning foundation models across architectures and modalities. SE001, SE002, SE003, SE004
CE004 Goodfire's language-model workflow claims a 58% hallucination reduction with no degradation on performance benchmarks. SE002, SE005
CE005 The same language workflow claims roughly 90x lower intervention cost than LLM-as-a-judge approaches. SE002
CE006 The Hallucinations Viewer compares base and policy rollouts on LongFact++ and exposes intervention details for selected outputs. SE005
CE007 Goodfire's life-sciences surface claims state-of-the-art pathogenicity prediction across 839k ClinVar variants. SE003, SE015
CE008 Goodfire says EVEE provides interpretable predictions and explanations for all 4.2 million ClinVar variants. SE003, SE015
CE009 Prima Mente and Goodfire identified DNA fragment length as a dominant Alzheimer's signal and distilled the finding into a human-readable classifier. SE013, SE028
CE010 Goodfire says the Alzheimer's biomarker workflow generalized to an independent cohort. SE003, SE013
CE011 Goodfire's robotics and vision surface says teams can predict failures before deployment by inspecting latent representations directly. SE004
CE012 The robotics case study says Goodfire traced unstable behavior to brittle internal features and information bottlenecks. SE004
CE013 Goodfire markets feature steering as stronger than prompting when prompt engineering hits diminishing returns. SE006
CE014 Goodfire says feature steering can often replace fine-tuning for behavior changes but cannot add new knowledge to a model. SE006
CE015 Goodfire deprecated its earlier SAE demo interface and API in February 2026. SE006
CE016 MIT Technology Review reports that Silico uses agents to automate interpretability work that previously required human researchers. SE009, SE030
CE017 MIT Technology Review reports that Silico is priced case-by-case and is easier to use on open-source models than on closed APIs. SE030
CE018 Goodfire's intentional design thesis frames current AI training as guess-and-check and positions interpretability as closed-loop steering. SE007
CE019 Goodfire says intentional design aims to change what models learn from individual datapoints rather than hard-wiring heuristics into models. SE007
CE020 Goodfire says it released the first public sparse autoencoders trained on a true reasoning model, DeepSeek R1. SE010
CE021 Goodfire's R1 work says effective steering had to begin after the model's boilerplate response prefix rather than at the first response token. SE010
CE022 Goodfire reports that some R1 features revert toward original behavior under oversteering before outputs become incoherent. SE010
CE023 Goodfire argues that important model concepts often live on curved manifolds rather than along single linear directions. SE011, SE014
CE024 Can SAEs Capture Neural Geometry? says a single sparse-autoencoder feature gives only a partial view of curved internal structure. SE014
CE025 Goodfire says its manifold pipeline clusters sparse features to recover fuller geometric structure from internal representations. SE014
CE026 Stochastic Parameter Decomposition moves interpretability into parameter space by learning which weight components can be removed without changing behavior. SE017
CE027 Model diff amplification makes rare harmful behaviors 10 to 300 times more common in sampling, making them easier to detect. SE016
CE028 Goodfire says model diff amplification can reveal post-training side effects after only a fraction of a training run. SE016
CE029 Goodfire's eval-awareness study found naturally occurring verbalized eval awareness across all 19 benchmarks and 8 models it tested. SE012
CE030 Goodfire says prompt rewrites reduced eval awareness by 40% and an unsupervised method reduced it by 75%, with safe-behavior rates also falling. SE012
CE031 Paint With Ember uses a canvas that manipulates SDXL-Turbo internal activations instead of relying only on text prompts. SE019
CE032 Goodfire's research surfaces and phylogeny work argue that internal geometry recapitulates structured concepts across language, image, and genomic models. SE011, SE021
CE033 Goodfire's terms define the platform as software, APIs, tools, documentation, support, and services, and allow customers to bring models, files, datasets, code, and workflows into the platform. SE022
CE034 Public terms tie commercial fees and overages to order forms and usage reports rather than to a public rate card. SE022
CE035 The Pilot Agreement limits pilot use to internal evaluation and requires a separate commercial license after the evaluation period. SE023
CE036 Goodfire's terms and pilot agreement both describe support, technical assistance, field engineering, research activities, and deliverables around the platform. SE022, SE023
CE037 Goodfire's terms allow third-party products and permit access suspension for security, legal, operational, or overdue-account reasons. SE022
CE038 Goodfire's terms say customers retain customer materials while Goodfire retains Goodfire IP and broad rights over usage data and licensed workflow data. SE022
CE039 Goodfire announced SOC 2 Type II with no exceptions identified and a public SOC 3 summary. SE008
CE040 Goodfire says its Mayo collaboration operates under rigorous data privacy protocols and Mayo Clinic governance frameworks. SE027
CE041 NIST's AI RMF and generative-AI profile focus on embedding trustworthiness into AI design, development, use, and evaluation. SE033
CE042 Gartner says GenAI total cost of ownership is often understated and that critical decision use cases require more robust and interpretable approaches. SE034
CE043 Salesforce Ventures frames Goodfire as moving AI teams from guessing at behavior to measuring and shaping model intent and reasoning. SE031
CE044 On Healthcare Tech argues that interpretability could become infrastructure for regulated health AI, but public commercialization evidence still looks early. SE032
CE045 Public materials reviewed do not provide a public status page, self-serve API reference, or public deployment-count disclosure for Silico. SE001, SE022, SE030
CE046 Careers, Stanford guest lectures, and the fellowship program show active research-engineering recruiting and practitioner education despite a limited public OSS product surface. SE024, SE025, SE026
CE047 Goodfire's public 2026 output is research-led and fast, with published releases on May 4, May 20, and May 21 covering eval awareness and neural geometry. SE012, SE014, SE020
CE048 PR Newswire says Series B proceeds will fund next-generation product development and partnership scaling across AI agents and life sciences. SE038
CE049 EVEE combines Evo 2 embeddings, lightweight probes, and frontier reasoning models to generate human-readable hypotheses about variant effects. SE015
CE050 Goodfire's phylogeny work says Evo 2 encodes tree-of-life relationships as a curved manifold, reinforcing its model-to-human knowledge-transfer thesis. SE021
CE051 Menlo and Lightspeed both describe Goodfire as an applied research lab translating mechanistic interpretability into productized tooling. SE035, SE036
CE052 PYMNTS reports that Goodfire internally uses a model design environment and deploys that shared environment forward with customers. SE037
CU001 Goodfire publicly says its platform is used by Fortune 500 enterprises, major healthcare institutions, and AI research labs. SU001, SU014
CU002 Goodfire positions Silico and related services for organizations training or fine-tuning foundation models across architectures and modalities. SU001, SU002
CU003 Goodfire says it engages deeply and selectively with teams building high-stakes or frontier systems where understanding and control are essential. SU014
CU004 Public product and contact copy imply buyers are research, platform, or product owners while day-to-day users are research scientists and ML engineers. SU001, SU002, SU003, SU004, SU005
CU005 Goodfire's named proof set is concentrated in life sciences, AI-agent infrastructure, and materials discovery rather than a wide range of end markets. SU003, SU006, SU008, SU010, SU011, SU013
CU006 Reviewed public sources do not disclose Goodfire's customer count. SU001, SU014, SU019, SU020
CU007 Reviewed public sources do not disclose Goodfire's segment-level revenue or ARR by customer type. SU014, SU020, SU022
CU008 The broad Fortune 500 adoption claim is not backed by a public list of named enterprise customers or outcomes. SU001, SU014
CU009 Goodfire's public sales surface is request-access and contact led rather than self-serve. SU001, SU019
CU010 Prima Mente partnered with Goodfire to understand its Pleiades epigenomics model. SU006, SU007
CU011 Goodfire says its researchers embedded in Prima Mente's team while building a biomarker-discovery pipeline. SU006
CU012 Goodfire and Prima Mente identified a novel class of blood-borne biomarkers for Alzheimer's detection. SU006, SU007, SU022
CU013 Prima Mente's public outcome remains pre-validation because the biomarkers are still undergoing experimental validation and a publication is forthcoming. SU006, SU007
CU014 Goodfire collaborated with Arc Institute to interpret Evo 2, Arc's genomic foundation model. SU010, SU025
CU015 The initial Arc Institute disclosure described feature discovery and steering work that was still in its early stages rather than a mature production deployment. SU010
CU016 By March 2026, Goodfire's Evo 2 interpretability work had been updated to note Nature publication, increasing scientific credibility of the Arc partnership. SU010, SU020
CU017 Goodfire says its Mayo Clinic collaboration combines interpretability work with Mayo's medical AI team and established data-governance frameworks. SU008
CU018 Public Mayo materials frame the work as genomic research and responsible-AI validation rather than routine clinical deployment. SU008, SU009
CU019 Goodfire's EVEE work is described as part of its ongoing collaboration with Mayo Clinic and is still undergoing peer review. SU009
CU020 Goodfire says EVEE achieves 0.997 AUROC on 839k ClinVar variants and provides predictions and explanations for all 4.2 million ClinVar variants. SU003, SU009
CU021 Goodfire says EVEE outputs are computational predictions rather than diagnoses and require further expert review and validation. SU009
CU022 Goodfire partnered with Rakuten on PII detection for multilingual AI-agent messages in a production-critical enterprise setting. SU013
CU023 The Rakuten deployment required synthetic-to-real generalization, multilingual English and Japanese coverage, lightweight inference, and very high recall. SU013
CU024 Goodfire says Rakuten deployed SAE probes and describes the system as the first known enterprise application of SAEs for language-model guardrails. SU013
CU025 Among reviewed sources, Rakuten is the clearest public evidence of a production Goodfire deployment. SU013, SU019
CU026 Goodfire and Radical AI publicly announced a partnership to apply interpretability to inverse materials design. SU011, SU012
CU027 Goodfire says its self-correcting-search work with Radical AI improved successful candidates by about 27% and generated about 30% more SUN materials in the target range. SU012
CU028 Radical AI's public partnership disclosure says more research directions and outcomes will be shared later, leaving commercialization maturity unclear. SU011, SU012
CU029 Goodfire says it deploys its model design environment forward with customers in a shared environment. SU014, SU026
CU030 MIT Technology Review reports that Silico pricing is determined case by case and Goodfire declined to provide specific pricing. SU019
CU031 MIT Technology Review says Silico could let smaller firms and research teams build or adapt open-source models without hiring interpretability researchers. SU019
CU032 Goodfire's public positioning is selective and high-touch rather than high-volume self-serve SaaS. SU001, SU014, SU019
CU033 Goodfire's Series B materials say the new funding will scale partnerships across AI agents and life sciences. SU021, SU022, SU023, SU026
CU034 Salesforce Ventures frames Goodfire around enterprise AI ROI, reliability, and control problems. SU017, SU018
CU035 The public proof set spans life sciences, AI-agent operations, materials science, and general frontier-model design. SU003, SU011, SU013, SU014
CU036 Goodfire's public blog history shows named collaboration proof surfacing across 2025 and 2026 rather than through a single isolated announcement. SU008, SU010, SU011, SU013, SU016
CU037 Goodfire's public materials distinguish broad segment claims from a much smaller set of named collaborators. SU001, SU003, SU006, SU008, SU010, SU011, SU013
CU038 No reviewed public source disclosed NRR, GRR, churn, renewal rate, or true retention cohorts for Goodfire. SU001, SU014, SU019, SU020
CU039 No reviewed public source disclosed contract length, commercial expansion metrics, or top-customer concentration for Goodfire. SU001, SU014, SU020, SU022
CU040 Arc Institute and Mayo Clinic each have later public follow-on evidence after their initial collaboration announcements, indicating relationship continuity but not proving paid renewal. SU008, SU009, SU010
CU041 The disclosed reference set is concentrated in a handful of named collaborators and is especially weighted toward life-sciences programs. SU003, SU006, SU008, SU009, SU010, SU020
CU042 The broad Fortune 500 claim remains materially weaker than the named proof set because no enterprise names or outcomes are publicly disclosed. SU001, SU014
CU043 MIT Technology Review quoted Leonard Bereska saying Goodfire is adding “precision to the alchemy,” a substantive critique of how principled the product really is. SU019
CU044 OnHealthcare argued that Goodfire's $1.25 billion valuation looks aggressive for a research-first company with relatively early commercial traction. SU020
CU045 OnHealthcare argued that the public valuation case relies more on platform option value than on disclosed revenue or customer metrics. SU020
CU046 Several scientific customer outcomes remain partly hypothesis-stage because Prima Mente's biomarkers are still under validation and EVEE is still undergoing peer review. SU006, SU007, SU009
CU047 Silico is most naturally usable where customers can inspect model internals, which may bias near-term adoption toward open-model teams and research labs over closed-model users. SU002, SU019
CU048 Goodfire's continued publication of frontier interpretability results supports a customer narrative built on research credibility as much as on packaged software. SU034
CU049 Mayo Clinic is a major medical institution, so Goodfire's disclosed collaboration carries meaningful signal for regulated-domain customer credibility. SU035, SU007
CR001 Goodfire positions itself as a research company using interpretability to understand, learn from, and design AI systems rather than relying on scale alone. SR006, SR022
CR002 Goodfire publicly argues that today's dominant AI-development process still cannot meaningfully understand, debug, or shape what models learn. SR005, SR006
CR003 Goodfire says current model training is still a costly guess-and-check process and presents intentional design as an attempt to move from open-loop tweaking toward closed-loop control. SR005
CR004 Goodfire also states that its techniques are early, the science is incomplete, and the hardest interpretability problems remain unsolved. SR005, SR023
CR005 MIT Technology Review described Silico as potentially useful but quoted an external mechanistic-interpretability researcher saying Goodfire is adding precision to alchemy rather than fully turning model building into engineering. SR013
CR006 Goodfire markets Silico as a model-design environment that can debug behavior, remove confounders, and diagnose failures before production, but access is still request-based rather than self-serve. SR009
CR007 Goodfire claims its platform is already used by Fortune 500 enterprises, major healthcare institutions, and AI research labs, but it does not disclose how many of those users are production customers or what they pay. SR008
CR008 Under the MSA and TOS, Goodfire only commits to support, service levels, implementation help, training, or professional services if those items are expressly defined in an order form. SR001, SR003
CR009 The TOS says pilot, beta, trial, evaluation, or pre-release access may be modified, suspended, or discontinued at any time and, absent an order form, carries no service levels, support commitments, security commitments, or availability commitments. SR003
CR010 Goodfire's default legal terms disclaim warranties that the platform or services will be uninterrupted, secure, accurate, complete, or error free. SR001, SR003
CR011 Goodfire's aggregate liability is capped at fees paid in the prior twelve months under the MSA and TOS, while pilot-agreement liability is capped at pilot fees. SR001, SR002, SR003
CR012 The TOS defines Usage Data broadly to include usage volumes, clickstream, logs, performance data, and error data, and classifies that Usage Data as Goodfire IP. SR003
CR013 The TOS gives Goodfire a perpetual, irrevocable, sublicensable license to use Workflow Data to provide, improve, evaluate, train, and commercialize the platform, subject to promises not to identify the customer or reveal confidential information. SR003
CR014 The MSA's feedback clause assigns customer feedback and related know-how to Goodfire without attribution or compensation. SR001
CR015 Goodfire's public contracts require compliance with U.S. export and re-export restrictions and any necessary government approvals for cross-border use of the service or customer materials. SR001, SR003
CR016 Goodfire says it is SOC 2 Type II compliant and directs customers to a trust portal for SOC 3 materials and full-report requests. SR004
CR017 Goodfire says its Mayo Clinic collaboration operates under rigorous data-privacy protocols and Mayo's established data-governance frameworks. SR011
CR018 Goodfire's Prima Mente case study says the customer needed to narrow model signals for experimental validation and FDA-approval progress, and that Goodfire identified a novel biomarker class through interpretability work. SR010
CR019 On Healthcare argues Goodfire's 2026 valuation is aggressive relative to a roughly 51-person organization that still appears early in commercial traction and is funding green-field research alongside product work. SR014
CR020 Goodfire's February 2026 Series B valued the company at $1.25 billion and brought total disclosed funding to just over $200 million. SR014, SR015, SR021, SR022
CR021 On Healthcare reports Goodfire had about 51 employees as of January 2026. SR014
CR022 Goodfire's careers page explicitly recruits people who thrive in fast-paced environments, signaling that the company is still building organizational depth while scaling. SR007
CR023 On Healthcare says Eric Ho has argued there are probably fewer than 150 full-time interpretability researchers in the world, implying a tight labor pool for the company's core discipline. SR014
CR024 MIT Technology Review says Silico pricing is determined case by case based on customer requirements and that Goodfire declined to disclose pricing specifics. SR013
CR025 On Healthcare argues Goodfire is not yet a predictable SaaS business with clearly disclosed recurring revenue mechanics. SR014
CR026 Goodfire's Prima Mente case study says Goodfire researchers embedded with the customer team and built the biomarker-discovery pipeline jointly. SR010
CR027 Goodfire's TOS contemplates support, technical assistance, field engineering, research activities, collaboration activities, and deliverables alongside platform access. SR003
CR028 Goodfire's May 2026 eval-awareness paper says verbalized eval awareness appeared across all 19 benchmarks and 8 models tested, with 515 manually verified instances. SR024
CR029 The same paper says eval awareness correlates with safer behavior and can systematically overestimate model alignment if benchmarks do not account for it. SR024
CR030 Goodfire reports that rewriting prompts cut verbalized eval awareness by 40 percent, that an unsupervised paraphrasing method cut it by 75 percent, and that refusal rates also dropped as awareness fell. SR024
CR031 Goodfire's Reasoning Theater work says chain-of-thought text can be performative; on easier tasks models often know the answer early and generate superfluous reasoning that lags internal state. SR025
CR032 Reasoning Theater also reports that probe-based early exit saved 68 percent of MMLU tokens and 33 percent of GPQA-Diamond tokens for DeepSeek-R1 while retaining more than 95 percent of baseline accuracy. SR025
CR033 Goodfire's model-diff-amplification post says harmful or backdoored behaviors are often a needle-in-a-haystack problem that standard evaluations miss until after deployment. SR030
CR034 Model diff amplification made harmful outputs 10x to 300x more frequent in testing and made a sleeper-agent backdoor about 100x easier to surface, but Goodfire says the method is only for detection and overstates real prevalence. SR030
CR035 Goodfire's memorization-via-loss-curvature work says language models memorize substantial portions of training data and that many questions about how memories are stored or localized remain unresolved. SR027
CR036 The same memorization work says suppressing memorization can preserve logical reasoning but degrade arithmetic and closed-book factual recall, showing that edits can trade off reliability across tasks. SR027
CR037 Goodfire's SPD post argues that sparse autoencoders do not explain feature geometry, do not converge to a single true decomposition as they scale, and that SPD still has non-trivial sensitivity and has only been validated on toy models. SR026
CR038 Goodfire's neural-geometry post says a single SAE direction gives only a partial view of curved structure, so interpreting features one by one misses the global picture. SR028
CR039 Goodfire's manifold-steering post says linear steering often mismatches internal geometry and can produce noisy, off-target effects. SR029
CR040 Goodfire's scientific-model interpretability work argues interpretability can improve reliability and transparency in downstream applications, especially clinical domains, but extracting mechanisms from complex models remains challenging and valuable. SR031
CR041 MIT Technology Review says interpretability tools like Silico could be essential for safety-critical applications in healthcare and finance, increasing the burden on Goodfire to prove deployment-grade trustworthiness rather than just interesting demos. SR013
CR042 NIST says the 2024 generative-AI profile and the 2026 critical-infrastructure concept note are intended to guide organizations toward concrete AI risk-management practices and trustworthy-AI controls. SR016
CR043 Gartner says enterprise GenAI outcomes depend heavily on data quality, governance, change management, realistic expectations, and talent availability. SR017
CR044 Datadog markets a production stack that combines prompt testing, evaluations, tracing, monitoring, sensitive-data scanning, and enterprise controls for AI systems. SR019
CR045 LangSmith markets observability, monitoring, hallucination debugging, and self-hosted or BYOC deployment options so sensitive traces can stay inside the customer environment. SR020
CR046 On Healthcare and Goodfire's Mayo materials both frame healthcare deployment as blocked by the gap between model predictions and biological understanding, positioning interpretability as a compliance and validation bridge rather than only a developer tool. SR011, SR014
CR047 Goodfire's public proof set is concentrated in named collaborations and case studies—Prima Mente, Mayo Clinic, Radical AI, and unnamed enterprise claims—rather than a broad list of disclosed production references. SR008, SR010, SR011, SR012
CR048 The Radical AI partnership announcement says details on research directions and outcomes will be shared later, so one of Goodfire's flagship scientific partnerships is still forward-looking in public evidence. SR012
CR049 PwC says healthcare AI adoption is slower than in other sectors and emphasizes risk-controlled adoption, which raises go-to-market friction for vendors selling into regulated clinical workflows. SR018
CR050 Adjacent observability vendors already package evaluation, tracing, monitoring, and governance into production platforms, so Goodfire has to prove that interpretability delivers a distinct control layer rather than just another form of observability. SR019, SR020
CR051 Salesforce Ventures argues enterprise AI buyers are increasingly constrained by unclear ROI and by an inability to steer models reliably and consistently, framing control and reliability as buyer pain rather than purely research interests. SR032
CR052 Lightspeed framed Goodfire as critical infrastructure for explainable and mission-critical AI, explicitly tying future demand to regulation and to the need to productize interpretability for enterprises rather than only researchers. SR033
CR053 Investing.com reported that Goodfire works with clients including Microsoft, Mayo Clinic, and Arc Institute and plans to use new capital for model improvement, compute, and hiring, which reinforces both partner-value and execution-demand intensity. SR034
CR054 Adjacent observability vendors already market tracing, monitoring, and workflow-debugging for AI agents, increasing substitution risk around parts of Goodfire's budget. SR035
CR055 Datadog now packages agent observability inside a broader enterprise monitoring suite, which can pull AI-operations budget toward incumbent platforms. SR036
CR056 Langfuse positions itself as an observability layer with open-source adoption, reinforcing price and workflow competition for AI development teams. SR037
CR057 Langfuse publishes transparent pricing, which increases buyer expectations for standardized software packaging that Goodfire has not yet publicly matched. SR038
CR058 LangSmith markets observability for AI agents and LLM applications, underscoring that adjacent tooling vendors can compete for the same developer and platform owners. SR039
CR059 Weights' combination with OpenAI highlights consolidation risk in AI tooling, where platform vendors can absorb adjacent products before smaller specialists fully scale. SR040
CR060 Mechanistic interpretability results still depend on advancing research rather than finished engineering playbooks. SR041
CR061 Goodfire continues to publish foundational work on latent computation, underscoring that part of its edge still resides in experimental research rather than commoditized software. SR042
CR062 Goodfire's ongoing publication cadence suggests platform differentiation remains tied to research velocity, which creates key-person and execution dependence if commercialization lags. SR043
CR063 Goodfire's valuation and product narrative still depend on turning novel neural-geometry research into dependable commercial workflows, which keeps execution risk elevated. SR044
CV001 Goodfire announced a $150 million Series B at a $1.25 billion valuation in February 2026. SV001, SV002, SV012
CV002 B Capital led the Series B and the syndicate included Juniper Ventures, Menlo Ventures, Lightspeed Venture Partners, South Park Commons, Wing Venture Capital, DFJ Growth, Salesforce Ventures, and Eric Schmidt. SV001, SV002, SV003
CV003 Goodfire said the Series B came less than a year after its Series A. SV002, SV003, SV006
CV004 Goodfire announced a $50 million Series A in April 2025 led by Menlo Ventures with Anthropic participating. SV006, SV007
CV005 Public company and press-release materials imply that Goodfire has raised more than $200 million in total capital after the Series B. SV001, SV002, SV006
CV006 Official and SEC materials identify Goodfire as a Delaware company founded in 2023 and based in San Francisco. SV028, SV029, SV030
CV007 Goodfire describes itself as a public benefit corporation focused on interpretability to understand, learn from, and design AI systems. SV002, SV030
CV008 The April 2025 Form D shows roughly $52.0 million sold against a roughly $52.1 million offering tied to the Series A financing. SV028, SV006
CV009 The February 2026 Form D lists Yan-David Erlich among related persons and shows a $161.7 million offering amount tied to the Series B-era filing. SV029, SV002
CV010 Goodfire positions Ember as its flagship model design environment and interpretability platform. SV006, SV007, SV010
CV011 Goodfire says Ember is meant to give programmable access to internal model features so users can inspect, edit, and retrain behavior more precisely than black-box methods. SV006, SV007, SV009
CV012 Goodfire says interpretability-guided training reduced hallucinations in a language model by roughly half. SV001, SV002, SV010
CV013 Goodfire cites collaborators including Arc Institute, Mayo Clinic, Prima Mente, and Microsoft. SV001, SV010
CV014 Goodfire says interpretability work surfaced a novel class of Alzheimer's biomarkers from Prima Mente's epigenetic model. SV001, SV002, SV010
CV015 Goodfire announced SOC 2 Type II compliance with no exceptions identified in February 2026. SV031
CV016 Goodfire continued publishing 2026 research across neural geometry, steering, parameter decomposition, and pooling methods. SV022, SV024, SV025, SV027, SV031
CV017 Goodfire's Llama 3 research preview says it trained sparse autoencoders on Llama-3-8B and used causal feature interventions to steer outputs while minimizing degradation. SV023
CV018 Goodfire's Geometric Calculator page says Llama 3.1 8B uses a general-purpose addition module that handles months, days, and arithmetic via circular representations. SV024
CV019 Goodfire's Covariance Pooling page argues second-moment pooling outperforms mean pooling on downstream genomic tasks. SV025
CV020 Goodfire's Painting With Concepts page shows interpretability tooling applied to SDXL-Turbo image generation, indicating modality expansion beyond text. SV026
CV021 Goodfire's VPD explainer says the company decomposed a 67M-parameter model into simple pieces and used that structure to edit behavior without training. SV027
CV022 Goodfire's product wedge sits deeper in the stack than observability vendors because it aims to intervene on model internals rather than only trace outputs or enforce guardrails. SV009, SV019, SV020, SV021
CV023 Arize Phoenix positions itself around tracing, evals, and agent observability rather than model-internal design. SV019
CV024 Fiddler positions its product around observability, guardrails, and governance for agents and predictive AI rather than model-internal representation editing. SV020
CV025 LangSmith positions its product around tracing, monitoring, and clustering for agent behavior rather than model-internal steering. SV021
CV026 Gartner says generative AI entered the 2025 trough of disillusionment and that ROI depends on governance, change management, and full cost accounting. SV017
CV027 NIST says the AI RMF and its generative AI profile exist to help organizations manage trustworthiness and AI risk across design, deployment, and evaluation. SV018
CV028 The On Healthcare analysis says Goodfire raised $209 million across seed, Series A, and Series B and estimated the team at roughly 51 employees as of January 2026. SV010
CV029 The On Healthcare analysis argues that the $1.25 billion valuation is aggressive for a research-first company with relatively early commercial traction. SV010
CV030 TechCrunch's 2026 mega-round list places Goodfire among U.S. AI companies that raised $100 million or more in early 2026 at a $1.25 billion valuation. SV012
CV031 TechCrunch reported that Eric Schmidt's Hillspire invested directly in Goodfire as family offices and private wealth moved earlier into AI deals. SV011
CV032 Anysphere was valued at $9.9 billion after surpassing $500 million in ARR. SV013
CV033 Harvey was reportedly raising at $11 billion after hitting a $190 million ARR rate by the end of 2025. SV014
CV034 Glean reached a $7.2 billion valuation after surpassing $100 million in ARR. SV015
CV035 Anthropic was valued at $350 billion in April 2026 with up to $40 billion of Google investment and large compute commitments. SV016
CV036 Unlike Anysphere, Harvey, and Glean, Goodfire's public round materials do not disclose revenue or ARR, so a comparable revenue multiple cannot be responsibly calculated from public evidence. SV001, SV002, SV010, SV013, SV014, SV015
CV037 The current mark therefore looks like strategic option value on category leadership, research talent, and future platform commercialization rather than a fundamentals-backed software multiple. SV001, SV009, SV010, SV017
CV038 Goodfire's strategic investor mix—Anthropic in Series A, Salesforce in Series B, and Eric Schmidt in Series B—supports the view that technically sophisticated buyers think interpretability will matter commercially. SV006, SV009, SV011, SV002
CV039 Goodfire's market relevance is helped by enterprise pressure for explainability, governance, and reliable ROI in AI deployments. SV009, SV017, SV018
CV040 Public evidence still does not disclose customer count, pricing, contract structure, retention, gross margin, or software-versus-services mix. SV001, SV002, SV010
CV041 A plausible bull case requires proof that Goodfire is converting research credibility into repeatable software revenue and durable enterprise adoption. SV009, SV017, SV019
CV042 Without that proof, a base case should haircut the last round and anchor below $1.25 billion because market demand is real but commercial evidence is incomplete. SV010, SV017, SV013, SV014, SV015
CV043 A reasonable public-evidence bear case is a sub-$650 million outcome if commercialization stays bespoke, competitors absorb budget, or private AI multiples compress. SV010, SV019, SV020, SV021, SV013
CV044 A reasonable public-evidence base case is roughly $800 million to $1.1 billion, implying the last round already prices in part of the bull thesis. SV010, SV013, SV014, SV015, SV017
CV045 A reasonable public-evidence bull case is roughly $1.25 billion to $1.85 billion, which requires disclosed software revenue, strong design-partner conversion, and continued research and enterprise validation. SV001, SV009, SV010, SV015
CV046 Given stage and disclosure opacity, another private round or strategic acquisition is a more plausible near-to-mid-term path than a public listing. SV010, SV012, SV015, SV016
CV047 The most supportable current recommendation is research-more rather than buy because company-quality evidence exceeds pricing evidence. SV001, SV010, SV017, SV018
CV048 The most supportable valuation stance is stretched because the $1.25 billion round sits near the lower bound of the bull case, not the center of the base case. SV010, SV013, SV014, SV015, SV017
CV049 Entry discipline should require NDA-gated disclosure of ARR or revenue, pricing, top-customer concentration, gross margin, and the post-Series-B preference stack before underwriting above the base-case range. SV010, SV017, SV018
CV050 Thesis-break triggers include failure to disclose recurring revenue quality, inability to convert partners into repeatable platform customers, or evidence that observability vendors can satisfy budgets without Goodfire's deeper tooling. SV009, SV019, SV020, SV021, SV026
CV051 Goodfire's valuation case depends partly on owning distinctive interpretability research that competitors may not easily replicate. SV032
CV052 Goodfire continues to invest in foundational interpretability methods, which supports upside optionality but also means commercial value still depends on converting research into repeatable product adoption. SV033
CV053 Goodfire's upside case still depends on scaling its interpretability research edge into a durable commercial moat before adjacent tooling categories commoditize around it. SV034
来源
编号出版方标题引文
SO001 Goodfire Goodfire homepage Goodfire is a research company using interpretability to understand, learn from, and design AI systems.
SO002 Goodfire Goodfire company page
SO003 Goodfire Goodfire careers All roles are full-time, in person five days a week at our San Francisco, Telegraph Hill office.
SO004 Goodfire Our Series B Today, we’re excited to announce a $150 million Series B funding round at a $1.25 billion valuation.
SO005 Goodfire Intentionally Designing the Future of AI At Goodfire, we’re developing the science and technology that lets us steer model training — a process we’re calling intentional design.
SO006 Goodfire On optimism for interpretability At Goodfire, we believe we can engineer frontier AI systems that are understandable.
SO007 Goodfire Silico The first platform for intentional model design.
SO008 Goodfire Life Sciences We partner with companies training foundation models across architectures and modalities to interpret their models.
SO009 Goodfire Goodfire Announces Collaboration to Advance Genomic Medicine with AI Interpretability Mayo Clinic has a financial interest in the technology referenced in this press release.
SO010 Goodfire Goodfire contact Our platform is used by Fortune 500 enterprises, major healthcare institutions, and AI research labs.
SO011 Goodfire Prima Mente customer story Goodfire’s platform for in silico science decoded their model, identifying a novel class of biomarkers for Alzheimer’s detection.
SO012 Goodfire Fellowship Fall 25 We’re excited to announce that we’ll be bringing on several Research Fellows and Research Engineering Fellows this fall for our fellowship program.
SO013 Goodfire AP293 guest lectures 25 We gave three guest lectures in Surya Ganguli’s course on interpretability at Stanford last fall.
SO014 PR Newswire AI lab Goodfire raises $150M at $1.25B valuation to design models with interpretability Today, Goodfire—the AI research lab using interpretability to understand, learn from, and design models—announced a $150 million Series B funding round at a $1.25 billion valuation.
SO015 Yahoo Finance AI lab Goodfire raises $150M at $1.25B valuation to design models with interpretability
SO016 Pulse 2.0 Goodfire: $150 Million Series B At $1.25 Billion Valuation Raised For Interpretability AI Lab The company has raised more than $200 million in total backing from a mix of venture firms and individual investors.
SO017 Tech Funding News Goodfire raises $150M Series B at $1.25B valuation for interpretability AI
SO018 PR Newswire Goodfire raises $50M Series A to advance AI interpretability research This funding, which comes less than one year after its founding, will support the expansion of Goodfire’s research initiatives and the development of the company’s flagship interpretability platform, Ember.
SO019 Yahoo Finance Goodfire raises $50M Series A to advance AI interpretability research
SO020 Goodfire Announcing our $50M Series A Today, we’re excited to announce a $50 million Series A funding round led by Menlo Ventures.
SO021 Menlo Ventures Leading Goodfire’s $50M Series A to interpret how AI models think
SO022 Lightspeed Venture Partners Goodfire: Building Interpretable AI We at Lightspeed are thrilled to lead their $7M seed round.
SO023 Lightspeed Venture Partners Goodfire company profile
SO024 Salesforce Ventures Welcome, Goodfire Goodfire was founded by Eric Ho, Daniel Balsam, and Thomas McGrath.
SO025 Salesforce Ventures Goodfire company profile
SO026 VCNewsDaily Goodfire Venture Capital Funding
SO027 MIT Technology Review This startup’s new mechanistic interpretability tool lets you debug LLMs In reality, they are adding precision to the alchemy.
SO028 OnHealthcare Goodfire AI and the billion-dollar interpretability bet The valuation jump ... is aggressive for a company with around 51 employees and what appears to be relatively early commercial traction.
SO029 PYMNTS Goodfire raises $150 million to better understand AI
SO030 LSVP Goodfire company page
SM001 Goodfire Goodfire Understand the scientific foundations of neural networks so that we can intentionally design AI.
SM002 Goodfire Company | Goodfire We engage deeply and selectively, partnering with teams building high-stakes or frontier systems where understanding and control are essential.
SM003 Goodfire Silico | Goodfire A model design environment.
SM004 Goodfire Language | Goodfire 58% reduction in hallucinations by using features as rewards.
SM005 Goodfire Life Sciences | Goodfire Interpretability surfaced fragment length as the dominant predictive signal.
SM006 Goodfire Robotics & Vision | Goodfire Catch generalization failure before deployment.
SM007 Goodfire Our Series B | Goodfire We have built a model design environment ... to improve model behavior, and monitor them in production.
SM008 Goodfire Intentional Design | Goodfire Intentional design will be an advance in model creation similar to the difference between selective breeding and genetic engineering.
SM009 Goodfire Feature Steering for Reliable and Expressive AI Engineering Feature steering works well with fine-tuned models but also often makes fine-tuning unnecessary.
SM010 Goodfire Mayo Clinic Collaboration | Goodfire This collaboration operates under rigorous data privacy protocols and Mayo Clinic's established data governance frameworks.
SM011 Goodfire Manifold Steering | Goodfire Research Representation steering ... promises lightweight, adaptable, and granular control of neural networks.
SM012 Goodfire Interpreting Evo 2 | Goodfire Research We discovered a wide range of features corresponding to sophisticated biological concepts.
SM013 Goodfire Interpreting LM Parameters | Goodfire Research This is not just a theoretical issue. It prevents us from achieving practical engineering goals.
SM014 Goodfire Pilot Agreement | Goodfire Customer will be allowed to test the Software and receive Services, with the aim of evaluating Goodfire's technology and considering a future long-term commercial relationship.
SM015 Goodfire / Prima Mente Prima Mente Customer Story | Goodfire Goodfire's interpretability platform ... turned their foundation model into an engine for biomarker discovery.
SM016 Gartner Generative AI | Gartner GenAI enters the Trough of Disillusionment on the 2025 Hype Cycle for Artificial Intelligence.
SM017 PwC AI Jobs Barometer | PwC Workers with AI skills command a 56% wage premium.
SM018 NIST AI Risk Management Framework | NIST AI Risk Management Framework.
SM019 Arize Phoenix | Arize The open-source platform for agent development and evaluation.
SM020 Arize Pricing | Arize AX Pro ... $50 per month.
SM021 Fiddler AI Observability | Fiddler Gain Complete Visibility from Development to Production.
SM022 Fiddler Pricing | Fiddler $0.002 per trace.
SM023 Datadog LLM Observability | Datadog Test prompt, model, and tool changes against real production data before rollout.
SM024 LangChain LangSmith | LangChain LangSmith Observability gives you complete visibility into agent behavior.
SM025 Langfuse Langfuse Langfuse brings observability, prompts, evals, experiments, and human annotation into one connected workflow.
SM026 Langfuse Pricing | Langfuse Enterprise ... $2499/month.
SM027 Patronus AI Patronus AI Evaluate agent effectiveness in tip-of-the-tongue moments.
SM028 Arthur Arthur Gain visibility and reliability of your model through continuous evals.
SM029 Humanloop Pricing | Humanloop Get the enterprise platform to develop, evaluate, and ship trustworthy LLM powered apps.
SM030 MIT Technology Review This startup’s new mechanistic interpretability tool lets you debug LLMs In reality, they are adding precision to the alchemy.
SP001 Goodfire Silico | Goodfire The first platform for intentional model design.
SP002 Goodfire Language | Goodfire Predict how your model will fail before deployment, not after.
SP003 Goodfire Life Sciences | Goodfire Trace predictive signal through interpretable features to confirm whether predictions rely on real biological structure or dataset artifacts and spurious correlations.
SP004 Goodfire Robotics & Vision | Goodfire Evaluate whether your model has learned real physical structure directly from the latent space, before generating a single frame.
SP005 Goodfire Feature steering for reliable and expressive AI engineering AI engineers often ask us how feature steering differs from prompting or fine-tuning.
SP006 Goodfire Our Series B Today, we're excited to announce a $150 million Series B funding round at a $1.25 billion valuation.
SP007 MIT Technology Review This startup's new mechanistic interpretability tool lets you debug LLMs Goodfire is one of a small handful of companies, including industry leaders Anthropic, OpenAI, and Google DeepMind, pioneering a technique known as mechanistic interpretability.
SP008 On Healthcare Tech Goodfire AI and the billion-dollar black box The valuation jump from wherever it was at Series A to $1.25B at Series B is aggressive for a company with around 51 employees and what appears to be relatively early commercial traction.
SP009 Goodfire Research Probe-based data attribution Filtering out the data flagged by our probe reduces the harmful behavior by 63% without compromising general performance.
SP010 Goodfire Research Rakuten: SAE probes for PII detection We detail one of the first uses of sparse autoencoders (SAEs) with a production AI model - using SAE probes to detect personally identifiable information for Rakuten AI agents.
SP011 Goodfire Research Understanding and steering Llama 3 We're releasing preview.goodfire.ai, a desktop interface to help you understand and steer Llama 3's behavior.
SP012 Goodfire Research VPD explainer We tried this and were able to make a precise and predictable change to the model's behaviour by directly editing the subcomponents, with no training required.
SP013 Goodfire Research Self-correcting search We were able to improve generation by giving a diffusion model a feedback loop from its own internals, resulting in ~30% more viable candidate materials in a target range.
SP014 Goodfire Research Reasoning theater Chain-of-thought reasoning is not always faithful to the model's internal computations.
SP015 Arize Phoenix The open-source platform for agent development and evaluation.
SP016 Arize Pricing | Arize AX Pro ... $50 per month.
SP017 Fiddler AI AI Observability | Fiddler AI Gain unified visibility, context, and control across agents and predictive applications.
SP018 Fiddler AI Pricing | Fiddler AI $0.002 per trace.
SP019 Arthur Arthur The full lifecycle platform for ensuring reliable AI.
SP020 Datadog LLM Observability | Datadog Free includes up to 40K LLM spans per month. Pro starts at $160 per month and includes 100K LLM spans.
SP021 LangChain LangSmith LangSmith has a free tier for development and small-scale production. Paid plans scale with trace volume.
SP022 Langfuse Langfuse Open Source AIEngineeringPlatform.
SP023 Langfuse Pricing | Langfuse $29/ month.
SP024 Humanloop Pricing | Humanloop Get the enterprise platform to develop, evaluate, and ship trustworthy LLM powered apps.
SP025 Humanloop Humanloop is joining Anthropic As we sunset the Humanloop platform, we will continue to work closely with our customers to make their transition as smooth as possible.
SP026 Weights Weights is joining OpenAI As part of this transition, our products and services have been wound down and are no longer available.
SP027 National Institute of Standards and Technology AI Risk Management Framework The NIST AI Risk Management Framework (AI RMF) is intended for voluntary use and to improve the ability to incorporate trustworthiness considerations into the design, development, use, and evaluation of AI products, services, and systems.
SP028 Gartner Generative AI The total cost of ownership (TCO) for GenAI initiatives can often exceed initial expectations due to hidden costs such as compliance reviews, model retraining and internal overheads.
SP029 Humanloop Humanloop: LLM evals platform for enterprises
SI001 Goodfire Goodfire homepage
SI002 Goodfire Understanding, Learning From, and Designing AI: Our Series B Today, we're excited to announce a $150 million Series B funding round at a $1.25 billion valuation.
SI003 Goodfire Silico
SI004 Goodfire Language
SI005 Goodfire Life Sciences
SI006 Goodfire Robotics & Vision
SI007 Goodfire Feature Steering for Reliable and Expressive AI Engineering Update (Feb 2026): Our SAE demo interface and API have been deprecated.
SI008 Goodfire Contact Our platform is used by Fortune 500 enterprises, major healthcare institutions, and AI research labs.
SI009 Goodfire Careers
SI010 Goodfire SOC 2 Type II compliant We're excited to announce that Goodfire is SOC 2 Type II compliant.
SI011 Goodfire Customer story: Prima Mente
SI012 Goodfire RLFR: Reinforcement Learning from Feature Rewards Overall, we reduce the hallucination rate by 58% across the held-out test set.
SI013 Goodfire Master Services Agreement
SI014 Goodfire Pilot Agreement
SI015 Goodfire Silico Terms of Use
SI016 PR Newswire AI Lab Goodfire Raises $150M at $1.25B Valuation to Design Models with Interpretability
SI017 Yahoo Finance AI Lab Goodfire Raises $150M at $1.25B Valuation to Design Models with Interpretability
SI018 The SaaS News Goodfire Raises $150 Million at $1.25 Billion Valuation
SI019 Pulse 2.0 Goodfire: $150 Million Series B At $1.25 Billion Valuation Raised For Interpretability AI Lab
SI020 Tech Funding News Goodfire raises $150M Series B at $1.25B valuation
SI021 PR Newswire Goodfire Raises $50M Series A to Advance AI Interpretability Research
SI022 Yahoo Finance Goodfire Raises $50M Series A to Advance AI Interpretability Research
SI023 Menlo Ventures Leading Goodfire's $50M Series A to Interpret How AI Models Think
SI024 VC News Daily Goodfire Venture Capital Funding
SI025 Salesforce Ventures Welcome, Goodfire
SI026 On Healthcare Goodfire AI and the Billion-Dollar Black Box The valuation jump ... is aggressive for a company with around 51 employees and what appears to be relatively early commercial traction.
SI027 SEC Goodfire AI, Inc. Form D filing dated 2025-06-02
SI028 SEC Goodfire AI, Inc. Form D filing dated 2026-02-09
SI029 Goodfire Customer Story: Radical AI We're excited to announce a new partnership between Radical AI and Goodfire to fundamentally dismantle the black box of AI-driven materials discovery and design.
SE001 Goodfire Silico
SE002 Goodfire Language
SE003 Goodfire Life Sciences
SE004 Goodfire Robotics & Vision
SE005 Goodfire Hallucinations Viewer
SE006 Goodfire Feature Steering for Reliable and Expressive AI Engineering
SE007 Goodfire Intentionally Designing the Future of AI
SE008 Goodfire Announcing our SOC 2 Type II Certification
SE009 Goodfire You and Your Research Agent
SE010 Goodfire Under the Hood of a Reasoning Model
SE011 Goodfire The World Inside Neural Networks
SE012 Goodfire Verbalized Eval Awareness Inflates Measured Safety
SE013 Goodfire Interpretability for Alzheimer's Detection
SE014 Goodfire Can SAEs Capture Neural Geometry?
SE015 Goodfire EVEE: Explaining Genetic Variants
SE016 Goodfire Model Diff Amplification
SE017 Goodfire Stochastic Parameter Decomposition
SE018 Goodfire Understanding Memorization via Loss Curvature
SE019 Goodfire Painting with Concepts
SE020 Goodfire The Shape of Stories Inside Neural Networks
SE021 Goodfire Phylogeny Manifold
SE022 Goodfire Silico Terms of Use
SE023 Goodfire Pilot Agreement
SE024 Goodfire Careers
SE025 Goodfire AP293 Guest Lectures 25
SE026 Goodfire Fellowship Fall 25
SE027 Goodfire Announcing our Mayo Clinic Collaboration
SE028 Goodfire Prima Mente Customer Story
SE029 Goodfire Radical AI Partnership Announcement
SE030 MIT Technology Review This startup's new mechanistic interpretability tool lets you debug LLMs
SE031 Salesforce Ventures Welcome, Goodfire
SE032 On Healthcare Tech Goodfire AI and the Billion-Dollar Black Box
SE033 NIST AI Risk Management Framework
SE034 Gartner Generative AI
SE035 Menlo Ventures Leading Goodfire's $50M Series A to Interpret How AI Models Think
SE036 Lightspeed Venture Partners Goodfire
SE037 PYMNTS Goodfire Raises $150 Million to Better Understand AI
SE038 PR Newswire AI Lab Goodfire Raises $150M at $1.25B Valuation to Design Models with Interpretability
SU001 Goodfire Contact / early-access page Our platform is used by Fortune 500 enterprises, major healthcare institutions, and AI research labs.
SU002 Goodfire Silico product page
SU003 Goodfire Life sciences page
SU004 Goodfire Language page
SU005 Goodfire Robotics / vision page
SU006 Goodfire Prima Mente customer story Goodfire’s research scientists embedded in Prima Mente’s team as they had finished training their model.
SU007 Goodfire Interpretability for Alzheimer's detection We detail how we studied Pleiades to identify fragmentomics as a novel class of biomarkers for Alzheimer’s detection.
SU008 Goodfire Mayo Clinic collaboration announcement This collaboration operates under rigorous data privacy protocols and Mayo Clinic's established data governance frameworks.
SU009 Goodfire EVEE: explaining genetic variants Our pathogenicity probe achieves state-of-the-art performance (0.997 overall AUROC on 839k ClinVar variants).
SU010 Goodfire Interpreting Evo 2 Preliminary experiments have shown promising directions for steering these features to guide DNA sequence generation, though this work is still in its early stages.
SU011 Goodfire Radical AI partnership announcement
SU012 Goodfire Using self-correcting search to accelerate materials discovery Applying self-correcting search improves targeting without harming SUN scores, leading to an overall ~27% increase in successful candidates.
SU013 Goodfire Rakuten SAE probes for PII detection As a result, Rakuten deployed the SAE probes - the first known enterprise application of SAEs for language model guardrails.
SU014 Goodfire Series B announcement / customer positioning We use this environment internally for research, and deploy it forward with our customers, collaborating in a shared environment.
SU015 Goodfire You and your research agent
SU016 Goodfire Blog index
SU017 Salesforce Ventures Goodfire company profile
SU018 Salesforce Ventures Welcome Goodfire Enterprise customers care more about the ROI they see from their AI investments than ever.
SU019 MIT Technology Review This startup's new mechanistic interpretability tool lets you debug LLMs In reality, they are adding precision to the alchemy.
SU020 OnHealthcare Goodfire AI and the billion-dollar bet on interpretability
SU021 Tech Funding News Goodfire raises $150M Series B at $1.25B valuation
SU022 PR Newswire AI lab Goodfire raises $150M at $1.25B valuation to design models with interpretability This funding... will enable Goodfire to ... scale partnerships across AI agents and life sciences.
SU023 Yahoo Finance AI lab Goodfire raises $150M at $1.25B valuation to design models with interpretability
SU024 Lightspeed Venture Partners Goodfire company page
SU025 Menlo Ventures Leading Goodfire's $50M Series A to interpret how AI models think Patrick Hsu, co-founder of Arc Institute... said, “Their interpretability tools have enabled us to extract novel biological concepts that are accelerating our scientific discovery process.”
SU026 PYMNTS Goodfire raises $150 million to better understand AI We use this environment internally for research, and deploy it forward with our customers, collaborating in a shared environment.
SU027 Goodfire Research index
SU028 Goodfire Radical AI customer story
SU029 Goodfire Open problems in mechanistic interpretability
SU030 Goodfire Belief dynamics in in-context steering
SU031 Goodfire Mixing mechanisms
SU032 Goodfire Replicating circuit tracing for a simple mechanism
SU033 Goodfire Mapping latent spaces in Llama 3.3 70B
SU034 Goodfire A Geometric Calculator
SU035 Mayo Clinic About Mayo Clinic
SR001 Goodfire Master Services Agreement The Services are provided "as is" and Goodfire hereby disclaims all warranties.
SR002 Goodfire Pilot Agreement In no event will either Party's aggregate liability exceed the fees paid for the pilot.
SR003 Goodfire Silico Terms of Use Customer grants Goodfire a non-exclusive, worldwide, perpetual, irrevocable, royalty-free, sublicensable license to Workflow Data.
SR004 Goodfire Goodfire is SOC 2 Type II compliant We're excited to announce that Goodfire is SOC 2 Type II compliant.
SR005 Goodfire Intentional design The techniques are early, the science is incomplete, and the hardest problems remain unsolved.
SR006 Goodfire Company Our goal is to make AI that can be understood, debugged, and shaped like software.
SR007 Goodfire Careers If you thrive in fast-paced environments and believe that understanding AI systems is essential for our future, join us.
SR008 Goodfire Contact Our platform is used by Fortune 500 enterprises, major healthcare institutions, and AI research labs.
SR009 Goodfire Silico Precisely debug issues with model behavior, identify and remove confounders, and diagnose failures before they occur in production.
SR010 Goodfire Prima Mente customer story Goodfire's research scientists embedded in Prima Mente's team and built out a biomarker discovery pipeline.
SR011 Goodfire Mayo Clinic collaboration This collaboration operates under rigorous data privacy protocols and Mayo Clinic's established data governance frameworks.
SR012 Goodfire Radical AI partnership announcement More details about specific research directions and outcomes will be shared as the partnership progresses.
SR013 MIT Technology Review This startup’s new mechanistic interpretability tool lets you debug LLMs In reality, they are adding precision to the alchemy.
SR014 On Healthcare Goodfire AI and the billion-dollar interpretability bet The valuation jump is aggressive for a company with around 51 employees and what appears to be relatively early commercial traction.
SR015 PYMNTS Goodfire raises $150 million to better understand AI The company's Series B funding round values Goodfire at $1.25 billion.
SR016 NIST AI Risk Management Framework The profile will guide critical infrastructure operators towards specific risk management practices to consider when engaging AI-enabled capabilities.
SR017 Gartner Generative AI The success of these implementations often hinges on the quality of data and the effectiveness of governance frameworks in place.
SR018 PwC AI Jobs Barometer In the Healthcare sector, AI adoption is happening slower than in other industries and risk-controlled adoption of this technology matters.
SR019 Datadog LLM Observability / Agent Observability Validate changes before rollout, monitor production health continuously, and scale AI programs with stronger governance and fewer surprises.
SR020 LangChain LangSmith Observability LLM observability platforms provide visibility into agent decisions and help debug complex failures and hallucinations.
SR021 Tech Funding News Goodfire raises $150M Series B at $1.25B valuation This lack of visibility makes AI hard to control, difficult to fix, and risky to deploy at scale.
SR022 Goodfire Understanding, Learning From, and Designing AI: Our Series B To that end, we've built a model design environment.
SR023 Goodfire On optimism for interpretability Models are complex systems, and understanding them is a genuine research challenge.
SR024 Goodfire Verbalized eval awareness inflates measured safety Unless safety benchmarks account for eval awareness, they may systematically overestimate model alignment.
SR025 Goodfire Reasoning theater Models genuinely reason through hard problems, but coast through easy ones while generating superfluous chain-of-thought.
SR026 Goodfire Stochastic parameter decomposition SPD isn't a complete solution.
SR027 Goodfire Understanding memorization via loss curvature The method is not yet mature and can be heavy-handed in its edits.
SR028 Goodfire Can SAEs capture neural geometry? A single line can only give us a partial view of curved geometric structure.
SR029 Goodfire Manifold steering Linear steering cuts across the behavior manifold and produces noisy, off-target effects.
SR030 Goodfire Model diff amplification Even if an undesired behavior normally occurs only once in a million samples, amplification lets us surface it with far fewer rollouts.
SR031 Goodfire Phylogeny manifold Interpretability can improve reliability and transparency for downstream applications, especially in clinical domains.
SR032 Salesforce Ventures Welcome Goodfire Enterprise customers care more about the ROI they see from their AI investments than ever and cannot steer AI models to behave reliably and consistently.
SR033 Lightspeed Venture Partners Goodfire is building interpretable AI As governments increasingly push regulation mandating explainable AI systems, enterprises will need to provide clear rationales for model behavior.
SR034 Investing.com Goodfire raises $150 million to improve AI model understanding The company works with clients including Microsoft Corp., the Mayo Clinic, and the nonprofit Arc Institute.
SR035 IBM Think Topics: Model Observability
SR036 Datadog Agent Observability | LLM Observability | Datadog
SR037 Langfuse Langfuse
SR038 Langfuse Pricing - Langfuse
SR039 LangChain LangSmith: AI Agent & LLM Observability Platform
SR040 Weights Weights is joining OpenAI
SR041 Goodfire Priors in Time
SR042 Goodfire A Geometric Calculator
SR043 Goodfire Covariance Pooling
SR044 Goodfire The Neural Geometry Series
SV001 Goodfire Understanding, Learning From, and Designing AI: Our Series B Today, we're excited to announce a $150 million Series B funding round at a $1.25 billion valuation.
SV002 PR Newswire AI Lab Goodfire Raises $150M at $1.25B Valuation To Design Models With Interpretability Goodfire... announced a $150 million Series B funding round at a $1.25 billion valuation.
SV003 Yahoo Finance AI lab Goodfire raises $150M at $1.25B valuation to design models with interpretability
SV004 Pulse 2.0 Goodfire: $150 Million Series B At $1.25 Billion Valuation Raised For Interpretability AI Lab
SV005 Tech Funding News Goodfire bags $150M at $1.25B to build AI interpretability infrastructure
SV006 PR Newswire Goodfire Raises $50M Series A to Advance AI Interpretability Research Today, Goodfire... announced a $50 million Series A funding round led by Menlo Ventures... to support... Ember.
SV007 Menlo Ventures Leading Goodfire's $50M Series A to Interpret How AI Models Think
SV008 Lightspeed Venture Partners Goodfire: Building Interpretable AI
SV009 Salesforce Ventures Welcome Goodfire
SV010 On Healthcare Goodfire AI and the Billion Dollar Black Box The valuation jump... is aggressive for a company with around 51 employees and what appears to be relatively early commercial traction.
SV011 TechCrunch The AI gold rush is pulling private wealth into riskier, earlier bets
SV012 TechCrunch Here are the 17 U.S.-based AI companies that have raised $100M or more in 2026
SV013 TechCrunch Cursor's Anysphere nabs $9.9B valuation, soars past $500M ARR
SV014 TechCrunch Harvey reportedly raising at $11B valuation just months after it hit $8B
SV015 TechCrunch Enterprise AI startup Glean lands a $7.2B valuation
SV016 TechCrunch Google to invest up to $40B in Anthropic in cash and compute
SV017 Gartner Generative AI
SV018 NIST AI Risk Management Framework
SV019 Arize AI Phoenix
SV020 Fiddler AI AI Observability and Security
SV021 LangChain LangSmith Observability
SV022 Goodfire Research The Shape of Stories Inside Neural Networks
SV023 Goodfire Research Understanding and Steering Llama 3
SV024 Goodfire Research A Geometric Calculator
SV025 Goodfire Research Covariance Pooling
SV026 Goodfire Research Painting With Concepts
SV027 Goodfire Research VPD Explainer
SV028 U.S. Securities and Exchange Commission Form D for Goodfire AI, Inc. (Series A-era filing) Goodfire AI, Inc.... DELAWARE... 2023
SV029 U.S. Securities and Exchange Commission Form D for Goodfire AI, Inc. (Series B-era filing) Yan-David Erlich
SV030 Goodfire Company
SV031 Goodfire SOC 2 Type II
SV032 Goodfire The Neural Geometry Series
SV033 Goodfire SAE Scaling with Feature Manifolds
SV034 Goodfire SAE Scaling with Feature Manifolds