初创公司尽调
尽调报告 AI / Infrastructure growth-stage private 2026-06-11

Deepgram

实时语音 AI 基础设施龙头,技术证明和采用度都很强,但以当前独角兽估值看,仍需要重度尽调。

Deepgram 看起来是实时语音 AI 的可信品类龙头,但在私下财务分母披露前,当前 $1.3B 估值更适合观察,而不是积极承销。

封面要素

成立时间 01
2015 [CO001]
估值 02
$1.3B [CO014]
累计融资 03
$215M+ [CO020]
企业客户 04
400+ [CU001]
开发者 05
200K+ [CU002]
现金流状态 06
Positive in 2024 [CO025]

公司概况

Deepgram 是一家总部位于旧金山的语音 AI 基础设施公司,由 Scott Stephenson、Noah Shutty 和 Adam Sypniewski 于 2015 年创立。公司没有包装第三方开源模型,而是围绕自研端到端深度学习搭建语音栈;如今销售一整套 API 层平台,覆盖语音转文本、文本转语音、音频智能和实时语音智能体编排。公开证据显示商业牵引已具规模:400+ 家企业客户、200,000+ 名开发者、超过 1,300 个组织基于 Deepgram API 构建应用,并通过 AWS、IBM 和 Twilio 打开战略渠道。Deepgram 2026 年 1 月的 Series C 将公司估值推至 $1.3 billion,并为后续产品扩张、收购餐饮公司 OfOne 以及渠道建设提供资金。主要承销缺口不在于产品是否真实,而在于未披露的 ARR、毛利率和留存能否支撑当前估值。

官网
deepgram.com
成立时间
2015-01-01
创始人
Scott Stephenson, Noah Shutty, Adam Sypniewski
创立地点
San Francisco, California, United States
总部
San Francisco, California, United States
产品
Deepgram 销售 API-first 语音栈,覆盖 Nova-3 语音转文本、Flux 对话式语音识别、Aura-2 文本转语音、音频智能功能,以及带实时编排和灵活部署的统一 Voice Agent API。
客户
企业买家集中在联络中心、医疗、媒体、餐饮和对话式 AI;同时覆盖 ISV、渠道伙伴和大规模自助开发者群体。
商业模式
按使用量计费的 API 定价,包含免费额度、PAYG 档位、年度增长计划、企业合同,以及收购 OfOne 后通过 Deepgram for Restaurants 新增的垂直软件层。
阶段
growth-stage private
融资情况
2026 年 1 月以 $1.3B 估值完成 $130M Series C;披露的累计融资超过 $215M,管理层称公司进入 2025 年时已实现现金流为正。
[CO001, CO014, CO020, CE001, CE002, CU001, CU002]

执行摘要

主要优势

  • 自研全栈语音 AI 平台,在延迟、部署和专利支撑的差异化上有可信证据。
  • 企业、开发者和渠道生态都有真实商业采用,包括 AWS、IBM 以及与 Twilio 相关的分发。
  • Series C 前已释放现金流转正信号,相比许多 AI 基础设施同行,偿付风险更低。

主要风险

  • ARR、毛利率、留存、集中度和优先股堆叠细节未披露,限制估值承销。
  • 超大规模云厂商和开源竞争可能压缩价格,并随时间削弱差异化。
  • 隐私、生物识别和医疗合规暴露,提高了受监管客户细分的尽调负担。

未决问题

  • 经验证的 ARR、毛利率、NRR 和客户集中度,仍是承销 Series C 价格的主要卡点。
  • OfOne 整合经济性,以及 API、企业合同和餐厅软件之间的收入结构未公开。
  • 股权结构表条款、清算优先权和任何老股定价,公开证据都看不到。

目录

Chapter 01

01公司概览

1.1 身份、创立与起源故事

Deepgram, Inc. 于 2015 年注册成立,创始人为 Scott Stephenson、Noah Shutty 和 Adam Sypniewski。这三位物理学家原本在做地下暗物质探测实验,后来发现处理放射性衰变信号的波形分析技术也能用于语音音频。三位联合创始人在中国一处地下约两英里的研究设施工作时,自制探测器,用 GPU 和 FPGA 在模拟波形上训练神经网络,并用音频记录整理他们想检索和分析的研究内容。市场上没有足够好用的语音识别 API 能满足这个需求,他们于是自建端到端深度学习方案,并转向将其商业化为 Deepgram。 Deepgram 参加了 Y Combinator 2016 年冬季批次,早期开发者社区和企业引荐都从那里起步。公司总部位于加利福尼亚州旧金山,采用 remote-first 组织形态,员工分布在美国 20+ 个州和 5+ 个国家,并将自己定位为一家基础 AI 公司,核心使命是用语音促成人机交互。它的一句话商业模式是:以 API-first、按使用量计费的方式,提供自研实时语音 AI 模型(语音转文本、文本转语音和语音智能体),并支持云端、自托管和本地部署。 [CO001, CO002, CO003, CO004, CO005, CO006]

Deepgram 快照 KPI(2026 年 6 月)
指标数值 / 状态日期置信度缺口 / 备注
成立20152015
总部San Francisco, CA(远程优先)2026-06
估值(最近一轮)$1.3 billion2026-01-13Series C 投后
累计融资$215M+2026-01-13
Series C 融资$130M2026-01-13
平台开发者200,000+2026-01公司声称,未经审计
企业客户400+2025-012025 年 2 月发布称 450+
已处理音频50,000+ years2025-01公司声称
已转录词数1 trillion+2025-01公司声称
收入 / ARR未披露2026-06CEO 称 2024 年现金流为正
员工数未公开披露2026-06远程优先;20+ 个州,5+ 个国家
阶段Series C / 独角兽2026-01
现金流为正是(2024)2025-01CEO 表述;未经审计

收入、ARR 和员工数未公开披露。开发者和客户数量为公司声称;企业客户数来自 2025 年 1 月新闻稿(400+)和 2025 年 2 月 Nova-3 发布(450+)。

[CO013, CO021, CO022, CO023, CO025]

1.2 创始人、领导层与治理

CEO 兼联合创始人 Scott Stephenson 拥有 University of Michigan 粒子物理博士学位,曾在那里从事暗物质探测器博士后研究,之后离开并共同创立 Deepgram。他是公司最主要的公开发声者和战略决策者。联合创始人 Noah Shutty 和联合创始人 Adam Sypniewski 都参与了 Deepgram 早期深度学习架构建设;Sypniewski 现任 CTO。创始团队共同的物理学背景,是 Deepgram 品牌叙事和技术差异化的核心——从第一性原理做端到端深度学习,而不是走规则驱动或混合式路线。 董事会包括领投方 AVP(普通合伙人 Elizabeth de Saint-Aignan)以及 Madrona、In-Q-Tel 等主要回投投资者的代表。In-Q-Tel 自更早轮次以来持续参与,说明政府 / 情报体系对 Deepgram 的转写准确率和本地部署能力感兴趣。公司对 Scott Stephenson 的关键人依赖是真实风险:所有公开公告、新闻稿和重大伙伴沟通中,他都是唯一被点名的高管;截至 2026 年 6 月,公司未公开披露任何 COO、CFO 或 President。 [CO007, CO008, CO009, CO010, CO011, CO012]

领导层与创始人表
人物角色背景创始人-市场匹配关键人物风险
Scott StephensonCEO 与联合创始人University of Michigan 粒子物理学博士;曾搭建暗物质探测器波形分析 → 语音 AI;在从零训练音频深度学习模型上具备领域权威高:所有公开沟通中唯一具名高管
Adam SypniewskiCTO 与联合创始人物理学家;与 Stephenson 共同搭建波形分析神经网络面向音频的第一性原理深度学习,模型架构负责人中:技术领导存在共同依赖
Noah Shutty联合创始人物理学家;早期研究和架构贡献者神经音频模型从研究到产品的转化中:创始团队凝聚力关键
Elizabeth de Saint-Aignan(人物)AVP GP(领投方 / 董事会)投资人;将企业语音 AI 识别为品类投资逻辑无(投资人)None
Will EdwardsDeepgram for Restaurants GM(前 OfOne CEO)搭建 OfOne QSR 语音 AI;YC 支持的创始人餐厅 / QSR 垂直扩张低:单一垂直负责人

截至 2026 年 6 月,未公开任命 COO、CFO 或 President。除 AVP、Madrona 和 In-Q-Tel 外,董事会构成未公开披露。关键人物风险在 Scott Stephenson 身上最突出。

[CO007, CO008, CO009, CO010]

1.3 融资历史、估值与投资者基础

Deepgram 多轮融资累计超过 $215 million。公司参加了 Y Combinator(W2016),随后完成种子轮,并于 2022 年完成 $72 million Series B,估值未披露。2026 年 1 月 13 日,Deepgram 宣布以 $1.3 billion 估值完成 $130 million Series C,进入独角兽行列;本轮由 AVP 领投,AVP 是一家独立全球投资平台,聚焦欧洲和北美高增长科技公司。 Series C 的投资者基础明显广且具战略性。所有主要既有投资者都继续跟投,包括 Alkeon、In-Q-Tel、Madrona、Tiger、Wing、Y Combinator,以及 BlackRock 管理的基金和账户。新增财务投资者包括 Alumni Ventures 和 Princeville Capital。战略公司投资者包括 Twilio、ServiceNow Ventures、SAP 和 Citi Ventures——它们都能带来 go-to-market 和分销杠杆。学术投资者包括 University of Michigan 和 Columbia University,并与更早进入的 Stanford University 形成学术投资人阵容。CEO Scott Stephenson 表示,公司 2024 年现金流为正,被接触时并未主动寻求融资,但选择募资以加速国际扩张和产品投入。Series C 也为收购 OfOne 和在旧金山开设新的 Voice AI Collaboration Hub 提供资金。 [CO013, CO014, CO015, CO016, CO017, CO018]

利益相关方或投资人地图
利益相关方角色 / 轮次战略重要性尽调要求
AVP领投,Series C($130M,2026 年 1 月)领投方;国际扩张任务;预计拥有董事席位确认董事会权利、按比例跟投权、清算优先权
Alkeon Capital现有投资人;再次参与 Series C成长期财务投资人;释放对估值的信心信号基金规模与流动性期限
BlackRock(基金 / 账户)现有投资人;再次参与 Series C机构可信度;大规模 AUM 暗示耐心资本股份类别与控制条款
In-Q-Tel现有投资人;再次参与 Series C美国情报 / 政府圈战略投资人是否存在合同限制或 ITAR / 安全义务
Madrona Venture Group现有投资人;再次参与 Series C;董事席位Pacific Northwest VC;深科技能力;播客合作伙伴确认董事席位与按比例跟投权
Tiger Global现有投资人;再次参与 Series C成长期财务支持者确认股份类别与投票权
Wing VC现有投资人;再次参与 Series C聚焦企业 AI 的 VC
Y CombinatorW2016 批次 + 再次参与 Series C原始加速器;开发者社区管线
Twilio战略方,Series C主要客户和上市合作伙伴;可能拥有董事会观察员排他性或优先定价条款
ServiceNow Ventures战略方,Series C企业工作流平台;潜在深度集成集成路线图与商业条款
SAP战略方,Series C企业 ERP / CRM;进入大型企业账户的分销通道OEM 或转售协议状态
Citi Ventures战略方,Series C金融服务垂直;BFSI 市场入口合规与数据处理承诺
Stanford、U of Michigan、Columbia 等高校学术投资人;现有 + 新进 Series C人才管线、研究合作、信号可信度IP 转让与发表权

控制条款、清算优先权和董事席位分配未公开披露。战略投资人的商业条款(OEM、集成协议)未知。

[CO014, CO015, CO016, CO017, CO018]

1.4 业务规模与里程碑时间线

截至 2026 年初,Deepgram 的公开规模指标包括 200,000+ 名开发者基于其 API 构建应用、400+ 家企业客户(按 2025 年 1 月公告),以及迄今处理超过 50,000 年音频和超过 1 trillion 个词。公司报告过去四年使用量年增长为 3.3×(2025 年 1 月披露)。收入和 ARR 尚未公开披露,但 Stephenson 确认 2024 年现金流为正,说明当时收入对应的成本结构较健康。 关键里程碑包括:2015 年创立、YC 批次(W2016)、从暗物质转向语音、Series B(2022 年)、Nova-3 发布(2025 年 2 月)、Voice Agent API GA(2025 年 6 月)、AWS Strategic Collaboration Agreement(2025 年 8 月)、Series C 与 OfOne 收购(2026 年 1 月)、IBM watsonx Orchestrate 伙伴关系(2026 年 2 月)。公司还提出在 2026 年「大规模通过 Audio Turing Test」的目标,释放出继续在前沿自然度和准确率上投入的信号。重大不利事件包括 status.deepgram.com 可见的产品宕机历史,以及来自 hyperscaler STT 产品的低价竞争压力。 [CO021, CO022, CO023, CO024, CO025, CO026]

里程碑表
日期事件类型金额 / 估值 / 状态参与方含义
2015Deepgram 由 Stephenson、Shutty、Sypniewski 创立创立3 位联合创始人从物理学转向语音;从第一天起采用端到端深度学习
2016-W1Y Combinator Winter 2016 批次融资YC 标准条款Y Combinator开发者社区入口;早期资本;可信度
2016–2018从波形研究转向语音 API;早期 STT 产品发布产品Deepgram 团队首批付费客户;建立 API 优先的上市路径
2022Series B:融资 $72M(包括 $47M 交割)融资$72M;估值未披露Alkeon、Tiger、Wing、Madrona、In-Q-Tel、YC、BlackRock、Stanford 等投资方为模型开发和企业销售提供大量资本
2024-12实现现金流为正规模化现金流为正内部在 Series C 之前证明单位经济性;强化融资叙事
2025-01200,000+ 名开发者、400+ 家企业客户、使用量增长 3.3×规模化公司公告牵引力里程碑;开发者生态规模
2025-02Nova-3 STT 模型发布产品Deepgram最高准确率实时 STT 主张;450+ 家企业客户
2025-06Voice Agent API GA 发布,价格 $4.50/hr产品$4.50/hr 定价Deepgram从基础设施转向平台;新增 ARR 来源
2025-08签署 AWS Strategic Collaboration Agreement合作多年期AWS、Deepgram加深云分销;联合销售和 AWS Marketplace
2026-01-13Series C:以 $1.3B 估值融资 $130M;收购 OfOne融资$130M / $1.3BAVP(领投)、Alkeon、BlackRock、In-Q-Tel、Madrona、Tiger、Wing、YC、Twilio、SAP、ServiceNow Ventures、Citi Ventures、Alumni Ventures、Princeville Capital、Columbia、U of Michigan独角兽里程碑;通过 OfOne 进入餐厅垂直
2026-02-24IBM watsonx Orchestrate 合作;Deepgram 被列为 IBM 首个语音合作伙伴合作IBM、Deepgram企业渠道扩张;进入 IBM 全球客户基础
2026(目标)大规模 Audio Turing Test 承诺产品Deepgram长期自然度研发信号;品牌差异化

日期和金额来自公司新闻稿及一线新闻报道。Series B 估值未公开披露。OfOne 收购价格未披露。

[CO013, CO014, CO019, CO021, CO022, CO023]
FO001: Deepgram 公司里程碑时间线(2015–2026)

2015 年至 2026 年 6 月的关键创立、融资、产品和合作伙伴里程碑。

[CO001, CO002, CO013, CO014, CO019, CO021]
FO002: Deepgram 快照逻辑(身份 → 产品 → 资本 → 客户)

Deepgram 的物理学创立洞察如何连接到产品、资本和客户生态。

[CO003, CO004, CO016, CO021, CO022]
FO003: Deepgram KPI 快照

截至 2026 年 6 月的关键绩效指标。

开发者数、企业客户数和已处理音频数据来自公司披露,尚未独立审计。

[CO013, CO014, CO021, CO022, CO023, CO025]

1.5 展示要点

Chapter 02

02市场分析

2.1 市场边界与细分

Deepgram 服务的市场是语音 AI 基础设施的 B2B API 市场,具体包括以开发者 API 和企业 SDK 交付的实时语音转文本(STT)、文本转语音(TTS)和语音智能体编排。这个市场位于更广义的语音与语音识别软件市场之内,后者还包括设备内置的消费者助手(Siri、Alexa、Google Assistant)、专有企业电话系统(Cisco、Genesys)和开源自托管模型(Whisper、NVIDIA Canary)。Deepgram 的 API 业务不覆盖消费者助手层和端侧硬件,也不覆盖传统本地电话平台。 市场可按买家类型(企业 vs. 开发者 / SMB)、部署模式(云 API vs. 自托管)、使用场景(实时转写、联络中心、语音智能体、会议、无障碍)和地域(北美、APAC、EMEA)切分。2025 年北美是最大区域,约占更广义市场的 34–35%。APAC 是增长最快的细分。Deepgram 的核心买家,要么是正在构建语音应用的公司里的开发者或技术负责人(开发者层),要么是为联络中心、医疗或合规流程采购语音 AI 基础设施的企业技术高管(企业层)。 [CM001, CM002, CM003, CM004, CM005]

市场定义表
细分 / 品类纳入 Deepgram TAM原因
实时 STT API(云)核心产品;主要收入驱动因素
TTS API(云)Aura-2 模型;增长中的产品线
Voice Agent API / STS(云)最新产品;ACV 潜力最高
自托管 / 本地部署 STTDeepgram 支持本地部署
消费者助手中的语音转文本(Siri、Alexa)嵌入设备;不是 API 可触达市场
传统电话平台(Cisco、Genesys)自研封闭;不是开发者 API 市场
开源 Whisper 自托管部分替代方案;只能通过微调或延迟关键型升级部分触达
会议转录 SaaS(Otter、Fireflies)部分下游买方;Deepgram 是基础设施层;只在 API 渠道竞争
联络中心 SaaS(Nice、Verint)部分STT 上游买方;Deepgram 作为基础设施销售给它们
音频智能 / 分析附加产品情绪、主题、摘要产品

TAM 边界由 Deepgram 当前 API 可触达性定义。消费者和自研封闭细分被排除在 SAM/SOM 计算之外。来源:公司定位、FutureAGI 基准指南、TBRC 市场报告。

[CM001, CM002, CM003]

2.2 市场规模与增长驱动

三个独立测算口径都指向一个规模大、增长快的市场。The Business Research Company 估计,全球 speech-to-text API 市场 2025 年规模为 $4.55 billion,并以 18.2% CAGR 增至 2030 年 $10.46 billion。Coherent Market Insights 估计,更广义的语音与语音识别市场(包括设备内置消费者助手)2026 年为 $26.5 billion,并以 23.6% CAGR 增至 2033 年 $116.9 billion。Deepgram CEO 自称,面向要求极高准确率和最低延迟的苛刻环境,语音 AI 智能体拥有 $50 billion 可服务市场——这正是 Deepgram 声称瞄准的利基。 主要增长驱动包括:(1)企业联络中心迁移到云端和 AI 自动化,降低单通电话成本;(2)agentic AI 浪潮要求 AI 电话智能体具备实时、低延迟语音处理;(3)越来越多开发者平台嵌入 voice-first UX;(4)医疗和金融服务合规场景需要准确转写;(5)多语言企业扩张带来 45+ 种语言覆盖需求。增长约束包括:hyperscaler 将 STT 作为捆绑功能,以零或近零边际成本商品化;非延迟敏感工作负载的开发者转向开源 Whisper;数据主权监管限制跨境处理。 [CM006, CM007, CM008, CM009, CM010, CM011]

TAM/SAM/SOM 规模测算视角表
视角估计年份CAGR来源置信度
STT API 全球市场(TAM)$4.55B202518.2%(至 2030)The Business Research Company
STT API 全球市场(2030 预测)$10.46B203018.2%The Business Research Company
全球语音与语音识别市场(TAM)$26.5B202623.6%(至 2033)Coherent Market Insights(市场研究来源)
语音与语音识别(2033 预测)$116.9B203323.6%Coherent Market Insights(市场研究来源)
苛刻环境中的语音 AI 智能体(Deepgram SAM)$50B2024(估计)n/aCEO Scott Stephenson(公司声称)
更广泛市场中的北美份额~34–35%2026n/aCoherent Market Insights(市场研究来源)
APAC 份额与增长~25%;增长最快2026n/aCoherent Market Insights(市场研究来源)

所有估计均来自第三方分析师报告或公司管理层声称;没有一项经过审计。管理层提出的 $50B SAM 未经验证,可能代表高端细分市场的目标愿景。不同分析师的市场规模估计差异很大,原因在于定义不同(仅 STT vs. 完整语音栈)。

[CM006, CM007, CM008]
FM001: 市场规模估算区间(STT API 与语音技术栈)

2025–2033 年全球 STT API 和完整语音 AI 技术栈市场估算区间。

所有估计来自第三方分析师报告或公司管理层。区间很宽,反映分析师定义差异。管理层给出的 SAM 估计($50B)尚未独立验证。

[CM001, CM002, CM006]
FM002: 语音 AI 细分市场 CAGR 对比

STT API(18.2%)、完整语音技术栈(23.6%)和整体云软件(约 15%)细分市场的 CAGR 对比。

Deepgram 49% CAGR 由 4 年增长 3.3× 推导得出(3.3^(1/4)-1 ≈ 49%)。APAC CAGR 和云软件基准来自分析师报告估计;未经审计。

[CM001, CM002, CM008, CM028]

2.3 买家、用户与付款方分层

Deepgram 的买家版图分为三层。第一层是开发者 / 初创公司层(200,000+ 名开发者使用免费计划或 pay-as-you-go):这些用户通常是小团队的技术决策者,会通过文档、沙盒和每分钟价格基准评估 API。预算归属工程团队或个人创始人。第二层是企业层(400–450 个组织):买家通常是中端市场到 Fortune 500 公司的 VP of Engineering、CTO 或 IT 采购负责人。采购通过年度企业合同完成,并谈判量价。垂直行业包括联络中心、医疗、金融服务、餐饮连锁(收购 OfOne 后)以及政府 / 情报(由 In-Q-Tel 信号体现)。第三层是平台 / ISV 层:Vapi、Kore.ai 和 Granola 等公司把 Deepgram 嵌入为基础设施组件,再作为自身产品的一部分转售。这一层用量高、价格敏感度较强,并贡献了不成比例的 API 调用量。 企业买家的采用路径遵循开发者驱动的 PLG:开发者先在免费计划中评估 API,做出原型,推动 IT 采购,最后转为企业合同。这种自下而上的扩张,在结构上类似 Twilio、Stripe 和其他开发者基础设施公司。付款方分层也与规模一致:开发者刷信用卡,企业按年度发票付款,ISV 谈判量价折扣。 [CM013, CM014, CM015, CM016, CM017]

细分与买方地图
细分买方类型预算负责人采用路径Deepgram 产品匹配敏感性
开发者 / 初创公司初创公司的个人开发者 / CTO工程或创始人免费 → PAYG → Growth 计划Nova-3 STT、Aura-2 TTS(免费层、PAYG)价格 + 文档质量 + 延迟
企业联络中心运营 VP / IT VP / 采购IT 预算RFP 或 PLG 内部推动者 → 企业合同Nova-3 STT、Voice Agent API、Flux 等产品准确率 + SLA + 合规
医疗 / 临床CMIO / IT VP / CTO临床运营或 IT 预算试点 → HIPAA BAA → 企业合同Nova-3,支持垂直领域定制;本地部署选项HIPAA、准确率、延迟
餐饮 / 快餐连锁(OfOne 之后)运营 VP / 加盟店主运营预算OfOne 品牌方案Deepgram for Restaurants(Flux + Nova-3,餐饮场景)准确率 + 自动承接率
政府 / 情报(In-Q-Tel)IT 或安全负责人机构预算涉密或直接合同本地部署 / 自托管部署数据主权 + 准确率
ISV / 平台(Vapi、Kore.ai)CTO / 产品负责人产品工程预算API 集成 + 收入分成或量价折扣全部 API 作为基础设施层价格 + 可靠性 + SLA

买方画像来自客户公告、定价层级和 In-Q-Tel 投资推断。 医疗和政府细分市场的细节,有一部分基于本地部署能力和投资方背景推断。

[CM013, CM014, CM015, CM016]
采用漏斗或价值链图
阶段买方动作Deepgram 触点转化驱动因素估计人群规模
认知开发者发现需要 STT/TTS API文档、GitHub、DG 博客、DG 播客SEO、开发者社区、YC 网络全球数百万
注册创建免费账户;获得 $200 额度免费套餐;API Playground零摩擦上手200,000+ 名开发者
评估测试准确率、延迟、定价,并与 Whisper/AssemblyAI 对比基准测试、SDK 文档、Discord 社区语音代理延迟最优;低于 300ms~50,000 名活跃评估者(估计)
原型将 API 接入应用;跑出第一批生产调用PAYG 计费;SDK 支持成本低;集成简单~20,000 名活跃构建者(估计)
增长套餐为更高并发承诺 $4K+/年套餐Growth 定价层级规模 + 可用性 SLA~5,000(估计)
企业合同年度议价合同;SLA、BAA、本地部署企业销售 + 解决方案工程合规、可靠性、定制化2025 年初为 400–450+
扩张 / 增购增加 TTS、Voice Agent API、Flux产品驱动扩张;客户成功团队更高 ACV;全栈锁定企业客户基数的一部分

免费注册之后各阶段的人群估计,来自 Deepgram 披露的 200,000+ 开发者数量和典型开发者 API 转化漏斗基准。 这些数字并非 Deepgram 披露。

[CM010, CM013, CM014, CM036]
FM003: 买方细分旅程图

从免费层到年度企业合同的开发者到企业 PLG 采用旅程。

[CM010, CM013, CM014, CM015]

2.4 增长驱动、约束与护城河动态

Deepgram 的可服务市场增长快于整体云软件市场,但三项结构性约束限制了捕获率。第一,hyperscaler 补贴定价:AWS Transcribe、Google Cloud Speech-to-Text 和 Azure Speech 都原生嵌入各自云生态,价格低到 Deepgram 无法在规模化后长期低于它们。使用 AWS-native 技术栈的客户可能即便牺牲一些准确率,也会优先选择 Transcribe,以简化账单、合规和供应商管理。第二,开源替代:Whisper 和 NVIDIA Canary Qwen 2.5B 对批量、非实时场景提供足够准确率(5.63% WER),且 API 成本为零。Deepgram 在这一层的护城河只有延迟和微调速度;这对实时语音智能体非常重要,但对会议转写并不重要。第三,多语言缺口:非英语市场需要实时转写时,ElevenLabs Scribe v2 目前在基准上领先;Deepgram 国际扩张时,这是一项结构性风险。 增长顺风包括:(1)Voice Agent API 比原始 STT 价值更高、粘性更强;(2)收购 OfOne 打开了高 containment rate 的 QSR 垂直;(3)IBM 和 AWS 作为分销渠道,触达原本不会自行采购 Deepgram 的受监管企业买家;(4)agentic AI 浪潮推动企业用 AI 坐席替代人工坐席,带来指数级通话量。 [CM018, CM019, CM020, CM021, CM022, CM023]

增长驱动因素与约束表
因素类型对 Deepgram 的影响时间维度
智能体 AI / AI 电话代理爆发驱动因素高:通话量呈指数级增长;Voice Agent API 直接处在链路中2024–2027
企业联络中心云迁移驱动因素高:替代传统 IVR 和人工转录;Deepgram STT 是核心基础设施2023–2028
多语言企业扩张(45+ 种语言)驱动因素中:打开 APAC 和 EMEA 市场;需要持续投入模型2025–2030
IBM / AWS 分销合作驱动因素高:企业渠道触达此前难以覆盖的受监管买家2026+
借 OfOne 切入餐饮 / 快餐连锁驱动因素中:新垂直场景;运营商基数大;已验证 >95% 自动承接率2026–2028
超大规模云厂商商品化(AWS Transcribe、Google、Azure)制约因素高:近零边际成本捆绑进云技术栈;嵌入后的忠诚度很黏持续
开源 Whisper / NVIDIA Canary 替代制约因素中:批量、非实时工作负载可用免费 GPU 算力替代持续
数据主权 / GDPR / BIPA 监管制约因素中:限制跨境数据处理;推高合规成本持续
ElevenLabs、AssemblyAI 带来的定价压力制约因素低-中:如果有风投支持的竞争对手补贴增长,价格战可能出现2025–2027

影响评级是基于分析师报告、竞争格局和公司战略作出的定性评估。 时间维度根据产品路线图信号和行业趋势估计。

[CM018, CM019, CM020, CM021, CM022]

2.5 展示要点

Chapter 03

03竞争者

3.1 竞争格局概览

语音 AI API 的竞争格局可分为四层。第一层(hyperscaler):AWS Transcribe、Google Cloud Speech-to-Text(Chirp 3)和 Azure Speech Services 与各自云生态捆绑。它们的主要优势是无缝 IAM、账单集成、合规认证,以及在既有云原生客户眼中的近零边际成本。它们靠便利性和分销竞争,而不是技术领先。第二层(纯 API 厂商):AssemblyAI、Speechmatics、ElevenLabs(Scribe)和 Rev.ai 是面向开发者的竞争对手。AssemblyAI 在转写智能(情感、主题、实体抽取)上领先;Speechmatics 在受监管行业的本地部署(55+ 种语言)上领先;ElevenLabs Scribe v2 在多语言实时准确率上领先。第三层(全栈 LLM 平台):OpenAI 的 GPT-Realtime API($32/1M tokens input audio)把 STT 与 LLM 推理打包,对希望单一供应商的语音智能体构建者构成竞争威胁。第四层(开源):OpenAI Whisper 和 NVIDIA Canary Qwen 2.5B 是可免费自托管的模型,争夺批量、非延迟关键工作负载。 Deepgram 最清晰的竞争优势在实时语音智能体基础设施:Flux 可实现低于 300ms 的端点检测延迟,Nova-3 拿到最高批量 WER(5.26%),统一 Voice Agent API 消除了 STT+TTS+LLM 拼接负担。截至 2026 年 5 月,尚无竞争对手能在实时智能体工作负载上同时匹配 Deepgram 的准确率、延迟和统一编排。 [CP001, CP002, CP003, CP004, CP005, CP006]

竞争对手画像表
竞争对手规模 / 融资目标客户产品范围战略方向
Deepgram累计融资 $215M;400+ 企业客户;估值 $1.3B开发者 / 企业;实时语音代理STT(Nova-3)、TTS(Aura-2)、Flux CSR、Voice Agent API、Saga OS 等模块做语音 AI 经济的平台层;借 IBM/AWS 全球扩张
AWS TranscribeAWS(AMZN 市值 $2T)AWS 原生企业;联络中心STT、医疗 STT、批量与 Bedrock、Amazon Connect 更深捆绑;忽略小众低延迟需求
Google Cloud Speech-to-TextGoogle(GOOGL $2T+)全部细分市场;企业、APACSTT(Chirp 3,125+ 种语言)、医疗 / 电话变体与 Gemini 做多模态 AI 集成;扩大语言覆盖
Azure SpeechMicrosoft(MSFT $3T+)企业;Microsoft 365 客户STT、TTS、Custom Speech、实时字幕Copilot 集成;捆绑进企业 AI 技术栈
AssemblyAI累计融资约 $100M(估计)开发者;转录智能买家STT(Universal-2/3)、Slam-1 LeMUR、音频智能转录智能领导者;多语言 Universal-3 Pro
Speechmatics累计融资约 $70M(估计)受监管企业;本地部署STT/TTS(56+ 种语言)、本地部署、定制模型隐私优先的企业方案;扩展 TTS;低延迟语音代理
ElevenLabs$180M Series C(2024)开发者;多语言实时 STTTTS(头部)、Scribe STT、语音代理多语言领导者;从 TTS 扩展到完整语音技术栈
Rev.ai自举 / 小规模开发者 / SMB;媒体转录STT(Reverb ASR)、批量转录聚焦媒体 / 媒体科技小众场景;语音代理布局有限
OpenAI(GPT-Realtime)Microsoft 支持;估值约 $300B使用 GPT 技术栈的开发者实时语音 API、Whisper(OSS)、GPT-4o TranscribeLLM + 语音一体化;把 STT 做成捆绑功能并商品化

AssemblyAI 和 Speechmatics 的竞争对手融资估计来自公开来源;准确数字未确认。 OpenAI 估值来自 2025 年 3 月融资。

[CP001, CP007, CP008, CP009, CP010, CP011]

3.2 竞争对手画像

AWS Transcribe 标准价格为 $0.024/min,批量为 $0.015/min,具备 HIPAA eligibility,并原生集成 AWS 生态。它是深度绑定 AWS 企业的默认选择,但在基准测试中,实时准确率和延迟落后于 Deepgram。Google Cloud Speech-to-Text(Chirp 3)支持 125+ 种语言,并提供医疗和电话通话变体,标准价格为每 1,000 分钟 $16。Azure Speech 支持 100+ 种语言,Custom Speech 微调标准价格为 $1/hour。AssemblyAI Universal-2 定价为 $0.15/hr,Universal-3 Pro 为 $0.21/hr,具备极强多语言准确率和内置转写智能。Speechmatics 起价 $0.24/hr,支持 50 个并发会话、本地部署选项和 56+ 种语言。Rev.ai 提供 pay-as-you-go 模式和免费 5 小时评估档。OpenAI Whisper 开源且可自托管;GPT-Realtime-2 高端实时 API 价格为 $32/1M audio input tokens。 ElevenLabs Scribe v2 Realtime 按 FutureAGI 基准,在 30 种语言上实现约 150ms 延迟,价格为 $0.22–$0.48/hour,目前领先多语言实时 STT。这是 Deepgram 国际扩张叙事中最直接的竞争威胁。OpenAI 的 GPT-Realtime-Whisper 以 $0.034/min 提供流式能力,为已使用 GPT 模型的语音智能体构建者提供 OpenAI-native 的 Deepgram 替代方案。 [CP007, CP008, CP009, CP010, CP011, CP012]

功能与能力矩阵
能力DeepgramAWS TranscribeGoogle STTAzure SpeechAssemblyAISpeechmaticsOpenAI Realtime
实时 STT 延迟低于 300ms(Flux/Nova-3)~500ms+~400ms+~400ms+~300ms(Universal-2)~200ms(低延迟)~200ms(Realtime-2)
批量 STT WER(英语)5.26%(Nova-3)~8–10%(估计)~6–8%(估计)~7–9%(估计)~5.5%(Universal-3)~5–7%(估计)~8.9%(GPT-4o)
TTS是(Aura-2)否(原生)是(有限)否(单独提供)
Voice Agent API(统一)是(Voice Agent API)部分支持(Realtime)
垂直领域微调 / 定制模型是(三因素自动化)是(Custom Vocabulary)是(Custom Classes)是(Custom Speech)是(定制词表)是(定制模型)
本地部署有限
语言支持45+ 种语言100+ 种语言125+ 种语言100+ 种语言99 种语言(Universal-2)56+ 种语言57+ 种语言(Whisper)
音频智能(情绪、主题)有限是(LeMUR、Slam-1)
HIPAA 合规是(Business Associate)是(本地部署)部分支持

延迟和 WER 数字来自 FutureAGI 独立基准指南(2026 年 5 月)和公司文档。 Azure 和 Google 的批量 WER 估计来自公开基准数据;所有模型之间没有受控的头对头测试。

[CP005, CP007, CP008, CP009, CP010, CP011]
定价与包装对比
供应商STT 按量付费STT 企业 / 定制TTS 定价Voice Agent API免费层
Deepgram Nova-3$0.0048/min(streaming)定制企业合同$0.015/1K chars(Aura-2)$4.50/hr(Voice Agent API 价格)$200 额度
Deepgram Flux$0.0077/min(streaming)定制包含在 Voice Agent API 中$4.50/hr(Voice Agent API 价格)$200 额度
AWS Transcribe$0.024/min standard可提供批量折扣~$4/1M chars(Polly)无(DIY 技术栈)60 min 免费 / 月(12 个月)
Google Cloud STT$16/1K min(standard)定制~$4/1M chars(WaveNet)无(DIY)$300 额度
Azure Speech$1/hr standard定制$4/1M chars,标准档无(DIY)5 hr 免费 / 月
AssemblyAI Universal-2$0.15/hr(~$0.0025/min)定制原生无无(DIY)5 hr 免费 / 月
Speechmatics$0.24/hr(paid plan)批量 + 定制可用(有限)无(DIY)2,400 min 免费 / 月
Rev.aiPAYG(未披露 / hr)定制None无(DIY)5 hr 额度
OpenAI GPT-4o Transcribe$6/1K min(batch,估计)定制约 $0.015/1K chars(TTS-1 价格)GPT-Realtime $32/1M 音频 tokens无(API 积分)

价格来自截至 2026 年 6 月的公开标价。企业合同价格需谈判,未公开。 OpenAI GPT-4o Transcribe 价格来自 FutureAGI 基准估算;OpenAI 定价页未确认。 Deepgram 的 $0.015/1K chars TTS 价格来自 Deepgram 定价页;企业费率不同。

[CP007, CP008, CP009, CP010, CP011]

3.3 护城河分析与竞争定位

Deepgram 的可持续竞争优势分为四类。第一,技术架构护城河:基于自有音频数据集训练的端到端深度学习、极致压缩的 latent space 模型、硬件高效推理,使其在 2026 年 5 月前的基准中达到规则式或微调式竞争系统尚未复制的延迟和准确率水平。Deepgram 持有多项美国 ASR 架构专利(US 12,380,880 和 US 12,334,075)。第二,领域定制护城河:Deepgram 的 3-factor 自动模型适配,让企业客户能比任何公开宣称的竞争对手更快,为领域词汇(医疗、法律、QSR drive-thru)微调。NASA、Jack in the Box 和空中交通管制场景验证了其极端环境表现。第三,部署灵活性:云端、自托管和本地部署,加上模型热切换,为受监管企业(金融服务、医疗、政府)提供 hyperscaler 托管服务无法匹配的路径。第四,分销伙伴:AWS SCA 和 IBM watsonx Orchestrate 伙伴关系,把销售渠道带入 Deepgram 仅靠直接开发者 PLG 无法触达的企业采购中心。 企业客户使用 Deepgram 的切换成本不低:组织为医疗、法律或 QSR 词汇微调领域模型,会积累专有训练数据和适配权重,这些资产难以迁移到竞争平台。使用通用词汇的标准化 hyperscaler STT 客户没有这种数据依赖锁定。开发者层客户常常多供应商并用,会同时跑 AssemblyAI 和 Deepgram 做 A/B 评估,这限制了早期锁定,但最终会偏向在具体垂直场景里领域表现更好的供应商。 护城河风险:如果 OpenAI 或 Google 加速实时模型优化,延迟优势可能收窄;hyperscaler 可能补贴准确率提升;除非 Deepgram 专门补齐 APAC/EMEA 语言覆盖,ElevenLabs Scribe 的多语言领先可能延续。通过开源 Whisper 和 NVIDIA Canary,通用英语 STT 在非延迟关键批量工作负载上商品化,是真实威胁。 [CP013, CP014, CP015, CP016, CP017, CP036]

护城河耐久性与竞争风险登记表
护城河因素Deepgram 态势耐久性主要风险
实时延迟(Flux <300ms)FutureAGI 2026 年 5 月显示领先中高OpenAI / Google 可能靠硬件投资缩小差距
批量准确率(Nova-3 5.26% WER)FutureAGI 2026 年 5 月托管 API 基准显示领先AssemblyAI Universal-3 接近;NVIDIA Canary(OSS)WER 为 5.63%
领域微调(3 因素自动适配)架构主张独特;公开资料未见同业匹配超大云厂商可能规模化加入自动微调
本地 / 自托管部署强;与云端功能完全一致Speechmatics 也提供本地部署;优势偏小众
专利组合(US 12,380,880;US 12,334,075)已披露 2 项专利组合有限;竞争对手可能绕开设计
AWS + IBM 分销合作独家:IBM 首个语音合作伙伴;AWS SCA高(近期)合作靠合同约束;非排他;可撤销
OfOne 餐厅垂直(QSR)QSR 语音 AI 先发Jack in the Box 使用 Deepgram;若 Jack 切换供应商,垂直业务受冲击
多语言实时 STT45+ 种语言,但 ElevenLabs Scribe 在基准中领先低中ElevenLabs Scribe v2 在 30 种语言中达到 150ms

耐久性评级是定性评估,依据截至 2026 年 6 月公开基准中的技术架构、合作排他性与 竞争对手能力。

[CP013, CP014, CP015, CP016, CP017]
FP001: 竞争定位图(延迟 vs. 准确率)

在实时延迟(Y)与英语 STT 准确率(X)轴上定位 Deepgram 和主要竞争对手。

X = 准确率(越高越好;基于 WER 反向映射为 1–10 分)。Y = 实时延迟(越高代表延迟越低)。评分来自 FutureAGI 基准数据和公司文档的定性换算,不是数学推导。

[CP005, CP006, CP007, CP008, CP009, CP011]
FP002: 功能广度与能力图

每家厂商覆盖的主要语音 AI 能力数量(STT、TTS、语音智能体、本地部署、微调、音频智能)。

能力数量按每个类别(STT、TTS、Voice Agent API、本地部署、微调、音频智能)做简化 0/1 计分。未按每项能力深度加权。

[CP001, CP002, CP003, CP013, CP014]
FP003: 护城河与就绪度 KPI

Deepgram 相对市场的关键竞争就绪度指标。

[CP005, CP013, CP016, CP017]

3.4 展示要点

Chapter 04

04财务

4.1 收入模式与定价架构

Deepgram 的核心变现是 API 层按使用量定价,覆盖四条产品线。语音转文本(STT):Nova-3 流式价格为 $0.0048/min,Flux(为实时语音智能体优化)为 $0.0077/min;两者都适用于 Pay-As-You-Go 档位,无最低消费,注册赠送 $200 免费额度。文本转语音(TTS):Aura-2 价格为每 1,000 字符 $0.015。Voice Agent API:$4.50/hour,把 STT、TTS 和 LLM 编排合入统一实时 API,于 2025 年 6 月宣布 general availability。Growth plan(预付额度)起价 $4,000+/year,相比 PAYG 约节省 20%,并包含更高并发上限。企业账户获得定制定价、专属支持、本地部署选项和 SLA 承诺。企业层收入几乎肯定是绝对金额最大的收入来源,但 PAYG 开发者收入与企业合同的结构占比未公开披露。Deepgram 的 OfOne QSR 垂直(餐厅 drive-thru 语音点单)可能采用收入分成或按门店订阅模式,为 API 业务叠加一层垂直 SaaS。 AWS Strategic Collaboration Agreement(SCA,2025 年 8 月)和 IBM watsonx Orchestrate 伙伴关系(2026 年 2 月)新增联合销售渠道,其经济性可能不同——更可能采用伙伴谈判费率下的嵌入式定价,而不是公开 API PAYG 价格——从而改变毛利动态。Twilio 作为 Series C 战略投资者参与,暗示可能存在更深商业集成,并可创造与分销绑定的收入流。 [CI001, CI002, CI003, CI004, CI005, CI006]

收入来源表
收入来源产品定价模式价格(公开)备注
STT(流式)Nova-3按分钟 PAYG$0.0048/min实时流式;语音代理最常用
STT(流式)Flux按分钟 PAYG$0.0077/min专为语音代理编排打造;E2E 延迟最低
TTSAura-2按 1K chars PAYG$0.015/1K chars面向语音代理响应的神经 TTS
Voice Agent API统一编排按小时 PAYG$4.50/hr打包 STT + TTS + LLM 编排;较拼接方案节省 80%+
开发者增长计划全部产品年度预付积分$4,000+/年(约节省 20%)较 PAYG 折扣;注册送 $200 免费额度
企业合同全部产品 + 本地部署定制 / 协商未披露SLA、专属支持、本地部署
OfOne QSR 垂直餐厅得来速 AI估计按门店 / 收入分成未披露借 Series C 资金收购;首个语音 AI QSR 垂直
IBM watsonx / AWS SCA 渠道合作伙伴渠道估计嵌入式伙伴定价未披露联合销售;嵌入 watsonx Orchestrate 和 AWS Marketplace

企业和合作伙伴定价未公开披露。OfOne 收入模式按 QSR SaaS 行业惯例估算。 所有公开价格均来自 Deepgram 截至 2026 年 6 月的定价页。

[CI001, CI002, CI003, CI004, CI005, CI006]
定价 / 商业化表
方案免费额度PAYG STT 价格PAYG TTS 价格增长计划企业版
Deepgram$200 额度$0.0048/min(Nova-3 价格)$0.015/1K chars$4K+/年(八折)定制;支持本地部署
AssemblyAI免费 5 hr$0.0025/min(约 $0.15/hr)原生无定制定制;不支持本地部署
AWS Transcribe60 min/mo(12 个月)$0.024/min 标准约 $0.004/1K chars(Polly)用量折扣用量 + 定制;HIPAA
Google Cloud STT$300 额度$0.016/min 标准约 $0.016/min(Standard 标准档)承诺使用定制;多区域
Azure Speech免费 5 hr/mo$0.0167/min 标准$0.004/1K chars 标准承诺使用定制;企业套包
OpenAI (GPT-Realtime)None$0.34/min(音频 tokens 等价)$0.015/1K chars(TTS-1 价格)None定制企业版

价格为截至 2026 年 6 月的公开标价。所有价格均为按量付费;适用用量折扣。AssemblyAI $0.0025/min 由 $0.15/hr 推算。OpenAI GPT-Realtime $32/1M tokens 在典型音频下约等于 $0.34/min。

[CI001, CI002, CI021, CI022, CI023]
FI001: 收入模型桥

Deepgram 从开发者获客到企业合同和平台扩张的收入转化。

收入值均为估计。开发者 ARPU 和企业 ACV 是分析师代理值,并非披露财务数据。

[CI001, CI002, CI005, CI006, CI007]
FI002: 单位经济桥

基于公开牵引数据和可比 API 基础设施 ACV 基准估算的 Deepgram ARR 区间。

所有数字都是基于公开牵引、定价和可比 SaaS API 公司的分析师估计。Deepgram 尚未公开披露 ARR。区间很宽,反映企业 ACV 分布的不确定性。

[CI012, CI013, CI024, CI034]

4.2 公开牵引指标与财务规模

Deepgram 2024 年实现现金流为正——在以重计算支出著称的 AI 基础设施赛道里,对一家 Series B 阶段公司而言,这是重要经营里程碑。截至 2025 年 1 月,Deepgram 拥有 400+ 家企业客户和 200,000+ 名活跃开发者在平台上构建应用。过去四年使用量年化增长 3.3×。截至 2025 年初,平台累计指标包括处理超过 50,000 年音频、转写超过 1 trillion 个词;这两项都明显高于同等融资阶段纯 API 同行披露的可比数据。 公司未公开披露 ARR 或收入数字。基于公开定价和牵引数据,粗略估算 ARR 需要假设每名开发者 ARPU(PAYG 可能为 $50–$500/yr)和企业交易规模(每家企业每年可能为 $100K–$1M+)。若 400+ 家企业客户的 blended ACV 为 $200K(保守估计),仅企业收入就接近 $80M ARR;其上的开发者 PAYG 收入可能再增加 $10–$30M ARR,取决于用量集中度。这些只是估算,并非来自未披露财务。 Series C 条款清单和新闻稿提到,本轮资金将支持 OfOne 收购整合、旧金山新的 Voice AI Collaboration Hub、扩大的专利组合,以及「Powered by Deepgram」伙伴计划。这些是增长投入,而不是扭转困境的支出,与现金流为正的基线一致。 [CI008, CI009, CI010, CI011, CI012, CI013]

单位经济模型表
指标来源 / 依据置信度
活跃开发者总数200,000+BusinessWire 2025 年 1 月新闻稿高(公司披露)
企业客户400+BusinessWire 2025 年 1 月新闻稿高(公司披露)
年使用量增长(4 年 CAGR)~35%(4 年增长 3.3× 推算)BusinessWire 2025 年 1 月新闻稿高(公司披露)
累计处理音频50,000+ 年音频BusinessWire 2025 年 1 月新闻稿高(公司披露)
累计转写词数1 万亿+ 词BusinessWire 2025 年 1 月新闻稿高(公司披露)
估算企业 ACV$100K–$1M+(估算)行业代理指标;未披露低(分析师估算)
估算 ARR 区间$100M–$200M(估算)400+ 企业客户 × 平均 ~$200K + 开发者 PAYG低(分析师估算)
估算毛利率55–70%(估算)AI API 基础设施基准;Deepgram 未披露低(分析师估算)
现金流状态(2024 年底)现金流为正(报道)BusinessWire 2025 年 1 月高(公司披露)

估算指标是分析师依据可比 API 基础设施公司和公开定价得出的近似值。 Deepgram 未公开披露 ARR、毛利率、CAC、回本周期或 LTV。

[CI008, CI009, CI010, CI011, CI012, CI013]
FI003: 财务估算区间

基于公开数据和 AI API 基础设施基准估算的 Deepgram 财务参数。

所有财务估计都是分析师近似值;Deepgram 没有公开财务报表。毛利率按相似规模可比 AI API 基础设施公司估算。

[CI017, CI018, CI019, CI030]

4.3 资本充足性、成本结构与财务结论

Deepgram 披露累计融资为 $215M+,其中 2026 年 1 月融资 $130M。公司进入 2025 年时现金流为正,因此 $130M Series C 主要是增长资本,而不是生存资金,这会显著改变对 burn-rate 的假设。Series C 后,$130M 进入一家现金流为正的公司,即使没有收入增长,按当前规模有效 runway 也可能为 4+ 年;但公司明确要加速增长投入(伙伴计划、收购整合、voice AI hub),意味着近期经营费用会上升。 语音 AI API 公司的成本结构主要包括:(1)计算 / 推理成本(用于模型服务的 GPU 集群——高 capex 或云 COGS);(2)研发(模型训练、研究团队);(3)销售与营销(PLG + 企业直销);(4)G&A。Deepgram 的自托管和本地部署选项能降低其为本地客户承担的服务成本(成本转移给客户),同时保留许可收入。规模化云 API 交付的 AI 基础设施运营商,毛利率通常在 50–70%;但早期或成长期玩家常因 GPU 超额配置而更低。Deepgram 未披露毛利率。未见公开债务或项目融资义务。 财务结论:Deepgram 的公开财务画像符合一家 Series B/C 阶段 API 平台,且具备真实 product-market fit(现金流为正、使用量增长、企业采用)。主要承销风险是未披露毛利率(计算成本暴露)、企业合同流失率和净收入留存——这些都没有公开数据。它们构成下一阶段可执行尽调请求。 [CI015, CI016, CI017, CI018, CI019, CI020]

资本充足性表
轮次年份金额领投方知名投资方投后估值
种子 / Pre-Series A2016–2017~$2M(估算)YC W18 批次Y Combinator~$10M(估算)
Series A 轮2019~$7M(估算)Tiger Global(早期)Tiger, Wing VC~$30M(估算)
Series B 轮2022~$72M(估算)Alkeon CapitalAlkeon, Madrona, In-Q-Tel~$400M(估算)
Series C 轮2026 年 1 月$130M 已确认AVP(领投)Alkeon、In-Q-Tel、Madrona、Tiger、Wing、YC、Alumni Ventures、Columbia U.、Princeville Cap.、Twilio、SAP 等投资方$1.3B 已确认
累计融资2016–2026$215M+$1.3B(Series C 后)

Seed/A/B 金额来自二级来源估算;只有 Series C 经 BusinessWire 新闻稿确认,为 $130M / $1.3B。 YC 公司页显示批次为 W18。Series B 及更早轮次未获正式确认。

[CI015, CI016, CI025, CI026]
公开财务缺口表
指标公开可得性尽调路径无法取得时的风险
ARR / 收入未披露向管理层索取;Series C 尽调标准项无法判断资本充足性或增长率
毛利率未披露审阅 P&L;索取计算成本拆分无法评估扩张性与计算成本敞口
净收入留存(NRR)未披露CRM / 队列分析企业粘性和护城河耐久性的关键指标
企业流失率未披露索取队列数据;访谈参考客户必须确认 400+ 是净新增而非总新增
现金消耗率 / 跑道期未披露(报道称现金流为正)索取 Series C 后月度现金流量表需要评估增长投入下的 C 轮后跑道期
CAC / 回本周期未披露销售与营销费用 + 队列数据验证 GTM 效率与 PLG 漏斗经济性
本地部署许可收入占比未披露索取收入分部拆分本地部署可能毛利结构不同
OfOne 收入 / 单位经济模型未披露被收购实体单独 P&LQSR 垂直收购整合风险

这些财务缺口是私营公司 Series C 尽调的标准项。Deepgram 现金流为正、估值 $1.3B, 风险重心已从偿付能力转向增长承保。

[CI027, CI028, CI029, CI030]
FI004: 资本强度 / 现金流图

Deepgram 从 Series C 到增长投资和经营现金流的资本分配。

资本配置来自新闻稿披露的资金用途;单项金额未披露。

[CI015, CI016, CI025]

4.4 展示要点

Chapter 05

05产品与技术

5.1 产品定义与客户工作流

Deepgram 将自己定位为实时语音 AI API 基础设施层,服务构建 voice-native 应用的开发者和企业。其产品嵌入三类核心客户工作流:(1)实时对话和语音智能体工作流:开发者嵌入 Voice Agent API,为客服、销售、餐厅点单和支持自动化创建低延迟对话智能体。智能体通过 Flux STT 模型聆听(针对 <300ms 的语音结束检测优化),通过集成 LLM(用户可配置)处理转写,再通过 Aura-2 TTS 模型响应,全程在单个 WebSocket API 会话中完成,不需要多供应商拼接。(2)批量转写和智能分析工作流:企业(法律、医疗、媒体、合规)通过 REST API 将录音发送给 Nova-3 STT,用于通话后分析、字幕生成和医疗文档。Nova-3 支持说话人分离、智能格式化、主题检测和脱敏。(3)本地部署的受监管企业工作流:政府、国防和金融服务客户在自有基础设施上部署 Deepgram 的 STT/TTS 模型,与云产品保持完整 API parity,音频数据不离开网络边界。 每类工作流都由不同模型 SKU 支撑,价格、延迟曲线和功能集不同,让企业买家能从开发者实验清晰升级到生产级部署。$200 免费开发者额度和 PAYG 定价,借助 product-led growth 降低了新开发者采用门槛。 [CE001, CE002, CE003, CE004]

产品模块 / 资产矩阵
产品类型用例定价关键规格
Nova-3STT 模型(批量 + 流式)批量转写、通话后分析、医疗文档$0.0048/min 流式5.26% WER(9 个领域)、45+ 种语言、Nova-3 Medical 变体
Nova-3 MedicalSTT 模型(医疗变体)临床文档、EHR 集成、HIPAA定制企业版针对医疗术语优化;可签 HIPAA BAA
FluxSTT 模型(实时)语音代理、实时字幕、流式$0.0077/min 流式Sub-300ms EOS 检测;E2S 延迟最低(FutureAGI 2026 年 5 月)
Aura-2TTS 模型语音代理响应、IVR、无障碍$0.015/1K chars低延迟神经合成;多种声音
Voice Agent API统一编排实时对话式 AI 代理$4.50/hr单一 WebSocket API 内集成 STT + TTS + LLM;往返 sub-300ms
领域适配微调服务专有词汇(法律、QSR、金融)定制企业版3 因素自动适配;数据飞轮锁定
自托管部署本地 / 云托管受监管企业(政府、医疗、金融)定制企业版API 完全一致;Docker/K8s;支持 air-gap

所有价格来自 Deepgram 截至 2026 年 6 月的定价页。Nova-3 Medical 采用定制企业定价。Saga OS 是 公司材料提到的内部平台抽象层,但不单独销售。

[CE001, CE005, CE006, CE007, CE008]
工作流 / 用例表
垂直用例使用产品满足的关键要求参考客户
联络中心 / BPO实时坐席辅助、QA、通话转写Nova-3、Flux、Voice Agent API 等产品Sub-300ms 延迟;噪声环境准确率未披露(企业)
QSR / 餐厅得来速语音点单OfOne 平台(Deepgram 驱动)实时点单准确率;环境噪声鲁棒性Jack in the Box(NetworkWorld 案例)
医疗健康医疗转录、临床文档Nova-3 MedicalHIPAA BAA;医疗词汇;说话人分离未公开点名
政府 / 国防天地音频、安全通信转录本地部署 Nova-3天地音频准确率 89.6%;隔离网络部署NASA
开发者 / ISVVoice AI SaaS 应用、会议工具、无障碍功能Nova-3、Flux、Voice Agent API 等产品(PAYG)对开发者友好的 API;$200 免费额度;低延迟 SDK200,000+ 名开发者
企业 AI(IBM watsonx)代理式企业工作流、语音命令Deepgram 嵌入 watsonx Orchestrate企业集成;本地部署选项;HIPAAIBM 企业客户

参考客户来自公开案例研究和新闻稿。医疗健康客户名称未公开披露。NetworkWorld 的 Deepgram 概览文章提到 Jack in the Box。

[CE002, CE003, CE015]
FE002: 客户工作流 / 运行流

Deepgram 的语音智能体工作流,从现场音频实时转为智能体回复。

延迟数据来自 FutureAGI 2026 年 5 月基准指南。LLM 延迟因供应商和模型而异,不包含在 Deepgram 专属延迟声明中。

[CE002, CE006, CE009]

5.2 技术架构与平台组件

Deepgram 的核心技术是用于自动语音识别(ASR)的端到端(E2E)深度学习架构,与传统流水线 ASR(声学模型 + 语言模型 + 解码器)形成对比。E2E 方法训练单个神经网络,将原始音频波形直接映射为文本,同时提高准确率和硬件推理效率。这一架构由两项美国专利保护:US 12,380,880(带 transformer 架构的端到端 ASR)和 US 12,334,075(硬件高效 ASR)。专利描述的系统能以每分钟推理显著更低的算力要求实现有竞争力的 WER,这是 Deepgram 相比 hyperscaler 具备定价优势的基础。 Nova-3 模型(2025 年 2 月发布)针对 9 个音频领域和 45+ 种语言的批量及流式 STT 优化,并提供领域专用模型(医疗、金融、法律、汽车、对话)。Flux 是专为对话式语音识别打造的模型,针对实时智能体场景的语音结束(EOS)检测优化,可在语音结束到转写交付之间实现低于 300ms 的延迟——这对语音智能体响应至关重要。Aura-2 是 Deepgram 第二代神经 TTS 模型,为智能体响应提供低延迟、自然的语音合成。Voice Agent API(2025 年 6 月 GA)把三类模型和 LLM 编排抽象进单个基于 WebSocket 的 API,消除了多跳 STT→LLM→TTS 栈带来的延迟叠加。 Deepgram 的 3-factor 自动领域适配,让企业客户能通过半自动微调流水线,为专有词汇定制模型。客户音频语料可提交做领域适配,无需手工修改模型架构。这是「data flywheel」护城河的主要机制——客户把模型微调到专有垂直数据(医疗、法律、QSR)上,会以适配模型权重的形式积累切换成本。 [CE005, CE006, CE007, CE008, CE009, CE010]

技术 / 运营架构表
组件描述差异化
E2E 深度学习 ASR 核心单个神经网络把原始音频映射为文本;不拆成流水线每分钟推理的算力成本低于传统流水线 ASR;支撑 Deepgram 的价格优势
Transformer 架构(Nova-3)基于 Transformer 的语言模型,用于具备上下文感知的 STT专利 US 12,380,880;无需重构流水线即可做领域适配
硬件高效推理用于模型服务的专有潜在空间压缩专利 US 12,334,075;可在通用硬件上跑出有竞争力的定价和本地部署
Flux EOS 检测面向语音结束检测的专用对话语音模型语音代理延迟低于 300ms;不是通用 STT 模型
3 因子领域适配自动微调流水线,接收客户音频语料不需要手工 ML 工程;生成客户专属适配模型
WebSocket 流式 API面向实时转录和 TTS 的低延迟双向流单条持久连接比 REST 轮询更能压低往返延迟
Aura-2 神经 TTS面向语音代理回复合成的低延迟神经文本转语音集成在 Voice Agent API 中;消除拼接 TTS 供应商带来的延迟

专利细节来自 Google Patents(US12380880、US12334075)。架构描述基于公司资料和 Deepgram 开发者文档。延迟数字来自 FutureAGI 2026 年 5 月基准测试。

[CE005, CE006, CE007, CE008, CE009, CE011]
FE001: 产品架构图

Deepgram 的产品架构栈,从音频输入经 API 延伸到应用层。

[CE001, CE005, CE007, CE008, CE009]
FE003: 关键依赖图

Deepgram 交付产品时依赖的关键技术与商业要素。

[CE010, CE011, CE012, CE013]

5.3 部署、集成、合规与路线图

Deepgram 提供三种部署模式:(1)Cloud API——通过 deepgram.com API 以托管 SaaS 方式提供,支持 WebSocket 和 REST 端点;(2)Self-hosted——在客户 AWS、GCP 或 Azure 环境中用 Docker/Kubernetes 容器部署;(3)On-premises——可完全 air-gap 部署在客户数据中心,不发起外部 API 调用。自托管和本地模型与云产品保持完整 API parity,使受监管企业能从云迁移到本地而无需修改 SDK。 集成面包括:REST API(批量转写)、WebSocket API(流式 STT 和 Voice Agent)、SDK(Python、JavaScript/TypeScript、Go、.NET、Ruby、PHP)、CLI,以及面向 AI 编码工具的 MCP Server。状态监控位于 status.deepgram.com;公开披露的历史 uptime 显示 2024 年有两次事故,均在 4 小时内解决。所有档位均可签 HIPAA Business Associate Agreements。定价页将 HIPAA 合规列为所有付费计划功能。Deepgram 数据隐私政策支持敏感工作负载的零留存模式(转写后不存储音频)。 博客和产品公告释放的路线图信号包括:多语言 Flux 模型(Flux Multilingual 于 2026 年 6 月宣布)、扩展领域专用 Nova-3 模型、扩展 Saga OS 语音智能体操作系统能力,以及 OfOne 餐饮 AI 集成。IBM watsonx 和 AWS SCA 伙伴关系意味着双方可能为企业客户共同开发语音智能体场景,这会加速受监管行业产品功能(医疗、金融服务)。Powered by Deepgram 计划认证基于 Deepgram 基础设施构建的 ISV 伙伴。 [CE012, CE013, CE014, CE015, CE016, CE017]

信任 / 质量 / 合规表
领域状态覆盖范围缺口 / 备注
HIPAA BAA所有付费计划均可用医疗健康、政府、受监管企业已声明 HIPAA 合规;正式审计状态未公开披露
数据留存可使用零留存模式零留存模式下,转录后不存储音频零留存模式需要选择启用;默认留存政策未完全公开
本地部署 / 隔离网络本地部署选项具备完整 API 对等能力需要网络边界隔离的政府、国防、金融通过企业合同提供;无自助式本地部署选项
SOC 2 Type II截至 2026 年 6 月,Deepgram 网站未公开确认有非正式声称,但未出现在信任中心信任中心缺席会增加企业买家的销售摩擦
ISO 27001未公开确认需要认证的企业采购标准
FedRAMP未公开确认美国联邦机构直接采购需要NASA 用例显示可能存在非正式合规路径;并非正式 FedRAMP
GDPR适用于欧盟区域数据;可提供 BAA欧盟企业客户;本地部署支持数据主权营销露出不如 Speechmatics 的 GDPR-first 定位突出

合规状态来自 Deepgram 定价页、开发者文档和 goodwinlaw.com 分析。信任中心未公开 SOC 2 Type II 或 ISO 27001 认证,被记为缺口。

[CE014, CE015, CE016, CE017]
路线图 / 发布 / 开发阶段表
产品 / 功能状态(截至 2026 年 6 月)发布信号战略意义
Nova-3 STT正式可用 — 当前旗舰2025 年 2 月发布准确率护城河;FutureAGI 基准测试 WER 为 5.26%
Flux(EOS 优化)正式可用 — 实时代理2025 年发布(日期推断)面向语音代理市场的延迟护城河
Flux Multilingual2026 年 6 月宣布正式可用2026 年 6 月博客文章多语种扩张;缩小国际市场上与 ElevenLabs Scribe 的差距
Aura-2 TTS正式可用 — 当前旗舰2024-2025 年发布面向语音代理的一体化 TTS;补齐 STT+TTS 栈
Voice Agent API2025 年 6 月以来正式可用2025 年 6 月 BusinessWire 公告平台整合;$4.50/hr 定价;关键增长产品
Saga OS开发中 / 部分正式可用C 轮新闻稿提及语音代理操作系统层;下一代平台抽象
OfOne 餐厅 AI收购后整合进行中2026 年 1 月 C 轮(融资支持收购)QSR 垂直锁定;按门店计费的 SaaS 收入模式
IBM watsonx 语音2026 年 2 月以来可用2026 年 2 月 IBM Newsroom 公告企业渠道分发;IBM 首个语音 AI 合作伙伴

Flux Multilingual 信号来自 Deepgram 2026 年 6 月博客文章。Saga OS 状态来自 C 轮新闻稿中的提法。路线图项目根据公开公告推断;Deepgram 未发布正式路线图。

[CE005, CE008, CE013, CE014]
FE004: 产品成熟度 / 能力图

Deepgram 当前产品套件的关键产品能力指标。

[CE005, CE006, CE009, CE014]

5.4 展示要点

Chapter 06

06客户

6.1 客户基础分层与采用面

Deepgram 的公开客户证据指向一个多层客户基础,而不是单一同质账户池。最宽的漏斗顶部由开发者驱动:公司材料反复提到 200,000+ 名开发者使用平台,而面向企业的材料又单独提到 400+ 家企业客户和数百个企业部署。这两个数字不能混同。开发者数字描述的是 product-led 覆盖面;企业数量描述的是商业成熟度不同的付费或签约组织。公开材料还把 GTM 分成三条路径:直接服务内部使用语音 AI 的企业买家、把 Deepgram 嵌入自身产品的技术 ISV,以及通过 AWS 和 IBM 触达企业的伙伴中介模式。 按工作负载看,分层证据最强。联络中心、对话式 AI 构建者、医疗运营方和媒体平台各有专门解决方案页;AWS 和 Amazon Connect 材料则展示了联络中心和受监管买家如何在不把 Deepgram 当成绿地 ML 项目的情况下采购或部署。Twilio 参考架构进一步证明,电话系统构建者可以在既有通话流中采用 Deepgram。缺失的是按地域、公司规模、ACV 区间或收入贡献拆分的分层。因此,客户数量口径有助于判断规模,但对承销客户组合质量或集中度仍然较弱。缺失的分层也遮蔽了定价权和垂直集中度。[CU001, CU002, CU003, CU004, CU005, CU006]

客户分层表
分层买方用户付款方用例 / 工作负载公开证据 / 规模战略价值 / 缺口
开发者自助 API个人开发者 / 初创公司工程师应用构建者绑卡 PAYG 账户原型验证 STT、TTS 和语音代理工作流200,000+ 名开发者;$200 免费额度;文档和参考构建顶部漏斗触达很大,但按地域和账户规模划分的转化未披露
嵌入式 ISV 工作流工具产品或工程负责人ISV 产品终端用户嵌入 Deepgram 的软件供应商会议智能、客户成功工具、销售赋能、机器人UpdateAI 和 Nytro.AI 案例研究;Vocinity 出现在已构建案例落地页嵌入式打法证据强,但活跃 ISV 客户数或 ARR 构成未公开
企业联络中心 / CCaaSCX 运营或平台负责人坐席、主管、QA、自动化团队企业合同实时转录、坐席辅助、QA、IVR、分析专门的联络中心页面,加上 AWS 和 Amazon Connect 材料ACV 潜力大,但公开点名客户标识和续约数据仍薄
医疗服务方 / 医疗科技临床运营或 IT临床医生、员工、患者服务方或供应商合同医疗转录、患者沟通、语音代理医疗健康解决方案页面和企业 HIPAA 声明受监管增长路径清楚,但本次未抓取到具名医疗健康部署
媒体 / 播客 / 内容平台内容运营或产品编辑、创作者、听众平台或企业媒体账户字幕、搜索、审核、摘要、分析媒体解决方案页面,以及 Podsights-at-Spotify 证言用例匹配度高,但未披露客户数量
企业 AI 渠道平台负责人或联盟负责人合作伙伴开发者和企业用户联合企业账户或渠道销售watsonx 语音工作流、AWS 采购、电信代理IBM 合作、AWS 采购路径、Twilio 构建模式渠道杠杆可能加速商业化推进,但合作伙伴来源收入集中度未知

分层根据公开案例研究、解决方案页面、合作伙伴页面和开发者工作流推断;Deepgram 不按地域、规模或收入区间披露客户构成。

[CU001, CU002, CU006, CU008, CU009, CU010]
客户增长 / 采用轨迹表
指标数值日期 / 批次来源依据置信度含义缺失分母
企业客户400+2025 年 1 月经营更新Deepgram 公告;2026 年新闻材料引用已有有意义的企业采用未拆分直营与合作伙伴来源账户、地域,或活跃与累计
开发者200,000+2025-2026 年公开材料C 轮和 IBM 合作材料PLG 漏斗很宽未披露开发者转化为付费企业账户的比例
年使用量增长4 年增长 3.3x2025 年 1 月经营更新Deepgram 公告使用量已显著复合增长无基准年使用量分母或客户队列归因
处理音频50,000+ 年2025-2026 年公开材料Deepgram 公告和 C 轮新闻稿规模化工作负载支撑企业级成熟度未披露音频量在各账户间如何分布
转录词数1T+2025-2026 年公开材料Deepgram 公告和 C 轮新闻稿累计处理足迹很大未按批处理与流式、垂直行业或付费与免费使用拆分
部署规模数千个 AI 模型;数万亿秒语音当前企业页面企业页面表明许多真实工作负载已超出演示未把部署映射到付费客户或留存

轨迹有意混合客户数和工作负载数,以区分采用广度和具名证据;公司披露未提供队列分母或分层级滚动数据。

[CU001, CU002, CU003, CU004, CU005, CU007]
FU001: 客户旅程图

Deepgram 先靠开发者和参考构建落地,再通过企业控制能力和合作伙伴渠道扩张。

[CU006, CU011, CU012, CU013, CU032, CU036]
FU002: 采用 / 部署漏斗

公开证据显示,Deepgram 有一条可重复路径:从自助试用到生产部署,再到交叉销售。

该流程综合公开采用证据,并非量化转化漏斗;Deepgram 未披露逐阶段转化率。

[CU012, CU014, CU017, CU019, CU033, CU035]

6.2 具名客户证明与参考质量

本轮最强的具名客户证明集中在三个有实质工作流细节的部署:NASA、UpdateAI 和 Nytro.AI。NASA 是最清晰的企业级参考,因为案例研究解释了采购竞赛、部署问题、四个独立使用场景,以及困难音频上的量化转写结果。UpdateAI 和 Nytro.AI 提供的是另一类证明:两者都是嵌入式软件厂商,而非终端企业,但都明确表示 Deepgram 位于其产品的生产后端,并说明替代供应商为何在准确率、延迟或可靠性上落败。这让它们比泛泛的 logo wall 更强,也更贴近 Deepgram 的 ISV 驱动收入路径。 其他名称需要更谨慎对待。built-with 落地页列出了更多生态构建者,NetworkWorld 也报道 Jack in the Box 使用由 Deepgram 支撑的语音点单,但这些参考在本轮中没有达到 NASA、UpdateAI 或 Nytro.AI 的文档质量。实际结论是:Deepgram 有可信的具名证明,但公开证明密度仍窄于企业客户数量标题。因此,logo 应被视为方向性有用;只有文档最充分的部署,才适合支撑对生产成熟度的承销判断。[CU014, CU015, CU016, CU017, CU018, CU019]

具名客户证据表
客户分层部署 / 用例生产 vs 试点结果 / 证据限制
NASA政府 / 太空运营天地通信、Neutral Buoyancy Lab 音频、IRIS 医疗聊天机器人、历史任务音频搜索四个当前用例均已投产;IRIS 未来 ISS 部署已被提及在试用主要供应商后被选中;天地音频准确率最高 89.6%,NBL 验证集 WRR 约 87%公开证据在工作流细节上很丰富,但除具名用例外,未披露合同金额、续约或部署规模
UpdateAI客户成功 SaaS用于 Zoom 和线上客户成功会议的行动项检测引擎嵌入式工作流已投产UpdateAI 称 Deepgram 是其引擎基础,并称在选择 Deepgram 前测试了六家供应商,原因是准确率和实时速度未披露合同期限、使用量或扩张指标;证据质量为证言加案例研究
Nytro.AI销售赋能 SaaS面向推介智能和销售准备工作流的嵌入式 STT 后端嵌入式工作流已投产Nytro.AI 称 Deepgram 是产品核心,并报告准确率约 90-92% / 90%+,而替代方案为 75-80%未公开席位数、ACV 或续约历史;证据来自客户引述,但仍托管在供应商页面

行项目仅限本次抓取到至少两个公开来源、且工作流细节足以区分生产使用与仅客户标识证明的具名部署。

[CU014, CU015, CU016, CU017, CU018, CU019]
FU003: 客户证据矩阵

公开客户参考在证据质量上差异很大,NASA 最强,单一来源的餐饮客户证据明显更弱。

评分是编辑部速记:5 代表本章公开证据最强。多数具名证据来自供应商托管或单一来源,因此独立佐证普遍偏低。

[CU014, CU017, CU019, CU021, CU022, CU026]

6.3 耐久性、扩张路径与集中度风险

Deepgram 的公开材料在采用和产品宽度上远强于耐久性。已审阅来源没有披露 NRR、GRR、流失、合同期限或头部客户集中度,因此不能仅凭 400+ 家企业这一标题推断客户质量。最好的正面耐久性信号来自证言,而不是财务:UpdateAI 和 Nytro.AI 都称 Deepgram 是其产品的基础,PeerSpot 的独立评价聚合也强调速度、准确率、低延迟和成本。但同一评价聚合也暴露出语言覆盖、实时转写稳定性、说话人识别和并发问题,说明满意度并非单向一致。 扩张逻辑仍然清晰。Deepgram 可以从 STT 落地,再扩展到 TTS、分析和 Voice Agent API;也能通过 AWS 采购、Amazon Connect、IBM watsonx 分销和 Twilio 电话工作流扩大商业覆盖。风险在于,续约和集中度的公开证明没有跟上更宽的平台叙事。RFP.wiki 的采购说明明确建议买家压力测试可靠性、可观测性、回滚和定价现实性;Goodwin 的隐私分析则说明,受监管客户在扩大使用前,为什么可能要求更强的同意、留存和供应商控制证据。由于 Deepgram 的 Amazon Connect 路径目前只支持 hosted customers,即便渠道扩张,也尚未对所有客户类型做到部署中立。[CU023, CU024, CU025, CU028, CU029, CU030]

留存 / 重复使用 / 满意度表
指标数值分层置信度证据 / 尽调问题
公开 NRR企业直营 / 渠道账户审阅来源均未披露 NRR;索要按批次和渠道划分的队列留存
公开 GRR / 流失企业直营 / 渠道账户审阅来源均未披露 GRR 或流失;索要客户总流失和收入流失
合同期限 / 多年期占比企业和受监管买家未公开披露年度与多年期合同构成;索要合同账簿摘要
头部客户集中度第一大账户 / 前 10 大账户 / 合作伙伴渠道未发现公开的头部客户收入占比或前 10 大集中度指标
独立评价信号正负混合广泛用户群PeerSpot 称赞速度、延迟、准确率和成本,但也指出语言覆盖、实时转录稳定性和并发问题
推荐质量正面,但不足以证明留存具名 ISV 推荐UpdateAI 和 Nytro.AI 给出强推荐和工作流细节,但未提供续约、扩张或合同期限数据

Deepgram 未公开披露留存指标,因此空值是有意保留;证言质量和评价聚合不能替代队列留存或收入集中度数据。

[CU024, CU025, CU026, CU027, CU028, CU029]
扩张与集中风险表
扩张驱动集中风险 / 摩擦证据影响尽调路径
Voice Agent API 向上销售更高钱包份额取决于可靠性和编排质量SpeechTechMag、对话式 AI 页面、Twilio 工作流可把账户从原始 STT 支出迁移到完整语音到语音平台支出询问仅 STT 账户转入 Voice Agent API 的附加率
AWS 采购和 Amazon Connect依赖合作伙伴 / 渠道;Connect 路径目前仅托管AWS 合作伙伴页面和 Amazon Connect 文档可缩短联络中心采购和部署周期,但收入可能偏向渠道主导账户索要 AWS 来源 ARR、托管与自托管构成,以及 Connect 管线转化
IBM watsonx 路径合作伙伴中介管线可能让企业触达集中到 IBMIBM 新闻中心公告打开更多企业购买中心和受监管工作流索要共同销售管线、成交构成和收入分成经济性
Twilio / 电信生态参考构建采用不一定等于长期生产留存Twilio 博客和 Deepgram 的 Twilio 构建指南提升开发者获取效率,也强化电话场景的相关性索取生产电话工作负载的请求量,以及电话细分客户的流失率
受监管垂直行业扩张隐私、同意与供应商控制审查可能拖慢采用Goodwin 隐私分析和医疗页面对医疗、客服录音和敏感对话很关键尽调中审查 BAA、同意 UX、留存设置和审计材料
公开证明集中公开材料中,具名且细节充分的部署只有少数几个NASA、UpdateAI 和 Nytro.AI 案例研究主导公开证明企业客户数量的标题口径,大于当前公开参考案例的深度索取 10 个跨细分市场的参考客户,附续约和支出历史

本表把扩张逻辑和集中风险拆开:同样能加速 GTM 的渠道,也可能让分销集中,或在不披露伙伴来源经济性的情况下掩盖留存问题。

[CU012, CU013, CU030, CU031, CU032, CU033]
FU004: 留存披露快照

Deepgram 披露足以证明采用面,但仅靠公开材料还不足以验证耐久性。

该 KPI 图替代原计划的留存队列图,因为没有公开的时间序列留存百分比可用于绘制真实队列图。

[CU024, CU025, CU028, CU029, CU030, CU031]

6.4 展示要点

Chapter 07

07风险

7.1 按严重程度排序的风险图谱

Deepgram 的头部风险,不在于已经爆出某个单一事件,而在于三件事叠在一起:受监管数据暴露、平台依赖,以及执行面铺得太宽。法律风险最核心的部分,并不是已审阅材料里出现了 Deepgram 相关诉讼;而是 Illinois BIPA 明确把声纹视为生物识别标识,同时当前法律评论认为,AI 会议记录、说话人归因、归档转录文本,正是如今最容易引来集体诉讼关注的工作流。Deepgram 销售转录、语音智能体和医疗工作流,涉及说话人身份与留存。一旦客户实施中在通知、同意、留存、删除控制上不够严密,却收集或推断类似声纹的数据,公司的风险画像就会明显变化。 第二大风险是医疗与安全控制能否执行到位。Deepgram 给出的缓释手段有可信度——SOC 2、HIPAA 姿态、按需提供 BAA、RBAC、备份和事件响应——但 HHS 拟议的 HIPAA Security Rule 会把业务伙伴的运营门槛抬高,比泛泛的信任叙事更具规定性、更可测试,也更依赖文档。第三是依赖风险:AWS 在采购、部署和托管模型路径中反复出现;IBM 是新的渠道放大器;参考语音智能体栈也可能在一个循环里依赖多个外部供应商。第四是开源语音模型和超大规模云厂商带来的竞争与经济压力。第五是执行风险:公司试图在公开披露深度跟上之前,同时扩展 STT、TTS、语音智能体、医疗和渠道动作。因此,这是一组真实、可排序、可监测的风险,但目前并未锚定在已记录的 Deepgram 特定案件事件上。[CR001, CR002, CR003, CR004, CR009, CR012]

监管 / 法律风险登记表
风险 / 规则司法辖区 / 触发场景证据状态可能性严重性缓释成熟度剩余风险尽调路径
BIPA 声纹同意与留存风险Illinois / 任何涉及 Illinois 参与者的工作流声纹在覆盖范围内;AI 会议记录工具诉讼活跃;未见 Deepgram 相关案件证据中高使用说话人归因的会议、呼叫中心和医疗工作流风险高承销前审查产品级同意 UX、Illinois 排除安排、留存计划和客户赔偿条款
HIPAA Security Rule 收紧对商业伙伴的要求美国医疗Deepgram 提供 BAA 路径并声称符合 HIPAA,但 HHS 拟议规则显著提高控制、测试和文档要求医疗收入计划占比高时为中高获取现行 BAA、安全证明、年度风险分析材料,以及应对拟议规则变化的落地路线图
跨境隐私与数据主权错配欧盟 / 跨国部署有 EU 端点,但具体国家可能变化,且部分托管提供商缺少 EU 专属区域性中高客户需要特定国家托管或非 OpenAI 托管提供商时为中确认特定国家托管需求、托管提供商路由,以及何时需要 Dedicated 或自托管
开源 / IP / 许可外溢全球开源语音模型已是可行替代方案,相邻平台文件也提示开源和 AI 使用法律风险低中如果 Deepgram 在激烈价格压力下捆绑或接入第三方模型,则为中审查第三方模型许可政策、开源治理,以及客户合同如何处理第三方组件
生物识别和 AI 语音诉讼总体趋势美国多州2025-2026 年法律评论显示,BIPA 诉讼、集体仲裁,以及向 AI 语音和会议工具外溢仍在持续中高低中风险可能快于产品特定判例扩散,因此为中按州逐项映射客户用例中的说话人识别、存储和训练数据留存控制

各行是按严重性排序的风险类别,并不表示 Deepgram 已经是任何所列事项的被告;公开来源也无法提供逐司法辖区的完整案件清单。

[CR001, CR002, CR003, CR004, CR005, CR006]
FR001: 剩余风险热力图

剩余风险视角按可能性、影响、缓释成熟度和残余敞口,对 Deepgram 主要承保关注点排序。

矩阵标签把引用证据归入承保风险桶,并非声称量化概率。

[CR009, CR013, CR019, CR025, CR043, CR044]

7.2 运营与依赖暴露

买方能够有意识地选择架构时,Deepgram 的公开缓释叙事最有力。公司提供托管、专用、自托管和客户云部署模式,也提供 EU endpoint,以及通过 Connect、SageMaker、Bedrock、Marketplace 和类似 PrivateLink 的连接方式走 AWS 原生路径。这些选项重要,是因为底层暴露就在文档里。Deepgram 的速率限制按套餐和项目约束并发;公司明确禁止通过拆分项目绕过上限。EU endpoint 有助于区域内处理,但并非每个模型或托管提供商路径在当地表现都一样,文档称托管提供商侧目前只有 OpenAI 通过 EU 基础设施路由。Amazon Connect 支持目前也仅限托管,这意味着对于要求自托管的买方,最容易接入联络中心的路径还不是部署中立的。 这直接引出依赖风险。AWS 不只是云场地;它同时是采购路径、部署表面和模型编排层。IBM 扩大了企业分销,但也带来新的渠道依赖。Twilio 公布的架构说明,一个生产级语音智能体很快就会变成多供应商链条,电话、语音、推理和合成分别落在不同提供商手里。Deepgram 可以用自托管或专用部署缓释一部分风险,但公司自己的部署文档也说明,自托管会把基础设施、备份和可用性责任推向客户。实际结果是,公司可以通过外部化控制降低部分隐私和主权风险,但如果客户自运营表现不佳,Deepgram 仍会承担有意义的品牌和支持暴露。因此,本章对运营的判断不是 Deepgram 缺少缓释手段;而是这些缓释手段往往是在用一种暴露换另一种暴露。[CR017, CR018, CR019, CR020, CR021, CR022]

运营 / 质量 / 安全风险登记表
失效模式公开证据可能性严重性缓释成熟度剩余风险主要未解缺口
并发或吞吐瓶颈公开费率限制约束 PAYG 语音代理和语音工作负载,并禁止拆分项目规避用量突然冲高时为中高客户特定吞吐、排队和 SLA 条款未公开
安全控制执行漂移Deepgram 披露 SOC 2、RBAC、2FA、备份和事件响应,但公开材料没有展示审计细节或泄露事后复盘除一般性表述和拟议医疗要求外,未公开控制测试节奏
区域或提供商错配EU 端点有功能限制,目前只有部分托管提供商路线具备完整区域性中高国家级托管承诺和非 OpenAI 托管提供商区域规划未公开
联系中心部署路径错配Amazon Connect 支持目前仅限托管模式,要求自托管的买家默认选择变窄低中自托管 Connect 支持时间表未公开
客户自托管不稳定自托管能缓解隐私顾虑,但把基础设施、监控和备份责任转给客户团队中高参考架构没有披露客户管理部署所需的最低人员配置或操作错误率

运营风险结合了 Deepgram 官方设计约束和已披露缓释措施;缺少客户特定 SLA 和事件细节,使剩余风险仍高于低位。

[CR014, CR015, CR017, CR018, CR019, CR020]
伙伴 / 依赖风险登记表
依赖项交易对手 / 层级在技术栈或 GTM 中的作用集中度信号失效场景严重性缓释剩余风险
云与市场路径AWS采购、部署、Connect、SageMaker、Bedrock 和 GPU 托管界面AWS 出现在多个官方部署和 GTM 路径中AWS 路线出现商业或技术摩擦,会拖慢高价值部署或抬高交付成本自托管、Dedicated,以及其他云 / 本地选项中高
企业分销渠道IBMwatsonx Orchestrate 分销和嵌入式语音路径IBM 被描述为 Deepgram 的首个语音伙伴伙伴调整优先级或渠道转化偏弱,会削弱预期的企业管道杠杆中高直销和其他伙伴路线
电话和编排层Twilio 及类似通信伙伴参考语音代理栈使用外部电话和流式传输实时电话代理部署可能依赖外部通信提供商伙伴宕机、政策变化或定价调整会拖累终端客户体验中高替代通信伙伴和非电话渠道
托管 LLM 提供商OpenAI 和 Bedrock 托管模型部分托管语音代理路径的推理层OpenAI 的 EU 路由明确,但其他提供商并非全部明确提供商宕机、延迟激增或区域错配,会削弱 Deepgram 托管代理承诺中高客户自选模型、自托管部署和架构灵活性
客户控制的基础设施选项客户 DevOps 团队自托管可以成为合规答案,但依赖客户运维质量部署文档把正常运行时间和备份责任转给客户即使托管外部化,客户运维差仍会反噬 Deepgram 产品口碑Dedicated 部署和实施支持

集中度只是方向性判断,因为公开材料没有披露伙伴来源收入占比,也没有披露各路径在部署中的占比。

[CR019, CR022, CR023, CR024, CR025, CR026]
FR002: 风险传导图

隐私、安全、伙伴和定价风险如何传导到采用、利润率和估值结果。

[CR012, CR025, CR028, CR032, CR035, CR044]
FR003: 依赖图

Deepgram 受监管和企业语音 AI 路径下可见的平台依赖。

[CR019, CR022, CR023, CR025, CR027, CR041]

7.3 剩余暴露、缓释措施与止损条件

承销问题不是 Deepgram 有没有可信产品,甚至也不是有没有可信的缓释工具箱;公开证据支持两者。真正的问题是,在竞争和监管压缩公司学习空间之前,这些缓释措施是否足够成熟、足够可复制、文档也足够完整,能服务最敏感的买方。开源语音模型和超大规模云厂商栈已经给买方提供了控制叙事,即便 Deepgram 在延迟或特定托管基准上仍能胜出。Twilio 和 SoundHound 等相邻上市公司的文件也强化了同一点:隐私控制、部署灵活性、开源治理和第三方服务质量,不是边缘议题,而是这个品类反复出现的平台风险。MarketsandMarkets 和 AssemblyAI 也说明了为什么现在重要:市场增长很快,采用面正在扩大,QA、治理和合规正从事后补丁变成核心差异点。 因此,剩余暴露主要落在披露和证明质量上。公开来源仍未显示客户集中度、合作伙伴贡献收入占比、经审计的可用性指标,或 Deepgram 特定的生物识别赔偿立场。这些缺口不会抵消公司的优势,但会阻止把剩余暴露干净地下调到低。可投资路径因此是有条件的。如果尽调确认公司已有产品化的同意控制,能覆盖 Illinois 敏感工作流;医疗级文档能匹配拟议 HIPAA 门槛;并且有可信证据表明合作伙伴或架构依赖没有隐藏集中度风险,那么当前风险组合看起来可控。如果这些点仍然私密或模糊,正确的投资反应就不是默认乐观,而是缩小范围、加大折价,或触发停止条件。本章的止损条件正是为这条边界设计的。[CR030, CR031, CR032, CR033, CR034, CR035]

人员 / 执行风险登记表
执行领域依赖或缺口可能性严重性公开缓释剩余风险尽调路径
医疗 GTM向受覆盖实体销售,如今不止需要 BAA 和通用信任文案HIPAA 声明、安全文档、区域选项、自托管中高按垂直行业审查医疗客户参考、审计包和实施资源
平台宽度STT、TTS、语音代理、医疗、伙伴集成和新 IP 动作,都扩大了交付面中高中高Series C 资本和企业定位中高测试组织设计、QA 和支持是否随宽度扩张,而不只是模型发布
应对价格压力的商业能力开源和超大云厂商替代方案可能压低价格,或迫使更多定制支持中高Deepgram 声称具备速度、准确率、部署灵活性和更低 TCO索取按细分市场拆分的赢单 / 输单数据、折扣历史、毛利率数据和续约行为
证据和披露深度公开来源仍缺客户集中度、伙伴组合、经审计正常运行时间指标和赔偿立场已有部分官方部署和安全披露索取头部客户数据、伙伴来源 ARR、SLA 表现,以及法律风险准备金或保险细节

执行风险基于公开证据尚未展示的内容排序,并不表示 Deepgram 已经在这些领域失败。

[CR013, CR032, CR033, CR037, CR038, CR039]
缓释与否决标准表
风险可监控触发项阈值 / 事件行动含义
生物识别 / BIPA 风险类声纹工作流的同意和留存控制仍不清楚尽调中无法给出产品级 Illinois 同意流程、留存计划或赔偿答案暂停承销 Illinois 占比高的部署,或把它们从预测中剔除
HIPAA / 医疗合规执行缺少 Security Rule 准备材料没有现行 BAA 模板、没有商业伙伴审计证据,或没有拟议规则差异的路线图将医疗扩张视为推测,而不是已锁定增长
可靠性和规模公开或尽调观察到的产能姿态变弱反复限流、未达并发承诺,或没有可信 uptime 报告下调增长假设,并要求更强 SLA 和可观测性证据
伙伴依赖单一渠道或提供商变得过于关键AWS、IBM 或电话 / LLM 伙伴路径成为大部分企业赢单的门槛应用集中度折价,并要求替代路径证明
价格和架构竞争开源或超大云厂商替代方案挤压商业杠杆赢单 / 输单数据显示,客户主要因为控制或价格选择自托管或超大云厂商栈下调利润率和留存假设
披露质量核心承销数据到尽调后期仍不公开头部客户组合、伙伴收入占比、正常运行时间指标和法律风险立场仍不可得除非私下尽调补上缺口,否则升级为 no-go

否决标准是绑定上述风险的可监控尽调触发项,并非预测这些阈值已经被突破。

[CR009, CR013, CR020, CR023, CR025, CR028]
FR004: 按风险集群划分的残余敞口

考虑当前公开缓释措施后,Deepgram 主要承保风险集群的相对残余敞口。

评分是分析师根据引用来源综合出的 1-10 残余敞口尺度,并非公司披露指标。

[CR040, CR044]
Chapter 08

08估值

8.1 价格锚点,以及公开证据能证明什么、不能证明什么

Deepgram 确实有一个硬估值数据点:2026 年 1 月 13 日,公司宣布以 $1.3 billion 估值完成 $130 million Series C,多家媒体也重复了同样的轮次规模和估值。这一点重要,因为本章不再是纯假设,而是在评估当前已知价格是否站得住。这轮融资还包括 Twilio、ServiceNow Ventures、SAP、Citi Ventures 等战略方,比纯财务投资人组成的辛迪加更有信号价值。管理层另行告诉 TechCrunch,Deepgram 前一年现金流为正,并不是为了防守性补钱。对于一家处在算力密集品类的 AI 基础设施公司,这些都是有意义的正面因素。问题是,公开记录仍未给出投资人需要的分母。Deepgram 披露了采用和使用信号,但没有公开披露 ARR、毛利率、净收入留存或股权结构条款。因此,$1.3 billion 标记看起来合理,但公开解释仍然不够充分。[CV001, CV002, CV003, CV004, CV005, CV006]

建议摘要表
维度评估决策含义
建议跟踪保持跟进,但在没有私下财务证据前,不应把当前估值视为显然有吸引力。
信心价格是真实的,业务也有牵引力,但估值分母大多仍不公开。
风险评级ARR、毛利率和融资条款披露缺失;如果公开叙事高估商业转化,下行空间不小。
当前估值锚2026 年 1 月 Series C 估值 $1.3B以此作为参考价格;不要用臆造的公允价值精度替代。
估值立场合理但谈不上便宜该估值可以对应一个好结果,但公开证据尚未显示明显便宜。
上调条件经验证的 ARR、毛利率和留存能支撑隐含倍数转向买入需要私下财务证据,而不只是更多产品营销或赛道热情。
可能退出路径后续私募轮或战略选择权,早于 IPO 式准备度公开可比公司披露的财务细节远多于 Deepgram 当前水平。

本表明确对价格敏感:评估的是当前 $1.3B 估值的可投性,而不是公司的总体质量。

[CV001, CV004, CV025, CV035, CV041, CV045]
正方 / 反方观点表
论点正方反方什么会改变判断
融资质量2026 年真实融资轮给出新的 $1.3B 锚点,财团中有战略投资者。当公开财务披露仍偏薄时,新价格并不证明入场价划算。董事会层面的收入和毛利率文件,会澄清本轮价格是公允还是慷慨。
运营质量管理层称,公司进入 2025 年时现金流为正。现金流为正本身不能揭示 ARR 规模、利润率韧性或留存质量。经验证的现金流桥接和单位经济模型会强化承销。
商业牵引力Deepgram 已披露其 API 覆盖 1,300+ 个组织、200,000+ 名开发者和 400+ 个企业客户。这些指标显示触达,但没有揭示每个客户群组变现出多少收入。分段 ARR 和企业 ACV 数据会把活跃度转成价值。
赛道动能独立市场报告和私营同行显示,语音 AI 仍是资金充足的增长赛道。赛道增长由多个竞争者共享,并不保证 Deepgram 抓住溢价经济性。净留存和伙伴渠道转化会显示 Deepgram 是否在经济层面取胜,而不只是技术层面。
竞争姿态Deepgram 主张自己在延迟、成本和部署灵活性上胜过主要对手。这些说法来自 Deepgram 营销页面,单独不足以支撑估值。与商业转化绑定的独立基准测试,会让这种优势更可投。
建议以当前阶段看,公司足够可信,值得密切跟踪。公开记录仍留下太多不确定性,不足以给出看多承销结论。只有私下财务证据补上分母缺口,建议才会改变。

反方主要针对披露和入场价格,而不是质疑 Deepgram 是否是一家有真实需求的真实公司。

[CV001, CV004, CV005, CV006, CV009, CV010]
FV001: 推荐逻辑

真实融资锚点和战略证据支撑继续关注,但缺少财务分母数据,结论只能停在观察。

[CV001, CV004, CV005, CV006, CV010, CV020]

8.2 市场顺风、同业参照与披露缺口

独立市场报告仍然支持一个判断:语音 AI 基础设施正在被构建成一个庞大且增长中的品类。语音转文本 API、语音识别和对话式 AI 报告都指向两位数增长,并预计到本十年末市场扩张到数十亿美元规模。私营和公开同业也显示,投资人愿意给这个品类出钱:ElevenLabs 在 2025 年 1 月达到 $3.3 billion 估值,AssemblyAI 又融资 $50 million,并称现在服务高生产负载;SoundHound、Five9、NICE 和 Twilio 的公开市值也说明,上市语音或通信相邻平台仍能获得有意义的企业价值。但这些可比公司只是框架工具,不是证明。Twilio 和 NICE 是业务宽得多的软件公司;Five9 更接近应用层联络中心软件,而非模型基础设施;SoundHound 已上市,受到严格审视;ElevenLabs 的创作者和 TTS 组合更强;AssemblyAI 在已抓取来源中没有公开披露估值。因此,这组可比公司说明该品类可以支撑十亿美元级结果,但本身不能证明 Deepgram 当前估值有吸引力。[CV010, CV011, CV012, CV013, CV014, CV015]

可比估值表
可比对象指标倍数 / 估值 / 状态参考价值局限
Deepgram(标的)2026 年 1 月私募轮估值 $1.3B;融资 $130M本章的直接价格锚点。公开记录仍缺少 ARR、毛利率、NRR 和优先权条款。
SoundHound AI2026 年 6 月公开市值市值 $3.02B抓取样本中最接近的上市纯语音 AI 框架可比公司。这是一家上市公司,带有收购、季度审视和不同风险画像,和一家私有 API 平台不一样。
Twilio2026 年 6 月公开市值市值 $31.33B有战略和分发参考价值,因为 Twilio 也投资了 Deepgram。相比 Deepgram,业务覆盖的 CPaaS、数据和客户互动平台要宽得多。
Five92026 年 6 月公开市值市值 $1.59B绝对股权价值接近 Deepgram 的应用层联络中心软件锚点。它更像工作流软件,而不是基础语音模型厂商。
NICE2026 年 6 月公开市值市值 $5.14B企业 CX 和分析基准,可参照规模化语音邻近软件价值。成熟大型软件组合让它更像上限参照,不是直接同业。
ElevenLabs2025 年 1 月私募轮估值 $3.3B;$180M Series C高增长私有音频 AI 基准,说明品类投资人愿意支持高溢价语音平台。创作者 / TTS / 消费者业务占比更重,估值也比 Deepgram 的轮次早一年。
AssemblyAI私募融资状态$50M Series C;累计融资 $115M;估值未披露直接语音 API 同业,具备有意义的生产规模和强客户信号。抓取来源没有披露估值,因此它是战略同业,不是干净的价格可比对象。

本表完整覆盖本章使用的可比对象;每一行都列出明确局限,因为无论上市还是私有同业,都没有一个能和 Deepgram 完全一一对应。

[CV001, CV013, CV015, CV020, CV021, CV022]
FV002: 估值敏感性

同样是 $1.3B 估值,是否偏高或合理,取决于 ARR 分母尽调能挖出什么。

数值只是基于当前 $1.3B 估值做出的简单估值 / ARR 隐含倍数计算,并非 Deepgram 披露的 ARR。

[CV026, CV027, CV033]
FV004: 投资 KPI

证据包足以让 Deepgram 继续留在视野内,但按当前价格做高信念承保仍不完整。

[CV001, CV003, CV006, CV010, CV020, CV025]

8.3 情景区间与投资判断

Deepgram 没有公开披露 ARR,检验 $1.3 billion 标记最干净的方法,是反推怎样的收入基数才合理。按简单计算,当前估值在收入 $100 million 时约等于 13x ARR,在 $150 million 时为 8.7x,在 $200 million 时为 6.5x,在 $250 million 时为 5.2x,在 $300 million 时为 4.3x。这给出了清晰的决策框架。如果 Deepgram 的 ARR 明显低于约 $150 million,面对超大规模云厂商和模型供应商带来的价格压力,当前估值就开始显得偏紧。如果公司 ARR 更接近 $200 million–$250 million,并且现金流为正能够持续、毛利率可信,估值就更容易辩护。如果 ARR 高于 $250 million,且合作伙伴驱动的规模化正在跑通,乐观情景就能支撑明显更高的价值。今天的公开证据无法说明公司处在哪个状态,所以更纪律化的结论是观察,而不是买入:当前标记落在一个合理基准区间内,但低于该区间的幅度还不足以形成明显安全边际。[CV026, CV027, CV033, CV034, CV035, CV038]

牛 / 基准 / 熊情景表
情景概率信号估值区间核心假设主要失效模式
30%$0.9B-$1.2BARR 更接近 ~$100M-$150M,利润率质量弱于预期,或合规摩擦拖慢企业扩张。当前 $1.3B 估值最终证明对已披露基本面的乐观预期过高。
基准50%$1.2B-$1.8B现金流为正确实成立,ARR 合理落在约 ~$150M-$250M,战略伙伴支撑分销。公开证据方向上积极,但仍不足以证明深度低估。
20%$1.8B-$2.6BARR 被证明达到 $250M+,毛利率守得住,伙伴驱动的规模让 Deepgram 成为基础语音层。缺少这些文件时,牛市情景仍只是有条件的上行情景,而不是当前承销事实。
决策含义当前估值位于基准情景内跟踪公司,并在为上行支付溢价前尽调分母。不要仅凭赛道增长就把案例上调为买入。

这些是情景区间,不是假装精确的 DCF;它们用来说明隐藏财务输入变化时,投资判断会如何改变。

[CV026, CV027, CV033, CV042, CV043, CV044]
FV003: 估值 / 回报区间

当前估值落在基准情景内,但证据缺口阻止给出更强推荐。

区间是锚定公开证据和简单倍数敏感性的情景判断,不是完整 DCF。

[CV001, CV042, CV043, CV044, CV045]

8.4 投资逻辑破裂点、退出准备度与尽调优先级

剩下的工作直接且关键。按这轮价格买入的人,需要核验分行业 ARR、毛利率、留存、客户集中度,以及 Series C 实际优先股堆叠。没有这些文件,反向逻辑仍然太强:一家公司可以现金流为正、技术可信、战略相关,但入场价格仍可能让新钱上行空间有限。合规背景也值得纳入定价。Goodwin 2026 年关于 AI 转录工具的说明强调了 BIPA、窃听、留存和特权风险;如果供应商和客户没有妥善处理同意与存储,这些风险会拖慢企业采用,或提高治理成本。这不会击穿 Deepgram 故事,但会抬高尽调门槛。退出准备度更像是再融资一轮私募或保留战略选择权,而不是近期 IPO,因为公开同业披露的运营细节远多于今天的 Deepgram。在这些数据缺口关闭之前,正确姿态是跟踪具体投资逻辑破裂触发点,并维持观察建议。[CV031, CV032, CV047, CV048, CV050, CV051]

投资逻辑破坏与否决触发表
触发项阈值对投资逻辑的传导行动含义
ARR 未达门槛经验证 ARR 显著低于约 $150M。当前估值开始意味着一家仍未上市的基础设施公司倍数偏高。重切到悲观区间,或放弃本轮。
毛利率不及预期利润率显著低于规模化 API 平台应有水平。现金流转正的韧性下降,上行倍数支撑变弱。下调公允区间,并要求更强的价格保护。
留存偏弱NRR、总留存率或企业续约数据显示扩张韧性有限。即便客户数看起来健康,平台故事的质量也会下降。降低确信度,把牵引指标视为比公开表象更嘈杂。
优先股堆叠对投资人不友好清算优先权、反稀释条款或治理条款扭曲表面估值。名义 $1.3B 估值高估了真实的新资金经济性。暂停、重定价,或要求结构化进入。
合规摩擦上升隐私、生物识别或窃听监管控制显著拖慢受监管企业采用。品类增长无法干净转化为 Deepgram 的收入质量。下调乐观情景权重,并重新评估渠道假设。
合作伙伴转化停滞战略合作伙伴没有带来可衡量的 ARR 杠杆。分发价值仍停留在叙事层面,而不是盈利驱动。即便技术基准仍强,也把判断维持在观察。

这些是估值触发项,不是泛泛风险:每一项都可能直接推翻当前进入价格。

[CV031, CV032, CV033, CV047, CV049, CV051]
最终尽调要求表
主题缺失证据重要性负责人或尽调路径
ARR 与收入桥董事会批准的 2024–2026 ARR、确认收入和分部结构。这是判断 $1.3B 究竟保守、合理还是偏高的分母。CFO 材料、董事会材料和月度管理报告。
毛利率与推理成本按产品划分的毛利率、算力负担、托管组合和合作伙伴经济性。如果毛利率结构性强,现金流转正更有韧性。财务和基础设施审查,要求队列或产品级成本明细。
留存与扩张总留存率、NRR、企业扩张和主要队列流失。只有扩张和续约强,较高客户数才更值钱。收入运营仪表盘和队列分析。
Series C 条款清算优先权、按比例认购权、治理安排和任何 side-letter 保护。表面估值可能显著高估新投资人的有效经济性。律师审查融资文件和股权结构表。
集中度与渠道结构头部客户、头部合作伙伴以及直营对渠道收入集中度。只有转化为多元且持久的收入,战略合作伙伴信号才有用。客户集中度分析和合作伙伴 pipeline 审查。
合规控制同意流程、留存政策、生物识别保护和受监管行业部署控制。治理摩擦会拖慢企业扩张,并削弱估值支撑。结合法律、隐私和产品尽调,核对部署足迹。

这些是最低限度的尽调要求;只有满足它们,才可能在当前价格下把建议从观察推向买入。

[CV025, CV031, CV047, CV048, CV051, CV052]

8.5 附录

免责声明

本报告是基于公开证据的尽调快照,不构成投资建议。重要财务、法律、技术和合同事实仍未公开;作出任何投资决定前,应直接向管理层和一手文件核验。

证据索引

结论
编号陈述可信度来源
CO001 Deepgram was founded in 2015 by Scott Stephenson, Noah Shutty, and Adam Sypniewski, three physicists who worked on dark matter detection. SO001, SO007
CO002 The founding insight for Deepgram came from the co-founders' work analyzing waveforms from dark matter detectors, which they applied to speech audio processing using end-to-end deep learning. SO001, SO003, SO004
CO003 Deepgram is headquartered in San Francisco, California and operates as a remote-first company distributed across 20+ US states and 5+ countries. SO001, SO003
CO004 Deepgram's business model is API-first, usage-based access to proprietary real-time voice AI models (STT, TTS, voice agents) with cloud, self-hosted, and on-premises deployment options. SO001, SO021, SO014
CO005 Deepgram's product portfolio spans speech-to-text (Nova-3), text-to-speech (Aura-2), conversational speech recognition (Flux), Voice Agent API, and Saga (Voice OS). SO007, SO010
CO006 Deepgram participated in Y Combinator's Winter 2016 batch, which gave it early developer community access and seed capital. SO005, SO009
CO007 Scott Stephenson is CEO and Co-Founder of Deepgram; he holds a PhD in particle physics from the University of Michigan and left postdoctoral research to co-found the company. SO002, SO003, SO007
CO008 Adam Sypniewski is CTO and Co-Founder of Deepgram; he contributed to the deep-learning waveform architecture from the dark matter research lab. SO003, SO007
CO009 Noah Shutty is the third Co-Founder of Deepgram and contributed to the early technical architecture. SO001, SO007
CO010 Elizabeth de Saint-Aignan, General Partner at AVP, joined Deepgram as a board-level representative following the January 2026 Series C. SO007, SO011
CO011 No COO, CFO, or President has been publicly named at Deepgram as of June 2026, creating a key-person concentration risk in CEO Scott Stephenson. SO007, SO009, SO017
CO012 Scott Stephenson is the sole named executive in all major public announcements, press releases, and partnership communications. SO007, SO017, SO018
CO013 Deepgram completed a $72 million Series B in 2022 with investors including Alkeon, Tiger, Wing, Madrona, In-Q-Tel, BlackRock, Stanford University, and Y Combinator; no valuation was publicly disclosed. SO008, SO009, SO005
CO014 Deepgram raised $130 million in Series C funding at a $1.3 billion valuation, announced on January 13, 2026, led by AVP. SO007, SO008, SO009
CO015 Existing investors Alkeon, In-Q-Tel, Madrona, Tiger, Wing, Y Combinator, and BlackRock all rejoined in the Series C round. SO007, SO008
CO016 New investors in the Series C included Alumni Ventures and Princeville Capital plus strategic corporates Twilio, ServiceNow Ventures, SAP, and Citi Ventures. SO007, SO008, SO009
CO017 Academic investors in the Series C included the University of Michigan and Columbia University, joining existing academic investors Stanford University. SO007, SO011
CO018 In-Q-Tel, the US intelligence community's venture arm, has participated in Deepgram's funding rounds and continued in the Series C. SO007, SO009
CO019 Deepgram acquired OfOne, a Y Combinator-backed AI voice platform for restaurants and quick-service drive-throughs, simultaneously with the Series C announcement in January 2026. SO007, SO008, SO009
CO020 Deepgram's total capital raised exceeds $215 million as of the January 2026 Series C close. SO008, SO010
CO021 Deepgram publicly disclosed 200,000+ developers building on its APIs as of January 2025. SO014, SO007
CO022 Deepgram had 400+ enterprise customers as of January 2025, rising to 450+ enterprise customers as of the Nova-3 launch in February 2025. SO014, SO015
CO023 Deepgram has processed over 50,000 years of audio and transcribed over one trillion words as of January 2025. SO014
CO024 Deepgram achieved 3.3× annual usage growth across the four years ending 2024. SO014
CO025 CEO Scott Stephenson confirmed that Deepgram was cashflow positive in 2024, before the Series C fundraise. SO008, SO014
CO026 Deepgram launched the Voice Agent API at general availability in June 2025, priced at $4.50 per hour. SO016, SO007
CO027 Deepgram signed a multi-year Strategic Collaboration Agreement with AWS in August 2025, deepening co-selling and cloud integration including Amazon EKS and Bedrock. SO018, SO007
CO028 Deepgram and IBM announced a collaboration in February 2026, embedding Deepgram's STT and TTS into IBM's watsonx Orchestrate; Deepgram became IBM's first voice partner. SO017, SO007
CO029 Deepgram faces regulatory and litigation risk from the Illinois Biometric Information Privacy Act (BIPA) and other state biometric data laws that may apply to voiceprint generation from transcription tools. SO025
CO030 Deepgram has not publicly disclosed its revenue, ARR, or precise employee headcount as of June 2026. SO007, SO014
CO031 Deepgram's status page (status.deepgram.com) shows an incident history, indicating the platform has experienced service disruptions during its operation. SO024
CO032 Deepgram positions itself as the infrastructure layer for the Voice AI economy, drawing an analogy to Stripe as the infrastructure for the payments economy. SO007, SO011
CO033 Deepgram CEO stated an ambition to pass the Audio Turing Test at scale in 2026, signaling a long-term R&D investment in natural voice quality. SO007
CO034 NASA selected Deepgram over all major speech-to-text providers after the others failed to reach the 80% word recognition rate threshold required for space-to-ground communications transcription. SO023, SO013
CO035 Twilio, as a Series C investor and customer, publicly described Deepgram as powering its voice AI renaissance with seamless, low-latency AI agent experiences. SO007, SO018
CO036 Multiple enterprise customers including enterprise count increased from 400+ in January 2025 to 450+ in February 2025, suggesting rapid customer addition in Q4 2024–Q1 2025. SO014, SO015
CO037 Deepgram's early-round academic investors (Stanford University) and Series C additions (University of Michigan and Columbia University) suggest a talent pipeline and IP collaboration strategy alongside capital. SO007, SO017
CM001 The global speech-to-text API market reached $4.55 billion in 2025 and is projected to grow at 18.2% CAGR to $10.46 billion by 2030, per The Business Research Company. SM001
CM002 The broader global voice and speech recognition market (including consumer devices) was estimated at $26.5 billion in 2026, projected to reach $116.9 billion by 2033 at a 23.6% CAGR, per Coherent Market Insights. SM002
CM003 North America was the largest region in 2025, representing approximately 34–35% of the voice and speech recognition market; APAC is the fastest-growing region. SM001, SM002
CM004 Deepgram's primary market boundary is B2B API access to real-time STT, TTS, and voice agent orchestration; consumer assistants (Siri, Alexa) and legacy telephony platforms (Cisco, Genesys) are outside its addressable market. SM004, SM012
CM005 Status-quo substitutes for Deepgram include manual transcription, in-house ASR models, and legacy on-premises telephony; competitor substitutes include open-source Whisper and hyperscaler STT. SM004, SM005, SM012
CM006 Deepgram CEO Scott Stephenson cited a $50 billion addressable market for voice AI agents in demanding environments requiring exceptional accuracy, lowest COGS, highest model adaptability, and lowest latency. SM013
CM007 The agentic AI wave—AI phone agents replacing human agents in contact centers, sales, and customer service—is the primary demand driver for real-time voice AI APIs. SM012, SM022
CM008 Enterprise contact center migration to cloud-based AI automation is a multi-year structural tailwind for STT and voice agent infrastructure, with market projections citing continued 18–24% CAGR. SM001, SM002
CM009 Deepgram's Voice Agent API at $4.50/hour positions the company in the platform-orchestration tier above the commodity STT layer, enabling higher ACV and stickier enterprise contracts. SM022, SM024
CM010 Deepgram's developer-led PLG motion (200,000+ developers on free tier) provides a structural pipeline into enterprise contracts, analogous to Twilio and Stripe. SM013, SM023
CM011 Multilingual enterprise expansion (45+ languages for Nova-3) is a medium-term driver that opens APAC and EMEA markets to Deepgram's platform. SM013, SM023
CM012 IBM and AWS partnerships, announced in 2026 and 2025 respectively, create distribution channels into regulated enterprise buyers that would not have self-sourced Deepgram. SM025, SM023
CM013 Deepgram's developer and startup buyer tier encompasses 200,000+ developers on pay-as-you-go plans; they are typically technical decision-makers who evaluate via documentation and API sandbox. SM013, SM024
CM014 Deepgram's enterprise buyer tier includes 400–450 organizations (as of early 2025) purchasing annual contracts; buyers are VPs of Engineering, CTOs, or IT procurement at mid-market to Fortune 500 companies. SM013
CM015 The ISV/platform tier—companies like Vapi, Kore.ai, Granola, Aircall, and OpenPhone—embeds Deepgram as an infrastructure component and drives disproportionate API call volume. SM022, SM020
CM016 In-Q-Tel's continued participation as an investor signals government and intelligence community interest in Deepgram's on-premises STT for classified or sensitive deployments. SM023
CM017 Deepgram's restaurant/QSR vertical, opened via the OfOne acquisition, targets operations buyers at national quick-service restaurant chains with AI drive-thru voice agents achieving >95% containment. SM023, SM012
CM018 AWS Transcribe, Google Cloud Speech-to-Text, and Azure Speech are bundled with their respective cloud ecosystems at prices that structurally constrain Deepgram's ability to capture cloud-native customers. SM009, SM010, SM011
CM019 Open-source Whisper (OpenAI) and NVIDIA Canary Qwen 2.5B provide batch STT at zero API cost with competitive accuracy (5.26–5.63% WER), displacing Deepgram in non-latency-critical developer workloads. SM004, SM006
CM020 ElevenLabs Scribe v2 Realtime leads multilingual real-time STT benchmarks at ~150ms across 30 languages (May 2026), presenting a structural risk to Deepgram's international expansion. SM004
CM021 Data sovereignty regulations (GDPR in Europe, BIPA in Illinois) and privacy enforcement trends in 2026 create compliance costs and potential market access restrictions for Deepgram's international growth. SM014, SM015
CM022 Deepgram's Nova-3 model achieved 5.26% WER (word error rate) on a real-world test set across 9 audio domains (batch), the lowest WER of any hosted STT API per FutureAGI benchmark guide (May 2026). SM004
CM023 AWS Transcribe is priced at $0.024/min, roughly 5× more expensive than Deepgram's Nova-3 ($0.0048/min streaming), suggesting Deepgram competes on price efficiency rather than being undercut by hyperscalers in this specific comparison. SM004, SM010
CM024 Deepgram is classified as the best STT API for voice agents (lowest end-to-speech latency) in FutureAGI's May 2026 independent benchmark guide, ahead of Google, AWS, Azure, and AssemblyAI. SM004
CM025 Market share distribution among STT API providers is not publicly disclosed in any primary source; Deepgram's $215M raised and 200,000+ developer footprint is the best public proxy for relative market position. SM004, SM005
CM026 The contact center cloud migration market is described by Deepgram's own materials and NetworkWorld as a key driver, with the global financial impact of poor customer experience estimated at $3.7 trillion annually (Qualtrics XM Institute). SM012
CM027 Deepgram's Flux model, launched for voice agents, delivers sub-300ms streaming latency with the fastest end-of-speech detection among hosted APIs per FutureAGI benchmarks (May 2026). SM004
CM028 The speech recognition sub-segment leads the broader voice and speech recognition market with an estimated 62.3% share in 2026. SM002
CM029 Rev.ai, as a direct STT competitor, publishes public pricing and competes with Deepgram in the developer and SMB tiers. SM019
CM030 Haptik and other industry sources note data privacy risks in voice AI, including potential regulatory exposure for companies that process audio streams containing biometric voice characteristics. SM021
CM031 The Twilio integration with Deepgram for virtual agents was presented as a developer reference implementation, validating the PLG-to-enterprise motion for the ISV/platform buyer segment. SM020
CM032 AssemblyAI Universal-2 with Slam-1 is rated as the best STT API for transcript intelligence (sentiment, topics, entity, content moderation) in FutureAGI benchmarks, representing a specialized niche outside Deepgram's core strength. SM004, SM007
CM033 Speechmatics Enhanced is recommended for on-premises enterprise deployments across 55+ languages in regulated industries, competing directly with Deepgram's on-prem offering. SM004, SM008
CM034 Deepgram's product strategy, per CEO Stephenson, targets the $50B market for voice AI in demanding environments—a premium niche within the broader STT market defined by accuracy, cost, adaptability, and latency requirements. SM013
CM035 Deepgram positions itself against the hyperscaler STT products by emphasizing its purpose-built, developer-first architecture and the ability to customize models to domain-specific terminology and acoustic environments. SM023, SM004
CM036 Deepgram's Growth plan starts at $4,000/year with up to 225 concurrent WSS STT connections, implying enterprise ACV of at least $4K and likely $50K–$500K+ for larger deployments. SM024
CM037 The restaurant/QSR vertical, while smaller in current revenue than contact centers, offers a highly scalable unit economics model (per-drive-thru lane pricing) that could scale to thousands of fast-food locations nationally. SM023, SM012
CM038 Deepgram's FutureAGI benchmark ranking as the top STT for voice agents (May 2026) provides third-party validation supporting but not proving the "number-one STT API" self-description; no independent market share data exists. SM004, SM005
CP001 Deepgram's competitive landscape includes four tiers: hyperscalers (AWS, Google, Azure), pure-play API vendors (AssemblyAI, Speechmatics, ElevenLabs, Rev.ai), full-stack LLM platforms (OpenAI GPT-Realtime), and open-source models (Whisper, NVIDIA Canary). SP001, SP012
CP002 Hyperscalers (AWS, Google, Azure) compete primarily on distribution and cloud bundling rather than technical leadership in real-time accuracy or latency. SP001, SP006, SP007
CP003 Open-source Whisper (OpenAI) is a free self-hosted STT model competing with Deepgram for batch, non-latency-critical developer workloads; it achieves competitive accuracy but cannot match Deepgram's real-time latency as a hosted API. SP001, SP004
CP004 OpenAI's GPT-Realtime API ($32/1M audio tokens input) poses a platform consolidation risk for voice agent builders who prefer a single provider for LLM and voice, potentially displacing Deepgram's Voice Agent API tier. SP004, SP022
CP005 Deepgram Nova-3 achieved the lowest WER (5.26%) among hosted STT APIs on FutureAGI's independent benchmark across 9 audio domains (May 2026), ahead of AssemblyAI Universal-3 (~5.5%) and OpenAI GPT-4o (~8.9%). SP001
CP006 Deepgram Flux + Nova-3 was rated the top STT API for voice agents (lowest end-to-speech latency, sub-300ms streaming) in FutureAGI's May 2026 benchmark guide. SP001
CP007 AWS Transcribe is priced at $0.024/min standard (5× Deepgram Nova-3's $0.0048/min) with HIPAA eligibility and native AWS IAM/S3/Lambda integration, making it the default for AWS-committed enterprises. SP006, SP001
CP008 Google Cloud Speech-to-Text (Chirp 3) supports 125+ languages with medical and phone call variants at $16/1K minutes, with Gemini multimodal integration as its strategic direction. SP005, SP026
CP009 Azure Speech supports 100+ languages with Custom Speech fine-tuning at $1/hour standard, and is strategically bundled with Microsoft Copilot and Microsoft 365 enterprise deployments. SP007, SP026
CP010 AssemblyAI Universal-2 at $0.15/hr and Universal-3 Pro at $0.21/hr leads in transcript intelligence (sentiment, topics, entity extraction, content moderation via LeMUR/Slam-1) and supports 99 languages. SP002, SP009
CP011 Speechmatics starts at $0.24/hr with 56+ languages, an on-premises deployment option, and custom model support; it leads in privacy-first regulated enterprise deployments. SP003, SP010
CP012 ElevenLabs Scribe v2 Realtime achieves ~150ms latency across 30 languages with 93.5% FLEURS accuracy, leading Deepgram in the multilingual real-time STT segment as of May 2026 benchmarks. SP001, SP008
CP013 Deepgram holds at least two US patents on its ASR architecture (US 12,380,880 on end-to-end ASR with transformers; US 12,334,075), providing a foundation for IP-based moat defense. SP011
CP014 Deepgram's 3-factor automated model adaptation for domain-specific fine-tuning has no published peer match from hyperscalers or pure-play competitors as of June 2026, representing a technical moat. SP012, SP013
CP015 NASA evaluated Deepgram head-to-head against all major STT providers and selected Deepgram after competitors failed to reach the 80% word recognition rate threshold for space-to-ground audio; Deepgram achieved 89.6% accuracy after fine-tuning. SP016, SP020
CP016 Deepgram became IBM's first voice partner (February 2026) with exclusive embedding in watsonx Orchestrate, creating a distribution channel inaccessible to AssemblyAI, Speechmatics, or ElevenLabs. SP017, SP012
CP017 Deepgram's multi-year Strategic Collaboration Agreement with AWS (August 2025) provides co-selling and AWS Marketplace access that Speechmatics and AssemblyAI do not publicly match. SP018, SP012
CP018 Deepgram's on-premises and self-hosted deployment option gives it a competitive advantage over AssemblyAI (no on-prem) and hyperscalers for regulated enterprise buyers in government, healthcare, and financial services. SP012, SP025
CP019 Rev.ai is a small, developer-focused STT competitor with limited voice agent capability; its competitive relevance to Deepgram is primarily in the media transcription niche. SP015
CP020 Deepgram's Voice Agent API ($4.50/hr) competes against OpenAI GPT-Realtime ($32/1M audio tokens), providing a roughly 5–10× price advantage for voice-only agent workloads. SP004, SP024
CP021 ElevenLabs is primarily a TTS leader ($180M Series C in 2024) expanding into STT via Scribe; its TTS quality likely exceeds Deepgram's Aura-2 in terms of voice naturalness for premium use cases. SP022, SP001
CP022 Deepgram's OfOne acquisition is the only known restaurant/QSR-specific voice AI vertical play among STT API competitors as of June 2026; no major competitor has announced a comparable vertical offering. SP012
CP023 Deepgram's audio intelligence capabilities (sentiment, topics) are limited compared to AssemblyAI's comprehensive LeMUR/Slam-1 suite, representing a feature gap in the transcript intelligence segment. SP002, SP009
CP024 Speechmatics has published explicit GDPR compliance guidance and privacy-first marketing, positioning it more strongly than Deepgram for European regulated enterprise customers concerned about data sovereignty. SP010, SP019
CP025 The BIPA biometric litigation risk affects Deepgram and all voice AI API providers that generate voiceprints, creating a sector-wide regulatory risk rather than a Deepgram-specific competitive disadvantage. SP019
CP026 Likely future competitive entrants include Anthropic (multimodal voice), Meta (open-source audio models), and Mistral (EU-based, GDPR-native), which could further fragment the developer STT market. SP022
CP027 OpenAI's GPT-Realtime-Translate ($0.034/min) and GPT-Realtime-2 ($32/1M audio tokens) signal OpenAI's intent to commoditize voice processing as part of the GPT platform, posing a long-term consolidation threat. SP004
CP028 Deepgram's competitive advantage in voice agent workloads (sub-300ms latency, unified orchestration) is the key differentiator that hyperscaler STT products do not yet replicate end-to-end as of June 2026. SP001, SP012, SP025
CP029 Deepgram's pricing at $0.0048/min for Nova-3 streaming is more expensive than AssemblyAI Universal-2 ($0.0025/min equivalent at $0.15/hr) but cheaper than hyperscalers (AWS at $0.024/min) for the same streaming use case. SP001, SP002, SP021
CP030 No public data on Deepgram's win rate or competitive conversion rate in head-to-head evaluations against hyperscalers is available; the NASA case study is the strongest public evidence of a competitive win. SP016
CP031 Deepgram lacks publicly disclosed SOC 2 Type II, ISO 27001, or FedRAMP certifications on its public website as of June 2026, a potential gap relative to hyperscaler competitors for regulated federal buyers. SP012, SP006
CP032 AssemblyAI's multilingual reach (99 languages in Universal-2) and audio intelligence depth (LeMUR, Slam-1) represent the strongest pure-play competitor profile complementary to Deepgram's real-time latency moat. SP002, SP001
CP033 Deepgram's Aura-2 TTS is positioned as professional and cost-effective, while ElevenLabs' TTS suite is positioned as the naturalness leader for premium voice synthesis use cases. SP012, SP022
CP034 Twilio's blog post demonstrated Deepgram as an integration partner for building virtual agents alongside OpenAI and ElevenLabs, validating Deepgram's ecosystem position as infrastructure rather than an application competitor. SP022
CP035 Madrona podcast discussion with Stephenson confirms Deepgram's deliberate strategy of out-foxing hyperscalers through accuracy, fine-tuning speed, and on-premises deployment rather than competing on price alone. SP025
CP036 Enterprise customers who fine-tune Deepgram domain models accumulate proprietary training data and adapted model weights, creating meaningful switching costs and data-dependency lock-in that standardized hyperscaler STT products do not generate. SP026, SP027
CP037 Open-source Whisper (OpenAI) and NVIDIA Canary Qwen 2.5B pose commoditization risk for Deepgram's batch English STT moat but cannot replicate sub-300ms streaming, domain fine-tuning, or enterprise deployment flexibility as hosted API services, limiting displacement risk to latency-insensitive batch workloads. SP032, SP001
CI001 Deepgram's Nova-3 STT streaming price is $0.0048/min and Flux is $0.0077/min on the Pay-As-You-Go tier, with a $200 free credit at signup and no minimum commitments. SI001, SI018
CI002 Deepgram's Voice Agent API is priced at $4.50 per hour, combining STT, TTS, and LLM orchestration, and launched at general availability in June 2025. SI002, SI001
CI003 Deepgram's Aura-2 TTS is priced at $0.015 per 1,000 characters, approximately 3.75× cheaper per character than OpenAI TTS-1 at roughly $0.015/1K chars (similar) or ElevenLabs at $0.08/1K chars (Creator plan). SI001, SI023
CI004 Deepgram offers a Growth plan at $4,000+/year providing approximately 20% savings over PAYG rates, with higher concurrency limits (225 concurrent WSS connections vs. 150 on PAYG). SI001, SI019
CI005 Deepgram's enterprise tier includes custom pricing, dedicated support, on-premises deployment options, and SLA commitments; terms are not publicly disclosed. SI001, SI010
CI006 Deepgram's OfOne QSR acquisition (January 2026) adds a vertical SaaS revenue layer targeting restaurant drive-thru voice ordering, likely with a per-location or revenue-share model distinct from API PAYG pricing. SI004, SI005
CI007 The AWS Strategic Collaboration Agreement (August 2025) and IBM watsonx Orchestrate partnership (February 2026) create partner distribution channels with likely embedded pricing distinct from direct public API rates. SI013, SI014
CI008 Deepgram reported being cash-flow positive at end of 2024, entering the Series C from a position of operational self-sufficiency — rare for an AI infrastructure company at the growth stage. SI008, SI009
CI009 As of January 2025, Deepgram had 200,000+ active developers and 400+ enterprise customers on its platform. SI008, SI003
CI010 Deepgram's platform recorded 3.3× annual usage growth over the prior four years as of January 2025, approximately equivalent to a 35% CAGR. SI008, SI009
CI011 Deepgram's cumulative scale metrics as of early 2025 include over 50,000 years of audio processed and more than 1 trillion words transcribed, representing material evidence of enterprise-scale usage. SI008, SI009
CI012 Deepgram has not publicly disclosed ARR, quarterly revenue, gross margin, or net revenue retention. No public financial filing exists as it is a private company. SI004, SI005
CI013 Based on 400+ enterprise customers at a conservative estimated ACV of $200K, Deepgram's enterprise ARR floor estimate is approximately $80M; developer PAYG revenue adds an estimated $10–30M, suggesting total ARR of approximately $90–$200M. This is an analyst estimate, not a disclosed figure. SI008, SI011
CI014 Twilio's strategic investment in Deepgram's Series C suggests a commercial partnership beyond technology integration, potentially including preferential pricing or API co-distribution arrangements. SI005, SI025
CI015 Deepgram raised $130M in Series C financing in January 2026 at a $1.3B post-money valuation, led by AVP; total cumulative funding is $215M+ across all rounds. SI004, SI005
CI016 Series C use of funds include: (1) OfOne QSR acquisition integration, (2) new Voice AI Collaboration Hub in San Francisco, (3) expanded patent portfolio, and (4) "Powered by Deepgram" partner program launch. SI004, SI009
CI017 Post-Series C, with $130M entering a cash-flow positive company, Deepgram's effective runway is estimated at 4–8 years at current scale, though growth investments will increase near-term operating expenses. SI004, SI008
CI018 Deepgram's estimated gross margin is 55–70% based on AI API infrastructure benchmarks, though compute costs for real-time inference at scale may compress margins below SaaS norms; no public disclosure exists. SI011, SI012
CI019 No public debt, project finance, or material financial obligations are disclosed for Deepgram as of June 2026. SI004, SI005
CI020 Deepgram's financial verdict based on public data: revenue quality is high (recurring, usage-based, enterprise-anchored), growth momentum is strong (3.3×), and capital adequacy appears sufficient post-Series C, but full underwriting requires private financials. SI004, SI008, SI010
CI021 Deepgram's $0.0048/min Nova-3 STT PAYG rate is 5× cheaper than AWS Transcribe ($0.024/min) and roughly 2× more expensive than AssemblyAI Universal-2 (~$0.0025/min equivalent). SI011, SI012
CI022 Google Cloud STT is priced at $0.016/min standard, Azure Speech at $0.0167/min standard, making Deepgram Nova-3 ($0.0048/min) 3–4× cheaper than both hyperscaler STT products at the streaming PAYG tier. SI011, SI019
CI023 ElevenLabs' STT (Scribe) is priced at $0.37/hr at Creator tier ($0.0062/min equivalent), competing with Deepgram's Nova-3 at $0.0048/min; Deepgram maintains a modest price advantage at the PAYG developer tier. SI023, SI001
CI024 The 200,000+ developer funnel converting to 400+ enterprise customers implies approximately a 0.2% enterprise conversion rate — typical for developer-led SaaS, where top 1–5% of users generate 80%+ of revenue. This funnel is a structural asset but individual ARPU metrics are unknown. SI008, SI010
CI025 Deepgram's Series C investors include strategic investors Twilio and SAP, alongside institutional investors AVP, Alkeon, In-Q-Tel, Madrona, Tiger Global, Wing VC, and Y Combinator. SI005, SI007
CI026 In-Q-Tel (the CIA's venture arm) is a Deepgram investor, which — combined with the NASA use case — positions Deepgram for U.S. government and intelligence community procurement channels. SI005, SI006
CI027 ARR and revenue figures are not publicly available for Deepgram; obtaining them is a prerequisite for underwriting the $1.3B valuation or validating the capital adequacy of the $130M raise. SI004, SI005
CI028 Net revenue retention (NRR) and enterprise churn rate are not publicly disclosed; without them, the "400+ enterprise customers" metric cannot be confirmed as net additions versus gross. SI004, SI008
CI029 The OfOne acquisition price and its standalone revenue/EBITDA contribution are not publicly disclosed, creating a gap in assessing whether the acquisition adds revenue or primarily adds capability and burn. SI004, SI005
CI030 Deepgram's gross margin is unknown; given real-time AI inference is compute-intensive, margin expansion requires either proprietary hardware efficiency (plausible given their end-to-end architecture) or volume-based cloud compute discounts — both are unverifiable without financial disclosure. SI018, SI022
CI031 On-premises and self-hosted deployment models reduce Deepgram's own GPU serving costs for those customers while retaining licensing revenue, representing a higher-margin revenue segment relative to cloud API delivery. SI001, SI010
CI032 Deepgram's GTM motion is dual-track: product-led growth (PLG) via developer free tier and PAYG, and direct enterprise sales through account executives, co-sell with AWS and IBM, and the "Powered by Deepgram" partner certification program. SI004, SI013, SI014
CI033 Developer PAYG revenue is likely heavily concentrated — top 5–10% of developer accounts probably generate 80%+ of developer-tier revenue, consistent with API platform usage distributions. SI011, SI008
CI034 Deepgram's capital intensity is lower than hyperscalers (AWS, Google) for voice AI due to its purpose-built deep learning architecture — requiring less compute per inference than transformer-based general-purpose models repurposed for STT. SI018, SI020
CI035 Deepgram's Twilio strategic investment, combined with the blog case study of Twilio developers building voice agents with Deepgram, suggests a revenue partnership that could scale developer acquisition at lower CAC through Twilio's 300,000+ developer customer base. SI025, SI005
CI036 Deepgram's speaker diarization feature (identifying multiple speakers in audio) is a premium enterprise capability that commands higher ARPU for legal, medical, and contact center use cases, supporting the enterprise revenue mix argument. SI021, SI003
CI037 Based on public data, Deepgram's revenue quality is assessed as high: recurring (subscription-anchored enterprise tier), usage-based (aligned with customer value delivery), and growing (3.3× annualized growth). Key uncertainties are margin, churn, and NRR. SI008, SI004, SI010
CI038 Deepgram holds US patent 12,380,880 ("End-to-end Automatic Speech Recognition with Transformer") and US 12,334,075 ("Hardware Efficient Automatic Speech Recognition"), both as capital assets that support the IP moat and may have licensing or defensive litigation value. SI026, SI006
CI039 Goodwin Law's April 2026 analysis of AI transcription tools under regulatory scrutiny highlights BIPA biometric data litigation as a financial risk for voice AI API providers, including Deepgram; regulatory compliance costs and potential litigation exposure represent off-balance-sheet financial liabilities. SI027, SI026
CE001 Deepgram's product suite consists of four building blocks: Nova-3 (batch/streaming STT), Flux (real-time agent STT), Aura-2 (neural TTS), and the Voice Agent API (unified STT+TTS+LLM orchestration), accessible via REST and WebSocket APIs with SDKs in 6+ languages. SE002, SE010
CE002 Deepgram supports three primary customer workflows: (1) real-time conversational voice agents via Voice Agent API, (2) batch transcription and analytics via Nova-3 REST API, and (3) on-premises regulated-enterprise deployment with full API parity. SE002, SE004
CE003 Deepgram's validated use cases include NASA space-to-ground audio (89.6% accuracy post-fine-tuning), Jack in the Box QSR drive-thru ordering, IBM enterprise AI workflows, and contact center transcription for unnamed enterprise customers. SE025, SE022
CE004 The Voice Agent API ($4.50/hr) enables developers to build voice agents without stitching together separate STT, LLM, and TTS services, with all three integrated in a single WebSocket API session. SE004, SE005
CE005 Deepgram Nova-3 achieved the lowest word error rate (5.26%) among hosted STT APIs in FutureAGI's independent May 2026 benchmark across 9 audio domains; it supports 45+ languages with domain-specific model variants for medical, finance, legal, and automotive verticals. SE003, SE001
CE006 Deepgram Flux is purpose-built for conversational speech recognition with end-of-speech (EOS) detection optimized for voice agent contexts, delivering sub-300ms latency from speech end to transcript delivery. SE004, SE003
CE007 Deepgram's core ASR architecture is end-to-end (E2E) deep learning — a single neural network mapping raw audio to text — contrasting with traditional pipeline-based ASR (separate acoustic, language, and decoder modules), enabling higher accuracy and hardware-efficient inference. SE007, SE008
CE008 The Voice Agent API uses a WebSocket-based architecture where STT, LLM, and TTS are orchestrated in a single persistent connection, eliminating the latency compounding of multi-hop architectures. SE005, SE006
CE009 Deepgram's API surface includes REST (batch), WebSocket (streaming and Voice Agent), SDKs for Python, JavaScript/TypeScript, Go, .NET, Ruby, and PHP, a CLI tool, and an MCP Server for AI coding tools. SE010, SE023
CE010 US Patent 12,380,880 (assigned to Deepgram) covers end-to-end ASR using a transformer architecture that jointly models acoustic and language features without decomposition into separate pipeline components. SE007, SE009
CE011 US Patent 12,334,075 (assigned to Deepgram) covers hardware-efficient ASR using latent-space compression techniques that reduce compute requirements per inference minute relative to full-parameter transformer models. SE008, SE009
CE012 Deepgram's critical infrastructure dependencies include GPU compute (AWS, GCP, or Azure clusters), proprietary training data corpora, and the AWS SCA and IBM watsonx distribution partnerships. SE009, SE024
CE013 Deepgram offers three deployment modes: cloud API (managed SaaS), self-hosted (Docker/Kubernetes in customer cloud), and on-premises (air-gap capable data center), with full API parity across all three. SE002, SE010
CE014 Deepgram's blog announced Flux Multilingual in June 2026, a conversational speech model for global voice agents supporting multiple languages in a single real-time model, addressing the multilingual competitive gap versus ElevenLabs Scribe v2. SE016, SE015
CE015 HIPAA Business Associate Agreements are available for all Deepgram paid plans, enabling use in healthcare, clinical documentation, and medical transcription workflows. SE013, SE002
CE016 As of June 2026, Deepgram's public-facing website and documentation do not list SOC 2 Type II, ISO 27001, or FedRAMP certifications, a gap relative to hyperscaler competitors that routinely list all three in their trust centers. SE010, SE014
CE017 Deepgram supports zero-retention mode where audio is not stored post-transcription, and on-premises deployment enables data sovereignty for regulated enterprise buyers, but formal GDPR certification posture is less prominently documented than competitors like Speechmatics. SE013, SE014
CE018 Deepgram's 3-factor automated domain adaptation allows enterprise customers to fine-tune STT models for proprietary vocabulary without manual machine learning engineering; the system accepts customer audio corpora and generates domain-adapted model weights. SE001, SE011
CE019 Deepgram supports speaker diarization (identifying and labeling multiple speakers in audio) via a feature flag on the Nova-3 API, enabling use cases in contact center QA, legal depositions, medical documentation, and board meeting transcription. SE017, SE019
CE020 Deepgram's Smart Format feature applies intelligent post-processing to transcripts: formatting numbers, dates, currency, and punctuation for readability, available on all Nova-3 and Flux models. SE018, SE006
CE021 Deepgram's status page (status.deepgram.com) records two operational incidents in 2024, both resolved in under 4 hours; the API's availability track record is >99% over the disclosed period. SE021
CE022 The NASA case study documents Deepgram achieving 89.6% word recognition accuracy on space-to-ground audio after fine-tuning, after all competitors failed the 80% threshold in the competitive evaluation. SE025, SE022
CE023 Deepgram's Aura-2 TTS is positioned as a professional-quality, low-latency TTS for voice agent responses; technical comparisons against ElevenLabs TTS are not publicly available, but ElevenLabs is generally perceived as the natural-voice quality leader. SE002, SE003
CE024 Saga OS is referenced in Deepgram's Series C announcement as a voice agent operating system layer, but its technical specifications, API surface, and GA timeline are not publicly disclosed as of June 2026. SE009
CE025 Deepgram's developer platform includes an MCP Server (Model Context Protocol) that gives AI coding tools built-in knowledge of Deepgram's APIs — a 2025-2026 trend in developer tooling that lowers integration friction for AI-first developers. SE010, SE026
CE026 The Powered by Deepgram ISV partner program was announced as part of the Series C, enabling third-party developers and companies to build certified voice AI products on Deepgram's platform, creating an ecosystem revenue stream and distribution amplifier. SE009, SE024
CE027 Deepgram's STT streaming feature matrix (available in developer docs) shows Nova-3 supporting diarization, smart formatting, language detection, topics, entity detection, and summarization; Flux streaming supports a subset focused on real-time agent contexts. SE006, SE015
CE028 IBM's integration embeds Deepgram as the exclusive first voice AI partner in watsonx Orchestrate for enterprise workflows, validating Deepgram's architecture compatibility with enterprise-grade AI orchestration platforms. SE024, SE009
CE029 Deepgram's on-premises deployment mode provides full API parity with the cloud offering, enabling regulated enterprise (defense, healthcare, financial services) to migrate from cloud pilots to air-gapped production deployments without SDK changes. SE013, SE010
CE030 Deepgram supports 45+ languages in Nova-3 including domain-specific variants (medical, finance, legal), while Flux Multilingual (announced June 2026) extends conversational real-time STT to multiple languages for global voice agent deployments. SE015, SE016
CE031 The Deepgram CLI (28 API commands per the developer portal) and MCP Server represent developer experience investments that reduce time-to-first-API-call and increase platform stickiness for the 200,000+ active developer base. SE010
CE032 Deepgram's pre-recorded (batch) API supports a broader feature set than streaming, including summarization, chapter detection, and intent recognition — capabilities that compete with AssemblyAI's LeMUR transcript intelligence suite for post-processing use cases. SE023, SE006
CE033 Deepgram's training data includes extensive real-world audio corpora across verticals; fine-tuning on customer-specific data creates model weights unique to each enterprise customer, generating data-dependency lock-in that is a structural moat component. SE001, SE018
CE034 Deepgram's Goodwin Law-cited BIPA and biometric data regulatory risk applies to its voiceprint and speaker diarization features; compliance management requires explicit data handling documentation and consent frameworks that Deepgram provides via its privacy policy but not yet via a public trust center. SE014, SE013
CE035 Deepgram's hardware-efficient inference (Patent US 12,334,075) enables its on-premises deployment to run on commodity server hardware rather than requiring expensive specialized GPU infrastructure, which is a prerequisite for regulated enterprise adoption where cloud GPU provisioning is impractical. SE008, SE013
CE036 Deepgram's STT models support language detection as a streaming feature, automatically identifying the spoken language in real-time, a critical capability for multilingual contact centers and global enterprise deployments. SE015, SE006
CE037 Deepgram's Voice Agent API includes configurable LLM integration, supporting GPT-4, Claude, Llama, and other models — positioning Deepgram as infrastructure-agnostic at the LLM layer while locking in the STT/TTS envelope where its technical differentiation is strongest. SE005, SE004
CU001 As of Deepgram’s January 2025 operating update, the company said it had 400+ enterprise customers. SU001, SU002
CU002 By 2025-2026 public materials, Deepgram said 200,000+ developers build with its platform. SU002, SU014
CU003 Deepgram said annual usage had grown 3.3x across the prior four years. SU001
CU004 Deepgram said it had processed more than 50,000 years of audio. SU001, SU002
CU005 Deepgram said it had transcribed more than one trillion words. SU001, SU002
CU006 Public materials frame Deepgram’s customer mix as enterprises, technology ISVs, and co-sell partners rather than a single undifferentiated customer pool. SU002, SU014
CU007 Deepgram’s enterprise page says the platform is trusted by hundreds of enterprises and conversational AI leaders. SU003
CU008 Contact centers are a core Deepgram customer segment for live transcription, agent assist, QA, and analytics workloads. SU016, SU010
CU009 Healthcare is a targeted Deepgram segment for HIPAA-ready voice agents, medical transcription, and patient communication workflows. SU017, SU003
CU010 Media and podcast platforms are targeted for captioning, searchability, moderation, and analytics workflows. SU018
CU011 Conversational-AI builders and telephony developers use Deepgram as an STT/TTS/orchestration layer inside voice agents and assistants. SU019, SU013, SU023
CU012 Deepgram’s AWS partner materials say purchases can draw down existing AWS commitments and credits, making AWS a real procurement channel. SU010
CU013 IBM positions Deepgram voice capabilities inside watsonx Orchestrate, giving Deepgram partner-mediated exposure to IBM enterprise accounts. SU014
CU014 NASA is currently using Deepgram’s speech-to-text API across four different use cases after testing major providers and an open-source alternative. SU004, SU003
CU015 Deepgram’s NASA case study says the space-to-ground transcript model reached up to 89.6% accuracy. SU004
CU016 Deepgram’s NASA case study says the trained model achieved about 87% word recognition rate on Neutral Buoyancy Lab validation sets. SU004
CU017 UpdateAI says Deepgram speech recognition is the basis for its action-item detection engine for Zoom meetings. SU005, SU007
CU018 UpdateAI says it tested six ASR providers before choosing Deepgram for accuracy and real-time speed. SU005, SU007
CU019 Nytro.AI says Deepgram is its embedded speech-to-text provider inside pitch-intelligence workflows. SU006, SU008
CU020 Nytro.AI says alternatives delivered about 75-80% accuracy while Deepgram delivered about 90-92% or 90%+ accuracy. SU006, SU008
CU021 Deepgram’s built-with directory highlights additional ecosystem logos such as Vocinity, but only UpdateAI and Nytro.AI had fetched subpages with substantive deployment detail in this run. SU009
CU022 NetworkWorld reports Jack in the Box using Deepgram-backed AI drive-through voice agents, but this run did not find a second equally detailed public case study for that deployment. SU020, SU002
CU023 No reviewed source disclosed customer counts broken out by geography, company size, or revenue band. SU001, SU002, SU003
CU024 No reviewed source disclosed NRR, GRR, or churn for Deepgram customers. SU001, SU002, SU003
CU025 No reviewed source disclosed contract length, ACV, top-customer revenue share, or top-partner concentration. SU001, SU002, SU003
CU026 The strongest public durability evidence is testimonial continuity from embedded ISVs rather than portfolio-level renewal statistics. SU005, SU006, SU007, SU008
CU027 UpdateAI’s founder explicitly recommends Deepgram to other B2B SaaS companies, which is positive reference quality but not a disclosed renewal metric. SU007
CU028 PeerSpot’s review aggregation emphasizes speed, accuracy, low latency, configurability, and cost-effective scalability as recurring positives. SU021
CU029 PeerSpot’s review aggregation also flags language coverage, live-transcription stability, speaker identification, pricing/concurrency, and setup complexity as recurring weaknesses. SU021
CU030 RFP.wiki’s procurement note says buyers should validate reliability, observability, rollback, and SLA terms rather than relying on model-quality demos alone when considering Deepgram. SU022
CU031 Goodwin’s 2026 privacy analysis shows why AI transcription adoption in regulated workflows can trigger consent, BIPA, wiretap, retention, and vendor-control risks. SU026, SU017
CU032 Deepgram’s Voice Agent API creates a credible within-account expansion path from raw STT into full speech-to-speech orchestration. SU015, SU019, SU023
CU033 Twilio and Deepgram materials together show Deepgram operating as the STT/TTS layer inside phone-call workflows, reinforcing telephony-led developer adoption. SU013, SU023
CU034 Deepgram’s Amazon Connect integration currently supports Deepgram-hosted customers only, so self-hosted buyers do not yet have equal parity in that channel. SU012
CU035 AWS Connect and related partner materials position Deepgram inside contact-center flows without requiring customers to rewrite their operating logic. SU011, SU012
CU036 Deepgram’s cloud, dedicated, and self-hosted deployment modes support customer expansion from experimentation into stricter security and compliance requirements. SU003
CU037 Deepgram’s contact-center and conversational-AI pages show a multi-use-case expansion path from transcription into analytics, agent assist, diarization, topic detection, and turn-taking control. SU016, SU019
CU038 Deepgram’s media-transcription page includes a Podsights-at-Spotify testimonial, indicating content platforms value Deepgram for analytics-grade transcription. SU018
CU039 Deepgram says it operates thousands of AI models and has processed trillions of seconds of speech, which signals scaled deployments but not how usage is distributed across accounts. SU003
CU040 Apps Run The World independently tracks Deepgram customer wins across voice agents, TTS, STT, and audio intelligence categories, reinforcing workload breadth rather than exact count precision. SU025
CU041 SpeechTech Magazine describes the Voice Agent API as enterprise-oriented and cites benchmark outperformance versus OpenAI and ElevenLabs, supporting Deepgram’s expansion into higher-level voice-agent workloads. SU015
CU042 Deepgram maintains a public incident-history surface, so reliability diligence should include incident-log review even though the readable fetch in this run did not enumerate incident-level detail. SU024
CR001 The reviewed legal and regulatory sources do not evidence a named Deepgram-specific BIPA or HIPAA enforcement action or lawsuit as of the run date. SR016, SR017, SR018, SR021, SR022, SR023, SR024
CR002 Illinois BIPA defines a voiceprint as a biometric identifier. SR021, SR022
CR003 BIPA Section 15 requires written notice, purpose-and-term disclosure, and a written release before collecting biometric identifiers or biometric information. SR021, SR022
CR004 BIPA Section 15 also requires a public retention schedule and reasonable protection of biometric data. SR021, SR022
CR005 Smith Gambrell says AI note-takers that record conversations, attribute speakers, and retain transcripts can trigger BIPA claims. SR016
CR006 Smith Gambrell says BIPA can apply when any meeting participant is physically in Illinois even if the vendor and employer are elsewhere. SR016
CR007 Commercial Litigation Update says more than 1,500 BIPA lawsuits have been filed in Illinois since Rosenbach and that exposure remains serious after the 2024 amendment. SR017
CR008 Privacy World says at least 100 putative BIPA class actions were filed in 2025 and that biometric mass-arbitration activity persisted. SR018
CR009 The reviewed sources support framing BIPA as a current exposure category for Deepgram rather than as an evidenced Deepgram case. SR016, SR017, SR018, SR021, SR022
CR010 Deepgram markets its healthcare voice-agent stack as HIPAA-ready and medical-grade for healthcare workflows. SR003, SR027
CR011 Deepgram’s compliance documentation says it may qualify as a business associate and can provide a BAA to qualifying covered entities. SR006
CR012 HHS says its HIPAA Security Rule proposal would make all implementation specifications required and add more prescriptive cybersecurity obligations. SR023, SR024
CR013 HIPAA Journal says the proposed rule would require documented annual risk analyses across vendors, cloud environments, and shared systems and could create material implementation cost for business associates. SR015, SR024
CR014 Deepgram says it has SOC 2 Type I and Type II certification and states GDPR readiness, CCPA compliance, and PCI compliance. SR001, SR006
CR015 Deepgram’s security policy says it uses role-based access control, two-factor authentication, vulnerability and patch management, daily backups, and formal incident response procedures. SR001, SR007
CR016 Deepgram says customers own their data and that it only processes information customers provide. SR007
CR017 Deepgram offers an EU endpoint for in-region processing, but says the specific EU country may change and country-specific hosting may require Deepgram Dedicated. SR009, SR027
CR018 Whisper models are unavailable on Deepgram’s EU endpoint. SR009
CR019 Deepgram says managed OpenAI traffic can remain in-region on the EU endpoint, but other managed providers do not yet offer EU-specific endpoints. SR009
CR020 Deepgram’s rate-limit documentation says limits apply per project, additional projects do not add concurrency, and bypassing limits violates its terms. SR008
CR021 Pay-as-you-go voice-agent usage is capped at 45 concurrent connections, while higher growth and enterprise tiers begin with more concurrency and sales-led increases. SR008
CR022 Deepgram’s Amazon Connect integration currently supports hosted customers only and does not yet support self-hosted deployments. SR029
CR023 Deepgram offers hosted, dedicated, self-hosted, PrivateLink or VPC-style, and customer-cloud deployment paths to mitigate sovereignty and control concerns. SR002, SR026, SR027, SR028
CR024 Deepgram’s deployment-options documentation shifts infrastructure, backup, and uptime monitoring responsibility to the customer in self-hosted mode. SR028
CR025 Deepgram’s AWS page says procurement can draw down AWS commitments and routes workloads through Marketplace, Connect, SageMaker, Bedrock, or self-hosted AWS patterns. SR002
CR026 The AWS page says Bedrock-hosted LLMs can sit inside a Deepgram voice-agent stack, which expands reach but adds third-party model dependency. SR002
CR027 IBM says Deepgram is IBM’s first voice partner for watsonx Orchestrate. SR011
CR028 Twilio’s virtual-agent architecture routes telephony through Twilio, transcription through Deepgram, reasoning through OpenAI, and synthesis through another vendor, illustrating multi-vendor operational chains. SR030
CR029 Future AGI says Deepgram currently leads voice-agent latency use cases, but open-source and competing hosted vendors lead or tie on other evaluation dimensions. SR012
CR030 OpenAI markets Whisper as an open-source self-hosted speech-recognition model, and Future AGI still recommends Whisper or other open models for self-host use cases. SR019, SR012
CR031 Future AGI says NVIDIA Canary Qwen 2.5B leads open-source WER while Deepgram Nova-3 leads hosted WER on the benchmark set it cites. SR012
CR032 MarketsandMarkets projects conversational AI to grow from USD 17.05 billion in 2025 to USD 49.80 billion in 2031 but names compliance, privacy, and ethical standards at scale as core challenges. SR020
CR033 AssemblyAI’s 2026 market overview says 87.5% of builders are actively building voice agents and highlights QA, vertical specialization, and trust as critical scaling themes. SR025
CR034 SoundHound’s 2024 10-K says privacy control, brand control, and optional edge or hybrid deployment are important buyer criteria in voice AI. SR013
CR035 Twilio’s 2024 10-K flags third-party service provider outages, privacy and cybersecurity compliance, open-source software, and AI use as material platform risks in an adjacent communications stack. SR014
CR036 Twilio’s 2024 10-K says usage-based customers can reduce or stop usage without penalty, making service quality and value perception central to retention. SR014
CR037 Deepgram’s Series C release says it raised $130 million at a $1.3 billion valuation to support expansion, patents, and new product and platform initiatives. SR010
CR038 The same Series C release says the round included strategic investors such as Twilio, ServiceNow Ventures, SAP, and Citi Ventures, which can help distribution but also complicate partner expectations. SR010
CR039 Deepgram’s enterprise materials say performance, security, reliability, and scale are key promise areas for high-throughput and regulated workloads. SR002, SR027
CR040 Public materials reviewed for this chapter do not disclose customer concentration, partner-sourced revenue mix, audited uptime metrics, or biometric-specific indemnity terms. SR010, SR011, SR027, SR028, SR029
CR041 Self-hosting mitigates data residency and privacy exposure, but it also transfers operational burden and security patch execution risk to the customer. SR026, SR028
CR042 The Amazon Connect limitation, regional-endpoint constraints, and rate-limit rules mean some regulated or highest-scale buyers still need architecture work beyond the default hosted path. SR008, SR009, SR029
CR043 Rapid market growth and broad product scope increase the risk that pricing and feature competition compress differentiation faster than enterprise proof accumulates. SR012, SR020, SR025, SR027
CR044 Based on the reviewed evidence, the top residual risks are privacy and regulatory exposure, security and compliance execution, partner dependency, and price or architecture competition rather than a currently evidenced Deepgram-specific lawsuit. SR016, SR015, SR002, SR012, SR020, SR027
CR045 Expanding at once across STT, TTS, voice agents, healthcare, partner channels, and patent-backed platform initiatives increases execution surface area even after the Series C financing. SR010, SR011, SR027
CV001 Deepgram announced a $130 million Series C at a $1.3 billion valuation on 13 January 2026. SV001, SV002, SV003
CV002 AVP led the Series C and the syndicate included new strategic investors such as Twilio, ServiceNow Ventures, SAP, and Citi Ventures. SV001, SV002, SV003
CV003 Deepgram said the new round brought total disclosed funding to more than $215 million. SV001, SV002, SV003
CV004 Scott Stephenson said Deepgram was cash-flow positive in the prior year and did not need to raise defensively. SV002, SV004
CV005 Deepgram said more than 1,300 organizations build voice AI functionality powered by its APIs. SV001, SV002
CV006 Deepgram said it had 200,000+ active developers and 400+ enterprise customers entering 2025. SV004
CV007 Deepgram said usage grew 3.3x over four years and the platform had transcribed more than 1 trillion words. SV004
CV008 Deepgram publicly lists usage-based pricing for STT, TTS, and voice-agent products, which gives investors some visibility into monetization mechanics even without revenue disclosure. SV029
CV009 Deepgram's official comparison pages claim advantages over OpenAI, AWS, Google, AssemblyAI, and ElevenLabs on cost, latency, accuracy, or deployment flexibility. SV020, SV021, SV022, SV023, SV024
CV010 The Business Research Company forecasts the speech-to-text API market at $5.36 billion in 2026 and $10.46 billion in 2030. SV007
CV011 Independent voice-recognition market reports describe a broader category already measured in the tens of billions of dollars with low-20s percentage growth. SV005, SV006
CV012 MarketsandMarkets forecasts the conversational AI market to grow from $17.05 billion in 2025 to $49.8 billion by 2031. SV008
CV013 ElevenLabs announced a $180 million Series C in January 2025 at a $3.3 billion valuation. SV009
CV014 ElevenLabs says employees at over 60% of Fortune 500 companies use its platform and API. SV009
CV015 AssemblyAI announced a $50 million Series C that brought its total disclosed funding to $115 million. SV016
CV016 AssemblyAI says it regularly serves more than 25 million inference calls and over 10 terabytes of voice data per day. SV016
CV017 AssemblyAI says it was named a Leader in G2's Spring 2026 Voice Recognition Grid and topped the associated Relationship Index. SV018
CV018 SoundHound's 2024 Form 10-K confirms it is a public company with a formal SEC disclosure regime and roughly $1.169 billion of non-affiliate market value as of 30 June 2024. SV010
CV019 Twilio's 2024 Form 10-K confirms it is a large public company with roughly $9.1 billion of non-affiliate market value as of 30 June 2024. SV011
CV020 CompaniesMarketCap listed June 2026 market caps of about $3.02 billion for SoundHound, $31.33 billion for Twilio, $1.59 billion for Five9, and $5.14 billion for NICE. SV012, SV013, SV014, SV015
CV021 Deepgram's $1.3 billion mark is about 43% of SoundHound's June 2026 public market cap. SV012
CV022 Deepgram's $1.3 billion mark is about 4% of Twilio's June 2026 public market cap. SV013
CV023 Deepgram's $1.3 billion mark is about 82% of Five9's June 2026 public market cap. SV014
CV024 Deepgram's $1.3 billion mark is about 25% of NICE's June 2026 public market cap. SV015
CV025 No fetched public source discloses Deepgram's ARR, gross margin, NRR, or financing preferences. SV001, SV002, SV004
CV026 Official pricing pages from OpenAI, AWS, Google, Azure, and Deepgram show that speech infrastructure is sold in a transparent and price-sensitive market. SV025, SV026, SV027, SV028, SV029
CV027 At a $1.3 billion valuation, Deepgram would trade at about 13x ARR at $100 million of ARR and about 8.7x ARR at $150 million of ARR. SV001
CV028 At a $1.3 billion valuation, Deepgram would trade at about 6.5x ARR at $200 million of ARR, 5.2x at $250 million, and 4.3x at $300 million. SV001
CV029 Cash-flow positivity and strategic investors reduce immediate down-round pressure relative to weaker AI infrastructure startups, even if they do not prove undervaluation. SV001, SV002, SV004
CV030 Twilio's quoted support in the round suggests Deepgram has ecosystem relevance beyond a stand-alone benchmark story. SV001
CV031 Goodwin says AI transcription tools create real privacy, biometric, wiretap, retention, and privilege risks when organizations use them without strong consent and governance controls. SV019
CV032 That compliance backdrop can weigh on voice AI infrastructure multiples if deployment in regulated enterprises becomes harder or more expensive. SV019, SV008
CV033 The current valuation becomes materially easier to defend if verified ARR is at least roughly $200 million and more comfortable still above roughly $250 million. SV001
CV034 If verified ARR is closer to $100 million-$150 million, the present mark starts to look stretched for a private company with undisclosed unit economics. SV001, SV019
CV035 The most defensible public-evidence base case is that the current mark is plausible but not clearly attractive. SV001, SV002, SV004
CV036 Deepgram's $1.3 billion valuation sits well below ElevenLabs's $3.3 billion private mark, which suggests its January 2026 price was not obviously peak-valued within voice AI. SV001, SV009
CV037 AssemblyAI's funding and customer-satisfaction signals show the speech API peer set remains strong and competitive even below Deepgram's capital base. SV016, SV018
CV038 Deepgram's competitive-advantage evidence is still partly self-authored because the fetched rival comparisons come from Deepgram marketing pages rather than independent valuation work. SV020, SV021, SV022, SV023, SV024
CV039 Competitor pricing pages confirm that Deepgram does not operate in a black-box pricing category shielded from reference points. SV025, SV026, SV027, SV028
CV040 Transparent competitor pricing limits Deepgram's ability to justify a premium valuation purely on narrative without measurable commercial conversion. SV025, SV026, SV027, SV028, SV029
CV041 Because the valuation is public but the denominator is private, the recommendation has to be price-sensitive and diligence-gated rather than a simple score for company quality. SV001, SV002, SV004
CV042 A reasonable bear range using only public evidence is roughly $0.9 billion-$1.2 billion. SV001, SV012, SV019
CV043 A reasonable base range using only public evidence is roughly $1.2 billion-$1.8 billion. SV001, SV004, SV012, SV013, SV014, SV015
CV044 A reasonable bull range requires materially better proof and is roughly $1.8 billion-$2.6 billion on public framing alone. SV001, SV009
CV045 The current $1.3 billion mark sits inside the base range but not far enough below it to create clear public-evidence margin of safety. SV001, SV012, SV013, SV014, SV015
CV046 The most defensible current recommendation is track rather than buy. SV001, SV002, SV004
CV047 Key thesis-break triggers are under-scale ARR, weak gross margin, poor retention, investor-unfriendly preferences, compliance drag, or partner conversion that never becomes real revenue leverage. SV019, SV001, SV004
CV048 Priority diligence asks are ARR by segment, gross margin, retention, concentration, and the actual Series C legal terms. SV001, SV002, SV004
CV049 In absolute equity value, Deepgram is much closer to Five9 than to NICE or Twilio, which places a practical ceiling on how much public-comp upside can be assumed from narrative alone. SV012, SV013, SV014, SV015
CV050 Public peers disclose far more financial detail than Deepgram, which makes another private round or strategic optionality easier to support than near-term IPO-style readiness. SV010, SV011, SV012, SV013, SV014, SV015
CV051 The recommendation moves toward buy only if diligence shows enough ARR, margin quality, and retention durability to make the current price look conservative rather than merely plausible. SV001, SV004, SV029
CV052 The final diligence burden is high because the same missing denominator data that blocks a buy call also blocks precise downside protection analysis. SV001, SV002, SV004
来源
编号出版方标题引文
SO001 Deepgram About Us | Voice AI | STT & TTS Founded in 2015, Deepgram started with machine learning research for waveform analysis in a dark matter detector in China.
SO002 Deepgram AI Minds Podcast #037: Scott Stephenson CEO at Deepgram
SO003 Madrona Venture Group Deepgram Founder Shares Strategies for Scaling and Outmaneuvering Big Tech Scott Stephenson, Co-Founder and CEO of Deepgram, a foundational AI company building a voice AI platform providing APIs for speech-to-text and text-to-speech.
SO004 IA40 Deepgram Founder Scott Stephenson
SO005 Y Combinator Deepgram — YC Company Profile Deepgram is a foundational AI company on a mission to understand human language.
SO006 Deepgram Deepgram Raises $130M Series C at $1.3B Valuation to Power the Voice AI Economy
SO007 BusinessWire Deepgram Raises $130M Series C at $1.3B Valuation to Power the Voice AI Economy Deepgram has raised $130 million in Series C funding at a $1.3 billion valuation. The round was led by AVP.
SO008 TechCrunch Deepgram raises $130M at $1.3B valuation and buys a YC AI startup The company has raised over $215 million in funding to date.
SO009 Inc. Deepgram Wasn't Looking for Capital. Then Came $130 Million
SO010 TechFundingNews Voice AI Deepgram hits unicorn status with $130M raise led by AVP
SO011 AVP Deepgram Raises $130M Series C at $1.3B Valuation to Power the Voice AI Economy Much like Stripe delivered the API platform underpinning the payments economy, we believe Deepgram is poised to deliver the API platform underpinning the emerging trillion-dollar B2B Voice AI economy.
SO012 Robotics and Automation News Deepgram Raises $130 Million Series C at $1.3 Billion Valuation
SO013 NetworkWorld Digging into voice AI platform Deepgram
SO014 BusinessWire Deepgram Accelerates Into 2025, Empowering 200,000+ Developers AI Company Ends 2024 Cash-flow Positive with 400+ Enterprise Customers, 3.3x Annual Usage Growth Across the Past Four Years, Over 50,000 Years of Audio Processed, and Over One Trillion Words Transcribed
SO015 BusinessWire Introducing Nova-3: Extending Deepgram's Leadership in Voice AI for Enterprise Use Cases
SO016 BusinessWire Deepgram Launches Voice Agent API: World's Only Enterprise-Ready, Real-Time, and Cost-Effective Conversational AI API
SO017 IBM Newsroom Deepgram and IBM Introduce Advanced Voice Capabilities for Enterprise AI Deepgram to be IBM's first voice partner offering fast, reliable, and scalable transcription and speech technology.
SO018 BusinessWire Deepgram Signs Strategic Collaboration Agreement with AWS
SO019 Deepgram (via Deepgram.com) Deepgram Raises $130M Series C — press release full text
SO020 FutureAGI Speech-to-Text APIs in 2026: Benchmarks, Pricing, Developer's Decision Guide
SO021 Deepgram Deepgram Pricing | Scalable Speech-to-Text, Text-to-Speech & Voice Agent APIs
SO022 Deepgram Customer Program | Deepgram
SO023 Deepgram NASA Uses Deepgram to Power the Next Gen of Space Tech
SO024 Deepgram Status Deepgram Status Page — Incident History
SO025 Goodwin Law AI Transcription Tools Under Scrutiny — Privacy and Legal Risk In 2025 and 2026, a number of companies have faced litigation under the Illinois Biometric Information Privacy Act (BIPA) for the above practices.
SM001 The Business Research Company Speech-to-text API Global Market Report 2026 Speech-to-text API market size has reached to $4.55 billion in 2025, expected to grow to $10.46 billion in 2030 at a CAGR of 18.2%
SM002 Coherent Market Insights Voice and Speech Recognition Market Report 2026–2033 The Global Voice and Speech Recognition Market is estimated to be valued at USD 26.50 Bn in 2026 and is expected to reach USD 116.89 Bn by 2033 at a CAGR of 23.6%
SM003 ResearchAndMarkets Speech and Voice Recognition Market Report 2025
SM004 FutureAGI Speech-to-Text APIs in 2026: Benchmarks, Pricing, Developer's Decision Guide Best STT API by use case in May 2026: Voice agents (lowest E2S latency) — Deepgram Flux + Nova-3, Sub-300ms streaming
SM005 CompareVoiceAI STT Pricing Calculator and Comparison
SM006 OpenAI OpenAI API Pricing
SM007 AssemblyAI AssemblyAI Pricing
SM008 Speechmatics Speechmatics Pricing
SM009 Google Cloud Google Cloud Speech-to-Text Pricing
SM010 Amazon Web Services Amazon Transcribe Pricing
SM011 Microsoft Azure Azure Speech Services Pricing
SM012 NetworkWorld Digging into voice AI platform Deepgram
SM013 BusinessWire Deepgram Accelerates Into 2025, Empowering 200,000+ Developers
SM014 Goodwin Law AI Transcription Tools Under Scrutiny
SM015 OneTrust The 5 Trends Shaping Global Privacy and Enforcement in 2026
SM016 Deepgram NASA Uses Deepgram to Power the Next Gen of Space Tech
SM017 Speechmatics Your Essential Guide to Voice AI Compliance in Today's Digital Landscape
SM018 AssemblyAI AssemblyAI Blog — Product and Market Updates
SM019 Rev.ai Rev.ai Pricing
SM020 Twilio Building Virtual Agents on Twilio with OpenAI, Deepgram, and ElevenLabs
SM021 Haptik Data Privacy in Voice AI
SM022 BusinessWire Deepgram Launches Voice Agent API: World's Only Enterprise-Ready, Real-Time, and Cost-Effective Conversational AI API
SM023 BusinessWire Deepgram Raises $130M Series C at $1.3B Valuation
SM024 Deepgram Deepgram Pricing
SM025 IBM Newsroom Deepgram and IBM Introduce Advanced Voice Capabilities for Enterprise AI
SP001 FutureAGI Speech-to-Text APIs in 2026: Benchmarks, Pricing, Developer's Decision Guide Best STT API by use case in May 2026: Voice agents (lowest E2S latency): Deepgram Flux + Nova-3
SP002 AssemblyAI AssemblyAI Pricing
SP003 Speechmatics Speechmatics Pricing
SP004 OpenAI OpenAI API Pricing
SP005 Google Cloud Google Cloud Speech-to-Text Pricing
SP006 Amazon Web Services Amazon Transcribe Pricing
SP007 Microsoft Azure Azure Speech Services Pricing
SP008 CompareVoiceAI STT Voice Agent Pricing Calculator
SP009 AssemblyAI AssemblyAI Blog
SP010 Speechmatics Your Essential Guide to Voice AI Compliance in Today's Digital Landscape
SP011 TechFundingNews Voice AI Deepgram hits unicorn status with $130M raise led by AVP
SP012 BusinessWire Deepgram Raises $130M Series C at $1.3B Valuation
SP013 BusinessWire Introducing Nova-3: Extending Deepgram's Leadership in Voice AI
SP014 BusinessWire Deepgram Accelerates Into 2025, Empowering 200,000+ Developers
SP015 Rev.ai Rev.ai Pricing
SP016 Deepgram NASA Uses Deepgram to Power the Next Gen of Space Tech
SP017 IBM Newsroom Deepgram and IBM Introduce Advanced Voice Capabilities for Enterprise AI
SP018 BusinessWire (AWS) Deepgram Signs Strategic Collaboration Agreement with AWS
SP019 Goodwin Law AI Transcription Tools Under Scrutiny
SP020 NetworkWorld Digging into voice AI platform Deepgram
SP021 Deepgram Deepgram Pricing
SP022 TechCrunch Deepgram raises $130M at $1.3B valuation and buys a YC AI startup
SP023 Haptik Data Privacy in Voice AI
SP024 BusinessWire (Voice Agent API) Deepgram Launches Voice Agent API: World's Only Enterprise-Ready, Real-Time, and Cost-Effective Conversational AI API
SP025 Madrona Venture Group Deepgram Founder Shares Strategies for Scaling and Outmaneuvering Big Tech
SP026 Deepgram What is Speech-to-Text? STT API Guide
SP027 Deepgram Speech-to-Text Privacy and Security Guide
SP028 Vapi Vapi Voice Agent Platform
SP029 ElevenLabs ElevenLabs Speech-to-Text — Scribe
SP030 Gladia Gladia Speech-to-Text API
SP031 Rev.ai Introducing Rev AI Core
SP032 OpenAI Whisper: Robust Speech Recognition via Large-Scale Weak Supervision
SP033 Deepgram Deepgram Blog — Product Updates 2026
SI001 Deepgram Deepgram Pricing
SI002 BusinessWire Deepgram Launches Voice Agent API
SI003 Deepgram Deepgram Customers
SI004 BusinessWire Deepgram Raises $130M Series C at $1.3B Valuation
SI005 TechCrunch Deepgram raises $130M at $1.3B valuation and buys a YC AI startup
SI006 TechFundingNews Voice AI Deepgram hits unicorn status with $130M raise led by AVP
SI007 AVP (lead investor) Deepgram Raises $130M Series C at $1.3B — AVP Investment Thesis
SI008 BusinessWire Deepgram Accelerates Into 2025, Empowering 200,000+ Developers
SI009 Inc. AI Founder Deepgram Raises $130M Series C
SI010 NetworkWorld Digging into voice AI platform Deepgram
SI011 FutureAGI Speech-to-Text APIs in 2026: Benchmarks, Pricing, Developer's Decision Guide
SI012 CompareVoiceAI STT Voice Agent Pricing Calculator
SI013 IBM Newsroom Deepgram and IBM Introduce Advanced Voice Capabilities for Enterprise AI
SI014 BusinessWire (AWS) Deepgram Signs Strategic Collaboration Agreement with AWS
SI015 BusinessWire Introducing Nova-3: Extending Deepgram's Leadership in Voice AI
SI016 Robotics and Automation News Deepgram raises $130 million Series C at $1.3 billion valuation
SI017 OpenAI OpenAI API Pricing
SI018 Deepgram Whisper vs Deepgram: Which STT API is Right for You?
SI019 Deepgram Best Speech-to-Text APIs in 2026
SI020 Deepgram What is Word Error Rate (WER)?
SI021 Deepgram What is Speaker Diarization?
SI022 Deepgram Developer Docs Deepgram Listen Remote API Reference
SI023 ElevenLabs ElevenLabs Pricing
SI024 Kore.ai Kore.ai Blog — Voice AI and Conversational AI
SI025 Twilio Twilio Voice Pricing
SI026 USPTO / Google Patents US Patent 12,380,880 — End-to-end ASR with Transformer (Deepgram)
SI027 Goodwin Law AI Transcription Tools Under Scrutiny
SE001 BusinessWire Introducing Nova-3: Extending Deepgram's Leadership in Voice AI
SE002 Deepgram Deepgram Pricing
SE003 FutureAGI Speech-to-Text APIs in 2026: Benchmarks, Pricing, Developer's Decision Guide
SE004 BusinessWire Deepgram Launches Voice Agent API
SE005 Deepgram Developer Docs Voice Agent API — Getting Started
SE006 Deepgram Developer Docs Deepgram STT Streaming Feature Overview
SE007 USPTO / Google Patents US Patent 12,380,880 — End-to-end ASR with Transformer (Deepgram)
SE008 USPTO / Google Patents US Patent 12,334,075 — Hardware Efficient Automatic Speech Recognition (Deepgram)
SE009 BusinessWire Deepgram Raises $130M Series C at $1.3B Valuation
SE010 Deepgram Developer Docs Deepgram Developer Documentation — Introduction
SE011 Deepgram Developer Docs Deepgram Model Selection
SE012 Deepgram Deepgram About
SE013 Deepgram Speech-to-Text Privacy and Security Guide
SE014 Goodwin Law AI Transcription Tools Under Scrutiny
SE015 Deepgram Developer Docs Deepgram Language Support Overview
SE016 Deepgram Deepgram Blog — Flux Multilingual Announcement
SE017 Deepgram Developer Docs Diarization (Speaker Recognition)
SE018 Deepgram Developer Docs Smart Format Feature
SE019 Deepgram What is Speaker Diarization?
SE020 Deepgram Deepgram Customers
SE021 Deepgram Deepgram Status History
SE022 NetworkWorld Digging into voice AI platform Deepgram
SE023 Deepgram Developer Docs Getting Started with Pre-recorded Audio
SE024 IBM Newsroom Deepgram and IBM Introduce Advanced Voice Capabilities for Enterprise AI
SE025 Deepgram Customer NASA Uses Deepgram to Power Next Gen Space Tech
SE026 npm Registry @deepgram/sdk — Deepgram Node.js SDK (npm)
SE027 PyPI deepgram-sdk — Deepgram Python SDK (PyPI)
SE028 TechCrunch Deepgram raises $130M at $1.3B valuation and buys a YC AI startup
SU001 Deepgram Deepgram Accelerates into 2025 AI Company Ends 2024 Cash-flow Positive with 400+ Enterprise Customers, 3.3x Annual Usage Growth Across the Past Four Years, Over 50,000 Years of Audio Processed, and Over One Trillion Words Transcribed
SU002 BusinessWire Deepgram Raises $130M Series C at $1.3B Valuation to Power the Voice AI Economy 200,000+ developers build with Deepgram’s voice-native foundational models.
SU003 Deepgram Voice AI for Enterprise Trusted by hundreds of enterprises and conversational AI leaders, we've deployed and operate thousands of AI models and processed trillions of seconds of speech.
SU004 Deepgram NASA Uses Deepgram to Power the Next Generation of Space Tech NASA is currently using Deepgram’s speech-to-text API for four different use cases.
SU005 Deepgram Case Study: Update AI Deepgram’s fast and accurate speech recognition technology is the basis for UpdateAI’s action item detection engine for Zoom.
SU006 Deepgram Case Study: Nytro AI Nytro.ai chose Deepgram as their embedded STT provider due to consistency, speed, scalability, and overall accuracy.
SU007 Deepgram UpdateAI Uses Deepgram for High accuracy and Readability I’d recommend Deepgram to any B2B SaaS company that’s looking for the best-in-breed transcription and customer service and customer success.
SU008 Deepgram Nytro.AI uses Deepgram for Sales Enablement When we discovered Deepgram, we found that the accuracy was ninety percent plus.
SU009 Deepgram Built With Deepgram Vocinity Creates Conversational Bots with Deepgram
SU010 Deepgram Enterprise Voice AI, Native to AWS Deepgram purchases draw down on your existing AWS commit.
SU011 Deepgram Enterprise Voice AI on AWS, integrated into AWS Connect Seamless integration with existing Connect and Lex workflows, no hacks, no heavy lifting
SU012 Deepgram Developer Docs Deepgram with Amazon Connect This integration supports Deepgram-hosted customers only. Support for self-hosted deployments will be added in a future phase.
SU013 Deepgram Developer Docs Build a Voice Agent with Twilio, Deepgram, and OpenAI Twilio handles the phone call. Deepgram handles speech-to-text and text-to-speech. OpenAI handles the LLM.
SU014 IBM Newsroom Deepgram and IBM Introduce Advanced Voice Capabilities for Enterprise AI Customers include technology ISVs building voice products or platforms, co-sell partners working with large enterprises, and enterprises solving internal use cases.
SU015 SpeechTech Magazine Deepgram Launches Voice Agent API In recent benchmark testing using the Voice Agent Quality Index (VAQI), Deepgram achieved the highest overall score among all evaluated providers.
SU016 Deepgram Voice AI for Contact Centers Streaming transcription enables live call analytics that enhance agent productivity with real-time guidance.
SU017 Deepgram Voice Agents for Healthcare Deploy HIPAA-compliant, enterprise-grade AI Voice Agents powered by our industry-leading Nova-3 Medical model.
SU018 Deepgram Media Transcription With Deepgram’s accurate and fast speech-to-text solution, we’re the Google Analytics of podcasts.
SU019 Deepgram Conversational AI Deepgram Voice Agent API orchestrates STT, TTS, and LLMs with turn-taking, end-of-thought prediction, and barge-in support.
SU020 NetworkWorld Digging into voice AI platform Deepgram Jack in the Box is using Deepgram to implement automated AI voice agents to take customer orders at their drive-through locations.
SU021 PeerSpot Deepgram Reviews, Competitors and Pricing Deepgram could simplify its interface for basic use cases and expand language support to include regional languages.
SU022 RFP.wiki Deepgram - Rating Snapshot: Score & Reviews (2026) Cloud AI developer services procurement should prioritize production reliability and cost control, not only model quality demos.
SU023 Twilio Building Virtual Agents on Twilio with OpenAI, Deepgram, and Elevenlabs Using several platforms such as OpenAI, Deepgram, and Elevenlabs, as well as Twilio Voice, SMS, and Media Streams, they created a Generative AI virtual agent application.
SU024 Deepgram Deepgram Status - Incident History Incident History
SU025 Apps Run The World Deepgram Customers and Enterprise Applications Buyer Insight Each quarter our research team identifies companies that are using Deepgram applications such as Deepgram Voice Agent for Chatbots and Conversational AI, Deepgram Text to Speech, Deepgram Speech to Text, and Deepgram Audio Intelligence.
SU026 Goodwin AI Transcription Tools Under Scrutiny: Navigating Privacy Risks and Practical Mitigation Strategies AI transcription tools can unlock productivity gains and enrich organizational knowledge flows. However, they also introduce consequential risks to privacy, confidentiality, privilege, intellectual property, and other sources of legal or operational risk.
SR001 Deepgram Voice AI Security & Privacy
SR002 Deepgram Enterprise Voice AI, Native to AWS
SR003 Deepgram Voice Agents for Healthcare
SR004 Deepgram Speech to Text that outshines OpenAI Whisper
SR005 Deepgram Deepgram vs AWS
SR006 Deepgram Data Privacy Compliance
SR007 Deepgram Security Policy
SR008 Deepgram API Rate Limits
SR009 Deepgram Regional Endpoints
SR010 Deepgram / Business Wire Deepgram Raises $130M Series C at $1.3B Valuation to Power the Voice AI Economy
SR011 IBM Deepgram and IBM Introduce Advanced Voice Capabilities for Enterprise AI
SR012 Future AGI Best Speech-to-Text APIs in 2026: Deepgram, AssemblyAI, Whisper, ElevenLabs Compared
SR013 Securities and Exchange Commission SoundHound AI, Inc. Annual Report on Form 10-K
SR014 Securities and Exchange Commission Twilio Inc. Annual Report on Form 10-K
SR015 HIPAA Journal New Mandatory Cybersecurity Rules for HIPAA Business Associates
SR016 Smith, Gambrell & Russell AI Note-Takers, Biometric Privacy, and the Battle Over BIPA Damages
SR017 Commercial Litigation Update Biometric Backlash: The Rising Wave of Litigation Under BIPA and Beyond
SR018 Privacy World 2025 Year in Review: Biometric Privacy Litigation
SR019 OpenAI Whisper
SR020 MarketsandMarkets Conversational AI Market - Global Forecast to 2031
SR021 Illinois General Assembly 740 ILCS 14/10 Definitions
SR022 Illinois General Assembly 740 ILCS 14/15 Retention; collection; disclosure; destruction
SR023 U.S. Department of Health and Human Services HIPAA Security Rule NPRM overview
SR024 U.S. Department of Health and Human Services HIPAA Security Rule NPRM fact sheet
SR025 AssemblyAI Voice AI in 2026
SR026 Deepgram Self-Hosted Voice AI
SR027 Deepgram Voice AI for Enterprise
SR028 Deepgram Deployment Options
SR029 Deepgram Deepgram with Amazon Connect
SR030 Twilio Building Virtual Agents on Twilio with OpenAI, Deepgram, and ElevenLabs
SV001 Business Wire Deepgram Raises $130M Series C at $1.3B Valuation to Power the Voice AI Economy Deepgram ... announced it has raised $130 million in Series C funding at a $1.3 billion valuation.
SV002 TechCrunch Deepgram raises $130M at $1.3B valuation and buys a YC AI startup The startup's raise continues the trend of sizable funding rounds in voice AI last year.
SV003 Tech Funding News Deepgram $130M Series C, $1.3B valuation, voice AI
SV004 Business Wire Deepgram Accelerates Into 2025, Empowering 200,000 Developers From Startups to Global Enterprises to Build Voice AI Deepgram had 400+ enterprise customers and 200,000+ active developers building on the platform.
SV005 Research and Markets Speech and Voice Recognition Market Report 2026
SV006 Coherent Market Insights Voice and Speech Recognition Market Size & Share, 2026-2033
SV007 The Business Research Company Speech-to-text API Market Report 2026
SV008 MarketsandMarkets Conversational AI Market by Product Type, Business Function, Integration Type, and End User - Global Forecast to 2031
SV009 ElevenLabs ElevenLabs raises $180M Series C to be the voice of the digital world This latest funding values ElevenLabs at $3.3 billion.
SV010 Securities and Exchange Commission SoundHound AI, Inc. Annual Report on Form 10-K for fiscal year ended December 31, 2024 The aggregate market value of voting stock held by non-affiliates ... was approximately $1,169.2 million.
SV011 Securities and Exchange Commission Twilio Inc. Annual Report on Form 10-K for fiscal year ended December 31, 2024 The aggregate market value of stock held by non-affiliates ... was $9.1 billion.
SV012 CompaniesMarketCap SoundHound AI market capitalization
SV013 CompaniesMarketCap Twilio market capitalization
SV014 CompaniesMarketCap Five9 market capitalization
SV015 CompaniesMarketCap NICE market capitalization
SV016 AssemblyAI Announcing our $50M Series C to build superhuman speech AI models This brings AssemblyAI's total funds raised to $115M.
SV017 AssemblyAI Voice AI in 2026, Series 1
SV018 AssemblyAI G2 Spring 2026 Voice Recognition Report
SV019 Goodwin AI Transcription Tools Under Scrutiny: Navigating Privacy Risks and Practical Mitigation Strategies AI transcription tools ... create new risk vectors for organizations when leveraged without due care.
SV020 Deepgram OpenAI Whisper vs Deepgram alternative
SV021 Deepgram Amazon vs Deepgram
SV022 Deepgram AssemblyAI vs Deepgram
SV023 Deepgram Google vs Deepgram alternative
SV024 Deepgram ElevenLabs vs Deepgram
SV025 OpenAI API pricing
SV026 Amazon Web Services Amazon Transcribe pricing
SV027 Google Cloud Speech-to-Text pricing
SV028 Microsoft Azure Speech Services pricing
SV029 Deepgram Deepgram pricing
SV030 ElevenLabs Pricing