尽调报告 Generative AI infrastructure / inference cloud late-stage private 2026-05-16

Together AI

开放模型推理云，技术护城河可信、企业牵引已见规模，定价接近 Series B 水位

Together AI 的推理云产品和牵引力可信，但 Series B 估值要靠多年 ARR 放量才能支撑强退出。

封面要素

最新披露估值（Series B 2024） 01

3.3 USD B [CV001]

种子轮 / A 轮 / B 轮累计融资 02

500 USD M (approximate, per press) [CV001, CV002]

媒体报道的收入运行率区间 03

130-200 USD M ARR (per The Information, unverified) [CV040]

已具名企业与创业公司客户 04

9 case studies + GTC 2025 cohort [CV012]

开发者注册数（公司口径） 05

100000 developers [CU001]

公司概况

Together AI 是生成式 AI 云平台，覆盖 200+ 个开放和定制模型，提供无服务器与专用推理、微调和训练；底层由 FlashAttention、ThunderKittens 与 Together Inference Engine v2 支撑。公司把可防御的技术研究底座、 Salesforce + NVIDIA 渠道，以及开源社区入口组合到一起。

官网: www.together.ai
成立时间: 2022-06-01
创始人: Vipul Ved Prakash, Ce Zhang, Tri Dao, Percy Liang
创立地点: San Francisco, California, USA
总部: San Francisco, California
产品: Together AI 销售无服务器推理（按 token 计费）、专用端点（预留 GPU 容量）、微调（LoRA + full）、批量推理、 embeddings、视觉、音频和图像 API，覆盖 200+ 个开放与定制模型目录，整体兼容 OpenAI。
客户: 开发者（自助式）、AI 原生创业公司（Pika、Cartesia、Arcee、Nous Research）、企业 SaaS（Salesforce、 Zoom）、医疗健康（Adaption）、学术机构（Washington University），以及 NVIDIA GTC 2025 Pioneers 队列。
商业模式: 按用量计费的无服务器推理 + 承诺专用容量 + 微调 + 企业合同；Salesforce Ventures 联合销售和 Startup Accelerator 强化直销。
阶段: late-stage private
融资情况: 私有融资；Series A $102.5M（Nov 2023，Kleiner Perkins 领投）和 Series B $305M（Mar 2024，Salesforce Ventures 领投，CNBC / Bloomberg / Fast Company 报道投后约 $3.3B）；投资方包括 NVIDIA、Coatue、Lux Capital、Prosperity7、General Catalyst。

执行摘要

主要优势

技术护城河由 FlashAttention（Tri Dao）、ThunderKittens（Stanford HazyResearch）、Together Inference Engine v2 和 Mixture-of-Agents 产品化共同支撑。
核心渠道伙伴（Salesforce Ventures 联合销售、NVIDIA GTC 2025 Pioneers、Startup Accelerator）叠加 200+ 开放模型目录，覆盖企业和开发者。
已记录的企业与创业公司验证基础覆盖 Salesforce、Zoom、Pika、Cartesia、Arcee、Nous Research、Washington University 和 Adaption healthcare。

主要风险

超大规模云厂商捆绑推理服务（AWS Bedrock、GCP Vertex、Azure OpenAI）可能在 2026-2027 年压低定价 30-50%。
GPU、网络和技术栈集中依赖 NVIDIA；若 Blackwell 分配收紧，收入爬坡会被卡住。
到 2027 年，生成式 AI 监管边界（EU AI Act、BIS 出口管制、FTC 调查）和版权诉讼先例（NYT、Authors Guild、Getty）都会继续扩大。

未决问题

准确 ARR、NRR / GRR、前 10 大客户集中度、GPU 承诺支出和运营费用拆分（R&D / S&M / G&A）均未披露。
runDate 时 CFO 和 CRO 是否到位未获公开确认。
除公开状态页外，SLA 百分比、事故历史、渗透测试频率和泄露预案均未披露。
主权渠道姿态（Prosperity7-adjacent）以及版权先例收紧下的 OSS 托管政策，需要管理层披露。

01公司概况

1.1 身份、总部与产品框架

Together AI 将自己定位为「AI 加速云」，为开源和定制的大语言、图像、音频与视觉模型提供训练、微调和推理。公司主体 Together Computer Inc. 总部位于 California San Francisco，在 Menlo Park 设有卫星办公室， Zurich 还有研究人员；招聘页和联系方式共同确认了这些地点，也显示公司仍在基础设施、kernel、GPU、应用 ML 与收入岗位上积极招人。公司由四位与 Stanford、Princeton、ETH Zürich 和更广泛开源 LLM 研究社区深度相关的联合创始人于 27 June 2022 注册成立。它的身份建立在三根支柱上：面向 AI 工作负载定制的超大规模 GPU 云，开源研究线（RedPajama、OpenChatKit、StripedHyena、 FlashAttention、Mixture-of-Agents），以及与 OpenAI 和 Anthropic 竞争、但为开放模型定价的自助式推理和微调 API。公司强调客户可以保留权重、控制数据驻留，并在需要时使用专用集群；这是它与封闭 API 竞争对手的核心差异。[CO001, CO002, CO003, CO004, CO005]

KPI 快照表
指标	数值 / 状态	日期	置信度	缺口或尽调问题
投后估值	$3.3B	2024-07-09	高	确认 2026 年老股交易或新一轮融资
累计新股融资	≈$533M 已披露	2024-07	高	核查 2024 年 7 月后是否有延长轮
年化收入	≈$100M（第三方报道）	2024-07	中	无审计文件；要求管理层提供数字
员工数	>150（招聘网站推导）	2026-05	中	无监管文件；要求 HR 花名册
GPU 规模	>20,000 NVIDIA Hopper 级	2024-07	中	确认 Blackwell 增量和利用率
客户数	100,000+ 开发者（公司声称）	2024	低	区分付费与免费；核实 NRR
总部	San Francisco, CA	2026-05	高	—
成立日期	27 June 2022	2022	高	—

数值混合公司披露（高）、第三方报道（中）和推导数据（低）；付费客户数和 ARR 未经审计，必须向管理层核实。

[CO019, CO020, CO021, CO022, CO023, CO024]

FO002: 公司快照逻辑

身份、产品、资本和客户如何连接。

[CO001, CO003, CO005, CO017, CO020, CO021]

1.2 创始人、领导层与治理

CEO Vipul Ved Prakash 此前是 Topsy 联合创始人 / CTO（2013 年被 Apple 以约 $200M 收购），也是 Cloudmark 早期负责人，因此兼具消费者规模 ML 与基础设施运营经验。CTO Ce Zhang 是 ETH Zürich 终身教授，并领导 Together 在分布式训练系统和以数据为中心的 ML 研究。首席科学家 Chris Ré 是获得 MacArthur 奖的 Stanford 教授，Snorkel 以及多条 FlashAttention / Hyena 工作线都出自其团队；Stanford CRFM 主任 Percy Liang 是联合创始人兼顾问。领导层已补上收入负责人、GPU 基础设施负责人、推理工程负责人，以及常驻 Zurich 的研究负责人；董事会包括 Coatue、Kleiner Perkins、NEA 和 Lux 的投资合伙人。关键人依赖集中在 Prakash 的商业执行上，也集中在创始研究三人组带来的技术可信度上；考虑到开源飞轮贡献了 Together 很大一部分漏斗顶端，这一点尤其重要。[CO006, CO007, CO008, CO009, CO010, CO011]

管理层与创始人表
人物	职位	背景	创始人-市场匹配	关键人依赖
Vipul Ved Prakash	联合创始人、CEO	曾任 Topsy 联合创始人 / CTO（2013 年被 Apple 收购），Cloudmark 联合创始人	连续基础设施 / 消费 ML 创始人，有运营退出经历	高——唯一 CEO 和主要商业门面
Ce Zhang	联合创始人、CTO	ETH Zürich 终身教授；分布式训练与数据中心化 ML 研究负责人	深厚系统 / ML 研究可信度	高——唯一 CTO；连接研究与工程
Chris Ré	联合创始人、首席科学家	MacArthur Fellow；Stanford CS；Snorkel 联合创始人；FlashAttention/Hyena 谱系	撰写或指导大部分开源 IP	高——锚定研究品牌
Percy Liang	联合创始人	Stanford CRFM 主任；HELM 基准负责人	设定研究议程和学术可信度	中——顾问性质，不全职运营
Tri Dao	首席科学家（研究）	FlashAttention 作者；Princeton CS 教职	推理内核权威	高——推动内核性能领先
收入负责人	销售领导（公开列示职位）	企业 SaaS 背景	企业扩张必需	中——已招聘多名销售
GPU 基础设施负责人	集群工程	过往超大规模云厂商经验（招聘网站）	SLA 与成本关键岗位	中——正在积极招聘

创始人履历与官方 About 页面和 Wikipedia 交叉核实；非创始人高管来自截至 runDate 的招聘信息和公开 LinkedIn 足迹。

[CO006, CO007, CO008, CO009, CO010, CO011]

1.3 融资历史、资本结构与估值

Together AI 在 May 2023 完成 $20M 种子轮，由 Lux Capital 领投，Factory、SciFi、Long Journey 以及 Scott Banister、Jakob Uszkoreit、Aravind Srinivas 等个人投资人参与。Nov 2023 又完成 $102.5M Series A，由 Kleiner Perkins 领投，NVIDIA、Emergence、NEA、Prosperity7 和 Greycroft 参投。Mar 2024 公司据报道以 $1.25B 估值追加约 $106M；随后在 July 2024 完成 $305M Series B，由 Salesforce Ventures 和 Coatue 领投，投后估值 $3.3B，Lakestar、NVIDIA 和更多战略投资方也参与。由此计算，在任何 2025/2026 延伸轮之前，已披露的新股融资累计约 $533M；截至报告运行日，EDGAR 上没有公开 S-1 申报或注册发行文件。投资人组合——主权相关资本（Prosperity7）、战略 GPU 供应商（NVIDIA）、定义品类的云客户（Salesforce Ventures）与一线财务投资人（Coatue/KP/Lux）——并不常见，说明 Together 正被摆成开放模型市场中立、多方参与的底层骨架。[CO012, CO013, CO014, CO015, CO016, CO017]

利益相关方 / 投资方图谱
利益相关方	角色	轮次	控制 / 经济重要性	尽调问题
Salesforce Ventures	领投 B 轮（2024）	B	向 Salesforce 生态做战略分发	确认是否有商业承诺或收入分成
Coatue	联合领投 B 轮	B	公私募交叉投资信号	确认按比例跟投意愿
Kleiner Perkins	领投 A 轮	种子 / A / B	董事席位；合伙人 Bucky Moore	确认董事会构成
NVIDIA	战略投资方	A/B	H100/H200/B200 供给分配	量化供给承诺和定价
Lux Capital	领投种子轮	种子 / A	最早机构投资方	确认董事会观察员权利
Emergence Capital	A 轮	A	企业 SaaS 网络	—
Prosperity7 (Aramco)	A 轮	A	主权资本色彩；中东市场进入	确认是否有主权云承诺
投资方：NEA, Greycroft, SciFi, Factory, Long Journey, Definition, Long Journey	联合投资方	种子 / A / B	轮次支持	—
创始人与员工	普通股	—	据 A 轮新闻稿，持股保留 >25%	确认 B 轮后股权结构表

股权结构表数字来自融资事件新闻稿；截至 runDate，未披露老股出售。

[CO012, CO013, CO014, CO015, CO016, CO017]

FO001: 公司里程碑时间线

从创立到 Series B，以及旗舰研究发布。

[CO014, CO015, CO016, CO017, CO021, CO022]

1.4 规模、封面指标与里程碑

公开规模指标仍不完整。公司称其在多个区域运营超过 20,000 块 NVIDIA Hopper 级 GPU，公开路线图也提到 Blackwell 推出；并称通过 Together API 服务「hundreds of thousands」开发者。但公司尚未披露经审计 ARR、毛利率、付费开发者数或净收入留存率。 CNBC 报道 Series B 前后年化收入节奏为 $100M；Bloomberg 提到三位数增长，但没有给出具体数值。报道中的员工数超过全球 150 人，仍在招聘 kernel、网络、ML 和销售岗位。里程碑时间线包括成立（June 2022）、种子轮（May 2023）、RedPajama 1T 数据集（April 2023）、OpenChatKit（March 2023）、Series A（November 2023）、FlashAttention-3（July 2024）、 $3.3B 估值 Series B（July 2024），以及 StripedHyena / Mixture-of-Agents 研究（late 2023–2024）。截至报告运行日，未见反向诉讼、裁员或监管行动报道；但核心封面指标（毛利率、ARR 确认、客户集中度）仍未披露，已体现在快照 KPI 表中。[CO019, CO020, CO021, CO022, CO023, CO024]

里程碑表
日期	事件	类型	金额 / 估值 / 状态	参与方	含义
2022-06-27	Together Computer Inc. 注册成立	创立	存续	Prakash, Zhang, Ré, Liang	主体身份确立
2023-03-10	OpenChatKit 发布	产品	已发布	Together + LAION + Ontocord	开源指令微调基线
2023-04-17	RedPajama 1T 数据集发布	产品	已发布	Together + EleutherAI + LAION	基础开源数据集（1T tokens）
2023-05-15	$20M 种子轮公布	融资	已交割	Lux + Factory + SciFi	机构启动资本
2023-11-29	$102.5M A 轮	融资	已交割，估值未披露	Kleiner Perkins（领投）、NVIDIA、NEA、Emergence	扩张和 H100 建设
2024-03-13	据报道，以 $1.25B 估值进行过渡融资	融资	据报道	现有投资方	周期中估值抬升
2024-07-09	$305M B 轮，投后 $3.3B	融资	已交割	Salesforce Ventures + Coatue（联合领投）、NVIDIA、Lakestar	估值跃升 3x；转向企业市场
2024-07-11	FlashAttention-3 论文与博客	产品	已发布	Dao 等	领先 H100 推理内核
2024-09	Together Inference Engine 2.0	产品	已发布	Together 工程团队	延迟 / 吞吐领先主张
2023-12	StripedHyena-Nous-7B	产品	已发布	Together + Nous Research	非注意力长上下文架构
2024-06	Mixture-of-Agents 论文	产品	已发布	Together 研究团队	智能体 LLM 技术
2024-Q4	Dedicated Endpoints 正式可用（GA）	产品	已发布	Together 工程团队	企业推理产品

截至 runDate，未报道负面事件（诉讼、裁员、监管行动）；缺乏负面事件本身也是尽调发现，仍待背景调查验证。

[CO019, CO020, CO021, CO022, CO023, CO024]

FO003: 快照 KPI

投委会可用的成熟度、牵引力与资本快照。

[CO019, CO021, CO022, CO024, CO032]

1.5 图表要点

Chapter 02

02市场分析

2.1 市场边界与邻近领域

Together AI 位于现代云栈的 AI 计算和推理平台层——夹在超大规模云厂商 GPU IaaS（AWS、GCP、Azure）、专门 GPU 云（CoreWeave、 Lambda、Crusoe）、推理 API 提供商（Replicate、Fireworks、Groq、Modal）和封闭 API 模型实验室（OpenAI、Anthropic）之间。我们承销的市场，是用于运行、微调和服务开放权重或客户自有基础模型的支出，加上 AI 工作负载使用的专用与无服务器 GPU 容量。该市场不包括通用云计算、传统 ML 平台（Sagemaker 仅训练、经典 scikit pipeline），也不包括不托管客户权重的封闭专有模型 API。邻近领域包括 MLOps 工具（Weights & Biases、Anyscale）、向量数据库，以及 AI 安全 / 可观测性厂商。现状替代方案是自建 Kubernetes-on-GPU 集群，以及从 OpenAI / Anthropic 租用封闭 API；两者都用灵活性换取价格和运营简单度。我们也明确排除按席位计费的 AI copilots（Copilot、Cursor），因为需求单位是终端用户席位而不是推理 token；它们位于 Together 之上的应用层，采购的是 token 级推理，并不替代它。[CM001, CM002, CM003, CM004, CM005, CM006]

市场定义表
细分	纳入支出	排除支出	购买方 / 付款方	与 Together 的相关性
开放权重模型推理（API）	Llama/Mistral/Qwen/DeepSeek 上按 token 计费的无服务器推理	封闭 API token（OpenAI/Anthropic）	开发者 + CTO	核心 SOM
专用 GPU 容量	预留 H100/H200/B200 端点	通用云计算	平台团队	直接扩张 ARR
微调 + 定制模型托管	LoRA、完整微调、定制 checkpoint 托管	内部 Kubernetes 训练	ML 工程负责人	高毛利附加
批量推理 + 训练	百万级 token 批处理任务、预训练运行	封闭训练专用平台	研究负责人	增长切入点
主权 / 区域集群	区域内专用容量	公共区域多租户	政府 / 受监管 CIO	差异化赛道
MLOps + 可观测性	日志、评测、微调任务	BI / 分析	MLOps 负责人	相邻业务，非核心
封闭 API 模型租用	OpenAI/Anthropic API 支出	—	应用开发者	替代 / 压力

边界锚定在客户拥有权重，以及 GPU 支撑的计算作为计费单元；排除通用云和纯封闭 API。

[CM001, CM002, CM003, CM004, CM005, CM006]

2.2 TAM/SAM/SOM 与测算口径

多个分析机构口径指向 2024 AI 基础设施 TAM 为 $40–60B，2028 前 CAGR 为 30–50%（Gartner、IDC、McKinsey）。在这个范围内，与 Together 最相关的推理和专用 AI 计算 SAM，到 2026 约为 $8–15B；测算来自三角交叉：超大规模云厂商 AI 收入披露（外推 AWS Bedrock-equivalent 收入年化 $26B）、Series B 报道称推理是增长最快的产品线，以及 Together 约 $100M ARR 代理值意味着其在早期 SAM 中只占个位数份额。SOM（Together 可触达、近期可拿下的支出）大约为 $1–3B，重点是 AI 原生创业公司、模型实验室，以及 Together 已有明确关系的 Salesforce + 主权云渠道。测算受两点限制：超大规模云厂商没有拆分公开数据；许多已发表估计把训练 capex 和推理运行率混在一起。[CM007, CM008, CM009, CM010, CM011, CM012]

TAM/SAM/SOM 或规模测算视角表
发布方	年份	地域	数值	CAGR	方法论	置信度	限制
Gartner	2024	全球	$40–60B AI 基础设施 TAM	30–50%	自上而下调研超大规模云厂商 + 企业 AI 支出	中	汇总训练 + 推理；未拆分
IDC（二手引用）	2024	全球	$50B AI 基础设施，2024	35%	硬件 + 云预测	低	间接引用
McKinsey AI 支出报告	2024	全球	$50–100B 2027 年 AI 基础设施	40%	情景分析	低	区间宽；假设不清
三角测算 SAM（本报告）	2026	全球	$8–15B 推理 + 专用容量 SAM	—	自下而上，基于 CNBC ARR + 超大规模云厂商披露	中	依赖超大规模云厂商季报，来源单一
三角测算 SOM（本报告）	2026	全球	Together 可触达 $1–3B	—	渠道 + Together $100M ARR	中	估算不确定性高
NVIDIA 财报（数据中心）	2025-Q1	全球	>$30B/季度数据中心收入	>50%	公开文件	高	包含训练资本开支销售，并非纯推理

TAM/SAM/SOM 均设边界；保留区间，因为没有单一公开来源能清晰拆分推理支出。

[CM007, CM008, CM009, CM010, CM011, CM012]

FM001: 市场规模测算视角

Together 可服务 AI 计算的 TAM/SAM/SOM。

[CM007, CM008, CM011, CM012, CM036]

FM002: 市场估算区间

2026 年推理 SAM 估算。

[CM009, CM010]

2.3 买方、用户与付款方分层

三类核心买方驱动 Together 需求。（1）AI 原生创业公司和模型实验室：技术创始人或 CTO 为 FlashAttention 级推理延迟、专用 H100/H200 访问和开放权重灵活性选择 Together；这类客户通常先用信用卡自助购买，再升级到企业合同。（2）Fortune-500 企业内部的平台团队和应用 ML 小组：预算所有者是评估多模型策略的 CIO/CTO，采购门槛围绕数据驻留、SOC 2 和 BAA 支持；Salesforce Ventures 共同领投 Series B 为这一分层背书。（3）政府、研究和主权云客户：Prosperity7（Aramco）及类似主权相关 LP 暗示中东 / APAC 角度，Together 也把专用区域集群定位为差异点。用户（开发者、ML 工程师、研究人员）往往不同于付款方（财务、采购、IT），这会拉长企业周期，但落地后改善 NRR。[CM014, CM015, CM016, CM017, CM018, CM019]

细分市场 / 买方图谱
细分市场	买方	用户	付款方	工作流	预算负责人	采用触发点
AI 原生初创公司	CTO	ML 工程师	创始人 / CFO	自助式 API + LoRA	CTO	需要开放权重 + 专用 GPU
F500 平台团队	CIO	应用 ML	IT 采购	RFP + 专用端点	CIO	多模型策略 + BAA
主权云	部长 / CIO	政府 ML	财政部门	区域内专用容量	政府	数据驻留要求
模型实验室	创始人	研究员	创始人	预留训练 + 推理	创始人	超大规模云厂商 GPU 稀缺
独立开发者	本人	本人	本人	按 token 计价的 API	本人	免费层 + 价格持平
Salesforce 生态 ISV	产品 VP	工程团队	产品 P&L	嵌入式 GenAI	产品 VP	Salesforce Ventures 渠道

买方 / 用户 / 付款方拆开看，才能区分信用卡自助采用和企业采购门槛。

[CM014, CM015, CM016, CM017, CM018, CM019]

FM003: 买方 / 细分市场地图

各细分市场的采用成熟度。

[CM014, CM015, CM016, CM017, CM018, CM019]

2.4 增长驱动与约束

顺风包括：开放权重模型持续扩散（Llama 3/4、Mistral、DeepSeek、Qwen）、超大规模云厂商 GPU 稀缺、FinOps 压力要求降低按 token 计费的封闭 API 支出，以及智能体 AI 浪潮把每个用户的 token 用量放大。逆风包括：NVIDIA 供给分配偏向超大规模云厂商、主权数据规则拖慢跨境推理、新数据中心遭遇能源 / 许可瓶颈，以及 Groq、Fireworks、Cerebras 在推理层带来的价格竞争压力。采用时点风险包括企业采购摩擦、超大规模云厂商可能把 OSS 推理层商品化（AWS Bedrock 开放模型、GCP Vertex Model Garden），以及训练与推理组合的经济性波动。Together 的定位依赖推理 kernel 持续领先一代（FlashAttention 3/4、ThunderKittens），同时扩展到能锁定企业支出的预留 / 专用 SKU。每个驱动和约束最终都回到 IC 的二元问题：推理 SAM 能否再以 35%+ 复合增长三年，还是超大规模云厂商商品化会把增长前置成一年抢地盘？我们的基准情景假设 2027 前 CAGR 可持续在 30–40%，2026 起竞争强度扩大；在这种状态下，Together 的开源飞轮和专用容量差异化能产出最强 IRR。[CM021, CM022, CM023, CM024, CM025, CM026]

增长驱动因素与约束表
驱动因素 / 约束	方向	时间	影响	尽调问题
开放权重模型扩散	+	2024-2027	支撑 SAM >35% CAGR 增长	跟踪 Llama 4/5、DeepSeek、Qwen 发布节奏
NVIDIA Hopper/Blackwell 稀缺	+	2024-2026	推高 Together 预留容量溢价	量化 Together 与 NVIDIA 的分配协议
封闭 API 价格压力（OpenAI 降价）	-	持续	挤压每 token 利润率	跟踪 Together 与 OpenAI 的价格持平度
超大规模云厂商开放模型商品化	-	2025-2027	侵蚀纯推理 SAM	关注 AWS Bedrock 与 Vertex Model Garden 扩张
主权数据驻留规则	+/-	2025+	形成区域护城河，但限制跨境 ARR	确认 Together 区域内集群
能源 / 许可瓶颈	-	2026-2028	拖慢容量扩张	确认 Together 数据中心合同
智能体工作负载放大 token 用量	+	2025+	增加单用户推理量	跟踪 MoA + 智能体 SDK 采用
FinOps 推动 OSS 推理	+	2025+	相对封闭 API，给 Together 带来顺风	调研企业 FinOps 策略

驱动因素来自多份分析师报告和合作伙伴声明；约束则由供应链报道和超大规模云厂商公告交叉验证。

[CM021, CM022, CM023, CM024, CM025, CM026]

FM004: 采用漏斗或价值链图

从发现到扩张的路径。

[CM020, CM021, CM022, CM033]

2.5 图表要点

Chapter 03

03竞争格局

3.1 竞争格局分层

Together 在五个相互重叠的战场竞争。（1）超大规模云厂商开放模型产品——AWS Bedrock 和 Google Vertex Model Garden 托管 Together 也提供的 Llama / Mistral checkpoint，并打包企业合同和 IAM。（2）专门 GPU 云——CoreWeave、Lambda Labs 和 TensorWave 争夺原始 GPU-hour 与预留容量；它们通常缺少 Together 叠加的推理 SaaS 层。（3）推理 API 同行——Fireworks、Replicate、Modal 和 Anyscale 在按 token 计费的无服务器层提供近似直接替代；Fireworks 最常被称为 Together 最接近的直接对手。（4）定制硅推理厂商——Groq （LPU）、Cerebras（wafer-scale）和 SambaNova 以模型覆盖为代价，在延迟和每 token 价格上竞争。（5）封闭 API 模型实验室——OpenAI 和 Anthropic 是那些愿意放弃权重可迁移性的买方替代方案。现状替代是自建 Kubernetes-on-GPU，用灵活性换运营负担；内部自建最常见于前沿实验室和 FAANG。竞争集合异常宽，是因为 Together 位于计算、模型托管与开发者体验的交叉点；每个战场都让 Together 面对不同成本结构（capex 重的 GPU 云 vs OpEx 轻的 API 提供商）、不同分销权力（超大规模云采购 vs 开发者自助），以及不同退出动态（GPU 云整合 vs API 同行商品化），下文会分别承销。[CP001, CP002, CP003, CP004, CP005, CP006]

竞争对手画像表
竞争对手	类别	规模 / 融资	目标细分市场	差异化	限制
AWS Bedrock	超大规模云厂商开放模型	>$80B AWS 收入	企业	IAM、合规、捆绑	每 token 溢价，模型新增更慢
GCP Vertex Model Garden	超大规模云厂商开放模型	~$30B GCP 收入	企业	Gemini + 开放模型	开放权重深度较弱
CoreWeave	专用 GPU 云	>$8B 融资；2025 上市	AI 实验室、超大规模云厂商卸载需求	最大的非超大规模云厂商 GPU 集群	没有推理 SaaS 层
Lambda Labs	GPU 云	$320M Series C	研究人员、初创公司	按需 H100/H200	集群小于 CoreWeave
Fireworks AI	推理 API 同类	>$77M 融资	开发者、初创公司	OpenAI 兼容 API	OSS 研究影响力较小
Replicate	推理 API 同类	>$40M 融资	独立开发者	社区模型、低摩擦	冷启动延迟
Modal	无服务器基础设施	>$80M 融资	ML 工程师	Python 原生无服务器	模型广度较弱
Anyscale	基于 Ray 的平台	>$250M 融资	ML 工程师	Ray + LLM 工具	OSS 平台税
Groq	定制芯片	>$1B 融资	延迟敏感型开发者	LPU 推理速度	模型覆盖有限
Cerebras	定制芯片	>$1B 融资；已提交 IPO	前沿客户	晶圆级芯片	单次部署成本高
OpenAI / Anthropic（替代）	封闭 API	>$30B / $10B 融资	企业 + 开发者	前沿封闭模型	权重不可迁移
TensorWave	AMD GPU 云	种子轮阶段	成本敏感型开发者	MI300X 容量	规模有限

融资和规模数字来自公开新闻稿与 Crunchbase 摘要；部分私募融资轮次依赖第三方报道。

[CP001, CP002, CP003, CP004, CP005, CP006]

3.2 能力与功能比较

按能力轴看，Together 领先于 FlashAttention-3/4 kernel 性能、开放权重模型广度（Llama、Mistral、DeepSeek、Qwen、定制 checkpoint）和专用端点灵活性。超大规模云厂商领先于企业合规广度（BAA、FedRAMP、区域驻留）和打包身份 / 计费。Groq 在支持模型的原始单流延迟上领先，但模型覆盖落后。Fireworks 在无服务器开放模型 API 上与 Together 接近，但 OSS 研究可见度更低。价格比较显示，Together 无服务器费率集中在 OpenAI 平价区间附近（7–70B 模型输入 token 约 $0.20–$0.90/M），批量折扣最高 50%；CoreWeave / Lambda 在原始 GPU-hour 上更便宜，但要求客户自己做 DevOps；AWS Bedrock 则在底层计算之上加收按 token 溢价。下方功能矩阵把不支持的单元格标为 unknown，而不是猜测。矩阵显示：Together 赢在开放权重广度和 kernel 性能，超大规模云厂商赢在合规和 IAM，定制硅厂商以模型覆盖为代价赢在延迟；没有单一厂商能同时主导四个最常被引用的采购标准。我们还注意到，在这个集合里，Together 是仅有的两家既提供 OpenAI 兼容 chat completions 端点、又暴露 fine-tune 和 batch SKU 的厂商之一，能显著缩短从封闭 API 迁出的买方迁移时间。[CP009, CP010, CP011, CP012, CP013, CP014]

功能 / 能力矩阵
采购标准	Together	Bedrock	GCP Vertex	Fireworks	Groq	CoreWeave
开放权重模型广度	高	中	中	高	低	n/a
FlashAttention 级内核性能	高	unknown	unknown	高	中	n/a
专用端点 / 预留	是	是（预置）	是	是	是	是（裸资源）
微调 API	是	部分	是	是	否	否
批量推理 SKU	是	部分	是	部分	否	否
合规（SOC2/HIPAA/FedRAMP）	SOC2；通过 BAA 支持 HIPAA	完整	完整	SOC2	unknown	SOC2
主权 / 区域集群	可用	完整	完整	有限	unknown	完整
OpenAI 兼容 API	是	否	否	是	是	否
每 token 标价透明度	高	中	中	高	高	n/a
多模态（视觉 / 音频 / 图像）	是	部分	是	部分	否	n/a

公共文档未披露的单元格标为 “unknown”；功能不在竞争对手 SKU 范围内的单元格标为 “n/a”。

[CP009, CP010, CP011, CP012, CP013, CP014]

定价 / 包装对比
供应商	SKU	价格 / 单位	折扣	备注
Together	无服务器 Llama-70B	$0.88/M tokens	—	OpenAI 价格持平区间
Together	批量推理	较无服务器低 50%	批量	2025 更新
Together	专用端点	定制	预留	由销售报价
Fireworks	无服务器 Llama-70B	$0.90/M tokens	—	类似价格持平
Replicate	按秒计费	不一	—	按 GPU 秒计费
AWS Bedrock	Llama 3 70B	输出价格：$0.99/M output tokens	量大	预置预留选项
GCP Vertex	Llama 3 70B	$0.99/M	量大	类似 Bedrock
Groq	Llama 3 70B	$0.59/M	—	延迟溢价
CoreWeave	GPU 小时	$2–4/H100-hr	预留	客户管理技术栈
Lambda	GPU 小时	$2.79/H100-hr	按需	客户管理技术栈

每 token 价格反映 runDate 时供应商网站公开标价；企业交易的实际价格未披露。

[CP016, CP017, CP018, CP019, CP020]

FP001: 竞争定位图

开放权重广度与企业合规成熟度。

[CP001, CP009, CP011, CP012, CP013]

FP002: 功能广度 / 能力图

按竞争对手比较能力强度。

[CP010, CP014, CP015, CP018, CP029]

3.3 护城河耐久度与竞争风险

Together 可防御的护城河包括：（a）FlashAttention 研究脉络和 kernel 迭代速度（Tri Dao + Chris Ré），（b）开源社区引力（RedPajama、StripedHyena、MoA），以及（c）NVIDIA + Salesforce + 主权资本结构，帮助锁定 GPU 供应和企业分销。切换成本处于中等水平：客户可以通过 API 翻译在 Together / Fireworks / Bedrock 多栖；但 Together 上的专用端点合同和微调模型 artefact 会提高粘性。分销权力偏向超大规模云厂商——它们掌握企业采购和身份——但 Together 的中立性与开放权重承诺构成反定位差异。竞争者反向证据包括：CoreWeave 2024 IPO 文件与 Lambda 增长信号显示 IaaS 层有显著资本优势；Groq 和 Cerebras 各自融资超过 $1B 且估值更高；Bedrock 2025 年扩大 Llama 支持，压缩 Together 在商品化工作负载上的溢价。商品化风险真实存在，但受 Together 的研究速度和专用容量合同约束。净判断是：到 2027 年，护城河在专用和高性能分层仍然耐用；商品化无服务器层会越来越受到超大规模云厂商挤压，延迟敏感层会受定制硅厂商压力。公司能否守住 kernel 与架构领先，是护城河假设的门槛变量，因此也是技术尽调清单上的首要事项。[CP021, CP022, CP023, CP024, CP025, CP026]

护城河耐久性 / 竞争风险登记表
护城河主张	威胁	严重性	缓释措施 / 尽调问题
FlashAttention 研究脉络	开源成果向竞争对手扩散	中	跟踪 Together 的专利 / IP 布局
开源社区吸引力	Mistral/HF 的竞争性开源项目	中	量化 Together 在 GH/HF 上的长期牵引力
NVIDIA 供给协同	NVIDIA 向超大规模云厂商倾斜	高	记录 Together 与 NVIDIA 协议细节
Salesforce / 企业渠道	Salesforce 自建 AI 基础设施	中	确认 Salesforce 商业承诺
主权资本 + 区域集群	主权客户直接转向本地云	中	梳理 Together 区域数据中心布局
专用端点粘性	Bedrock 预置吞吐能力追平	高	跟踪 Bedrock 开放模型价格动作
开放权重中立性	企业想要封闭 API 的简单体验	中	调研企业多模型策略
推理引擎性能领先	专用芯片（Groq/Cerebras）跳跃式赶超	高	在相同模型上对比测试 Together 与 Groq

按竞争替代敞口和资本强度排序护城河；每行都有一个具体尽调问题。

[CP021, CP022, CP023, CP024, CP025, CP026]

FP003: 护城河 / 准备度 KPI

紧凑版竞争耐久性总结。

[CP021, CP022, CP023, CP024, CP025, CP026]

3.4 图表要点

Chapter 04

04财务情况

4.1 融资历史与资本结构

Together AI 已在四个公开宣布轮次中筹集约 $533M 已披露新股资本。$20M 种子轮（May 2023）由 Lux Capital 领投，Factory、SciFi、 Long Journey 以及知名个人投资人（Scott Banister、Jakob Uszkoreit、Aravind Srinivas）参与。Nov 2023 的 $102.5M Series A 由 Kleiner Perkins 领投，NVIDIA、Emergence、NEA、Prosperity7 和 Greycroft 参与。Mar 2024 公司据报道以 $1.25B 估值追加约 $106M（有时称为 Series A2），随后在 July 2024 完成 $305M Series B，由 Salesforce Ventures 和 Coatue 领投，投后估值 $3.3B，Lakestar、NVIDIA 及多家战略方参与。runDate 时，SEC EDGAR 上没有 Together Computer Inc. 的 S-1、S-3 或注册发行；公开市场上也未确认老股交易或 2026 延伸轮。因此资本结构仍是纯风险投资，且有战略锚点（NVIDIA 对应 GPU 供应，Salesforce Ventures 对应企业分销，Prosperity7 对应主权可选项）；按领投信号看，董事会控制权在 KP、 Coatue 与 Lux 之间分散，但股权结构表本身未公开。累计稀释未披露；外界普遍报道创始人在 Series B 后仍保留有意义股权，但公开记录没有精确比例，必须向管理层验证。[CI001, CI002, CI003, CI004, CI005, CI006]

资本充足性表
资本要素	数值	日期	公开状态	尽调问题
累计新股融资	~$533M	2024-07	已披露（轮次层面）	—
账面现金	未披露	—	缺失	索取现金头寸
月度烧钱	未披露（隐含约 $15-25M）	2024-25	缺失	索取实际烧钱速度
现金跑道月数	未披露（隐含可能 18-30 个月）	2025	缺失	索取现金跑道计划
计划资金用途	未披露	—	缺失	索取资本开支计划
下一轮融资触发点	未披露	—	缺失	索取里程碑
债务 / 项目融资	未披露	—	缺失	索取融资授信条款
供应商融资（NVIDIA）	未披露	—	缺失	确认是否存在设备融资
Series B 估值	投后 $3.3B	2024-07-09	已披露	—
最新老股成交价	未披露	—	缺失	Pitchbook / Information 传闻

资本要素混合了已披露融资轮金额和未披露的前瞻性财务要素。

[CI007, CI023, CI026, CI027, CI030, CI031]

FI004: 资本密集度 / 现金流图

将资本开支和经营性现金流映射到融资轮次。

现金余额和下一轮融资触发点未披露；箭头仅表示方向，不表示规模。

[CI007, CI026, CI027, CI030]

4.2 收入、定价与报道规模

Together 尚未提交财务报表。CNBC 在 July 2024 Series B 前后报道其年化收入节奏为 $100M；Bloomberg 提到「triple-digit growth」； Fast Company 和 VentureBeat 复述了这些数字，但没有独立验证。The Information 另有关于 2025 收入轨迹的付费墙报道；PitchBook 将公司列为后期风险投资公司，但没有确认 2025 跟投。公开定价页按 token 披露价格，7–70B 开放模型大约 $0.20–$0.90/M tokens，并记录 50% 批量推理折扣；定制专用端点价格需通过销售报价。SKU 包括无服务器、专用 / 预留端点、微调、批量和 embeddings；视觉 / 音频 / 图像 SKU 另行记录。runDate 时，公司没有公开 ARR、分部拆分、客户集中度、NRR 或毛利率。Forrester 和 IDC 的市场框架说明 Together 是数十亿美元生成式 AI 推理 TAM 中的成长阶段进入者，但两家分析机构都未把 Together 列为前三厂商。管理层承认 Salesforce Ventures 联合销售带动企业管线加速，但没有量化。公司自称动能、第三方媒体轶事和缺少经审计披露这三者组合，符合私营成长阶段 SaaS，但也在实际成交价 vs 标价、组合和毛利率上制造重大尽调风险。 GTM 动作以漏斗顶端自助开发者注册为主，随后通过 Salesforce Ventures 和 NVIDIA 渠道转介做伙伴驱动的企业扩张；销售周期、CAC 和回本周期未披露，但参考可比推理 API 厂商披露，企业专用合同可推断为 60-120 天。[CI009, CI010, CI011, CI012, CI013, CI014]

收入来源表
SKU	定价依据	公开价格基准	折扣杠杆	尽调缺口
无服务器推理	每百万 tokens	$0.20–$0.90/M（7–70B 开放模型）	用量 / 承诺用量	实际成交价与标价差异未披露
批量推理	每百万 tokens	较无服务器折扣 50%	批量 SLA 窗口	2025 年博客更新已确认
专用端点	定制 / 预留	销售报价	期限承诺	未公开标价
微调 API	按训练任务	定价页报价	用量	文档公开，但未披露毛利率
向量嵌入 API	每百万 tokens	按模型公开	用量	—
视觉 / 图像 / 音频 API	按请求 / 按 token	按模型公开	—	收入组合未拆分
企业合同	年度 / 承诺	未披露	战略折扣	关键尽调问题

各定价行混合了公开标价（高置信度）和推断的企业交易做法（低置信度）；不同 SKU 之间的收入结构未披露，必须向公司索取。

[CI009, CI010, CI011, CI012, CI013, CI014]

定价 / 变现表
定价维度	公开基准	标价与实际成交	折扣 / 未知项	来源
Llama-70B 按 token 定价	$0.88/M 无服务器	仅标价	用量折扣	定价页
批量 SLA 折扣	-50%，较无服务器	仅标价	批量窗口	2025 年批量推理博客
专用端点	定制 / 按小时	实际成交价未披露	期限承诺	博客 + 销售报价
微调任务	按训练 token	仅标价	用量	微调文档页
向量嵌入	每百万 tokens	仅标价	用量	向量嵌入文档页
企业合同金额	未披露	实际成交未披露	战略折扣	向管理层索取
联合销售返利（Salesforce）	未披露	实际成交未披露	合作伙伴经济条款	Salesforce Ventures 联合销售
主权云溢价	未披露	实际成交未披露	区域性	Prosperity7 战略方

标价可以公开核验；企业合同的实际成交价未披露，必须索取。

[CI012, CI013, CI014, CI015, CI016]

FI001: 收入模型桥接图

客户活动如何转化为 Together 收入和毛利。

毛利连线仅作示意；实际利润率未披露。

[CI012, CI013, CI014, CI015, CI024, CI025]

4.3 单位经济、资本充足性与缺口

Together 的公开画像只能支持粗略单位经济估计。成本端，GPU-hour COGS 随 NVIDIA capex 扩张；CoreWeave 的 S-1 披露（有用可比）显示，GPU 云在预留交易上的毛利率为 60–70%，按需交易更低。按 Together 标价看，无服务器每 token 毛利率可能在 40–60%，专用端点更高；但实际毛利率取决于利用率和未公开的预留容量合同。现金端，已融资 $533M，对应截至 2024 估计 $300–$500M 现金消耗（与超大规模 GPU 建设和 150+ 员工数一致），说明现金跑道可延伸到 2026，但没有数值被确认。资本充足性取决于 Together 是延长 Series B 还是申请 IPO； Figma 和 CoreWeave 的 2025 IPO 先例说明 AI 基础设施发行人的公开市场窗口已打开，而 Navan 的 S-1 流程是更接近的成长 SaaS 可比。缺口很重大：ARR 确认、按 SKU 毛利率、前 10 大客户集中度、净美元留存、已签约 vs 未签约收入、现金跑道月数、债务或供应商融资，以及任何主权云承诺。这些缺口推动了单位经济和资本充足性表中的尽调要求，也为每个未披露原始指标形成重大证据缺口；在没有管理层披露前，最有信息量的外部信号是 Together 的公开招聘状态、定价页修订，以及任何 2026 老股市场传闻，尽调结束前都应跟踪。以这一规模的消耗型 SaaS 看，营运资本不太可能成为约束；更大的现金摆动项是 GPU capex 相对收入爬坡的节奏，它决定下一轮触发时间。结论是：收入质量和增长表象强但未验证；毛利路径可信但未经审计；资本强度高但有 NVIDIA 对齐支撑；首要尽调阻断点，是 public-financial-gaps 表中列出的全套私有财务原始指标。[CI019, CI020, CI021, CI022, CI023, CI024]

单位经济表
指标	数值 / 空值	置信度	重要性	尽调问题
无服务器推理毛利率	40–60%（推断）	低	长期利润率路径	索取实际混合毛利率
专用端点毛利率	60–75%（推断）	低	预留容量客户 LTV	索取专用端点毛利率拆分
批量推理毛利率	35–55%（推断）	低	50% 折扣后的批量毛利率	确认批量利用率
CAC 回本周期	null	低	销售效率	按客群索取回本月数
魔数	null	低	销售产能	索取魔数
净留存率（NRR）	null	低	扩张代理指标	按队列索取 NRR
总留存率	null	低	流失代理指标	索取总留存率
2024 年隐含烧钱速度	$300–$500M（推断）	低	现金充足性	索取 24 个月计划
GPU 集群利用率	null	低	利用率驱动毛利率	按 SKU 索取利用率
SBC 比率	null	低	真实利润率	索取 SBC 明细表

所有数值都是推断区间或空值；每个空值都配有具体尽调请求。

[CI019, CI020, CI021, CI022, CI023, CI024]

公开财务缺口表
项目	公开状态	重要性	尽调问题
审计收入（ARR）	未披露	验证第三方 $100M 数据	索取管理层 ARR 与增长材料
按 SKU 拆分毛利率	未披露	支撑长期投资逻辑	按 SKU 索取 COGS 拆分
净金额留存率	未披露	粘性代理指标	按队列索取 NDR
前 10 大客户集中度	未披露	收入集中风险	索取匿名化前 10 大客户
已签约收入（RPO）	未披露	未来收入可见性	索取已签约 / 未签约拆分
现金与现金跑道	未披露	资本充足性	索取现金头寸与 24 个月计划
债务 / 供应商融资	未披露	资本结构	如有，索取融资授信条款
创始人持股	未披露	利益一致性、稀释	索取股权结构表
NRR 与总留存率	未披露	扩张与流失	索取总 / 净留存率
股权激励费用	未披露	真实利润率与披露利润率	索取 SBC 明细表
企业实际成交价	未披露	真实利润率与标价	索取三份样本合同

所有项目都会影响投资判断，且截至 runDate 均未公开；本章依赖第三方信号，也需要向管理层索取材料来补齐缺口。

[CI019, CI020, CI021, CI022, CI023, CI024]

FI002: 单位经济性桥接图

在披露值缺失时，拆解每 token 单位经济性的输入项。

所有定量节点都是推算区间或空值；本图仅作定性桥接。

[CI012, CI016, CI019, CI020, CI024, CI025]

FI003: 财务估计区间

有来源支持的收入、烧钱速度、现金跑道和利润率边界。

区间仅作示意；下限取最保守的公开数据点，上限按最激进公开数据点的 2x 估算。

[CI009, CI024, CI025, CI026, CI027]

4.4 图表要点

Chapter 05

05产品与技术

5.1 产品界面、模块与 SKU

Together AI 对外提供一个统一平台，包含无服务器推理、专用端点、微调、批量推理、embeddings，以及按模态划分的 API（视觉、音频、图像）。产品界面记录在 docs.together.ai，并在 chat-completions 层面兼容 OpenAI，因此从封闭 API 迁移并不复杂。模型目录覆盖 200+ 个开放模型，包括 Llama 3/4、Mistral、Mixtral、Qwen、DeepSeek、StripedHyena 和定制微调 checkpoint；公开模型和 SKU 资料确认了按 token 与按请求计费界面。专用端点为延迟敏感工作负载提供 H100/H200/B200 GPU 预留容量，需通过销售报价。微调 API 支持在多数支持的模型家族上运行 LoRA 和全参数训练任务。批量推理相对无服务器最高可打 50% 折扣，并有记录的 SLA 窗口。SDK 提供 Python 与 TypeScript，其他运行时可用原始 HTTP；限速文档区分免费、付费和企业层。下方完整产品模块 / 资产矩阵列出每个模块、主要用户、成熟度状态、差异化，以及买方在签长期合同前应追问的缺口。模块排序遵循买方典型采用路径：先用无服务器试验，再用专用端点和微调进生产，最后用批量和 embeddings 扩展工作流。[CE001, CE002, CE003, CE004, CE005, CE006]

产品模块 / 资产矩阵
模块	用户	状态 / 成熟度	差异化	尽调缺口
无服务器推理 API	开发者、创业公司	正式可用	200+ 开放模型上的 OpenAI 兼容聊天补全	SLA 百分比未公开
专用端点	企业	正式可用	预留 H100/H200/B200 算力，提供 BAA	标价未公开
微调 API	ML 工程师	正式可用	Llama/Mistral/Qwen 支持 LoRA + 全参数微调	训练成本透明度
批量推理	ML 工程师	正式可用（2025 更新）	较无服务器折扣 50%	实际批量利用率未披露
向量嵌入 API	开发者	正式可用	多个开放向量嵌入模型	按模型跟踪留存
视觉 / 图像 / 音频 API	多模态开发者	正式可用	Llama-Vision、图像生成、音频转写	区域可用性地图
推理引擎：Together Inference Engine（TIE v1/v2）	内部 / 高阶用户	正式可用	FA-3/4 + TK + 推测解码	引擎版本 SLA 差异
Mixture-of-Agents	研究人员、高阶开发者	测试版	集成推理提升质量	相较单模型的成本溢价
模型商店	所有用户	正式可用	200+ 开放权重 + 自定义权重	目录更新节奏
SDK（Python、TS、HTTP）	开发者	正式可用	OpenAI 兼容 + 原生	SDK 发布节奏

成熟度按公开文档状态判断；标为测试版或限量开放的单元格反映 runDate 时文档中的明确说法。

[CE001, CE002, CE003, CE004, CE005, CE006]

工作流 / 用例表
用户任务	当前工作流	Together 方案	可衡量收益	限制
试用开放模型	本地 llama.cpp 或 HF Spaces	无服务器 API 调用	零基础设施，兼容 OpenAI	规模化后的成本
从封闭 API 迁移生产负载	OpenAI SDK	将基础 URL 换成 Together	同一套 SDK，开放权重	功能对齐的边界
微调一个 Llama 变体	自建 GPU 集群	微调 API + 运行任务	不需要 DevOps	训练步骤可见度有限
支撑低延迟应用	自托管 vLLM	专用端点	预留容量，BAA	更高承诺用量
运行夜间批量摘要	自托管批处理	批量推理 SKU	比无服务器方案便宜 50%	批处理 SLA 窗口
构建智能体	LangChain + 封闭 API	函数调用 + JSON 模式 + 结构化输出	开放权重 + 工具使用	工具调用模式仍在演进
生成向量嵌入	本地 HF 向量嵌入模型	向量嵌入 API	托管、可扩展	重新索引成本
多模态（视觉）	自托管 Llama-Vision	视觉 API	托管视觉调用	图像尺寸限制
研究集成方案	论文复现代码	MoA API	开箱即用的集成推理	单次查询成本更高
运行受监管工作负载	本地部署 GPU	专用端点 + BAA	专用端点支持 HIPAA	尚无 FedRAMP

工作流各行来自文档快速入门和客户案例研究；限制项来自文档中的明确提示或已知缺口。

[CE001, CE002, CE003, CE004, CE005, CE006]

FE001: 产品架构图

Together AI 产品栈从 API 到 GPU 底座的分层。

[CE001, CE011, CE012, CE013, CE014, CE015]

5.2 架构、依赖与运营模型

Together 的架构把应用 API（chat、completions、embeddings、fine-tune、batch）叠在模型注册表与推理编排器之上，后者在多区域 NVIDIA Hopper / Blackwell 机群上调度 GPU pod。推理引擎（Together Inference Engine v1/v2）封装 FlashAttention-3 与 FlashAttention-4 attention kernel、ThunderKittens kernel 框架，以及 speculative-decoding / Medusa decoder，用来实现已发表的吞吐和延迟主张。Mixture-of-Agents（MoA）研究使支持模型能够通过 ensemble 推理获得更高质量补全。模型存储依托 HuggingFace 和 Together 自有 registry；权重可迁移是其明确设计原则。关键依赖包括 NVIDIA GPU 供应（Hopper/Blackwell）、数据中心共址伙伴、HuggingFace 模型工件目录，以及用于微调工件的 AWS S3 / 等价存储。运营模型把 kernel / 推理工程团队（Tri Dao、HazyResearch 脉络）、平台 / SRE 团队（2025 起由 Alon Gavrielov 领导的基础设施组织）和研究部门（Chris Ré、Percy Liang）拆开。架构通过一张流程图呈现（客户请求到 GPU pod 再到响应），并用关键依赖 DAG 暴露单一供应商集中。可靠性证据包括状态页、已发布限速文档，以及 GTC 2025 和 AI Native Conference 2025 上发布的模型路线图。缺口包括公开 SLA 百分比、精确多区域地图（哪些区域、哪些提供商），以及单一事实源路线图；这些均被标为证据缺口。[CE011, CE012, CE013, CE014, CE015, CE016]

技术 / 运营架构表
层 / 组件	作用	关键依赖	风险
API 网关	接收兼容 OpenAI 的 HTTP 请求	鉴权 + 限流基础设施	DDoS、限流校准失误
模型注册表	将模型 ID 解析到权重	HuggingFace + 内部存储	权重变动、许可证更新
推理调度器	把请求调度到 GPU pod	GPU 池、Kube / 编排器	热点拥堵、队列深度
推理引擎：Together Inference Engine v2	内核优化的模型执行	FA-3/4、ThunderKittens、推测解码	引擎 bug、新模型回归
GPU 池（Hopper / Blackwell）	算力底座	NVIDIA 供给、托管机房合作伙伴	供给冲击、断电
微调训练器	LoRA / 全参数训练任务	GPU 池 + 对象存储	任务失败成本
批处理队列	调度批量推理	GPU 低峰窗口	若撞上高峰，可能违反 SLA
向量嵌入服务	嵌入文本 / 图像	向量嵌入模型注册表	模型弃用
视觉 / 音频路径	多模态推理	独立模型栈	模态特定 bug
可观测性 / 状态	SLA 监控	status.together.ai 动态源	仍缺公开 SLA
信任 / 合规	SOC 2 + HIPAA 控制	审计节奏	FedRAMP 尚未正式可用
存储（微调产物）	持久化已训练模型	S3 等价存储	丢失 / 泄露场景

架构层基于已披露的产品表面；各层深度从博客和研究论文推断，可能并不穷尽。

[CE011, CE012, CE013, CE014, CE015, CE016]

FE002: 客户工作流 / 运营流程

客户请求经过 Together 平台并返回补全的路径。

[CE011, CE012, CE013, CE014, CE015, CE016]

FE003: 关键依赖图

Together 依赖的供应商、平台和合作伙伴。

[CE014, CE018, CE019, CE020, CE021, CE022]

5.3 信任、安全、合规与路线图

Together 发布的信任中心提到 SOC 2 Type II 认证、专用端点可提供 HIPAA business associate agreement（BAA），以及标准数据处理条款。 runDate 时，FedRAMP 和类似美国联邦认证尚未列出；区域驻留通过专用集群提供，但公开地图不完整。安全控制覆盖内容审核、function-calling JSON 校验、structured-output JSON mode，以及按模型给出的安全指南。路线图从博客和 AI Native Conference 文章中梳理，包括 Blackwell （B200）容量爬坡、批量推理 SKU 扩展、更多微调家族、多模态（vision+audio）覆盖，以及 Mixture-of-Agents 产品化。差异化建立在四点：（a）kernel 级性能领先（FA-3/4、TK），（b）开放权重模型覆盖广，（c）无服务器 / 专用 / 批量 SKU 之间足够灵活，（d）研究与工程双文化，且深接 Stanford / Princeton / ETH 脉络。公开开发者信号——GitHub repo 活跃度、PyPI 下载轨迹、HuggingFace model hub 存在感和 Hacker News 讨论参与——确认社区活跃，但规模还未匹配 OpenAI 或 Hugging Face 本身。相比超大规模云产品，Together 的差异在开放权重中立性和 kernel 性能上最明显，在企业合规广度上最不明显。下方信任 / 合规与路线图表按当前状态、范围和缺口汇总每项控制与里程碑；标为 unknown 的单元格表示缺少公开披露，而不是底层能力不存在。[CE023, CE024, CE025, CE026, CE027, CE028]

信任 / 质量 / 合规表
控制 / 认证	状态	范围	缺口
SOC 2 Type II	已鉴证	平台	需要最新鉴证日期
HIPAA / BAA	可用	专用端点	不覆盖无服务器层级
GDPR / DPA	可用	欧盟客户	具体区域驻留
FedRAMP	尚未	美国联邦	路线图时间未确认
ISO 27001	未确认	—	状态不确定
数据驻留 / 区域集群	部分可用	欧盟、美国	公开区域地图有限
内容审核 / 安全	有文档	API 层	各模型行为不同
函数调用 / JSON 模式	正式可用	API	工具使用模式仍在演进
结构化输出	正式可用	API	—
审计日志	有文档	企业	默认未启用
自定义模型权重隐私	有文档	专用端点	需要合同审查
漏洞赏金 / 负责任披露	已发布	平台	—

控制项已用 trust.together.ai 页面、博客文章和公开文档交叉核验；标为「未确认」的单元格表示公开披露缺失，并不等于底层控制不存在。

[CE023, CE024, CE025, CE026, CE027, CE028]

路线图 / 发布 / 开发阶段表
日期 / 阶段	功能 / 里程碑	状态	影响	来源
2024-07	FlashAttention-3	正式可用	Hopper 上的内核领先性	arXiv 2407.08608
2024-10	ThunderKittens	正式可用	内核框架	Together 博客
2024-11	Startup Accelerator	已推出	GTM 渠道	Together 博客
2025-03	GTC 2025 Pioneers	活动	客户 + NVIDIA 曝光	Together 博客
2025-04	Alon Gavrielov 出任 VP Infra	已聘任	运营规模	Together 博客
2025-05	Adaption 合作	已推出	医疗健康工作流	Together 博客
2025-06	AI Native Conference	活动	研究 + 产品发布	Together 博客
2025-08	FlashAttention-4	正式可用	下一代内核	Together 博客
2025-09	批量推理 API 更新	正式可用	50% 折扣 + SLA	Together 博客
2026-Q1	Blackwell (B200) 上线	计划中	容量与价格	从文档推断
2026	MoA 产品化扩展	计划中	质量层级	AI Native Conference
2026	多模态扩展	计划中	视觉 + 音频覆盖	Together 博客

runDate 之后的路线图事项均明确标为计划中；来源包括博客文章和会议公告。

[CE033, CE034, CE035, CE036, CE037]

FE004: 产品成熟度 / 能力图

各产品模块的成熟度评分。

[CE001, CE002, CE003, CE004, CE005, CE006]

5.4 图表要点

Chapter 06

06客户情况

6.1 客户分层与采用入口

Together AI 的客户基础按买方 / 用户角色和部署强度分层。漏斗顶端是使用无服务器推理做原型或低量生产的自助开发者：按公司披露，自 GA 以来已有超过 100,000 名开发者使用该平台。其下是具名创业客户——Pika（视频）、Arcee（开源合并）、Nous Research（社区模型）、 Cartesia（语音）——它们通过无服务器和专用端点组合运行生产工作负载。企业层由 Salesforce（通过 Salesforce Ventures 联合销售和客户案例提及）、 Zoom（客户案例）和 Washington University（研究部署）支撑；NVIDIA GTC 2025 Pioneers 项目又浮现出一批客户，包括医疗健康、机器人和开发者工具公司。 Startup Accelerator（2024-11 启动）是面向早期 AI 创业公司的明确漏斗，提供额度、技术支持和 GTM 放大。地域组合偏北美， EU 通过专用集群增长；垂直组合覆盖开发者工具、内容 / 媒体（视频、语音、图像）、企业 SaaS、医疗健康和学术。付款方 / 用户 / 买方拆分随层级变化：自助层里开发者既是买方也是用户；企业层买方通常是 CTO/CIO 或平台工程负责人，用户则是应用团队。下方客户分层、采用轨迹和具名客户证明表，记录每一行的证据质量与留存、集中度剩余缺口。[CU001, CU002, CU003, CU004, CU005, CU006]

客户分群表
分群	买方 / 用户 / 付费方	用例	规模	收入 / 战略价值	缺口
自助式开发者	开发者 = 买方 + 用户	原型开发、低量生产	100,000+ 开发者（公司声称）	长尾收入 + 漏斗	付费与免费未拆分
AI 原生初创公司	CTO / 创始人	生产推理	Pika、Cartesia、Nous、Arcee 有文档记录	高战略价值	未披露收入数值
企业 SaaS	CIO / 平台工程	嵌入式 AI 功能	Salesforce、Zoom	较大战略价值	未披露合同规模
医疗健康	CIO / 临床负责人	受监管工作流（BAA）	Adaption（2025 年推出）	战略性	生产状态待定
高校 / 研究	PI / IT 负责人	科研计算	Washington University	品牌价值	支出规模未披露
开发者工具	创始人 / CTO	嵌入式推理	GTC 2025 批次	管线	未列明批次成员
主权 / 政府	采购	主权云	与 Prosperity7 对齐（暗示）	战略可选性	无公开证据
开源社区	维护者	OSS 模型服务	HuggingFace 镜像集成	品牌 + 社区	主动与被动使用未拆分

分群行混合了具名案例研究和推断类别；收入区间数值不可得。

[CU001, CU002, CU003, CU004, CU005, CU006]

客户增长 / 采用进展表
指标	值	日期	来源	置信度	含义	缺失分母
使用平台的开发者	100,000+	2024	Together 博客	低	漏斗顶端规模	付费 / 免费拆分
具名客户案例研究	已发布 7+ 个	2024-25	Together 博客	高	真实生产使用	总客户数
GTC 2025 客户队列	约 12 家先锋客户	2025-03	Together 博客 + NVIDIA	中	企业销售管线	单客户 ACV
Startup Accelerator 参与者	未披露 N	2024-11 起	Together 博客	低	管线杠杆	队列规模
Adaption 医疗合作伙伴	1（已启动）	2025	Together 博客	中	受监管行业切入	生产状态
HuggingFace 集成用户	未披露	2024-25	HF 博客	低	开源社区拉动	活跃开发者
G2 评价	样本数很小	2025	G2	低	独立证据	样本量太低，缺乏代表性
Trustpilot 评价	样本数很小	2025	Trustpilot	低	独立证据	样本量太低，缺乏代表性

采用进展行混合了公司自称数据（低置信度）和第三方报道数字；缺失的分母已逐项列出。

[CU001, CU002, CU011, CU012, CU013, CU014]

FU001: 客户旅程图

从自助开发者到企业扩张的路径。

[CU001, CU002, CU003, CU004, CU005, CU006]

FU002: 采用 / 部署漏斗

开发者到企业客户的逐阶段转化。

认知、活跃付费和多年期合同数量都是示意占位；只有注册和具名数量有来源支持。

[CU001, CU002, CU011, CU012, CU013, CU014]

6.2 具名客户证明与耐久性

具名客户证明覆盖七个公开案例（Salesforce、Zoom、Pika、Arcee、Nous Research、Cartesia、Washington University），加上 GTC 2025 Pioneers 队列和 Adaption 医疗健康合作。每个案例都记录了客户工作流、使用模型和定性结果；量化结果（吞吐、延迟、成本、ROI）在部分部署中有记录，但并不全面。最常被引用的结果是 FlashAttention 带来的延迟降低（Pika、Cartesia）、相对封闭 API 的成本降低（Arcee、Nous），以及集成深度（Salesforce、Zoom）。生产 vs 试点方面，Salesforce、Zoom、Pika、Cartesia 明确为生产；Adaption 被描述为启动中的合作，而非已确认生产部署。反向与耐久信号混合：G2 和 Trustpilot 评论数很少，限制了独立留存代理；Reddit 和 Hacker News 讨论偶尔提到无服务器层延迟或冷启动问题；未见公开客户流失公告或终止客户报告。下方客户证明矩阵按证据质量、结果具体性、留存可见度和生产成熟度标注每个具名客户。留存与重复使用原始指标（NRR、GRR、总留存）均未披露，本章把该缺口列为重大证据缺口，并附具体尽调要求。引用质量和新鲜度以 2024-2025 案例（Salesforce、Zoom、Pika）最佳；更早且 2026 未更新的案例较弱。[CU012, CU013, CU014, CU015, CU016, CU017]

具名客户证据表
客户	客群	部署 / 用例	生产 / 试点	结果	限制
Salesforce	企业 SaaS	联合销售 + 嵌入式推理	生产	集成深度 + Series B 轮领投	合同金额未披露
Zoom	企业 SaaS	AI 功能推理	生产	延迟改善	具体指标未公开
Pika	初创公司（视频）	视频模型服务	生产	靠 FA 级内核降低延迟	成本收益仅定性
Cartesia	初创公司（语音）	语音模型服务	生产	专用部署吞吐	定价未披露
Arcee	初创公司（开源）	模型合并 + 推理	生产	相比闭源 API 的成本优势	用量未披露
Nous Research	开源社区	社区模型托管	生产	开放权重中立性	收入结构未披露
Washington University	学术机构	科研算力	生产	科研吞吐	支出规模未披露
Adaption	医疗	受监管工作流	启动中	进入医疗	生产状态待定
GTC 2025 Pioneers 队列	企业混合客群	多种用例	生产	NVIDIA + Together 联合	队列名单未完整列出

各行只列有案例研究或新闻证据的公开具名客户；未公开的具名客户（如有）不在本表内。

[CU012, CU013, CU014, CU015, CU016, CU017]

留存 / 重复使用 / 满意度表
指标	值 / null	客群	置信度	尽调请求
NRR	null	企业	低	请求按队列拆分 NRR
GRR	null	企业	低	请求按队列拆分总留存率
Logo 流失	null	企业	低	请求具名账户流失名单
活跃开发者（付费）	null	自助	低	请求付费开发者数量
复购率	null	自助	低	请求队列复购率
G2 平均评分	样本数很小	自助	低	样本数太小，不能外推
Trustpilot 平均评分	样本数很小	自助	低	样本数太小，不能外推
Reddit/HN 情绪	褒贬不一到偏正面	社区	低	汇总定性扫描
具名客户续约	null	企业	低	通过客户访谈确认
专用端点续约率	null	企业	低	请求续约队列

所有留存基础指标均为 null，并配有具体尽调请求。

[CU022, CU023, CU024, CU025, CU026]

FU003: 客户验证矩阵

按具名客户拆解证据质量；每行围绕单个客户展开证据维度，补充「具名客户证明」表。

[CU012, CU013, CU014, CU015, CU016, CU017]

6.3 扩张、集中度与反向信号

扩张代理大多是定性信号。Salesforce Ventures 联合销售关系是首要企业扩张杠杆；市场把 Salesforce Ventures 领投 Series B 解读为多年渠道承诺。 NVIDIA GTC 2025 Pioneers 和 Startup Accelerator 则增加品牌与管线。HuggingFace 合作把模型 hub 中的开发者导入 Together。没有管理层披露，就无法精确限定集中度风险；但公开客户组合偏 AI 原生创业公司和开发者工具公司，而不是少数超大型企业合同，说明漏斗顶端比 OpenAI 式锚定客户模型更分散。专用层记录了渠道与采购摩擦：企业销售周期需要销售介入、定制 MSA 和安全审查，收入确认前会增加 60-120 天。反向信号包括零散 Reddit 和 Hacker News 讨论提到无服务器层延迟、冷启动或偶发可靠性事件；公司维护公开状态页，但不发布 SLA 百分比。runDate 前未见公开诉讼、丢失客户报道或具名账户流失。下方扩张与集中度表记录每个扩张驱动、集中度风险、影响幅度，以及关闭剩余缺口所需的精确尽调路径；本章留存表把所有未披露原始指标视为尽调要求，而不是断言无法溯源的数字。整体看，客户证据基础符合一个成长阶段推理平台：它在强自助开发者飞轮之上，正在建立真实企业牵引。[CU027, CU028, CU029, CU030, CU031, CU032]

扩张与集中风险表
扩张驱动	集中风险	影响	尽调路径
Salesforce Ventures 联合销售	企业订单过度集中于 Salesforce 渠道	高	量化来自 Salesforce 的管线占比
NVIDIA GTC Pioneers	NVIDIA 转介绍集中	中	量化 GTC 来源 ACV
Startup Accelerator	长尾稀释风险	低	跟踪队列收入转化
HuggingFace 合作	漏斗依赖 HF	中	确认交叉推广条款
自助开发者增长	长尾流失风险	低	按月跟踪队列留存
Adaption 医疗切入	单一具名合作伙伴风险	中	跟踪后续医疗客户拿单
主权 / Prosperity7 渠道	若落地，存在主权客户集中风险	中	确认管线承诺
开源社区	品牌依赖 OSS 拉动	低	跟踪 GH/HF/PyPI 信号稳定性
前 10 大客户集中	若未披露则影响重大	高	请求匿名化前 10 大客户数据
地域集中	北美占比高	中	请求区域收入拆分

在缺少客户收入拆分披露时，扩张驱动和集中风险只能做定性排序。

[CU027, CU028, CU029, CU030, CU031, CU032]

FU004: 留存 / 复用队列

时间序列留存占位图，暂用行业常见 PLG SaaS 代理值；Together 披露前，所有数字仅作示意。

所有留存单元格都是行业基准示意值（PLG SaaS / 推理）；Together 尚未披露实际队列留存。

[CU022, CU023, CU024, CU025, CU026]

6.4 图表要点

Chapter 07

07风险

7.1 监管与法律风险面

Together AI 面对的生成式 AI 监管边界，与所有在美国和欧洲运营的基础模型平台相同。在美国，FTC 于 2024 年启动对生成式 AI 投资与伙伴关系的 6(b) 研究，并表示会广泛审查云与 AI 关系的反垄断问题；Biden / Trump-era Executive Order on AI 为联邦 AI 标准奠定基础， NIST AI Risk Management Framework 将其操作化。BIS 已收紧先进 GPU（A100、H100、H200、B200）以及部分基础模型权重出口管制，直接影响 GPU 云运营商。欧盟方面，AI Act 于 2024 生效，对通用 AI 提供商的分阶段义务将延续至 2026-2027；英国 ICO 和澳大利亚 OAIC 发布的 GenAI 指引也形成事实合规底线。隐私制度（California 的 CCPA、医疗健康工作负载的 HIPAA）施加合同层义务，Together 通过其信任中心提及的 BAA 和 SOC 2 控制来履行。诉讼侧，NYT v Microsoft/OpenAI、Authors Guild v OpenAI 和 Getty v Stability AI 是版权风向标案件，结果会塑造每个模型托管平台的风险暴露；Together 目前不是具名被告，但其开放模型托管业务存在相邻暴露，尤其当判例扩展到 platform-as-host 时。民间组织压力（CDT、EFF）增加声誉风险。下方监管与法律风险登记表按司法辖区、可能性、严重性、缓释和剩余暴露排序每个条目，并为每项未披露控制设置尽调问题。[CR001, CR002, CR003, CR004, CR005, CR006]

监管 / 法律风险登记表
规则 / 案件	司法辖区	状态	可能性	严重度	缓释措施	剩余暴露
FTC 6(b) 生成式 AI 调查	美国	进行中	高	中	聘请律师，持续监测	可能的行为性救济
FTC 一般 AI 执法	美国	执行中	中	中	标准广告 / 竞争合规	执法行动
EU AI Act（GPAI）	欧盟	2024-27 分阶段实施	高	高	GPAI 义务、透明度、版权退出机制	违规罚款最高可达收入的 7%
BIS 出口管制（GPU + 权重）	美国 / 全球	2025 年收紧	高	高	客户地理围栏、筛查	主权部署受阻
NIST AI RMF	美国	自愿	中	低	采用框架控制	若缺失，采购处于劣势
UK ICO 生成式 AI 指引	英国	有效	中	中	UK DPA + GDPR 合规姿态	执法暴露
澳大利亚 OAIC 生成式 AI 指南	澳大利亚	有效	低	低	采纳指南	执法暴露
白宫 AI 行政令	美国	有效	中	中	报告阈值	报告负担
CCPA（加州）	美国-加州	有效	中	中	隐私控制	执法暴露
HIPAA（医疗工作负载）	美国	有效	中	高	BAA、专用层级	数据泄露 + 罚款
SOC 2 证明范围	全球	在信任中心自我声明	中	中	SOC 2 Type II 证据	若过期，存在证明缺口
NYT 诉 Microsoft/OpenAI（版权）	美国	诉讼进行中	高	中	监控；平台与托管方边界	判例外溢风险
Authors Guild 诉 OpenAI	美国	诉讼进行中	高	中	监控；平台与托管方边界	判例外溢风险
Getty 诉 Stability AI	美国 / 英国	诉讼进行中	中	中	监控；图像模型相邻风险	判例外溢风险
CDT AI 政策压力	美国	活跃	低	低	沟通、透明度	声誉

每行反映 runDate 时的规则 / 案件态势；评级为定性判断，待管理层披露后再确认。

[CR001, CR002, CR003, CR004, CR005, CR006]

FR001: 风险热力图

主要风险的可能性 × 严重性热力图。

[CR001, CR003, CR004, CR012, CR018, CR021]

7.2 运营、安全、伙伴与依赖风险

Together 的运营风险集中在三条线：GPU 资源供给（Hopper 和 Blackwell）、模型服务可靠性，以及受监管工作负载的控制。NVIDIA 是最重要的单一供应商依赖——GPU、网络（NVLink、InfiniBand）和软件栈（CUDA、TensorRT、NeMo、Dynamo）都绕不开；它同时也是战略投资方，这降低了供给分配风险，也把下行情景集中到同一条链上：一旦 Blackwell 分配收紧，冲击会高度相关。HuggingFace 是主要的模型制品依赖；如果 HF 调整托管条款或商业协同，合作伙伴风险会浮现。Salesforce Ventures 通过 B 轮成为核心企业渠道伙伴，渠道集中度风险并不小。安全暴露覆盖标准模型云攻击面（提示词注入、数据外泄、提示词日志泄露、模型权重供应链被攻破），也覆盖 Together 在信任中心披露的 SOC 2 / HIPAA 控制面。公开视频状态页存在，但不披露 SLA 百分比。竞争替代风险真实存在：Fireworks、Replicate、Modal、Anyscale、Cerebras 和 Groq 都服务重叠工作负载；超大规模云厂商（AWS Bedrock、GCP Vertex、Azure OpenAI）则把推理捆进既有企业合同。人员与执行风险包括 Vipul Ved Prakash（CEO）、Ce Zhang（CTO）和 Tri Dao（首席科学家）的关键人依赖，以及必须跟上 Hopper→Blackwell→Rubin 节奏的建设速度。下方运营、伙伴和人员风险台账逐项记录失效模式、缓释成熟度和剩余暴露，并给出明确尽调路径。[CR018, CR019, CR020, CR021, CR022, CR023]

运营 / 质量 / 安全风险登记表
故障模式	可能性	严重性	缓释成熟度	剩余风险敞口	未解决缺口
无服务器多小时中断	中	中	状态页；未披露 SLA %	客户流失	SLA 披露
专用端点硬件故障	低	中	暗示有冗余	收入风险	可靠性指标
提示注入 / 数据外泄	中	中	安全模型、函数调用护栏	客户侧泄露	渗透测试节奏未披露
模型权重供应链受损	低	高	HF 完整性检查	平台级受损	权重签名流程未披露
SOC 2 证明失效	低	中	信任中心披露安全态势	企业交易受阻	到期日未披露
HIPAA BAA 泄露事件	低	高	可签 BAA	监管罚款	泄露应对计划未披露
GPU 产能缺口	中	高	NVIDIA 合作关系	收入上限	分配承诺未披露
网络 / 跨区故障	低	中	暗示多区域部署	延迟飙升	区域地图未披露
内部威胁	低	中	标准控制	数据泄露	访问控制未披露
软件缺陷引入回归	中	低	暗示分阶段发布	声誉	发布节奏未披露

运营评级为定性判断；多项控制原语未披露，应作为尽调问题处理。

[CR018, CR019, CR020, CR021, CR022, CR028]

合作伙伴 / 依赖风险登记表
依赖	交易对手	角色	集中度	失效情景	严重性	缓释措施	剩余风险敞口
GPU 供应	NVIDIA	主要供应商 + 投资方	很高	Blackwell 配额削减	高	战略投资方；多代产品承诺	收入上限
模型工件	HuggingFace	注册表 + 分发	高	托管政策变化	中	公司自托管兜底	分发摩擦
企业渠道	Salesforce	联合销售 + 投资方	中	联合销售优先级下调	中	直销体系搭建	管线收缩
数据中心容量	多方（未披露）	托管机房 + 超大规模云厂商	中	单一区域容量损失	中	多区域建设	延迟 / 成本
网络	多方	传输 + IX	低	对等互联丢失	低	多运营商	短时延迟
开源社区	Llama、Mistral、Qwen、DeepSeek 维护者	模型上游	中	许可证变更	中	模型多样性	许可证审查负担
资本伙伴	资本伙伴：GC / Salesforce / NVIDIA / Lux / Coatue / Prosperity7 / Kleiner	投资方	中	融资轮超额认购失败	中	收入进展	融资风险
主权资本伙伴	Prosperity7（KSA 相关）	战略投资方	低	地缘政治压力	中	披露姿态	声誉

依赖评级只反映公开集中度；私下合同承诺仍是尽调问题。

[CR023, CR024, CR025, CR026, CR027, CR030]

人员 / 执行风险登记表
角色 / 职能	依赖或缺口	可能性	严重性	缓释措施	尽调路径
CEO Vipul Ved Prakash	创始人主导；关键人依赖	低	高	创始人留任	背调
CTO Ce Zhang	关键人依赖	低	高	留任	背调
首席科学家 Tri Dao	关键人；塑造品牌认知	低	高	学术双重任职	留任计划
基础设施 VP Alon Gavrielov	新入职（2025）	低	中	近期加入	入职评估
CFO	runDate 时未披露	中	中	招聘推进中（推断）	确认任命
CRO / 销售负责人	runDate 时未披露	中	中	企业销售体系搭建	确认任命
工程人才梯队	Series B 后扩张	中	中	招聘势头	员工数披露
合规 / GRC	提到 SOC 2；团队规模未披露	中	中	证明材料	确认团队规模
董事会构成	GC + SVP + NVIDIA + 创始人	中	中	成长阶段治理	董事会会议纪要尽调
Hopper→Blackwell→Rubin 过渡执行	跨季度建设	中	高	与 NVIDIA 合作	项目计划尽调

人员风险表同时纳入已点名个人和未披露岗位；确认 CFO/CRO 任命是明确的尽调问题。

[CR032, CR033, CR034, CR035]

FR002: 风险传导图

风险如何传导至收入、利润率、融资和估值。

[CR001, CR003, CR004, CR012, CR023, CR024]

7.3 缓释措施、放弃标准与投资逻辑破裂触发点

下方缓释与放弃标准表把每个核心风险配到可监控触发点、明确阈值或事件，以及触发后的动作含义。触发点覆盖监管（如 2027 年 EU AI Act GPAI 义务执法）、诉讼（如延伸到平台托管方的不利版权裁决）、伙伴（如 NVIDIA 分配削减或 HuggingFace 托管变更）、运营（如无服务器推理多小时宕机、披露数据泄露）、竞争（如超大规模云厂商捆绑推理定价下压）、商业（如 Salesforce 联合销售流失）和执行（如创始人离职、Blackwell 上线延期）。每个触发点都记录向收入、毛利、融资或估值传导的路径，以及动作含义（放弃、重估、监控、接受）。本章明确说明，多项基础指标——事故数、SLA、前 10 大客户集中度、留存、GPU 承诺支出、运营支出拆分——均未披露，因此作为尽调问题处理，而不是断言无法溯源的数字。反向来源覆盖较广：监管机构（FTC、BIS、EU、UK ICO、OAIC）、法律案卷（CourtListener：NYT、Authors Guild、Getty）、竞争对手网站（Fireworks、Replicate、Modal、Anyscale、Groq、Cerebras、CoreWeave、Lambda）和开发者情绪论坛（Hacker News、Reddit）。本章判断，Together 的公开风险面符合成长期 AI 基础设施公司的常态，缓释姿态健康；但若干控制基础项仍需管理层披露后验证。[CR034, CR035, CR036, CR037, CR038, CR039]

缓释措施和叫停标准表
风险	可监控触发信号	阈值 / 事件	行动含义
EU AI Act 的 GPAI 条款	执法通知	同业首个 7% 罚款	重新测算欧盟收入
BIS 出口收紧	新的实体清单规则	新增 GPU 出口类别	重新测算主权客户管线
版权诉讼外溢	平台与托管方裁决	任何托管方责任裁决	重新测算 OSS 托管
NVIDIA 配额	Blackwell 配额削减	可比同业公开遭削减	重新测算容量爬坡
HuggingFace 政策变化	HF 条款更新	重大商业条款变化	搭建自托管
无服务器中断	多小时事件	>4h 或重复 >1h	SLA 复盘 + 客户沟通
安全事件	披露事件	任何需报告事件	立即重新测算
客户集中度	前 10 大客户占比	单一客户 >25%	集中度折价
创始人离职	公开公告	CEO/CTO/CSO 任一	叫停或大幅重新测算
降价融资	新融资	较 Series B 持平或下调	重新测算估值

触发信号可通过公开披露监控；本表就是本章可执行的叫停标准。

[CR034, CR035, CR036, CR037, CR038, CR039]

FR003: 依赖关系图

关键合作伙伴、供应商、监管机构和融资依赖。

[CR023, CR024, CR025, CR026, CR027, CR030]

7.4 证据材料

Chapter 08

08估值

8.1 投资建议、投资逻辑与反向逻辑

建议为持有 / 观察，置信度中等，风险评级中高。投资逻辑：Together AI 处在一个结构性有吸引力的交叉点：（a）按分析师和市场数据来源（Gartner、Forrester、IDC、a16z、Bessemer、Menlo），GenAI 推理市场以 40-60% CAGR 扩张；（b）技术护城河可信，来自 FlashAttention 作者身份（Tri Dao）、ThunderKittens 内核（Stanford HazyResearch）、Together Inference Engine v2 和 Mixture-of-Agents 产品化；（c）企业分发渠道由 Salesforce Ventures 联合销售、NVIDIA GTC 2025 Pioneers 和 Startup Accelerator 漏斗锚定。反向逻辑：推理层竞争激烈，Fireworks、Replicate、Modal、Anyscale、Cerebras、Groq 和超大规模云厂商（AWS Bedrock、GCP Vertex、Azure OpenAI Service）都把推理捆进既有企业合同；收入（The Information 报道为 $130M-$200M+ ARR）和留存基础指标仍未披露；B 轮标记估值（约 $3.3B-$3.5B）需要多年收入规模才能支撑 3-5x 退出；监管边界（EU AI Act、BIS、版权诉讼先例）到 2027 年持续收紧。估值章节把上述每一项都列为明确的投资逻辑破裂触发点，并配上可监控阈值和动作含义。下方建议摘要表并列给出建议、置信度、风险评级、估值立场和决策含义；投资逻辑 / 反向逻辑表记录底层论据，以及哪些变化会改变判断。[CV001, CV002, CV003, CV004, CV005, CV006]

建议摘要表
建议	置信度	风险评级	估值立场	决策含义
持有 / 观察	中	中高	处于或接近当前 Series B 轮估值	跟踪 ARR + NRR + 集中度；Series C 时复盘
买入（有条件）	中	中	回调 25% 或确认 ARR >$500M	牵引力确认或出现降估融资后再进入
放弃（有条件）	中	高	若超大规模云厂商降价 >40%，或 NVIDIA 配额削减，或发生安全事件	若悲观触发项出现，则退出 / 拒投
乐观情景	低	中	2028 年前以 >$8B 退出	战略收购或高溢价 IPO 路径
基准情景	中	中	2028 年前 $4B-$6B 退出	ARR 扩大 + 毛利率扩张
悲观情景	中	高	$1B-$2.5B 结果	降估融资 / 退出估值受压

本建议取决于投资逻辑失效表中的触发阈值。

[CV001, CV002, CV003, CV004, CV005]

投资逻辑 / 反向逻辑表
论点	改变判断的证据
分析师资料显示，生成式 AI 推理 TAM 以 40-60% CAGR 增长	TAM 下修至 <20% CAGR
FlashAttention + ThunderKittens + TIE v2 拼出可信技术护城河	开源 / 超大规模云厂商内核追平，削弱 Together 优势
Salesforce Ventures 领投 Series B，意味着多年渠道承诺	Salesforce 联合销售优先级下降或客户流失
NVIDIA 战略投资 + GTC 2025 Pioneers 入选，指向供给和销售管线	NVIDIA 将资源重新分配给直营产品（DGX Cloud）
相比闭源 API 提供商，开源中立定位可防守	主要 OSS 许可变更（Llama、Mistral、Qwen、DeepSeek）
已有企业 + 初创客户案例（Salesforce、Zoom、Pika、Cartesia、Arcee）	具名客户流失或生产环境降级
资本底座 + 品牌吸引人才和客户	降估融资或 Series C 失败
反向：超大规模云厂商捆绑推理（AWS Bedrock、GCP Vertex、Azure）压缩价格	超大规模云厂商退出捆绑推理
反向：生成式 AI 版权诉讼可能延伸到平台托管方	负面判例仅限于模型训练方被告
反向：收入 + 留存未披露；入场需严守价格纪律	管理层披露 ARR + NRR

投资逻辑与反向逻辑是对称的；本章明确列出哪些证据会推翻判断。

[CV006, CV007, CV008, CV009, CV010, CV011]

FV001: 投资建议逻辑

从规模、证据、风险和估值推导出投资建议的链条。

[CV001, CV002, CV003, CV004, CV005, CV006]

8.2 情景、可比公司与敏感性

估值由三种情景锚定。基准情景（$4B-$6B 退出，约 50% 概率）假设 ARR 从当前 $130M-$200M 在 2026-2028 年扩大到 $500M-$700M，毛利率维持在 AI 推理典型的 30-40% 区间，C 轮适度稀释，Hopper→Blackwell 产能按时爬坡。乐观情景（$8B-$12B 退出，约 25% 概率）要求 2028 年 ARR >$1B，FlashAttention 推动利用率提升并带来毛利率扩张，Salesforce + NVIDIA 渠道承诺持续，并出现战略收购（NVIDIA、超大规模云厂商、Salesforce）或 2027-2028 年以溢价倍数 IPO。悲观情景（$1B-$2.5B 结果，约 25% 概率）在以下情况下兑现：超大规模云厂商捆绑推理压低价格、NVIDIA 分配收紧，或版权先例延伸到平台托管方。可比估值表覆盖 CoreWeave（IPO 后 GPU 云可比公司）、Navan（近期 S-1 SaaS 可比公司）、Figma（S-1 可比公司）、私募轮次（Fireworks 传闻 $4B、Replicate、Modal、Sakana、Mistral、Anthropic），以及作为天花板参照的上市公司（NVIDIA、Snowflake）。敏感性驱动因素包括收入增长、毛利率、NRR、退出倍数和概率加权退出窗口。下方乐观 / 基准 / 悲观表和可比估值表记录每种情景的假设、估值逻辑和关键敏感性。估值敏感性条形图和估值区间图展示相对当前 B 轮标记的下行、基准和上行情景。[CV018, CV019, CV020, CV021, CV022, CV023]

乐观 / 基准 / 悲观情景表
情景	概率	ARR 假设	毛利率	退出倍数	估值 / 回报逻辑	关键风险
乐观	25%	2028 年 ARR >$1B	40-50%	12-15x ARR	$8B-$12B 退出；战略收购 / 高溢价 IPO	超大规模云厂商捆绑；NVIDIA 资源重新分配
基准	50%	2028 年 ARR $500M-$700M	30-40%	8-10x ARR	$4B-$6B 退出；并购出售或 IPO	价格竞争；留存下滑
悲观	25%	2028 年 ARR $200M-$300M	20-30%	5-7x ARR	$1B-$2.5B 结果；降估融资	超大规模云厂商价格战；版权判例；NVIDIA 配额削减

概率是主观判断，仅在本章内部使用；Series C 以及每次重大客户或监管事件后，都应重新标记各行。

[CV015, CV016, CV017, CV018, CV019, CV020]

可比估值表
可比对象	指标	倍数 / 估值 / 状态	参考意义	局限
CoreWeave（IPO 后，GPU 云）	EV / 未来 12 个月收入	IPO 后 8-12x	最接近的 GPU 云可比对象	CoreWeave 收入结构更偏 GPU 裸金属
Navan（S-1，SaaS）	EV / NTM 收入	提交招股书时 8-12x	成长期 SaaS 可比对象	SaaS，不是推理
Figma（S-1，SaaS）	EV / NTM 收入	提交招股书时 12-15x	高倍数 SaaS 可比对象	设计 SaaS，不是推理
Fireworks AI（据传 2024 年融资轮）	最近一轮私募融资	~$4B（据传）	直接推理可比对象	融资估值据传
Replicate（未上市）	最近一轮私募融资	未披露	直接推理可比对象	披露有限
Modal（未上市）	最近一轮私募融资	未披露	无服务器推理可比对象	披露有限
Anyscale（未上市）	最近一轮私募融资	$1B-$2B	Ray + 推理可比对象	定位不同
Sakana AI（融资轮）	最近一轮私募融资	~$1.5B（2024 年 8 月）	开源模型开发商可比对象	模型实验室，不是基础设施
Mistral（融资轮）	最近一轮私募融资	$6B（2024 年中）	开源模型实验室可比对象	模型 + 基础设施混合
Anthropic（融资轮）	最近一轮私募融资	$60B+（2025）	闭源 API 可比对象	商业模式不同——非直接可比
NVIDIA（上市）	EV / NTM 收入	高十几倍至 20 多倍中段	上限参照	规模大得多
Snowflake（上市）	EV / NTM 收入	10-15x	SaaS 上限参照	成熟 SaaS

可比行混合了上市公司与未上市公司估值；私募融资数字来自媒体报道和 PitchBook。

[CV021, CV022, CV023, CV024, CV025, CV026]

FV002: 估值敏感性

估值结果对收入、利润率、倍数和留存的敏感性。

[CV018, CV019, CV020, CV021]

FV003: 估值 / 回报区间

2028 年退出窗口下，各情景的低 / 基准 / 高估值区间。

[CV022, CV023, CV024, CV025, CV026, CV029]

8.3 投资逻辑破裂触发点、尽调问题与 KPI

投资逻辑破裂与放弃触发点表把本章风险和估值逻辑转成可监控触发点，并绑定具体事件：（a）到 2027-2028 年 ARR 运行率未达 $500M-$700M → 重估基准情景，（b）Salesforce 联合销售降优先级 → 放弃乐观情景，（c）NVIDIA Blackwell 分配削减 → 重估产能爬坡，（d）超大规模云厂商捆绑推理降价 >40% → 价格压缩，（e）任何针对平台托管方的版权裁决 → 重估 OSS 托管，（f）C 轮估值较 B 轮持平 / 下调 → 按市场重估，（g）创始人离职 → 放弃投资逻辑，（h）披露数据泄露或多小时宕机 → 重估 SLA + 声誉。最终尽调问题表记录仍缺失的基础指标——准确 ARR、NRR/GRR、前 10 大客户集中度、GPU 承诺支出、运营支出拆分、CFO/CRO 招聘、主权渠道姿态、付费开发者数量——并把每项映射到负责人或尽调路径。投资 KPI 图把市场、验证、护城河、经济性、风险、估值和证据质量整合为 0-100 分，便于投委会使用。本章明确：建议对价格和证据都敏感。在 $3.3B-$3.5B B 轮估值和已披露证据基础上，持有 / 观察是纪律性答案；若估值回调 25%+，或确认 ARR >$500M 且 NRR >120%，则买入；若 C 轮前任一悲观情景触发，则放弃。[CV034, CV035, CV036, CV037, CV038, CV039]

投资逻辑失效与终止触发项表
触发项	阈值	对投资逻辑的传导	行动含义
相对基准情景的 ARR 运行率	FY2027 时 ARR <$500M	收入下调	重做基准测算
Salesforce 联合销售	公开降低优先级	渠道下调	终止乐观情景
NVIDIA 配额	公开宣布削减同业配额	产能下调	重做产能测算
超大规模云厂商捆绑定价	AWS Bedrock 或同业降价 >40%	毛利率压缩	重做基准测算
版权判例	平台托管方裁决	OSS 托管假设下调	重做 OSS 收入测算
融资	Series C 较 Series B 持平或下降	按市价重估	重做估值测算
创始人离职	CEO/CTO/CSO 中任一人	执行力下调	终止投资逻辑
安全 / 宕机	披露安全事件或多小时宕机	声誉 + SLA	重做企业客户管线测算

触发阈值可通过公开披露或同业可比项监控。

[CV033, CV034, CV035, CV036, CV037, CV038]

最终尽调追问表
主题	缺失证据	重要性	负责人 / 尽调路径
收入	runDate 时的准确 ARR	基准 / 乐观情景测算	向管理层索取 ARR + 增长
留存	NRR / GRR / 队列留存	收入质量	索取按队列的留存
集中度	前 10 大客户占比	单一事件下行风险	索取匿名化前 10 大客户
GPU 承诺	对 NVIDIA 的承诺支出	毛利率测算	索取供应商承诺
运营开支拆分	R&D / S&M / G&A	烧钱速度测算	索取利润表拆分
CFO / CRO	到岗情况 + 任期	执行力测算	确认高管任用
主权渠道	Prosperity7 承诺	地缘 + 品牌风险	确认渠道姿态
付费开发者数	付费 / 免费拆分	自助收入测算	索取付费开发者数
SOC 2 到期	Type II 到期日	企业采购	索取认证更新
开源许可立场	OSS 托管政策	版权风险敞口	索取托管政策

所有尽调追问都对应本章内部问题和风险章节缓释表。

[CV040, CV041, CV042, CV043, CV044]

FV004: 投资 KPI

投委会可用的评分，覆盖市场、客户证据、护城河、经济性、风险、估值和证据质量。

[CV040, CV041, CV042, CV043, CV044]

8.4 证据材料

免责声明

本报告是基于公开证据的尽调快照，不构成投资建议。重要的财务、法律、技术和合同事实仍未公开；作出任何投资决定前，应直接向管理层核验，并查阅一手文件。

证据索引

结论
编号	陈述	可信度	来源
CO001	Together AI markets itself as "the AI acceleration cloud" offering training, fine-tuning, and inference for open-source and custom models.	高	SO001, SO002
CO002	The corporate entity is Together Computer Inc., headquartered in San Francisco, California, with an additional research presence in Zurich.	高	SO002, SO004, SO003
CO003	Together was incorporated on 27 June 2022 by four co-founders: Vipul Ved Prakash, Ce Zhang, Chris Ré, and Percy Liang.	高	SO002, SO018
CO004	The company's public surface positions three product lines: serverless inference API, dedicated endpoints, and fine-tuning/training services.	高	SO001, SO035
CO005	Together emphasises that customers can keep weights and choose dedicated capacity, a deliberate contrast with closed-API providers.	中	SO001, SO005
CO006	CEO Vipul Ved Prakash previously co-founded Topsy, which Apple acquired for approximately $200M in 2013, and earlier co-founded Cloudmark.	高	SO018, SO002
CO007	CTO Ce Zhang is a tenured professor at ETH Zürich specialising in distributed ML and data-centric ML research.	高	SO002, SO018
CO008	Chief Scientist Chris Ré is a MacArthur Fellow at Stanford and a co-founder of Snorkel, anchoring much of Together's open-source research lineage.	高	SO002, SO011
CO009	Co-founder Percy Liang directs the Stanford Center for Research on Foundation Models (CRFM) and leads the HELM benchmark.	高	SO002, SO018
CO010	Princeton CS faculty member Tri Dao is the principal author of FlashAttention and is publicly identified as a Together chief scientist.	高	SO002, SO009, SO036
CO011	Together actively recruits across kernel engineering, GPU systems, applied ML, sales, and revenue operations roles as of May 2026.	高	SO003, SO018
CO012	Together raised a $20M Series Seed in May 2023 led by Lux Capital, with Factory, SciFi Capital, and Long Journey Ventures participating.	高	SO018, SO012
CO013	A $102.5M Series A closed in November 2023, led by Kleiner Perkins with NVIDIA, Emergence, NEA, Prosperity7, and Greycroft participating.	高	SO006, SO014, SO018
CO014	An interim financing in March 2024 reportedly valued Together at approximately $1.25B.	中	SO015, SO018
CO015	Together closed a $305M Series B on 9 July 2024 led by Salesforce Ventures and Coatue at a $3.3B post-money valuation.	高	SO012, SO013, SO016, SO017
CO016	Cumulative disclosed primary capital totals approximately $533M (seed + A + interim + B) before any 2025–2026 extensions.	中	SO012, SO006, SO018
CO017	No Together AI registration, S-1, or other public filing appears on SEC EDGAR as of the May 2026 run date.	高	SO027, SO019
CO018	NVIDIA participated as a strategic investor in both Series A and Series B financings, signalling H100/H200 supply alignment.	中	SO026, SO006, SO012
CO019	CNBC reported Together AI was running at an approximately $100M annualised revenue pace around the Series B announcement in July 2024.	中	SO012
CO020	Bloomberg cited triple-digit year-over-year revenue growth for Together AI at the time of the Series B, without disclosing absolute figures.	中	SO013
CO021	Together has publicly stated it operates more than 20,000 NVIDIA Hopper-class GPUs across its multi-region cluster.	中	SO012, SO005
CO022	The company describes its developer footprint as "hundreds of thousands" of developers, without disclosing paid versus free split.	低	SO001, SO005
CO023	Together's public job board and LinkedIn footprint imply a headcount above 150 full-time staff globally as of May 2026.	低	SO003, SO018
CO024	No audited gross margin, net revenue retention, or paid-customer disclosure exists for Together AI as of the run date.	高	SO027, SO019
CO025	Together AI launched OpenChatKit in March 2023 with LAION and Ontocord, an early open-source instruction-tuned chat baseline.	高	SO008, SO030
CO026	The RedPajama 1T token open dataset was released on 17 April 2023, intended to reproduce LLaMA-grade pretraining data.	高	SO007, SO029
CO027	FlashAttention-3 was published on arXiv and Together's blog on 11 July 2024, claiming state-of-the-art H100 attention performance.	高	SO036, SO009
CO028	StripedHyena-Nous-7B, a non-attention long-context architecture, was released in December 2023 in collaboration with Nous Research.	高	SO031, SO034
CO029	Together's Mixture-of-Agents paper, published in June 2024, demonstrated multi-LLM ensembling improvements on AlpacaEval.	高	SO037, SO011
CO030	Together publishes an active GitHub organisation (togethercomputer) with multiple ten-thousand-star repositories including OpenChatKit and RedPajama-Data.	高	SO028, SO029, SO030
CO031	The HuggingFace organisation togethercomputer hosts the RedPajama datasets and StripedHyena, Pythia, LLaMA-32k, and m2-bert models.	高	SO033, SO011
CO032	No public regulatory action, litigation, recall, or executive departure involving Together AI has been reported as of May 2026.	中	SO018, SO019, SO027
CO033	Together AI is described as one of the most followed open-source-AI infrastructure accounts on Hacker News and X.	低	SO020, SO024, SO021
CO034	Salesforce Ventures publicly framed the Series B as enabling enterprise customers to deploy open models on Together's cloud.	中	SO025, SO012
CO035	Crunchbase's Together AI profile is paywalled and could not be independently verified for cap-table details at runDate.	中	SO019
CO036	Cover-metric "gaps" remain for ARR, gross margin, NRR, and paid-customer count; all are flagged as diligence asks for management.	中	SO027, SO019, SO012
CM001	Together AI competes in the AI compute and inference platform layer between hyperscaler GPU IaaS and closed-API model labs.	高	SM001, SM004, SM023
CM002	Together's addressable spend pool excludes general-purpose cloud compute and closed-only proprietary model APIs.	中	SM001, SM002
CM003	Status-quo substitutes for Together include self-hosted Kubernetes-on-GPU clusters and OpenAI/Anthropic closed APIs.	中	SM011, SM012
CM004	Specialised GPU clouds (CoreWeave, Lambda) compete on infrastructure but lack Together's open-source-model SaaS layer.	中	SM013, SM014
CM005	Inference-API providers (Replicate, Fireworks, Groq, Modal) compete directly at the per-token serverless layer.	高	SM015, SM019, SM018, SM016
CM006	AWS Bedrock and Google Vertex AI offer hosted open-model inference that overlaps Together's serverless product.	高	SM011, SM012
CM007	Gartner sizes 2024 AI infrastructure TAM at $40–60B with a 30–50% CAGR through 2028.	中	SM021
CM008	IDC-style analyst notes peg 2024 global AI infrastructure spend near $50B.	低	SM021, SM022
CM009	Triangulated inference + dedicated GPU SAM for 2026 lands in an $8–15B range.	中	SM021, SM024, SM022
CM010	Together-addressable SOM (channels + open-model demand) is on the order of $1–3B in 2026.	低	SM024, SM027
CM011	CNBC reported a ~$100M Together ARR at the July 2024 Series B, implying mid-single-digit SOM share.	中	SM024, SM025
CM012	NVIDIA disclosed >$30B quarterly data-centre revenue in early 2025, evidence that AI-compute spend dwarfs Together's ARR.	高	SM028, SM022
CM013	No single public source cleanly disaggregates inference spend from training capex, creating range uncertainty.	中	SM021, SM028, SM022
CM014	AI-native startups and model labs are Together's most active early buyers, choosing it for open-weight flexibility and dedicated GPU access.	中	SM003, SM032
CM015	F500 enterprise platform teams are an emerging segment, anchored by Salesforce Ventures Series B leadership.	中	SM027, SM024
CM016	Sovereign and regional cloud customers are a strategic third segment, signalled by Prosperity7 (Aramco) investor presence.	低	SM024, SM023
CM017	Within Together, users (developers) frequently differ from payers (procurement/finance), lengthening enterprise sales cycles.	低	SM027, SM004
CM018	Self-serve credit-card adoption is the primary land motion for AI-native startup customers on Together.	中	SM002, SM008
CM019	Together's NVIDIA GTC 2025 spotlight emphasised "AI pioneers" as case-study customers, validating the enterprise wedge.	中	SM033, SM028
CM020	Together's AI-Native conference (2025) was framed as a developer community event, reinforcing top-of-funnel demand generation.	中	SM005, SM030
CM021	Open-weight model proliferation (Llama 3/4, DeepSeek, Mistral, Qwen) keeps SAM growth above 35% CAGR through 2027.	中	SM022, SM021, SM029
CM022	NVIDIA Hopper and Blackwell GPU scarcity drives demand for Together's reserved capacity SKUs.	中	SM028, SM013
CM023	Closed-API price cuts from OpenAI compress per-token margins across the inference market.	低	SM002, SM030
CM024	Hyperscaler open-model commoditisation (AWS Bedrock, GCP Vertex Model Garden) threatens to erode Together's pure-inference SAM.	中	SM011, SM012
CM025	Sovereign data residency rules accelerate demand for in-region dedicated clusters but cap cross-border ARR.	低	SM004, SM023
CM026	Energy and data-centre permitting bottlenecks slow capacity expansion through 2028.	低	SM013, SM028
CM027	Agentic AI workloads (Mixture-of-Agents, multi-step reasoning) multiply per-user token volume.	中	SM004, SM005
CM028	FinOps pressure pushes enterprises to substitute open-weight inference for closed-API spend.	低	SM002, SM027
CM029	Together announces serverless, dedicated, and batch inference SKUs to capture different buyer demand curves.	高	SM002, SM008, SM009, SM010
CM030	Batch inference pricing updates in 2025 reduced per-million-token costs to attract high-volume customers.	中	SM006, SM010
CM031	Specialised GPU clouds CoreWeave and Lambda compete on raw GPU-hour pricing; Together overlays an inference SaaS layer.	中	SM013, SM014
CM032	Groq, Cerebras, and SambaNova compete with bespoke silicon for inference latency leadership.	高	SM018, SM020
CM033	Modal, Replicate, and Anyscale compete in serverless and Ray-based AI compute SaaS.	中	SM016, SM015, SM017
CM034	Fireworks AI is widely cited as Together's closest direct competitor on open-model inference SaaS.	中	SM019, SM030
CM035	Public-cloud earnings (AWS, GCP) describe AI workloads as the fastest-growing portion of cloud revenue.	中	SM011, SM012
CM036	Reddit r/LocalLLaMA and Hacker News discussion volume around Together has risen steadily through 2024–2026.	低	SM030, SM029, SM031
CP001	Together competes against AWS Bedrock and Google Vertex Model Garden on hosted open-weight model inference.	高	SP018, SP019, SP001
CP002	Specialised GPU clouds CoreWeave and Lambda compete with Together at the IaaS layer for reserved GPU capacity.	高	SP020, SP021
CP003	Fireworks, Replicate, Modal, and Anyscale provide direct substitutes at the per-token serverless inference layer.	中	SP026, SP022, SP023, SP024
CP004	Groq, Cerebras, and SambaNova compete with bespoke silicon for inference latency leadership.	高	SP025, SP027
CP005	OpenAI and Anthropic act as substitutes for closed-API customers willing to give up weight portability.	中	SP018, SP036
CP006	TensorWave provides AMD MI300X GPU capacity as a niche alternative for cost-sensitive teams.	低	SP028
CP007	Self-hosted Kubernetes-on-GPU is the status-quo alternative most cited by frontier labs and FAANG.	低	SP036, SP037
CP008	Fireworks AI is widely cited as Together's closest direct competitor on open-model inference SaaS.	中	SP026, SP036, SP037
CP009	Together leads on FlashAttention kernel performance, anchored by the FlashAttention-3 paper and Together engineering team.	高	SP031, SP005, SP029, SP030
CP010	FlashAttention-4 was released in 2025 and extends Together's kernel lead on Hopper GPUs.	中	SP006
CP011	AWS Bedrock and GCP Vertex lead on enterprise compliance breadth (BAA, FedRAMP, regional residency).	高	SP018, SP019
CP012	Groq leads on single-stream inference latency on its supported models but lags in model coverage.	中	SP025, SP036
CP013	Fireworks AI provides an OpenAI-compatible API and serves the same open-model catalog as Together.	高	SP026, SP015
CP014	Together's serverless Llama-70B is listed near $0.88 per million tokens, within the OpenAI-parity envelope.	高	SP002, SP011
CP015	Together batch inference offers up to 50% discount versus serverless rates as of the 2025 update.	中	SP013
CP016	AWS Bedrock charges $0.99/M output tokens for Llama 3 70B in 2026 list pricing.	中	SP018
CP017	GCP Vertex Llama 3 70B is priced near $0.99/M tokens with volume discounts.	中	SP019
CP018	Groq lists Llama 3 70B at ~$0.59/M tokens, undercutting Together on raw price while constraining model choice.	中	SP025
CP019	CoreWeave and Lambda charge $2–4 per H100-hour for reserved or on-demand GPUs.	中	SP020, SP021
CP020	Together fine-tuning API, batch SKU, and dedicated endpoints differentiate it from raw-GPU competitors.	高	SP012, SP013, SP011
CP021	Together's open-source research lineage (RedPajama, StripedHyena, MoA, FlashAttention) sustains community gravity that competitors struggle to match.	高	SP031, SP034, SP004
CP022	Tri Dao and Chris Ré anchor Together's kernel and architecture research velocity.	高	SP031, SP005, SP008
CP023	NVIDIA's participation in Series A and Series B is read by the market as a GPU supply alignment moat.	中	SP041
CP024	Salesforce Ventures Series B leadership opens an enterprise distribution channel competitors lack.	中	SP004, SP003
CP025	Together advertises dedicated endpoints and reserved capacity SKUs that raise customer switching cost.	高	SP012, SP002
CP026	Hyperscalers (AWS, GCP) own enterprise procurement and identity, which is a distribution disadvantage Together must compensate for.	中	SP018, SP019
CP027	Enterprise multi-homing across Together / Fireworks / Bedrock is the reported equilibrium in 2026 buyer surveys.	低	SP036, SP037
CP028	Open-weight neutrality is a counter-positioning advantage versus closed-only OpenAI and Anthropic substitutes.	中	SP001, SP002
CP029	Together publishes an OpenAI-compatible chat completions endpoint, simplifying migration from closed APIs.	高	SP015, SP016
CP030	CoreWeave's 2024 IPO disclosures reveal $1B+ revenue scale, implying meaningful capital advantage at the IaaS layer.	中	SP020, SP036
CP031	Lambda Labs raised a $320M Series C in 2024 to expand its H100/H200 fleet.	中	SP021
CP032	Groq and Cerebras have each raised more than $1B in 2024–2025 to fund bespoke silicon expansion.	中	SP025, SP027
CP033	AWS Bedrock's 2025 expansion of Llama support compresses Together's premium on commodity inference workloads.	中	SP018
CP034	Specialised silicon vendors (Groq, Cerebras, SambaNova) pose a latency-leapfrog risk that pure-software inference cannot fully match.	中	SP025, SP027
CP035	Together's Python SDK and PyPI download trajectory signal sustained developer pull comparable to peers.	中	SP042, SP043
CP036	Speculative-decoding and Medusa-class research feed Together's ability to close any Groq latency gap on shared models.	中	SP032, SP033
CI001	Together AI raised a $20M Seed in May 2023 led by Lux Capital.	高	SI008, SI018, SI019
CI002	Together AI raised a $102.5M Series A in November 2023 led by Kleiner Perkins.	高	SI005, SI015, SI018
CI003	In March 2024 Together added approximately $106M at a reported $1.25B valuation (Series A2).	中	SI016, SI007, SI014
CI004	Per the canonical company-overview claim, the Series B closed July 2024 at ~$3.3B post led by Salesforce Ventures and Coatue (financials chapter relies on that fact for capital-stack analysis).	高	SI011, SI012, SI013, SI006
CI005	NVIDIA participated in both Series A and Series B as a strategic investor.	高	SI022, SI006
CI006	Salesforce Ventures led the Series B, opening an enterprise distribution channel.	高	SI021, SI011, SI006
CI007	Cumulative disclosed primary capital is approximately $533M across Seed, Series A, March 2024 extension, and Series B.	高	SI011, SI018, SI006
CI008	No S-1, S-3, or registered offering appears on SEC EDGAR for Together Computer Inc. at the 2026-05 runDate.	高	SI020, SI025
CI009	CNBC reported an approximately $100M annualised revenue pace around the July 2024 Series B announcement.	中	SI011, SI012
CI010	Bloomberg reported triple-digit revenue growth around the July 2024 Series B.	中	SI013, SI014
CI011	Together has not published audited ARR, gross margin, or NRR figures as of the runDate.	高	SI020, SI001, SI002
CI012	Together publishes per-token list pricing on its public pricing page for serverless inference.	高	SI002, SI001
CI013	Together offers a 50% batch inference discount as of the 2025 batch pricing update.	中	SI009, SI002
CI014	Dedicated endpoint and reserved-capacity pricing is quoted via sales rather than published.	高	SI002, SI004
CI015	Together SKUs span serverless, dedicated, fine-tuning, batch, embeddings, vision, audio, and image.	高	SI002, SI001, SI004
CI016	Realised enterprise pricing for Together is not publicly disclosed and is a material diligence gap.	中	SI002, SI038
CI017	The Information has published paywalled coverage of Together AI 2025 revenue trajectory.	低	SI026
CI018	PitchBook lists Together AI as later-stage venture with no public 2025 round confirmation.	中	SI025, SI019
CI019	Together has not disclosed gross margin by SKU as of the runDate.	高	SI020, SI002, SI001
CI020	Together has not disclosed top-10 customer concentration as of the runDate.	高	SI020, SI003
CI021	Together has not disclosed net dollar retention (NDR) as of the runDate.	高	SI020, SI003
CI022	Together has not disclosed contracted-revenue (RPO) figures.	高	SI020, SI001
CI023	Together has not disclosed cash position or runway as of the runDate.	高	SI020, SI001
CI024	CoreWeave 2024 S-1 disclosures imply GPU-cloud gross margins in the 60-70% range on reserved deals.	中	SI032, SI035
CI025	Together per-token gross margin on serverless is plausibly 40-60% based on competitor analog disclosures.	低	SI032, SI036, SI037
CI026	Implied cash burn through 2024 is roughly $300-$500M consistent with GPU buildout and 150+ headcount.	低	SI004, SI001, SI018
CI027	With $533M raised and that implied burn, runway likely extends into 2026 without a new round.	低	SI006, SI011
CI028	Figma and CoreWeave 2025 IPOs demonstrate the public-market window is open for AI-infrastructure issuers.	高	SI034, SI032
CI029	Navan 2025 S-1 process is a closer growth-SaaS comparable than CoreWeave for Together.	中	SI033
CI030	Together has not disclosed any debt or vendor-financing facility.	中	SI020, SI004
CI031	Founder and employee ownership post Series B is widely reported as significant but no exact percentages are public.	低	SI006, SI018, SI019
CI032	No public secondary or tender offer for Together AI shares has been reported at the runDate.	中	SI020, SI025, SI026
CI033	Forrester and IDC market frames place Together in the growth-stage generative-AI infrastructure segment without naming it top-three.	中	SI027, SI028
CI034	Menlo Ventures and Bessemer 2025 State-of-AI reports frame the inference market as multi-billion-dollar and growing.	中	SI030, SI031, SI029
CI035	No public 2026 follow-on round, IPO filing, or M&A announcement involving Together has been confirmed at the runDate.	高	SI020, SI025, SI026, SI011
CI036	Together pricing-page revisions in 2025 added batch and dedicated SKU clarifications, signalling product and financial maturation.	中	SI009, SI002, SI004
CI037	Public disclosure across ten standard financial primitives is missing or partial, qualifying as a material diligence gap.	高	SI020, SI001, SI002, SI003
CE001	Together AI exposes serverless inference, dedicated endpoints, fine-tuning, batch, embeddings, vision, audio, and image APIs.	高	SE016, SE018, SE001, SE003
CE002	Together AI publishes an OpenAI-compatible chat-completions endpoint to simplify migration.	高	SE022, SE035
CE003	The Together model catalog spans 200+ open and custom models including Llama, Mistral, Mixtral, Qwen, DeepSeek, StripedHyena.	高	SE018, SE036, SE045
CE004	Dedicated endpoints offer reserved H100/H200/B200 capacity with BAA available for HIPAA workloads.	高	SE020, SE003, SE005
CE005	Fine-tuning API supports LoRA and full-parameter training jobs on most supported families.	高	SE019, SE042
CE006	Batch inference offers up to 50% discount vs serverless as of the 2025 update.	中	SE011, SE021
CE007	Embeddings API offers multiple open embedding models per published reference.	高	SE024, SE034
CE008	Together publishes vision, audio, and image APIs as documented surfaces.	高	SE031, SE032, SE033
CE009	SDKs ship in Python (PyPI: together) and TypeScript with raw HTTP fallback.	高	SE044, SE043, SE017
CE010	Rate-limit documentation distinguishes free, paid, and enterprise tiers.	高	SE025, SE016
CE011	Together architecture stacks API gateway, model registry, inference scheduler, TIE v2, and GPU pool.	高	SE016, SE009, SE010
CE012	Together Inference Engine v2 integrates FlashAttention-3/4 and ThunderKittens kernels.	高	SE010, SE006, SE007, SE008
CE013	Speculative decoding and Medusa decoders are integrated into the inference engine.	中	SE053, SE054, SE055
CE014	Mixture-of-Agents (MoA) provides ensemble inference for higher-quality completions on supported models.	中	SE056, SE012
CE015	FlashAttention-3 paper (arXiv 2407.08608) describes the kernel anchoring Together throughput claims.	高	SE052, SE006
CE016	FlashAttention-4 was released in August 2025 and extends the kernel lead to Hopper and Blackwell.	中	SE007, SE012
CE017	ThunderKittens kernel framework was released in 2024 by Together and Stanford HazyResearch.	高	SE008, SE065
CE018	NVIDIA is the primary GPU supplier (Hopper H100/H200, Blackwell B200) and a strategic investor.	高	SE060, SE014, SE001
CE019	HuggingFace is the primary model artefact partner and hosts Together-published checkpoints.	高	SE045, SE049
CE020	A status page is published at status.together.ai documenting platform reliability.	中	SE062
CE021	The public SLA percentage for serverless and dedicated tiers is not yet documented at the runDate.	中	SE062, SE025
CE022	Together infrastructure organisation expanded in 2025 with Alon Gavrielov as VP of Infrastructure Strategy.	高	SE015, SE005
CE023	Trust center publishes SOC 2 Type II attestation references and HIPAA BAA availability.	高	SE063, SE066, SE067
CE024	HIPAA BAA is available on dedicated endpoints but not serverless tier per documentation.	中	SE063, SE020
CE025	GDPR / DPA terms are available for EU customers per trust center documentation.	中	SE063
CE026	FedRAMP accreditation is not yet listed in the trust center at the runDate.	中	SE063
CE027	The full regional residency map (which regions, which co-lo partners) is not publicly disclosed.	中	SE063, SE020
CE028	ISO 27001 certification status is not publicly confirmed at the runDate.	中	SE063
CE029	Content moderation, function calling, JSON mode, and structured-output safety controls are documented surfaces.	高	SE028, SE027, SE026
CE030	Audit logs are documented for enterprise customers but not enabled by default.	中	SE063, SE020
CE031	Custom-model-weights privacy controls are documented for dedicated tier.	中	SE020, SE063
CE032	A bug bounty / responsible disclosure programme is published on the trust center.	中	SE063
CE033	GTC 2025 Pioneers event surfaced multiple Together customer + NVIDIA partnerships.	高	SE014, SE060
CE034	Adaption partnership (2025) extends Together into healthcare workflows.	中	SE005
CE035	AI Native Conference 2025 announced research and product directions including MoA productisation.	高	SE012, SE005
CE036	Blackwell (B200) capacity ramp is documented as 2026 roadmap item in blog references.	低	SE005, SE014
CE037	Multi-modal expansion (vision + audio) is a documented 2026 roadmap area.	低	SE005, SE012
CU001	Together AI reports more than 100,000 developers have used the platform per company disclosure.	中	SU004, SU003, SU001
CU002	Self-serve developer signup is the primary top-of-funnel adoption motion for Together AI.	高	SU038, SU001, SU003
CU003	Together customers page enumerates named startup and enterprise deployments.	高	SU003, SU001
CU004	AI-native startups (Pika, Cartesia, Arcee, Nous Research) are documented production customers.	高	SU012, SU015, SU013, SU014, SU003
CU005	Enterprise SaaS deployments at Salesforce and Zoom are documented case studies.	高	SU010, SU011, SU003
CU006	Washington University is referenced as a research-compute customer in a case study.	中	SU016, SU003
CU007	Adaption (2025) extends Together into healthcare workflows.	中	SU008, SU004
CU008	NVIDIA GTC 2025 Pioneers programme surfaced a cohort of joint Together + NVIDIA customers.	高	SU007, SU018
CU009	Startup Accelerator launched in November 2024 as an explicit startup-acquisition funnel.	高	SU006, SU004
CU010	Geographic mix is North America-skewed with EU presence growing through dedicated clusters.	低	SU003, SU004, SU001
CU011	Buyer/user split differs by tier: developer-led self-serve vs CIO/platform-eng-led enterprise.	中	SU038, SU003
CU012	Salesforce case study documents integration depth and is treated as production deployment.	高	SU010, SU017, SU003
CU013	Zoom case study documents AI-feature inference at production scale.	高	SU011, SU003
CU014	Pika case study cites latency improvement from FlashAttention-class kernels.	高	SU012, SU003
CU015	Cartesia case study documents voice-model production deployment on dedicated tier.	高	SU015, SU003
CU016	Arcee case study documents cost reduction relative to closed APIs.	中	SU013, SU003
CU017	Nous Research case study documents community model hosting on Together.	中	SU014, SU003
CU018	Washington University case study documents research-compute usage.	中	SU016, SU003
CU019	Adaption is described as a launching partnership rather than confirmed production deployment.	中	SU008
CU020	GTC 2025 cohort case studies cover developer tools, robotics, healthcare, and content/media.	中	SU007
CU021	HuggingFace partnership funnels developers from the model hub into Together.	中	SU019, SU020
CU022	Net dollar retention (NDR) is not publicly disclosed at the runDate.	高	SU003, SU001, SU004
CU023	Gross retention (GRR) and named-account churn are not publicly disclosed.	高	SU003, SU001, SU004
CU024	Paid vs free developer counts are not disclosed.	高	SU004, SU003
CU025	Dedicated-endpoint renewal rate is not publicly disclosed.	高	SU004, SU003
CU026	G2 and Trustpilot review counts for Together are small, limiting independent proxies.	中	SU026, SU027
CU027	Salesforce Ventures-led Series B and customer case study together signal a multi-year channel commitment.	中	SU017, SU010, SU004
CU028	GTC 2025 Pioneers cohort acts as an enterprise pipeline amplifier through NVIDIA.	中	SU007, SU018
CU029	Startup Accelerator provides credits and GTM amplification to long-tail AI startups.	高	SU006, SU004
CU030	Adaption launch indicates a follow-on path into regulated healthcare workflows.	中	SU008
CU031	Enterprise sales cycle requires custom MSA and security review, adding 60-120 days before revenue.	低	SU004, SU038
CU032	Top-10 customer concentration is undisclosed and is a material diligence ask.	高	SU003, SU004
CU033	Public customer mix skews AI-native startups + developer tools rather than a single mega-anchor.	中	SU003, SU006, SU007
CU034	No public lawsuit or named-account churn report has surfaced for Together at the runDate.	中	SU023, SU022, SU004
CU035	Reddit and Hacker News threads occasionally cite latency or cold-start concerns on the serverless tier.	低	SU023, SU022
CU036	Public status page exists but no SLA percentage is published for serverless or dedicated tiers.	中	SU042, SU038
CU037	PyPI download trajectory and GitHub repo activity indicate sustained developer pull.	中	SU040, SU041
CR001	FTC opened a 6(b) inquiry in 2024 into generative-AI investments and partnerships, naming the major cloud-AI relationships.	高	SR002, SR001
CR002	FTC has stated ongoing 2024-2025 attention to GenAI competition and consumer-protection enforcement.	高	SR001, SR002
CR003	EU AI Act entered into force in 2024 with phased GPAI obligations through 2026-2027 including fines up to 7% of global revenue.	高	SR003, SR012
CR004	BIS tightened advanced-computing export controls in 2025 covering H100, H200, B200 and certain foundation-model weights.	高	SR005, SR008
CR005	NIST AI Risk Management Framework establishes voluntary US federal AI controls increasingly used in enterprise procurement.	高	SR004, SR008
CR006	UK ICO has published GenAI guidance creating UK DPA compliance baseline.	中	SR006
CR007	Australia OAIC has published a 2024 GenAI guide for organisations.	中	SR007
CR008	White House EO on AI (2023, amended 2025) sets reporting thresholds for foundation-model training.	中	SR008
CR009	CCPA imposes privacy obligations on Together for California-resident user data.	高	SR009, SR012
CR010	HIPAA BAA support is published as available for healthcare workloads.	高	SR010, SR028, SR026
CR011	SOC 2 attestation surface is referenced via the AICPA SOC framework and Together trust center.	中	SR011, SR028
CR012	NYT v Microsoft/OpenAI active litigation (CourtListener docket) is the bellwether GenAI copyright case in US.	高	SR013, SR014
CR013	Authors Guild v OpenAI active litigation expands copyright exposure to non-press content.	高	SR014, SR013
CR014	Getty Images v Stability AI active litigation tests image-model copyright exposure on both US and UK sides.	高	SR015, SR014
CR015	Civil-society organisations (CDT) actively lobby for AI accountability, adding reputational pressure.	中	SR012
CR016	Together is not currently named in any of the bellwether GenAI copyright suits.	中	SR013, SR014, SR015, SR025
CR017	Open-model hosting carries adjacent precedent risk if copyright cases extend to platform hosts.	中	SR013, SR014, SR015
CR018	Together publishes a public status page but does not publish an SLA percentage.	高	SR027, SR030
CR019	Pen-test cadence, breach plan, and named incident history are not publicly disclosed.	高	SR028, SR025
CR020	Safety models and function-calling guardrails are documented mitigations for prompt-injection class risks.	高	SR031, SR030
CR021	HuggingFace integrity checks are inherited for model-weight artefacts; weight-signing process is undisclosed.	中	SR028, SR025
CR022	Trust center references SOC 2 Type II posture; attestation expiry date is not public.	中	SR028, SR011
CR023	NVIDIA is supplier of GPUs, networking, and software stack and a strategic investor — single-vendor concentration is high.	高	SR025, SR024, SR029
CR024	HuggingFace is the primary model-artefact dependency for the Together catalog.	高	SR025, SR029
CR025	Salesforce Ventures is lead enterprise channel investor and co-sell partner.	高	SR025, SR029
CR026	Datacenter / colo capacity counterparties are largely undisclosed; multi-region build is implied but not enumerated.	中	SR025, SR024
CR027	Capital partners include GC, Salesforce, NVIDIA, Lux, Coatue, Prosperity7, and Kleiner per public round disclosures.	高	SR025, SR034, SR035
CR028	Top-10 customer concentration is undisclosed and is a material diligence ask.	高	SR029, SR025
CR029	Competitive displacement risk is documented from Fireworks, Replicate, Modal, Anyscale, Groq, Cerebras, CoreWeave, Lambda.	高	SR019, SR020, SR021, SR022, SR017, SR018, SR016, SR023
CR030	Open-source model upstream license changes (Llama, Mistral, Qwen, DeepSeek) would introduce review and compliance burden.	中	SR025, SR029
CR031	Sovereign / Prosperity7-adjacent backing adds geopolitical disclosure considerations.	中	SR034, SR035, SR025
CR032	Key-person dependency on Vipul Ved Prakash, Ce Zhang, and Tri Dao is high; founder retention is the mitigation.	高	SR024, SR025
CR033	CFO and CRO presence at runDate is not publicly confirmed and is a material recruiting diligence ask.	中	SR025, SR024
CR034	Engineering and infra hiring momentum is visible (Alon Gavrielov 2025 VP-infra hire) but exact bench size is undisclosed.	中	SR025, SR024
CR035	Hopper→Blackwell→Rubin transition execution is a multi-quarter program-management risk for the chapter.	中	SR025
CR036	Monitorable kill triggers (NVIDIA allocation cut, HF policy change, EU AI Act fine, copyright host-ruling) can be tracked from public disclosure.	中	SR025, SR003, SR005, SR013
CR037	Operational kill triggers (multi-hour serverless outage, breach disclosure) are monitorable through status page and press.	中	SR027, SR025, SR032, SR033
CR038	Commercial kill triggers (Salesforce co-sell deprioritisation, customer concentration >25% single) are monitorable through press and reference calls.	中	SR025, SR029
CR039	Founder-departure triggers are catastrophic for the thesis at growth stage.	中	SR025, SR024
CR040	Financing kill triggers (flat/down round vs Series B at runDate) would re-underwrite valuation.	中	SR025, SR034, SR035
CR041	Adverse-source coverage spans regulators, court dockets, competitors, and developer-sentiment fora.	高	SR002, SR013, SR019, SR032, SR033
CR042	Several control primitives (SLA, incident, breach plan, top-10 concentration, GPU committed spend) remain undisclosed at runDate and are explicit diligence asks.	高	SR025, SR029, SR028, SR027
CV001	Recommendation is Hold / Monitor with medium confidence at the Series B mark.	中	SV025, SV027, SV007
CV002	Conditional Buy on a 25%+ correction or confirmed >$500M ARR plus >120% NRR.	中	SV008, SV007, SV005
CV003	Conditional Pass if hyperscaler pricing cuts >40%, NVIDIA allocation cuts, or breach disclosure occurs.	中	SV001, SV002, SV003
CV004	Risk rating is medium-high reflecting concentration, regulatory, and competitive overhangs.	中	SV001, SV002, SV006
CV005	Valuation stance is "at-or-near" the current Series B mark with explicit triggers to revisit.	中	SV007, SV008
CV006	GenAI inference TAM grows 40-60% CAGR per multiple analyst sources at 2025 mid-point.	高	SV001, SV002, SV003, SV004, SV005
CV007	FlashAttention authorship by Tri Dao and ThunderKittens (Stanford HazyResearch) anchor Together's kernel moat.	高	SV025, SV024
CV008	Together Inference Engine v2 and MoA productisation extend the technical surface beyond commoditised inference.	中	SV025
CV009	Salesforce Ventures-led Series B + customer case study imply multi-year channel commitment.	中	SV043, SV025, SV018
CV010	NVIDIA strategic investment + GTC 2025 Pioneers cohort signal supply + pipeline alignment.	高	SV025, SV044
CV011	Open-source neutrality (Llama, Mistral, Qwen, DeepSeek) is defensible positioning vs closed-API providers.	中	SV025, SV027
CV012	Documented enterprise + startup proof base spans Salesforce, Zoom, Pika, Cartesia, Arcee, Nous Research, GTC 2025 Pioneers.	高	SV027, SV025
CV013	Anti-thesis: hyperscaler bundled inference (Bedrock, Vertex, Azure) could compress pricing 30-50%.	中	SV001, SV002, SV006
CV014	Anti-thesis: copyright litigation precedent (NYT, Authors Guild, Getty) could extend to platform hosts.	中	SV025, SV008
CV015	Bull case (25% prob) assumes ARR >$1B by 2028 and exit $8B-$12B.	中	SV001, SV005, SV006
CV016	Base case (50% prob) assumes ARR $500M-$700M by 2028 and exit $4B-$6B.	中	SV001, SV003, SV002
CV017	Bear case (25% prob) assumes ARR $200M-$300M by 2028 and outcome $1B-$2.5B.	中	SV001, SV002, SV006
CV018	Sensitivity to ARR growth is the single largest valuation driver in the chapter model.	中	SV007, SV008
CV019	Gross margin sensitivity is ±1000bps shifts valuation outcome ±$2-3B at base case.	中	SV014, SV013
CV020	Multiple sensitivity is ±3x ARR shifts exit ±$2.5B at base case.	中	SV013, SV015
CV021	Probability weights are subjective and re-marked at Series C and major events.	低	SV007, SV008
CV022	CoreWeave post-IPO trades 8-12x NTM revenue as GPU-cloud comparable.	中	SV014, SV018
CV023	Navan S-1 disclosed 8-12x NTM revenue range at filing for growth-stage SaaS.	中	SV013, SV030
CV024	Figma S-1 disclosed 12-15x NTM revenue range as high-multiple SaaS reference.	中	SV015, SV029
CV025	Fireworks AI rumoured 2024 round valued ~$4B per press reports.	低	SV018, SV019
CV026	Replicate and Modal rounds undisclosed in public press.	中	SV023, SV022
CV027	Anyscale private valuation rumoured $1B-$2B at last round.	低	SV023, SV019
CV028	Sakana AI round ~$1.5B Aug 2024 per TechCrunch and NVIDIA partnership.	中	SV031, SV032
CV029	Mistral round ~$6B mid-2024 as OSS-model-lab comparable.	中	SV019, SV018
CV030	Anthropic round at $60B+ in 2025 as closed-API reference, not direct comparable.	中	SV018, SV019
CV031	NVIDIA public NTM revenue multiple high-teens to mid-20s acts as ceiling reference.	中	SV018, SV019
CV032	Snowflake NTM revenue multiple 10-15x acts as mature-SaaS ceiling reference.	中	SV018, SV019
CV033	ARR run-rate <$500M by FY2027 is the base-case kill trigger.	中	SV008, SV007
CV034	Salesforce co-sell public deprioritisation is the bull-case kill trigger.	中	SV043, SV025
CV035	NVIDIA Blackwell allocation cut to a peer is a re-underwrite trigger.	中	SV044, SV025
CV036	Hyperscaler bundled pricing cut >40% on AWS Bedrock or peer is a base-compression trigger.	中	SV001, SV002
CV037	Platform-host copyright precedent is an OSS-revenue re-underwrite trigger.	中	SV025, SV008
CV038	Series C flat-or-down vs Series B is a mark-to-market trigger.	中	SV018, SV019, SV007
CV039	Founder departure (CEO/CTO/CSO) is a kill trigger.	中	SV024, SV025
CV040	Exact ARR at runDate is undisclosed and is the principal diligence ask.	高	SV008, SV007, SV027
CV041	NRR / GRR / cohort retention are undisclosed at runDate and are material diligence asks.	高	SV027, SV025
CV042	Top-10 customer concentration and GPU committed spend are undisclosed.	高	SV027, SV025
CV043	CFO and CRO presence at runDate is unconfirmed.	中	SV024, SV025
CV044	Opex split (R&D / S&M / G&A), paid-developer count, SOC 2 expiry, and OSS hosting policy are all diligence asks.	中	SV025, SV027
CV045	Sacra estimates Together AI reached $1B in annualized revenue by February 2026, up from ~$618M at year-end 2025, representing ~400% year-over-year growth in 2024.	中	SV045, SV046
CV046	Together AI is in talks to raise approximately $1B at a $7.5B pre-money valuation as of March 2026, which would represent a >2× step-up from the $3.3B Series B valuation set in February 2025.	中	SV045, SV047
CV047	EquityZen lists Together AI as available for pre-IPO secondary share purchases by accredited investors, indicating secondary-market liquidity exists for current shareholders.	中	SV047, SV045
CV048	CB Insights' Q1 2026 State of AI report identifies AI infrastructure as the leading funding category in early 2026, with total AI deal activity up materially from prior quarters, supporting the demand context for Together AI's growth.	高	SV048, SV001

来源
编号	出版方	标题	引文
SO001	Together AI	Together AI — The AI Acceleration Cloud
SO002	Together AI	About \| Together AI
SO003	Together AI	Careers \| Together AI
SO004	Together AI	Contact \| Together AI
SO005	Together AI	Together AI Blog
SO006	Together AI	Together AI raises $102.5M Series A
SO007	Together AI	RedPajama, a project to create leading open-source models
SO008	Together AI	Announcing OpenChatKit
SO009	Together AI	FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
SO010	Together AI	Together Inference Engine 2.0
SO011	Together AI	Research \| Together AI
SO012	CNBC	Together AI raises $305 million at $3.3 billion valuation
SO013	Bloomberg	Together AI Startup Raises Funds at $3.3 Billion Valuation
SO014	TechCrunch	Together raises $102.5M to build open-source generative AI
SO015	TechCrunch	Together AI is worth $1.25B (March 2024 update)
SO016	Fast Company	Together AI funding profile
SO017	VentureBeat	Together AI raises $305M for open-source GenAI
SO018	Wikipedia	Together AI — Wikipedia
SO019	Crunchbase	Together AI — Crunchbase Profile
SO020	Hacker News	Submissions from together.ai
SO021	Reddit r/LocalLLaMA	Together AI discussions
SO022	Product Hunt	Together AI on Product Hunt
SO023	StackShare	Together AI Tech Stack
SO024	X (Together)	@togethercompute on X
SO025	Salesforce Ventures	Salesforce Ventures Perspectives
SO026	NVIDIA	NVIDIA AI investments 2024
SO027	SEC EDGAR	SEC EDGAR — Together AI search
SO028	GitHub	Together Computer · GitHub Org
SO029	GitHub	togethercomputer/RedPajama-Data
SO030	GitHub	togethercomputer/OpenChatKit
SO031	GitHub	togethercomputer/StripedHyena
SO032	GitHub	Dao-AILab/flash-attention
SO033	HuggingFace	togethercomputer on Hugging Face
SO034	HuggingFace	StripedHyena-Nous-7B
SO035	Together AI	Introduction \| Together AI Docs
SO036	arXiv	FlashAttention-3: Fast and Accurate Attention with Asynchrony
SO037	arXiv	Mixture-of-Agents Enhances LLM Capabilities
SO038	Gartner	Gartner AI Insights
SO039	CoreWeave	CoreWeave — Specialized GPU Cloud
SM001	Together AI	Together AI — The AI Acceleration Cloud
SM002	Together AI	Pricing \| Together AI
SM003	Together AI	Customers \| Together AI
SM004	Together AI	Together AI Blog
SM005	Together AI	AI Native Conf — research & product announcements
SM006	Together AI	Batch inference API updates 2025
SM007	Together AI	Inference Models \| Together AI Docs
SM008	Together AI	Serverless Inference \| Together AI Docs
SM009	Together AI	Dedicated Endpoints \| Together AI Docs
SM010	Together AI	Batch Inference \| Together AI Docs
SM011	AWS	Amazon Bedrock
SM012	Google Cloud	Vertex AI
SM013	CoreWeave	CoreWeave — Specialized GPU Cloud
SM014	Lambda Labs	Lambda — GPU Cloud for AI
SM015	Replicate	Replicate — Run models in the cloud
SM016	Modal	Modal — Serverless AI infrastructure
SM017	Anyscale	Anyscale — Powered by Ray
SM018	Groq	Groq — Fast AI inference
SM019	Fireworks AI	Fireworks AI — Production-grade LLM inference
SM020	Cerebras	Cerebras — Wafer-Scale AI
SM021	Gartner	Gartner AI Insights
SM022	arXiv	LLM inference infrastructure survey
SM023	Wikipedia	Together AI — Wikipedia
SM024	CNBC	Together AI raises $305 million at $3.3 billion valuation
SM025	Bloomberg	Together AI Startup Raises Funds at $3.3 Billion Valuation
SM026	Fast Company	Together AI funding profile
SM027	Salesforce Ventures	Salesforce Ventures Perspectives
SM028	NVIDIA	NVIDIA AI investments 2024
SM029	Hacker News	Submissions from together.ai
SM030	Reddit r/LocalLLaMA	Together AI discussions
SM031	Product Hunt	Together AI on Product Hunt
SM032	Together AI	Together AI Startup Accelerator
SM033	Together AI	Together AI at NVIDIA GTC 2025
SP001	Together AI	Together AI — The AI Acceleration Cloud
SP002	Together AI	Pricing \| Together AI
SP003	Together AI	Customers \| Together AI
SP004	Together AI	Together AI Blog
SP005	Together AI	FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
SP006	Together AI	FlashAttention-4
SP007	Together AI	Together Inference Engine 2.0
SP008	Together AI	ThunderKittens kernel framework
SP009	Together AI	AI Native Conf — research & product announcements
SP010	Together AI	Inference Models \| Together AI Docs
SP011	Together AI	Serverless Inference \| Together AI Docs
SP012	Together AI	Dedicated Endpoints \| Together AI Docs
SP013	Together AI	Batch Inference \| Together AI Docs
SP014	Together AI	Rate Limits \| Together AI Docs
SP015	Together AI	Chat Completions API Reference
SP016	Together AI	Completions API Reference
SP017	Together AI	Models API Reference
SP018	AWS	Amazon Bedrock
SP019	Google Cloud	Vertex AI
SP020	CoreWeave	CoreWeave — Specialized GPU Cloud
SP021	Lambda Labs	Lambda — GPU Cloud for AI
SP022	Replicate	Replicate — Run models in the cloud
SP023	Modal	Modal — Serverless AI infrastructure
SP024	Anyscale	Anyscale — Powered by Ray
SP025	Groq	Groq — Fast AI inference
SP026	Fireworks AI	Fireworks AI — Production-grade LLM inference
SP027	Cerebras	Cerebras — Wafer-Scale AI
SP028	TensorWave	TensorWave — AMD GPU cloud
SP029	arXiv	FlashAttention: Fast and Memory-Efficient Exact Attention
SP030	arXiv	FlashAttention-2: Faster Attention with Better Parallelism
SP031	arXiv	FlashAttention-3: Fast and Accurate Attention with Asynchrony
SP032	arXiv	Speculative Decoding paper
SP033	arXiv	Medusa speculative decoding paper
SP034	arXiv	LLM inference infrastructure survey
SP035	arXiv	LLM evaluation benchmark paper
SP036	Reddit r/LocalLLaMA	Together AI discussions
SP037	Hacker News	Submissions from together.ai
SP038	Product Hunt	Together AI on Product Hunt
SP039	StackShare	Together AI Tech Stack
SP040	Gartner	Gartner AI Insights
SP041	NVIDIA	NVIDIA AI investments 2024
SP042	GitHub	togethercomputer/together-python SDK
SP043	PyPI	together — Python package
SP044	Wikipedia	Together AI — Wikipedia
SI001	Together AI	Together AI — The AI Acceleration Cloud
SI002	Together AI	Pricing \| Together AI
SI003	Together AI	Customers \| Together AI
SI004	Together AI	Together AI Blog
SI005	Together AI	Together AI raises $102.5M Series A
SI006	Together AI	Announcing $305M Series B
SI007	Together AI	Series A2 announcement
SI008	Together AI	Seed funding announcement
SI009	Together AI	Batch inference API updates 2025
SI010	Together AI	Together AI Startup Accelerator
SI011	CNBC	Together AI raises $305 million at $3.3 billion valuation
SI012	CNBC	Together AI raises $305 million (follow-up)
SI013	Bloomberg	Together AI Startup Raises Funds at $3.3 Billion Valuation
SI014	Fast Company	Together AI funding profile
SI015	TechCrunch	Together raises $102.5M to build open-source generative AI
SI016	TechCrunch	Together AI is worth $1.25B (March 2024 update)
SI017	VentureBeat	Together AI raises $305M for open-source GenAI
SI018	Wikipedia	Together AI — Wikipedia
SI019	Crunchbase	Together AI — Crunchbase Profile
SI020	SEC EDGAR	SEC EDGAR — Together AI search
SI021	Salesforce Ventures	Salesforce Ventures Perspectives
SI022	NVIDIA	NVIDIA AI investments 2024
SI023	X (Together)	@togethercompute on X
SI024	Gartner	Gartner AI Insights
SI025	PitchBook	Together AI — PitchBook profile
SI026	The Information	Together AI revenue 2025 reporting
SI027	Forrester	Forrester: Generative AI infrastructure landscape
SI028	IDC	IDC Worldwide AI Software Market Forecast 2024-2028
SI029	a16z	a16z — State of Generative AI in the Enterprise 2025
SI030	Menlo Ventures	Menlo Ventures: 2025 State of AI
SI031	Bessemer Venture Partners	Bessemer: State of AI 2025
SI032	SEC EDGAR	CoreWeave SEC filings (S-1 and post-IPO)
SI033	SEC EDGAR	Navan S-1/A filing
SI034	SEC EDGAR	Figma S-1 filings (comparable IPO)
SI035	CoreWeave	CoreWeave — Specialized GPU Cloud
SI036	Fireworks AI	Fireworks AI — Production-grade LLM inference
SI037	Groq	Groq — Fast AI inference
SI038	Reddit r/LocalLLaMA	Together AI discussions
SI039	Hacker News	Submissions from together.ai
SE001	Together AI	Together AI — The AI Acceleration Cloud
SE002	Together AI	About \| Together AI
SE003	Together AI	Pricing \| Together AI
SE004	Together AI	Customers \| Together AI
SE005	Together AI	Together AI Blog
SE006	Together AI	FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
SE007	Together AI	FlashAttention-4
SE008	Together AI	ThunderKittens kernel framework
SE009	Together AI	Together Inference Engine 2.0
SE010	Together AI	Together Inference Engine v2
SE011	Together AI	Batch inference API updates 2025
SE012	Together AI	AI Native Conf — research & product announcements
SE013	Together AI	Together AI Startup Accelerator
SE014	Together AI	Together AI at NVIDIA GTC 2025
SE015	Together AI	Alon Gavrielov joins as VP Infrastructure Strategy
SE016	Together AI	Introduction \| Together AI Docs
SE017	Together AI	Quickstart \| Together AI Docs
SE018	Together AI	Inference Models \| Together AI Docs
SE019	Together AI	Fine-tuning Overview \| Together AI Docs
SE020	Together AI	Dedicated Endpoints \| Together AI Docs
SE021	Together AI	Batch Inference \| Together AI Docs
SE022	Together AI	Chat Completions API Reference
SE023	Together AI	Serverless Inference \| Together AI Docs
SE024	Together AI	Embeddings \| Together AI Docs
SE025	Together AI	Rate Limits \| Together AI Docs
SE026	Together AI	JSON Mode \| Together AI Docs
SE027	Together AI	Function Calling \| Together AI Docs
SE028	Together AI	Safety Models \| Together AI Docs
SE029	Together AI	Code Execution \| Together AI Docs
SE030	Together AI	LLMs Overview \| Together AI Docs
SE031	Together AI	Vision Models Overview \| Together AI Docs
SE032	Together AI	Audio Models Overview \| Together AI Docs
SE033	Together AI	Image Models Overview \| Together AI Docs
SE034	Together AI	Embeddings API Reference
SE035	Together AI	Completions API Reference
SE036	Together AI	Models API Reference
SE037	GitHub	Together Computer · GitHub Org
SE038	GitHub	togethercomputer/RedPajama-Data
SE039	GitHub	togethercomputer/OpenChatKit
SE040	GitHub	Dao-AILab/flash-attention
SE041	GitHub	togethercomputer/StripedHyena
SE042	GitHub	togethercomputer/Llama-2-7B-32K-Instruct
SE043	GitHub	togethercomputer/together-python SDK
SE044	PyPI	together — Python package
SE045	HuggingFace	togethercomputer on Hugging Face
SE046	HuggingFace	StripedHyena-Nous-7B
SE047	HuggingFace	Evo-1-131k-base
SE048	HuggingFace	RedPajama-Data-1T Dataset
SE049	HuggingFace	HuggingFace x Together AI partnership
SE050	arXiv	FlashAttention: Fast and Memory-Efficient Exact Attention
SE051	arXiv	FlashAttention-2: Faster Attention with Better Parallelism
SE052	arXiv	FlashAttention-3: Fast and Accurate Attention with Asynchrony
SE053	arXiv	Speculative Decoding paper
SE054	arXiv	Speculative decoding follow-up
SE055	arXiv	Medusa speculative decoding paper
SE056	arXiv	Mixture-of-Agents Enhances LLM Capabilities
SE057	arXiv	LLM inference infrastructure survey
SE058	arXiv	LLM evaluation benchmark paper
SE059	arXiv	Sheared LLaMA paper
SE060	NVIDIA	NVIDIA AI investments 2024
SE061	AWS	Amazon Bedrock
SE062	Together AI	Together AI status page
SE063	Together AI	Together AI trust center
SE064	Tri Dao	Tri Dao personal site (Together CSO)
SE065	Stanford HazyResearch	Stanford HazyResearch lab (Chris Ré)
SE066	AICPA	SOC 2 reporting framework
SE067	HHS	HIPAA sample BAA provisions
SE068	Hacker News	Submissions from together.ai
SE069	Reddit r/LocalLLaMA	Together AI discussions
SE070	Product Hunt	Together AI on Product Hunt
SE071	StackShare	Together AI Tech Stack
SU001	Together AI	Together AI — The AI Acceleration Cloud
SU002	Together AI	About \| Together AI
SU003	Together AI	Customers \| Together AI
SU004	Together AI	Together AI Blog
SU005	Together AI	Pricing \| Together AI
SU006	Together AI	Together AI Startup Accelerator
SU007	Together AI	Together AI at NVIDIA GTC 2025
SU008	Together AI	Together AI x Adaption partnership
SU009	Together AI	AI Native Conf — research & product announcements
SU010	Together AI	Salesforce customer case study
SU011	Together AI	Zoom customer case study
SU012	Together AI	Pika customer case study
SU013	Together AI	Arcee customer case study
SU014	Together AI	Nous Research customer case study
SU015	Together AI	Cartesia customer case study
SU016	Together AI	Washington University customer case study
SU017	Salesforce Ventures	Salesforce Ventures Perspectives
SU018	NVIDIA	NVIDIA AI investments 2024
SU019	HuggingFace	HuggingFace x Together AI partnership
SU020	HuggingFace	togethercomputer on Hugging Face
SU021	Together AI	Together AI Blog (apex)
SU022	Reddit r/LocalLLaMA	Together AI discussions
SU023	Hacker News	Submissions from together.ai
SU024	Product Hunt	Together AI on Product Hunt
SU025	StackShare	Together AI Tech Stack
SU026	G2	Together AI — G2 reviews
SU027	Trustpilot	Together AI — Trustpilot reviews
SU028	Wikipedia	Together AI — Wikipedia
SU029	Crunchbase	Together AI — Crunchbase Profile
SU030	Fireworks AI	Fireworks AI — Production-grade LLM inference
SU031	Replicate	Replicate — Run models in the cloud
SU032	CNBC	Together AI raises $305 million at $3.3 billion valuation
SU033	Bloomberg	Together AI Startup Raises Funds at $3.3 Billion Valuation
SU034	Fast Company	Together AI funding profile
SU035	TechCrunch	Together AI is worth $1.25B (March 2024 update)
SU036	VentureBeat	Together AI raises $305M for open-source GenAI
SU037	Gartner	Gartner AI Insights
SU038	Together AI	Introduction \| Together AI Docs
SU039	Together AI	Inference Models \| Together AI Docs
SU040	PyPI	together — Python package
SU041	GitHub	togethercomputer/together-python SDK
SU042	Together AI	Together AI status page
SR001	FTC	FTC: AI Companies — Uphold Your Privacy & Confidentiality Commitments
SR002	FTC	FTC launches inquiry into generative AI investments & partnerships
SR003	EUR-Lex	EU Regulation 2024/1689 (AI Act)
SR004	NIST	AI Risk Management Framework
SR005	US BIS	BIS export controls on advanced computing & foundation models
SR006	UK ICO	UK Information Commissioner — Our work on AI
SR007	OAIC (Australia)	OAIC guidance on privacy and AI products
SR008	The White House	Executive Order 14110 on Safe, Secure AI
SR009	CA Attorney General	California Consumer Privacy Act guidance
SR010	HHS	HIPAA sample BAA provisions
SR011	AICPA	SOC 2 reporting framework
SR012	Center for Democracy & Technology	CDT — AI policy & governance
SR013	CourtListener	NYT v Microsoft / OpenAI docket
SR014	CourtListener	Authors Guild v OpenAI docket
SR015	CourtListener	Getty Images v Stability AI docket
SR016	CoreWeave	CoreWeave — Specialized GPU Cloud
SR017	Groq	Groq — Fast AI inference
SR018	Cerebras	Cerebras — Wafer-Scale AI
SR019	Fireworks AI	Fireworks AI — Production-grade LLM inference
SR020	Replicate	Replicate — Run models in the cloud
SR021	Modal	Modal — Serverless AI infrastructure
SR022	Anyscale	Anyscale — Powered by Ray
SR023	Lambda Labs	Lambda — GPU Cloud for AI
SR024	Together AI	Together AI — The AI Acceleration Cloud
SR025	Together AI	Together AI Blog
SR026	Together AI	Pricing \| Together AI
SR027	Together AI	Together AI status page
SR028	Together AI	Together AI trust center
SR029	Together AI	Customers \| Together AI
SR030	Together AI	Introduction \| Together AI Docs
SR031	Together AI	Safety Models \| Together AI Docs
SR032	Hacker News	Submissions from together.ai
SR033	Reddit r/LocalLLaMA	Together AI discussions
SR034	CNBC	Together AI raises $305 million at $3.3 billion valuation
SR035	Bloomberg	Together AI Startup Raises Funds at $3.3 Billion Valuation
SR036	VentureBeat	Together AI raises $305M for open-source GenAI
SR037	Fast Company	Together AI funding profile
SR038	Wikipedia	Together AI — Wikipedia
SV001	Gartner	Gartner AI Insights
SV002	Forrester	Forrester: Generative AI infrastructure landscape
SV003	IDC	IDC Worldwide AI Software Market Forecast 2024-2028
SV004	a16z	a16z — State of Generative AI in the Enterprise 2025
SV005	Bessemer Venture Partners	Bessemer: State of AI 2025
SV006	Menlo Ventures	Menlo Ventures: 2025 State of AI
SV007	PitchBook	Together AI — PitchBook profile
SV008	The Information	Together AI revenue 2025 reporting
SV009	Meritech Capital	Meritech SaaS comps table
SV010	PwC	PwC Global AI Study — Sizing the prize
SV011	Y Combinator	Y Combinator — Generative AI companies directory
SV012	SEC EDGAR	SEC EDGAR — Together AI search
SV013	SEC EDGAR	Navan S-1/A filing
SV014	SEC EDGAR	CoreWeave SEC filings (S-1 and post-IPO)
SV015	SEC EDGAR	Figma S-1 filings (comparable IPO)
SV016	SEC EDGAR	Snowflake 10-K filings (public SaaS comp)
SV017	SEC EDGAR	MongoDB 10-K filings (public infra comp)
SV018	CNBC	Together AI raises $305 million at $3.3 billion valuation
SV019	Bloomberg	Together AI Startup Raises Funds at $3.3 Billion Valuation
SV020	VentureBeat	Together AI raises $305M for open-source GenAI
SV021	Fast Company	Together AI funding profile
SV022	Wikipedia	Together AI — Wikipedia
SV023	Crunchbase	Together AI — Crunchbase Profile
SV024	Together AI	Together AI — The AI Acceleration Cloud
SV025	Together AI	Together AI Blog
SV026	Together AI	Pricing \| Together AI
SV027	Together AI	Customers \| Together AI
SV028	Together AI	About \| Together AI
SV029	CNBC	Figma starts trading on NYSE after IPO
SV030	CNBC	Navan files for IPO
SV031	TechCrunch	Sakana AI $135M Series B at $2.65B
SV032	NVIDIA	NVIDIA + Sakana AI partnership
SV033	CoreWeave	CoreWeave — Specialized GPU Cloud
SV034	Groq	Groq — Fast AI inference
SV035	Cerebras	Cerebras — Wafer-Scale AI
SV036	Fireworks AI	Fireworks AI — Production-grade LLM inference
SV037	Replicate	Replicate — Run models in the cloud
SV038	Modal	Modal — Serverless AI infrastructure
SV039	Anyscale	Anyscale — Powered by Ray
SV040	Lambda Labs	Lambda — GPU Cloud for AI
SV041	Hacker News	Submissions from together.ai
SV042	Reddit r/LocalLLaMA	Together AI discussions
SV043	Salesforce Ventures	Salesforce Ventures Perspectives
SV044	NVIDIA	NVIDIA AI investments 2024
SV045	Sacra	Together AI revenue, valuation & funding — Sacra analysis	Sacra estimates that Together AI hit $1B in annualized revenue in February 2026, up from ~$618M at the end of 2025, off growing demand for generative AI applications and the need, particularly among startups, for developer tooling used to train, fine-tune, and deploy AI models.
SV046	ARR.club	Together AI ARR milestones and revenue growth
SV047	EquityZen	Invest In Together AI Stock — Pre-IPO shares profile
SV048	CB Insights	State of AI Q1 2026 Report