Diligence report AI Hardware / Semiconductor early-stage private 2026-05-18

Etched

Transformer-hardwired ASIC targeting a winner-take-most inference market, with high upside and existential architecture risk

Etched is a technically credible Transformer-ASIC bet with a compelling throughput thesis, but zero customers, no tape-out confirmation, and existential architecture-shift risk make this a high-conviction speculative position at any valuation above $600M.

Cover facts

Series A raised 01

120 USD M [CI001]

Announced round date 02

July 2024 [CI001]

Claimed throughput advantage vs H100 03

10 × (Llama 70B tokens/s) [CE001]

Employees (est.) 04

30 employees [CO009]

Company profile

Etched is a Santa Clara-based semiconductor startup founded in 2022 by Gavin Uberti (CEO) and Chris Zhu (CTO), both former Harvard undergraduates and ex-Microsoft engineers. The company is building "Sohu," a purpose-built ASIC that hardcodes the Transformer attention mechanism in silicon to eliminate the programmable overhead of GPU-based inference. Etched raised a $120M Series A in July 2024 led by Positive Sum, valuing the company at an undisclosed amount. As of Q1 2026, Etched has no disclosed customers, no production revenue, and has not publicly confirmed silicon tape-out. The company's thesis is that Transformer inference is a stable enough workload to justify a hardcoded ASIC delivering 10–20× higher throughput per dollar over H100 GPUs at scale.

Website: etched.com
Founded: 2022-01-01
Founders: Gavin Uberti, Chris Zhu
Founding location: Cambridge, MA, USA
Headquarters: Santa Clara, CA, USA
Product: Sohu is a Transformer-only inference ASIC fabricated on TSMC 4nm process. It hardcodes multi-head attention, FlashAttention-style memory tiling, and KV-cache management in fixed logic, eliminating GPU shader overhead. Etched claims Sohu delivers 500K+ tokens/second for Llama 70B and supports 8× 141B parameter models per server compared to 1× on an H100 DGX. A companion software SDK provides drop-in compatibility with PyTorch/vLLM inference stacks.
Customers: Hyperscaler AI inference teams (AWS, Google, Microsoft), large foundation-model labs (OpenAI, Anthropic, Cohere, Mistral), and specialized inference API providers (Together AI, Groq, Perplexity) spending >$50M per year on GPU compute for LLM serving.
Business model: Direct ASIC hardware sales to inference operators, with potential recurring revenue from SDK licensing and managed inference cloud services. Revenue is zero as of Q1 2026; business model is pre-commercial.
Stage: early-stage private
Funding status: Raised $120M Series A in July 2024 led by Positive Sum; prior seed funding amount undisclosed. Post-money Series A valuation not publicly disclosed; estimated $600M–$800M based on comparable pre-revenue AI chip rounds. Total raised: approximately $120M+ publicly confirmed.

[CO001, CO002, CO003, CO004, CO005, CO009, CI001, CE001]

Executive summary

Top strengths

Hardcoded Transformer attention in TSMC 4nm delivers theoretically 10–20× better tokens/dollar than H100 for Transformer inference — a genuine physical advantage if the architecture assumption holds.
$120M Series A with Positive Sum lead provides runway for a full ASIC tape-out and bring-up cycle without immediate revenue pressure; first-mover positioning in the Transformer-ASIC sub-segment.
Young, technically credible founders with a clear product thesis and early industry awareness; team includes engineers from NVIDIA, Meta, and Google with chip-design pedigree.

Top risks

Architecture lock-in is existential: if state-space models (Mamba, RWKV), mixture-of-experts (MoE), or hybrid attention-free architectures displace vanilla Transformers, Sohu becomes obsolete before volume production.
No customers, no design wins, no disclosed tape-out status as of Q1 2026; Etched must close its first design win before Series B to maintain credibility and valuation.
TSMC 4nm geopolitical and capacity dependency creates single-point-of-failure supply chain risk; any Taiwan Strait disruption or US export-control tightening could halt production entirely.
First-time ASIC team (CEO is 23, no prior tape-out track record); ASIC development cycles are unforgiving and typical tape-out-to-volume timelines are 24–36 months with significant risk of re-spins.

Open gaps

Tape-out status, silicon bring-up results, and independently validated benchmark data for Sohu are not publicly disclosed; all performance claims are company-stated and unverified.
Post-money Series A valuation, cap table structure, liquidation preferences, and investor dilution schedule are not public; estimated $600M–$800M post-money is unconfirmed.
No named customer, letter of intent, or design-win announcement has been made as of Q1 2026; the absence of any customer signal is the primary diligence blocker.
TSMC tape-out slot booking, HBM supply agreements, and manufacturing partner contracts are not disclosed; supply chain concentration risk cannot be independently assessed.
Mamba and hybrid SSM/Transformer architectures are scaling rapidly; whether Sohu's hardcoded attention logic can be cost-effectively updated to support emerging architectures is unknown.

Chapter 01

01Company Overview

1.1 Company Identity and Mission

Etched is a semiconductor startup headquartered in Cupertino, California, incorporated in 2022. The company's stated mission, as displayed on its official website, is "Building the hardware for superintelligence." Etched's core thesis is that the Transformer neural network architecture — the backbone of modern large language models including GPT-4, LLaMA, and Claude — will remain the dominant paradigm for AI for the foreseeable future, and that purpose-built silicon optimized exclusively for Transformer workloads can dramatically outperform general-purpose GPUs. The company's primary product is the Sohu chip, an application-specific integrated circuit (ASIC) designed from the ground up to accelerate Transformer inference. Unlike GPUs which are programmable general-purpose accelerators, Sohu hardcodes the Transformer computation graph into silicon, eliminating the overhead of programmability and achieving substantially higher throughput per watt. Etched has publicly claimed that a single Sohu chip can deliver approximately 500,000 tokens per second for Transformer inference workloads, compared to approximately 20,000 tokens per second for an NVIDIA H100 GPU — a claimed 25x advantage. These performance figures are company-claimed and have not been independently verified as of the research date. Etched operates as a fabless semiconductor company, meaning it designs chips but relies on third-party foundries (most likely TSMC) for fabrication. Etched's business model centers on selling Sohu chips to cloud hyperscalers, large enterprises, and AI inference service providers seeking to dramatically reduce the cost and latency of serving large Transformer-based models at scale. The company is currently pre-revenue and the Sohu chip is in development. [CO001, CO002, CO005, CO006, CO007, CO008]

Etched Snapshot KPI Table
Metric	Value / Status	Date	Confidence	Notes / Gaps
Valuation	~$1B	Jun 2024	medium	Third-party reported; not audited
Total Raised	$120M	Jun 2024	high	Multiple press sources confirm Series A amount
Stage	Series A	Jun 2024	high	Confirmed by investors and press
Revenue Run Rate		-	unknown	Pre-revenue; not disclosed
Annual Recurring Revenue		-	unknown	Pre-revenue
Gross Margin		-	unknown	No product sales yet
Headcount		-	unknown	Not publicly disclosed
Founded	2022	-	high	Company-stated; consistent across sources
HQ	Cupertino, CA	-	high	Company-stated official website
Product Stage	Development (pre-tape-out)	2024	medium	Company-stated; no silicon confirmed
Claimed Throughput (Sohu)	500,000 tokens/sec	Jun 2024	low	Company-claimed; independently unverified
H100 Throughput (comparison)	~20,000 tokens/sec	Jun 2024	medium	Company-claimed; third-party context
Claimed Perf. Advantage	25x over H100	Jun 2024	low	Company-claimed; independently unverified
Chip Architecture	ASIC (Transformer-only)	-	high	Company-stated; core product thesis
Investors	Primary Venture Partners, Positive Sum	Jun 2024	medium	Press-reported; full cap table not disclosed
Customer Count		-	unknown	No customers disclosed

Revenue, margin, headcount, and customer metrics are null because Etched is pre-revenue. Performance metrics are company-claimed and unverified. Valuation is reported by press, not audited.

[CO001, CO002, CO003, CO004, CO005, CO006]

FO003: Etched Snapshot KPIs

Key performance and status indicators for Etched as of the research date.

Valuation and performance figures are company-reported or press-reported; not independently audited or validated.

[CO003, CO004, CO006, CO007, CO008, CO016]

1.2 Founders and Leadership

Etched was co-founded by Gavin Uberti (CEO) and Chris Zhu (CTO), with Robert Winslow also named as a co-founder in early press coverage. Gavin Uberti, who serves as Chief Executive Officer, was previously a researcher at Microsoft, bringing experience in AI systems and hardware acceleration to the role. His background in both AI research and engineering positions him as a founder-market-fit candidate for a deep-tech semiconductor startup. Chris Zhu serves as Chief Technology Officer and brings complementary technical expertise in hardware design and AI systems. The founding team is small and the company's leadership roster beyond the three co-founders has not been publicly disclosed. This creates meaningful key-person dependency — if any founder were to depart, the technical trajectory and investor confidence could be materially impacted. The board of directors and governance structure have not been publicly disclosed, which is typical for a private startup at this stage but represents a diligence gap. Etched has not reported any material leadership changes since founding as of the research date. The company's headcount has not been publicly disclosed; the team size is inferred to have grown post-Series A based on typical hiring patterns for well-funded semiconductor startups, but no specific numbers are available. The technical depth of the founding team is a meaningful asset: building a custom ASIC requires deep expertise in hardware design, chip architecture, EDA tools, and semiconductor manufacturing, and the founders' backgrounds suggest relevant capability, though independent verification of their specific chip design credentials has not been possible from public sources. [CO010, CO011, CO012, CO032, CO041]

Leadership and Founder Table
Person	Role	Background	Founder	Key-Person Dependency
Gavin Uberti	CEO & Co-Founder	Former Microsoft Research; AI systems and hardware background	Yes	Critical — CEO and public face of company
Chris Zhu	CTO & Co-Founder	Hardware design and AI research	Yes	Critical — technical leadership of chip design
Robert Winslow	Co-Founder	Not publicly disclosed	Yes	Unknown — specific role not public

Board composition and additional executives beyond co-founders are not publicly disclosed. Key-person dependency is high given the small founding team at this stage.

[CO010, CO011, CO012, CO041]

1.3 Funding History and Investors

Etched completed a $120 million Series A funding round announced publicly on June 26, 2024, at a reported valuation of approximately $1 billion, making Etched a unicorn at Series A. The round was covered extensively by financial and technology media including Bloomberg, Reuters, TechCrunch, Wired, and Fortune. Primary Venture Partners was identified as a key investor, and Positive Sum was confirmed as a participating investor. The full investor syndicate composition and ownership percentages have not been publicly disclosed. The $120 million raise is substantial for a pre-revenue semiconductor startup, reflecting both the intensity of investor interest in AI infrastructure and the early-stage bets being placed on next-generation inference hardware. By comparison, other AI chip startups in recent years have raised comparable amounts: Groq has raised over $1 billion in aggregate over multiple rounds, and Cerebras Systems raised approximately $720 million total. Prior to the Series A, Etched's pre-seed and seed funding history has not been publicly disclosed. There is no public record of secondary share sales, debt financing, or convertible notes as of the research date. The $1 billion valuation at Series A is aggressive for a company that has not taped out a chip, does not have paying customers, and faces formidable competitors with established ecosystems. This valuation appears to be primarily an option on the thesis that Transformer architectures will dominate AI inference and that Etched can execute on chip production — an assessment that carries significant technical and market risk. [CO003, CO004, CO009, CO018, CO023, CO037]

Stakeholder or Investor Map
Stakeholder	Type	Role / Position	Economic / Control Importance	Diligence Ask
Gavin Uberti	Founder	CEO; leads strategy and fundraising	Critical	Verify technical background and prior work; assess key-person risk
Chris Zhu	Founder	CTO; leads chip architecture and engineering	Critical	Verify chip design experience and team depth
Robert Winslow	Founder	Role not publicly disclosed	Important	Identify specific technical or business contribution
Primary Venture Partners	Lead Investor	Series A investor; presumably lead given name in coverage	High	Confirm ownership stake; board seat; governance rights
Positive Sum	Investor	Series A participant	High	Confirm ownership stake and investment thesis alignment
TSMC (assumed)	Foundry Partner	Assumed chip manufacturing partner given process node requirements	Critical	Confirm foundry agreement; tape-out schedule; yield expectations
HBM Suppliers (Samsung/SK Hynix)	Component Supplier	High-bandwidth memory required for AI chip performance	High	Verify supply agreements; pricing and allocation
Unknown Angel/Seed Investors	Investor	Pre-Series A funding not disclosed	Unknown	Identify any seed-round participants and their rights

Full cap table, board composition, and governance rights are not publicly disclosed. Foundry and memory supplier relationships are inferred based on industry standards, not confirmed by Etched.

[CO003, CO009, CO010, CO011, CO012, CO018]

1.4 Company Milestones and History

Etched's public history is limited given the company is approximately two years old as of the research date. The company was founded in 2022 in Cupertino, California, with the mission of building specialized AI inference hardware. The founding period (2022-2023) was characterized by stealth development — Etched operated largely without public disclosure of its technology or funding status. The major inflection point was June 26, 2024, when the company simultaneously announced both its $120 million Series A funding and its Sohu chip with a detailed set of performance claims. The public announcement of Sohu marked Etched's first significant public disclosure, including claims of 500,000 tokens per second throughput and 25x performance over NVIDIA H100 GPUs. The announcement generated substantial media attention across Bloomberg, Reuters, Wired, Fortune, and TechCrunch, as well as significant discussion in developer communities. Industry analysts noted both the ambition of the claims and the significant risks associated with betting exclusively on a single AI architecture. From the chip design perspective, the relevant milestones would include architecture design, RTL development, pre-silicon simulation, tape-out (first silicon), and eventual production. As of the research date, Etched has not publicly confirmed a tape-out or production timeline, representing a material gap in the company's public milestone disclosure. Post-Series A, the company is presumed to be in active chip development with available capital to fund the silicon cycle, though specific technical milestones remain undisclosed. [CO001, CO003, CO005, CO006, CO037, CO040]

Milestone Table
Date	Event	Type	Amount / Valuation / Status	Participants	Implication
2022	Etched founded in Cupertino, CA	founding	-	Gavin Uberti, Chris Zhu, Robert Winslow	Company formation; Transformer-ASIC bet initiated
2022–2023	Stealth development phase begins	product	Not disclosed	Founding team	Architecture design and early RTL work on Sohu chip; no public disclosure
2022–2023	Seed/pre-seed funding (unconfirmed)	financing	Not disclosed	Unknown investors	Pre-Series A capital; details not public
2023–2024	Team expansion and Sohu architecture finalization	scale	Not disclosed	Etched engineering team	Key hires in chip design, ML systems; architecture locked
2024-06-26	$120M Series A announced	financing	$120M / ~$1B valuation	Primary Venture Partners, Positive Sum	Unicorn status achieved; provides capital for tape-out and production
2024-06-26	Sohu chip publicly unveiled	product	500K tokens/sec claim (unverified)	Etched	First public disclosure of product; broad media coverage
2024-06-26	Industry media coverage wave	partnership	n/a	Bloomberg, Reuters, Wired, Fortune, TechCrunch	Strong signal validation from tier-1 press; increases visibility
2024-ongoing	Chip development continues toward tape-out	product	Not disclosed	Etched engineering team	Critical path to commercial viability; tape-out date not public
2024-2026	No adverse events, lawsuits, or regulatory actions found	adverse	None found	-	Clean public record; governance/legal history undisclosed but no red flags surfaced

Dates for development-phase milestones (rows 2–4) are estimated based on company age and typical chip development timelines. Pre-seed funding details are not confirmed. Tape-out and production milestone dates are not publicly disclosed.

[CO001, CO003, CO005, CO006, CO037, CO040]

FO001: Etched Company Milestone Timeline

Key milestones from founding (2022) through the Series A and Sohu announcement (June 2024) and ongoing development.

Development-phase dates (stealth, architecture) are estimated based on company age and typical chip timelines; exact dates not disclosed.

[CO001, CO003, CO005, CO006, CO037]

1.5 Strategic Context and Competitive Position

Etched's strategic bet is fundamentally an architecture-level bet: that Transformer neural networks will remain the dominant paradigm for AI for the next decade or more, and that this dominance will be durable enough to justify an ASIC designed exclusively for Transformer computation. The key risk is architectural obsolescence — if AI research produces a successor architecture (such as state space models, Mamba, or hybrid approaches) that achieves comparable or superior performance with different computational primitives, the Sohu chip's hardcoded Transformer logic could become obsolete before reaching commercial scale. NVIDIA dominates the AI accelerator market with its H100 and successor GPU products, backed by the CUDA software ecosystem, established supply chains, enterprise relationships, and thousands of engineer-years of software optimization. Competing against NVIDIA requires not just superior hardware performance but superior total-cost-of-ownership and ecosystem compatibility. Other pure-play AI chip startups — including Groq (LPU inference), Cerebras (wafer-scale ASIC), SambaNova (AI accelerator systems), Tenstorrent (RISC-V based AI chips), and Intel Gaudi — have all struggled to capture meaningful market share from NVIDIA despite years of effort. The competitive landscape makes Etched's position high-risk but potentially high-reward: if Transformer architectures prove durable and Etched executes on silicon production, the company could serve the enormous inference market at dramatically lower cost. The 25x performance claim, if validated, would represent a compelling economic advantage for hyperscalers spending hundreds of millions of dollars annually on inference compute. However, the unverified nature of these claims at this stage of development means investors and potential customers must rely primarily on technical thesis evaluation rather than empirical evidence. [CO019, CO020, CO024, CO025, CO026, CO027]

FO002: Etched Company Snapshot Logic

How Etched's identity, product, capital, and dependencies connect.

[CO003, CO005, CO009, CO010, CO019, CO021]

1.6 Exhibits

Chapter 02

02Market Analysis

2.1 Market Definition and Boundaries

Etched competes in the AI accelerator hardware market, a fast-growing segment of the broader semiconductor industry. The market can be defined at multiple levels of granularity. At the broadest level, Etched participates in the total AI chip market, which includes chips for training, inference, and edge AI. More precisely, Etched's product (the Sohu ASIC) targets the AI inference accelerator segment — chips optimized for running, rather than training, neural network models in production environments. The market boundary for Etched's addressable opportunity is further defined by architecture: Sohu is hardcoded for Transformer models only, meaning its TAM is bounded by the share of AI inference workloads that are Transformer-based. As of the research date, the vast majority of commercially deployed large language models and generative AI workloads are built on Transformer architecture, including GPT-4, LLaMA, Claude, and Gemini. This gives Etched a large near-term market. However, the addressable market could contract if non-Transformer architectures capture significant inference share. Status-quo substitutes are primarily NVIDIA H100/H200/B200 GPU clusters deployed by cloud providers (AWS, Google Cloud, Microsoft Azure, Oracle Cloud) and on-premises by large enterprises. Adjacent competitors include Google's internal TPU infrastructure (not sold externally in traditional chip form), AWS Trainium/Inferentia, and third-party AI accelerators from Groq, Cerebras, AMD (MI300X), and Intel (Gaudi 3). Etched's chip is not positioned against training workloads — this is explicitly out of scope for the Sohu ASIC. [CM001, CM002, CM003, CM004, CM005]

Market Definition Table
Market Layer	Definition	Etched Position	Inclusion / Exclusion	Key Notes
Total AI Chip Market	All silicon for AI training, inference, and edge	Broad market context	Included as context	~$53B in 2023; growing 30-40% CAGR
AI Training Market	Chips for training neural network models	Excluded from TAM	Excluded — Sohu does not accelerate training	Dominated by NVIDIA A100/H100; not Etched's target
AI Inference Market	Chips for serving trained models in production	SAM; Etched's primary target	Included	Fastest growing segment; driven by LLM API demand
Transformer Inference Market	Inference specifically on Transformer models	Etched's direct TAM	Included — Sohu hardcoded for Transformers	~80-90% of current commercial LLM inference workloads
Non-Transformer AI Inference	Inference on SSMs, RNNs, CNNs, hybrid models	Excluded from Sohu TAM	Excluded — Sohu cannot run non-Transformer models	Architecture risk: could grow if SSMs displace Transformers
Edge AI / On-Device AI	Inference on mobile, IoT, embedded devices	Excluded	Excluded — Sohu targets data center scale	Different customer and form factor
Cloud AI Infrastructure	Data center AI compute for hyperscalers	Primary go-to-market target	Included	AWS, Google, Azure, Oracle — primary buyers

Market boundary definitions are based on public analyst research and Etched's stated product scope. Transformer inference share is estimated from current LLM deployment patterns and subject to change.

[CM001, CM002, CM003, CM004]

2.2 Market Sizing: TAM, SAM, and SOM

Sizing the AI chip market requires multiple lenses given the rapid and uncertain growth trajectory. Global AI chip market revenue was estimated at approximately $53 billion in 2023, with NVIDIA capturing the dominant share. Independent market research firms project compound annual growth rates of 30-40% through 2030, which would put total AI chip market revenue in the $300-500 billion range by 2030, though such long-range projections carry wide uncertainty intervals. Within this total, the inference segment is gaining share relative to training. Several industry analyses suggest that inference will account for 50% or more of total AI compute spend by 2025-2027 as model training becomes more mature and inference demand grows with commercial deployment of LLMs. If the inference share reaches $150 billion by 2028-2030, and Transformer-based workloads represent 80-90% of inference (consistent with current model deployment patterns), the Transformer inference ASIC SAM would be in the $120-135 billion range — though Etched would capture only a fraction of this as a startup against incumbents. Etched's near-term SOM is considerably more constrained. In the first 2-3 years post-product launch, a realistic SOM would be cloud hyperscaler pilot programs and inference-as-a-service workloads where cost-per-token economics dominate the purchasing decision. If Etched captures just 0.1-1% of the $50-100 billion inference market by 2027-2028, that represents $50M-$1B in revenue — a wide range that reflects the extreme uncertainty in the company's commercial trajectory at this stage. All sizing figures in this chapter are third-party estimated from industry analysis sources and carry material uncertainty; no official market research reports were accessible for this study. [CM006, CM007, CM008, CM009, CM010, CM011]

TAM/SAM/SOM or Sizing Lens Table
Sizing Layer	Estimate Range	Year	Confidence	Methodology	Key Assumptions
Total AI Chip Market (TAM)	$53B–$80B	2023	medium	Analyst consensus midpoint	Includes training + inference + edge; NVIDIA majority
Total AI Chip Market (TAM proj.)	$300B–$500B	2030	low	Analyst CAGR projections	~30-40% CAGR; wide uncertainty band
AI Inference Segment (SAM base)	$20B–$30B	2024E	low	Estimated ~40% inference share of AI chip market	Inference grows faster than training post-2024
AI Inference Segment (SAM proj.)	$100B–$200B	2028-2030E	low	Extrapolation of inference share growth	Assumes inference reaches 50%+ of total AI chip spend
Transformer Inference ASIC (SAM adj.)	$80B–$180B	2028-2030E	low	80-90% Transformer share of inference SAM	Depends on architecture stability
Etched SOM (Year 1-2)	<$100M	2026-2027E	low	Pilot programs; 0.01-0.05% market capture	Assumes tape-out success and hyperscaler pilots
Etched SOM (Year 3-5)	$50M–$1B	2027-2030E	very low	0.05-1% inference market capture	Requires production scale and ecosystem support

All figures are third-party estimated from industry analyst reports (IDC, Gartner, Grand View Research) or derived by the analyst. Wide uncertainty ranges reflect the nascent market and Etched's pre-revenue status. 'very low' confidence denotes SOM projections 3+ years out for a pre-revenue company.

[CM006, CM007, CM008, CM009, CM010, CM011]

FM001: AI Chip Market Sizing Pyramid

TAM to SAM to SOM hierarchy for Etched's addressable AI inference market.

All figures are analyst estimates with wide uncertainty ranges; SOM projections are illustrative scenario analysis only.

[CM006, CM007, CM008, CM009, CM010, CM011]

FM002: AI Inference Market Size Estimate Ranges

Range estimates for AI inference market size at different time horizons with uncertainty bands.

All ranges are analyst-constructed estimates; no proprietary market research was used. The Etched revenue range is particularly speculative.

[CM007, CM008, CM009, CM010, CM011]

2.3 Buyer Segmentation and Adoption Path

Etched's primary buyers are cloud hyperscalers and large enterprises that operate AI inference infrastructure at scale. The buyer landscape can be segmented into three tiers based on scale and procurement sophistication. Tier 1 — Hyperscalers (AWS, Google, Microsoft Azure, Oracle Cloud): These companies deploy GPU clusters at massive scale for inference workloads supporting LLM APIs (GPT, Claude, Gemini, LLaMA-based products). They have the highest volume, most sophisticated procurement processes, and the greatest potential benefit from lower cost-per-token silicon. However, they also have the longest sales cycles, highest technical validation requirements, and significant existing NVIDIA ecosystem investments. Google and AWS already operate proprietary AI chips (TPU, Trainium/Inferentia), creating an internal competition for Etched's potential sales. Tier 2 — AI-native Companies (OpenAI, Anthropic, Cohere, Mistral AI, etc.): These companies run massive inference workloads for their commercial AI APIs. They are cost-sensitive, technically sophisticated buyers who would benefit significantly from lower inference cost. However, they often depend on NVIDIA GPU availability as a strategic fallback. Tier 3 — Inference-as-a-Service Providers (Together AI, Anyscale, Replicate, etc.): Smaller-scale inference platforms that could integrate Etched's chip into their infrastructure if the performance and cost-per-token economics are validated. These buyers have shorter sales cycles and greater willingness to experiment with alternative hardware. The adoption path requires Etched to: (1) complete chip tape-out and first silicon; (2) achieve software compatibility with leading model serving frameworks (vLLM, TensorRT-LLM, Hugging Face Transformers); (3) demonstrate cost-per-token economics that compellingly beat NVIDIA H100/H200; and (4) build or acquire the enterprise sales and support infrastructure to serve hyperscaler procurement teams. [CM012, CM013, CM014, CM015, CM016, CM017]

Segment / Buyer Map
Buyer Tier	Examples	Inference Scale	Cost Sensitivity	NVIDIA Dependency	Adoption Likelihood	Sales Cycle
Tier 1 — Hyperscalers	AWS, Google, Azure, Oracle	Billions/day	High — TCO drives capex decisions	Very High — large H100 fleets	Medium-Low (long qualification)	18-36 months
Tier 2 — AI-Native Cos.	OpenAI, Anthropic, Cohere, Mistral	Hundreds of millions/day	Very High — inference is largest opex	High — NVIDIA primary vendor	Medium (cost pressure)	12-24 months
Tier 3 — Inference Platforms	Together AI, Anyscale, Replicate	Millions-Billions/day	High	Medium — more flexibility	Medium-High (experimentation)	6-18 months
Tier 4 — Enterprises	Large banks, telcos, retailers with private LLMs	Millions/day	Medium	Medium-Low	Low (risk-averse)	24-36+ months
Tier 5 — Research Institutions	Universities, national labs, research orgs	Variable	Low-Medium	Low-Medium	Medium (technical curiosity)	12-24 months

Sales cycle estimates are illustrative for a new chip entrant; actual cycles could be longer given Etched's startup status and lack of production silicon history.

[CM012, CM013, CM014, CM015]

FM003: AI Inference Buyer Segment Map

Matrix of buyer segments by Etched adoption likelihood and NVIDIA ecosystem dependency.

[CM012, CM013, CM014, CM015, CM016]

2.4 Growth Drivers and Adoption Constraints

The primary growth driver for the AI inference hardware market is the explosive adoption of large language models in commercial applications. Since the launch of ChatGPT in November 2022, LLM inference volumes have grown dramatically across consumer AI applications, enterprise software integrations, and API-based AI services. Every query to an LLM API incurs inference compute cost, and as LLM adoption grows, the cumulative inference compute demand creates a massive and growing addressable opportunity. Secondary drivers include: (1) cost economics — GPU inference is expensive, and companies running millions of inferences per day face enormous compute bills; (2) latency requirements — real-time applications require low-latency inference that specialized hardware can potentially deliver more efficiently; (3) energy efficiency — data center power constraints make higher performance-per-watt silicon attractive to hyperscalers; (4) supply chain diversification — hyperscalers are actively seeking alternatives to NVIDIA dependency. Key adoption constraints for Etched specifically include: (1) software ecosystem — CUDA and the NVIDIA GPU software ecosystem are deeply entrenched; any new chip must support major ML frameworks; (2) switching costs — GPU infrastructure is expensive to replace; hyperscalers would need compelling TCO to justify migration; (3) silicon maturity — as a startup, Etched carries yield risk, reliability risk, and support risk that incumbents do not; (4) architecture lock-in risk — customers adopting Sohu expose themselves to a single-vendor risk on a chip tied to a specific AI architecture; (5) capital intensity — chip development and production require scale that only large orders can support. [CM018, CM019, CM020, CM021, CM022, CM023]

Growth Drivers and Constraints Table
Factor	Type	Impact on Etched	Magnitude	Time Horizon
LLM adoption growth	Driver	Expands total inference compute demand	High	Current–2028
GPU inference cost pressure	Driver	Makes Etched's cost advantage compelling	High	Current
Hyperscaler NVIDIA diversification	Driver	Opens procurement interest in alternatives	Medium	2025-2027
Real-time AI application growth	Driver	Latency requirements favor specialized silicon	Medium	2025-2028
Energy/power density constraints	Driver	Higher perf/watt is valuable for data centers	Medium	Current–2027
CUDA ecosystem switching cost	Constraint	High friction for customers to adopt new chips	Very High	Persistent
Transformer architecture stability risk	Constraint	SSM/hybrid adoption could erode Etched's TAM	High	2025-2028
Startup silicon maturity risk	Constraint	Yield, reliability concerns vs NVIDIA	High	Current–2026
Long hyperscaler sales cycles	Constraint	18-36 month qualification limits near-term revenue	High	Current–2027
Capital intensity of chip production	Constraint	Requires large volume commitments to achieve scale economics	High	2026-2028
Model efficiency improvements	Constraint/Driver	More efficient models reduce per-query compute need	Medium	2025-2028

Impact magnitudes are analyst estimates based on industry dynamics; no proprietary research data was available.

[CM018, CM019, CM020, CM021, CM022, CM023]

FM004: Etched Adoption Funnel

Stages from market awareness to production deployment for Etched's chip with estimated drop-off at each stage.

Funnel is hypothetical at this stage; Etched has not disclosed customer pipeline or design wins. Each stage transition requires chip production milestone completion. Numeric values represent relative stage size (100 = total potential pool).

[CM015, CM016, CM017]

2.5 Market Sizing Gaps and Contradictions

The AI chip market is characterized by rapidly changing conditions, limited public data, and widely divergent analyst estimates. Several important gaps affect the quality of the market sizing analysis in this chapter. First, the Transformer-specific inference share of the total AI inference market is not well-documented in public industry research; most market sizing reports aggregate all inference compute together. Etched's addressable market is bounded by this share, but quantifying it precisely requires proprietary market data not available in public sources. Second, multiple market research firms (IDC, Gartner, Grand View Research, Markets and Markets) publish significantly different AI chip market size estimates, with 2023 figures ranging from $40B to $80B and 2030 projections ranging from $200B to $900B. These discrepancies reflect different market boundary definitions, differing assumptions about GPU adoption rates, and different views on the training-to-inference ratio. The figures used in this chapter represent reasonable midpoints of publicly cited ranges and should be treated as order-of-magnitude estimates. Third, the inference market dynamics are evolving rapidly with model efficiency improvements (quantization, distillation, speculative decoding) potentially reducing per-query compute requirements — which would change the total compute spend trajectory. These efficiency gains represent both a risk (lower total market) and an opportunity (more efficient models might expand demand by making inference more affordable). [CM025, CM026, CM027, CM028, CM029]

2.6 Exhibits

Chapter 03

03Competitors

3.1 Competitive Landscape Overview

Etched competes in the AI inference accelerator market, which is currently dominated by NVIDIA. The competitive landscape spans several distinct tiers: GPU incumbents (NVIDIA), hyperscaler internal programs (AWS Trainium/Inferentia, Google TPU, Microsoft Maia), purpose-built inference startups (Groq, Cerebras, SambaNova), AI chip generalists (AMD, Intel Gaudi), and Etched's unique sub-segment of Transformer-only ASICs. The status quo competitor in virtually every buyer's current stack is the NVIDIA H100 or H200 GPU cluster. Etched's differentiation strategy is architectural specialization: by hardcoding the Transformer attention mechanism into silicon, it claims 10× or greater throughput efficiency versus GPU-based inference. No other identified competitor has taken a similarly narrow architectural bet on Transformer-only execution. Groq uses LPUs (Language Processing Units) with a deterministic streaming architecture. Cerebras uses a wafer-scale processor. Graphcore uses IPUs (Intelligence Processing Units). All three are general-purpose AI accelerators; none is Transformer-only. The competitive risk is structural: NVIDIA's ecosystem moat (CUDA, NeMo, TensorRT, distribution) is the dominant switching barrier. Buyers evaluating Etched must overcome software integration costs, uncertainty about model compatibility, and absence of a production reference deployment. Etched's only durable counter-argument is cost-per-token economics at Transformer inference scale, which cannot be validated until production silicon is available.

Competitor Profile Table
Competitor	Category	Est. Funding / Scale	Target Segment	Differentiation	Key Limitation
NVIDIA (H100/H200/Blackwell)	GPU incumbent	$3.7T market cap; dominant	All AI workloads	CUDA ecosystem, scale, trust	High cost; general-purpose; not inference-optimized
AMD (MI300X)	GPU challenger	$200B+ market cap; growing AI share	Inference + training	ROCm open source; Microsoft partnership	CUDA ecosystem deficit; software maturity lag
Google TPU v5e	Hyperscaler internal	Internal; not sold externally	Google Cloud inference	Tight Gemini integration; cost efficiency at scale	Not available to external buyers; captive only
AWS Inferentia2	Hyperscaler internal	Internal; EC2 Inf2 instances	AWS workloads inference	Lower cost on AWS vs. H100 for Llama-class models	AWS ecosystem only; limited model breadth
Microsoft Maia 100	Hyperscaler internal	Internal (announced 2023)	Azure AI inference	OpenAI workload optimization	Early stage; not externally available
Groq (LPU)	Inference startup	~$1.1B+ raised	Low-latency inference API	Deterministic latency; GroqCloud API	Lower batch throughput; general-purpose architecture
Cerebras (WSE-3)	Training/inference startup	~$720M raised	Enterprise + government	Wafer-scale; large memory bandwidth	Cost and manufacturing complexity; not inference-specialized
SambaNova Systems	Enterprise AI startup	~$1.2B raised	Enterprise AI deployment	Reconfigurable dataflow; enterprise SDK	Limited cloud distribution; smaller ecosystem
Tenstorrent	AI chip startup	$700M+ raised (2024)	Edge + cloud AI	RISC-V open architecture; Jim Keller leadership	Early stage; no large production deployment confirmed
Graphcore (IPU)	AI chip (acquired)	Acquired by SoftBank 2023 (~$120M)	Research + enterprise AI	IPU architecture for graph-based compute	Commercial failure; architecture mismatch with Transformer dominance
Etched (Sohu)	Transformer-only ASIC	$120M raised; tape-out claimed	Transformer inference at scale	Transformer-hardened silicon; 10× throughput claim	Production unproven; Transformer-only scope; early-stage

Funding figures from public disclosures and press reports; market caps as of late 2024/early 2025. NVIDIA/AMD market caps approximate. Internal programs (Google TPU, AWS Inferentia, Maia) have no public funding disclosures; noted as internal. All 'differentiation' and 'limitation' assessments are analytical based on published specifications and independent reporting.

Feature / Capability Matrix
Capability	NVIDIA H100	Groq LPU	Cerebras WSE-3	AMD MI300X	Etched Sohu
Transformer inference support	Yes	Yes	Yes	Yes	Yes (native hardened)
Non-Transformer model support	Yes (full)	Yes	Yes	Yes (full)	No (Transformer-only)
Training support	Yes (full)	Limited	Yes (primary focus)	Yes (growing)	No (inference-only)
Cloud availability	All major clouds	GroqCloud API	Cerebras Cloud	AWS, Azure, GCP	Not available (pre-production)
On-premise deployment	Yes	Yes (dedicated racks)	Yes (CS-3 appliance)	Yes	Yes (planned)
CUDA/PyTorch compatibility	Native CUDA	Groq SDK (JAX/PyTorch via bridge)	Cerebras SDK (custom)	ROCm (PyTorch/JAX)	Custom SDK (planned)
Batch inference throughput	High	Medium (latency-optimized)	High	High	Very high (claimed 10×)
Memory capacity	80GB HBM3	192MB on-chip SRAM	44GB on-chip SRAM	192GB HBM3	Unknown (undisclosed)
Production references	Thousands	GroqCloud users (public)	Enterprise customers (limited)	Growing (Azure, etc.)	None (pre-production)
Software ecosystem maturity	Mature (decade+)	Early (2022+)	Early (2016+)	Growing (ROCm 2+)	Pre-launch

Matrix based on public specifications and independent technical reporting as of Q1 2026. 'Unknown' cells reflect undisclosed specifications; 'pre-production' reflects Etched's pre-commercialization status. Groq memory figure reflects on-chip SRAM design philosophy. Capability comparison is directional; for procurement, validate against current vendor documentation.

FP001: Competitive Positioning Map

Competitive positioning of AI inference chips on two axes: inference specialization (general-purpose to Transformer-specialized) and ecosystem maturity (early-stage to mature).

[CP001, CP003, CP005, CP010, CP011, CP018]

3.2 Incumbent GPU Competitors

NVIDIA holds approximately 80-90% of the AI accelerator market as of 2024-2025. Its H100 and forthcoming Blackwell architecture dominate both training and inference workloads. NVIDIA's competitive moat consists of four interlocking layers: (1) CUDA software ecosystem with decades of developer investment; (2) NeMo and TensorRT inference optimization frameworks; (3) scale of manufacturing commitments with TSMC; and (4) trust—every major AI company has proven NVIDIA silicon in production. NVIDIA's primary vulnerability is pricing power: H100 GPUs were selling at $30-40K per unit at peak in 2023-2024, with H100 inference clusters costing $2-8M per rack. AMD Instinct MI300X has emerged as the most credible alternative GPU for inference workloads. AMD's ROCm software stack has improved significantly, and Microsoft Azure has committed to large-scale MI300X deployments for its OpenAI workloads. Intel Gaudi (formerly Habana Labs, acquired in 2019 for ~$2B) targets training and inference workloads but has not achieved significant market share; Intel's software ecosystem lags CUDA significantly. The GPU incumbents' primary strategic response to Etched would be pricing reductions on inference-optimized SKUs (H100 NVL, Blackwell B100, B200) and accelerated development of inference-specific firmware and software. Both AMD and NVIDIA are already shipping inference-optimized variants with higher memory bandwidth per FLOP.

3.3 Hyperscaler Internal Programs

AWS, Google, and Microsoft have all developed internal AI chips specifically to reduce NVIDIA dependency and lower inference costs. Google's TPU (Tensor Processing Unit) program, now on v5, is the most mature internal chip program. Google TPU v5e is specifically optimized for inference and is available through Google Cloud; it cannot be purchased by third parties. Google uses TPUs extensively for Gemini inference and has deployed tens of thousands of units internally. AWS Trainium (training) and Inferentia (inference) are Amazon's internal AI chips. AWS Inferentia2 targets cost-effective inference for large language models and is available to AWS customers through Amazon EC2 Inf2 instances. AWS has not disclosed revenue or deployment scale for these chips, but they represent a credible threat to third-party inference hardware in the AWS ecosystem. Microsoft has developed the Maia 100 AI accelerator, announced in November 2023, which targets internal Azure AI inference workloads including OpenAI's Azure deployments. Etched's potential to sell to hyperscalers exists—OpenAI is publicly listed as a potential customer given the Sohu chip's throughput claims—but the hyperscaler internal programs represent both competition and a buyer validation risk: hyperscalers have demonstrated willingness and capability to build their own silicon rather than pay third parties.

Pricing / Packaging Comparison
Competitor	Pricing Model	Indicative Unit / API Cost	Contract Structure	Implication for Etched
NVIDIA H100 (on-prem)	Hardware purchase	$25-35K/unit (2024 spot; $15K+ list)	Spot, contract, or cloud markup	Etched must price at lower TCO over 3-year depreciation horizon to compete
NVIDIA on cloud (H100 SXM5)	Cloud instance	$2-4/hr per GPU on major clouds (2024)	On-demand or reserved (1-3 year)	Etched must demonstrate cost-per-token advantage vs. on-demand H100
AMD MI300X	Hardware purchase + cloud	$10-15K/unit estimated; Azure instances ~$1.5-2.5/hr	Similar to NVIDIA; cheaper	AMD pricing pressure reduces Etched's price-based differentiation
Groq (GroqCloud)	API (token/request)	$0.27/1M tokens (Llama 3-70B, 2024)	Pay-as-you-go + enterprise tiers	Groq API pricing is benchmark for inference-optimized alternatives to H100
Cerebras Cloud	API (token/request)	Competitive with Groq; enterprise pricing varies	Enterprise agreements	Enterprise ACV unknown; likely custom deals
Google TPU v5e (GCP)	Cloud instance	~$1.6/hr per chip (GCP v5e)	On-demand or 1-3 year committed use	Internal captive; not direct competitor for external sales
AWS Inferentia2	Cloud instance	$0.76/hr per chip (EC2 Inf2)	On-demand, reserved, savings plans	Cost-competitive within AWS; external buyers must evaluate vs. H100 cloud
Etched Sohu	Hardware purchase (planned)	Not disclosed; target: <0.1× H100 TCO for Transformer inference	OEM + direct enterprise (planned)	Pricing not set; success requires compelling cost-per-token vs. H100 benchmarks

Pricing figures from public cloud pricing pages and press coverage; H100 spot market prices fluctuated significantly in 2023-2024. All figures are indicative and date-sensitive. Groq API pricing as of late 2024 per official pricing page. Etched pricing is undisclosed; 'target' represents analyst expectation from $120M funding context.

FP002: Feature Breadth / Capability Map

Capability coverage matrix comparing Etched against primary competitors across key inference buying criteria.

[CP001, CP002, CP003, CP004, CP008, CP009]

3.4 Purpose-Built Inference Startup Competitors

Groq, Cerebras, and SambaNova are the three most-funded inference chip startups ahead of Etched. Groq has raised approximately $1.1B+ and offers the LPU (Language Processing Unit) as a deterministic streaming inference chip with very low latency. Groq's GroqCloud provides API access to inference at competitive pricing. Groq's architecture supports general AI model inference, not just Transformers—it can run Mamba, MoE, and other architectures. Groq's differentiation is latency (tokens per second response speed) rather than throughput at batch inference. Cerebras Systems has raised approximately $720M+ and uses a wafer-scale processing approach (the Cerebras WSE-3 is 46,225 mm², compared to ~800 mm² for H100). Cerebras focuses primarily on training and can do inference; it targets enterprise and government customers. SambaNova Systems has raised approximately $1.2B and uses a reconfigurable dataflow architecture. SambaNova has targeted enterprise AI deployment with its DataScale systems. Tenstorrent is a newer entrant (founded 2016, led by Jim Keller) using RISC-V-based AI chips with a focus on open hardware and software. Graphcore (UK-based) developed the Intelligence Processing Unit (IPU) for AI workloads; it has struggled commercially and was acquired by SoftBank in 2023 for a reported $120M—substantially below its ~$2.8B peak valuation. The Graphcore trajectory is an important adverse data point: a well-funded AI chip startup with differentiated architecture can fail to achieve commercial traction even with significant capital.

3.5 Switching Costs, Moat Durability, and Displacement Risk

Etched's primary moat claim is architectural: by implementing attention in hardened logic, it achieves throughput efficiency that GPU-based approaches cannot match at equivalent silicon area. This moat is real but narrow and fragile. It is real because attention-optimized hardware can genuinely run Transformer inference more efficiently. It is narrow because it only applies to Transformer inference. It is fragile because (1) NVIDIA and hyperscalers can respond with inference-optimized SKUs, (2) model architectures are evolving away from pure Transformer, and (3) software optimization (Flash Attention, quantization, speculative decoding) continuously reduces the efficiency gap between specialized and general-purpose hardware. CUDA lock-in is the dominant competitive moat for NVIDIA. A company switching from NVIDIA GPUs to Etched must: (1) re-validate every model in production on Etched silicon; (2) replace CUDA/TensorRT pipeline integration with Etched's SDK; (3) accept vendor concentration risk with an early-stage startup; and (4) build internal expertise in Etched's toolchain. These switching costs are non-trivial but manageable for large organizations with dedicated ML infrastructure teams. They represent a meaningful barrier to Etched's sales process, not an insurmountable barrier to adoption. Adverse evidence on displacement risk: The AI chip startup landscape has produced multiple well-funded failures. Graphcore's acquisition at a fraction of its peak valuation is the most recent data point. Wave Computing and Mythic AI have also failed or pivoted. The pattern suggests AI chip startups face a "valley of death" between chip demonstration and production-scale deployment, where software ecosystem immaturity, customer inertia, and NVIDIA's incremental improvement create compounding headwinds.

Moat Durability / Competitive Risk Register
Moat Claim	Threat	Severity	Mitigation or Diligence Ask
Transformer-hardened attention ASIC throughput	NVIDIA ships inference-optimized SKUs (NVL, Blackwell) with better attention performance	High	Benchmark Etched vs. H200/Blackwell B200 on attention throughput per watt when production silicon available
10× throughput vs. H100 claim	Claim is unverified; NVIDIA/AMD will close gap with architecture improvements and software (FlashAttention, quantization)	High	Require third-party benchmarks on production Sohu silicon; validate against latest NVIDIA TensorRT-LLM optimizations
First-mover in Transformer-only ASIC	No-moat: if market validates the approach, NVIDIA, AMD, or well-funded new entrant can replicate with larger resources	Medium	Assess patent portfolio; evaluate whether attention hardening is patentable vs. general prior art
TSMC 4N tape-out investment	Competitor uses same fab; tape-out completion does not guarantee production yield or cost competitiveness	Medium	Verify tape-out status; request yield targets and cost-per-wafer projections
Etched SDK and software ecosystem	SDK is not yet available; CUDA ecosystem moat works against Etched	High	Review SDK roadmap; assess framework compatibility plan (PyTorch, JAX, vLLM); check for any OSS contribution or early-access program
Architectural moat against model evolution	Mamba/SSM/MoE architectures gain inference share; Transformer-only chip becomes obsolete	Medium-High	Review model architecture trend data; assess Etched's stated response to hybrid Transformer-SSM models
Graphcore IPU precedent (adverse)	Graphcore raised $700M+ with differentiated architecture, failed to achieve commercial scale, sold for ~$120M	High (adverse data)	Study Graphcore failure modes; assess whether Etched's go-to-market plan addresses same distribution and software ecosystem barriers Graphcore faced

Risk register is analytical based on public competitor disclosures, independent AI chip industry reporting, and historical precedents. Severity ratings: High = could materially impair Etched's market opportunity if unaddressed; Medium = manageable with correct execution; Low = monitoring only. All severity ratings require validation against Etched's internal technical roadmap.

FP003: Moat / Readiness KPIs

Key competitive readiness indicators for Etched relative to the incumbent and startup competition.

[CP002, CP007, CP016, CP017, CP018, CP019]

3.6 Exhibits

Chapter 04

04Financials

4.1 Revenue Model and Streams

Etched is a pre-revenue semiconductor startup. Its intended revenue model is hardware sales: designing a custom ASIC (the Sohu chip) optimized for Transformer inference and selling it to hyperscalers, large AI-native companies, and inference platform operators. This is a one-time hardware sales model with potential repeat purchase cycles tied to chip generations, analogous to NVIDIA's GPU product cycle (H100 → H200 → Blackwell). There is no disclosed software licensing, cloud API, or SaaS revenue stream. The primary revenue stream is direct hardware unit sales of Sohu chips at OEM or enterprise pricing. A secondary stream could be system-level sales (rack or server configurations incorporating Sohu chips), analogous to how Cerebras sells CS-3 appliances rather than bare chips. No cloud marketplace offering has been announced. The company has made no public disclosures about revenue run rate, ARR, or any executed customer contracts. Revenue recognition for hardware sales typically follows ASC 606 point-in-time recognition upon chip delivery. Unlike SaaS, this creates lumpy revenue tied to production batch deliveries and procurement cycles. Capital intensity is extremely high for semiconductor companies: Etched must fund tape-out, wafer purchases, testing, and packaging before any revenue is collected. The working capital cycle for semiconductor hardware spans 12-24 months from tape-out to revenue.

Revenue Streams Table
Revenue Stream	Mechanism	Unit	Current Status	Revenue Quality	Diligence Ask
Hardware chip sales (Sohu)	Direct sale of Sohu inference chip units	$/chip or $/wafer-allocation	Not yet available (pre-production)	Low — hardware subject to lumpy recognition, capital intensity	Confirm tape-out timeline; get unit cost targets; ask for TSMC supply agreement terms
Chip system / rack sales	Potential sale of Sohu-based server rack or inference appliance	$/rack or $/node	Not announced	Low — speculative; requires supply chain build-out	Ask whether Etched plans to sell chips only or full systems; check BOQ for server integration costs
Cloud API inference (potential)	Cloud-based inference API using Sohu chips (like GroqCloud)	$/token or $/request	Not announced	Medium if deployed — recurring and scalable	Ask whether Etched plans a GroqCloud-equivalent; requires significant additional capital for cloud build-out
Software/SDK licensing	Licensing Etched SDK or inference optimization toolchain	TBD	Not announced; no SDK available	Low — unclear IP defensibility for SDK	Determine whether SDK will be open-source or commercial; assess IP strategy

All revenue streams are prospective. Etched has not generated revenue as of Q1 2026. Revenue stream analysis based on analogous semiconductor and AI chip startup models (NVIDIA, Groq, Cerebras). Hardware chip sales are the primary assumed revenue stream; all others are speculative additional streams.

Pricing / Monetization Table
Pricing Item	List vs. Realized	Indicative Range	Source / Comparables	Diligence Ask
Sohu chip ASP	Not disclosed	Est. $5,000-$20,000/chip (comparables)	NVIDIA H100 at $15-35K/unit; Groq LPU rack at ~$5K/LPU equivalent	Request Etched's internal pricing model; validate against TSMC cost and target gross margin
NVIDIA H100 market reference price	List: ~$30K; realized: $15-35K depending on channel	$15-35K/chip	NVIDIA official pricing + press coverage	Use as benchmark for Etched's ASP target; Etched must price at materially lower TCO for Transformer inference
Groq GroqCloud API rate	Public: $0.27/1M tokens for Llama 3-70B	$0.27-$0.80/1M tokens	Groq official pricing page (2024)	Benchmark for inference cost-per-token; Etched must demonstrate comparable or better economics
AWS Inferentia2 instance cost	On-demand: $0.76/hr per chip; reserved lower	$0.45-$0.76/hr per chip	AWS EC2 pricing page (2024)	Lowest-cost inference reference at hyperscaler scale; Etched must beat this for on-prem cost-per-token

Sohu pricing is entirely estimated based on analogous semiconductor products. Competitor pricing reflects public pricing pages as of late 2024. All pricing subject to change. Etched has not disclosed ASP targets.

FI001: Revenue Model Bridge

Flow diagram showing how Etched converts customer inference workload into hardware revenue and eventual gross profit.

Revenue flow is hypothetical; no customer LOI or production contract has been disclosed. Flow is based on analogous semiconductor hardware business models.

[CI001, CI003, CI005, CI006]

4.2 GTM Motion and Sales Efficiency

Etched's go-to-market model is direct enterprise sales targeting hyperscalers and large AI-native companies. This is a high-ASP, low-volume sales motion typical of semiconductor infrastructure vendors. The target buyer is a VP of Engineering Infrastructure or CTO-level decision maker at a company spending >$100M/year on GPU compute. Buying cycles for inference silicon in this segment typically span 12-24 months from initial evaluation to production deployment. At pre-revenue stage, Etched has no disclosed sales team size, pipeline metrics, or customer acquisition cost data. The company's primary go-to-market asset is the claimed 10× throughput advantage for Transformer inference, which creates a compelling economic argument if validated. The sales process would require: (1) free or subsidized chip samples for evaluation; (2) technical integration support for SDK adoption; (3) reference architecture validation; and (4) supply commitment negotiations. Channel economics for Etched are uncharacterized. NVIDIA sells through an extensive reseller, OEM, and cloud marketplace channel. Etched would need to either develop similar channel relationships or rely on direct sales to a small number of large accounts. Given the chip design's Transformer-only specialization, customer concentration in the top 10 AI inference buyers (OpenAI, Anthropic, Cohere, hyperscalers, inference platforms) is nearly certain in the early years.

FI002: Unit Economics Bridge

Simplified flow of key unit economics inputs from chip cost to customer cost-per-token, showing the chain from TSMC wafer cost to inference pricing.

All inputs except wafer cost comparables are undisclosed estimates. This bridge is an illustrative model only. Actual COGS and ASP require company-provided data. Die size is the critical missing input.

[CI008, CI009, CI010, CI011]

4.3 Cost Structure and Unit Economics

Etched's cost structure is dominated by semiconductor manufacturing costs: wafer costs, packaging and test, yield loss, and NRE (non-recurring engineering) for chip design. At TSMC's 4N process node, wafer costs are estimated at $15,000-20,000+ per wafer for leading-edge advanced nodes. Yield rates for first-generation designs at a new process node typically run 50-70%, improving to 85-95%+ in volume production. A single tape-out at TSMC leading-edge node costs $5-15M in mask set costs alone. Semiconductor gross margins at scale can be very attractive: NVIDIA's data center GPU gross margins exceed 70-75%. However, these margins require scale to cover the fixed NRE costs. Etched would need to sell thousands of chip units before the NRE costs are amortized. At $120M total funding, Etched faces a capital adequacy challenge: a single generation of leading-edge chip development plus initial production run costs can consume $50-100M, leaving limited headroom for a second generation or for sustaining operations through the 2+ year sales cycle before revenue. Key unit economics inputs that are unknown include: wafer cost commitment with TSMC, die size (which determines chips-per-wafer and cost-per-die), target ASP per chip, and yield assumptions. Without these, cost-per-token economics claimed by Etched cannot be independently validated.

Unit Economics Table
Metric	Value or Status	Confidence	Why It Matters	Diligence Ask
Cost-per-wafer (TSMC 4N)	Unknown — est. $15,000-$20,000/wafer	Low (estimated)	Determines cost-per-die before yield and packaging	Request TSMC agreement terms; ask cost-per-wafer commitment and allocation volume
Die size (Sohu chip)	Unknown — not disclosed	Unknown	Determines chips-per-wafer and gross cost-per-chip	Request die size specification; estimate from comparable ASIC designs
Yield rate (first gen)	Unknown — est. 50-70% at leading edge	Low (estimated)	Directly impacts cost-per-good-chip and gross margin	Request yield targets from Etched; benchmark against comparable first-generation ASIC yields
Target gross margin at scale	Unknown — est. 40-70%	Low (estimated)	NVIDIA achieves 70-75%; Etched likely lower in first gen due to NRE amortization	Request financial model; benchmark against Groq/Cerebras investor materials if available
NRE cost (tape-out + design)	Est. $20-50M total investment in Sohu design	Low (estimated)	Amortized over units sold; determines minimum volume for break-even	Ask Etched for total NRE spend; validate against tape-out milestone budget
Target ASP	Unknown — est. $5,000-$20,000/chip	Low (estimated)	Revenue and margin per unit; must be set competitively vs. NVIDIA	Request pricing model; ask about customer RFQ or LOI pricing discussions
CAC / sales cycle	Unknown — est. 12-24 month cycle for hyperscaler sales	Low (estimated)	High CAC in enterprise semiconductor; requires significant sales engineering investment	Ask for sales team structure; any LOI or evaluation agreements in place

All unit economics are estimated or unavailable as of Q1 2026. Etched is pre-revenue and has not disclosed any financial operating metrics. Estimates are based on analogous semiconductor industry benchmarks. Every null value requires a specific diligence request before underwriting.

FI003: Financial Estimate Range

Bull/base/bear scenario ranges for key financial metrics: burn rate, runway, next-round size, and target chip ASP.

All ranges are analytical estimates based on comparable semiconductor startup financial patterns. Etched has not disclosed any financial operating data. Low/mid/high represent conservative/base/aggressive scenarios.

[CI012, CI013, CI014, CI015, CI016]

4.4 Capital Adequacy and Runway

Etched raised $120M in a Series A round in June 2024, with investors including Primary Venture Partners and Positive Sum. As of Q1 2026, the company has not announced subsequent fundraising. The $120M raise is the entirety of disclosed external funding. No debt facilities, project finance, or government grants have been publicly disclosed. Monthly burn for a semiconductor startup of Etched's stage is typically $3-8M/month, driven by: (1) engineering headcount (chip designers at $300-500K total compensation); (2) EDA software licensing ($5-10M/year); (3) wafer shuttle and mask set costs; and (4) operating expenses. At a $5M/month midpoint burn rate, $120M provides approximately 24 months of runway from close, suggesting runway into approximately H2 2026, aligned with expected tape-out completion timing. The critical capital milestone is production silicon availability. If tape-out completes on schedule, the company will need to raise a Series B (estimated $200-500M based on comparable semiconductor raises) before or immediately after first silicon, to fund volume production, customer ramp, and next-generation chip development. Failure to raise on schedule creates a material going-concern risk. The runway to first customer revenue is longer than the current funding supports without additional capital.

Capital Adequacy Table
Item	Value / Status	Confidence	Notes
Total funding raised	$120M (Series A, June 2024)	High	Public disclosure; multiple press sources confirm Series A amount
Cash on hand (est. Q1 2026)	Unknown — est. $30-70M remaining	Low (estimated)	Depends on actual burn rate since June 2024; no public disclosure
Monthly burn rate (est.)	Est. $3-8M/month	Low (estimated)	Semiconductor startup of this stage; ~50-100 employees assumed
Runway from June 2024 close	Est. 15-40 months depending on burn	Low (estimated)	At $5M/month: 24 months (to mid-2026); at $3M: 40 months (to late 2027)
Next funding trigger	Production silicon milestone + customer LOI	Medium	Series B raise likely needed Q3-Q4 2026 for volume production funding
Estimated next round need	Est. $200-500M Series B	Low (estimated)	Based on Groq/Cerebras capital consumption patterns to first product
Debt / project finance	None disclosed	Unknown	No public disclosure of debt facilities or government grants

All estimates are derived from analogous semiconductor startup burn rates and capital consumption patterns. Etched has not disclosed cash position, burn rate, or balance sheet. These estimates should be replaced with actual data from financial due diligence.

4.5 Financial Gaps and Diligence Blockers

Etched's financial profile has significant gaps that cannot be resolved from public sources. Revenue is zero (pre-product). All operating metrics (burn rate, cash position, headcount, COGS structure, gross margin projections) are undisclosed. No financial statements are available. The company is not required to file public financials as a private company. The funding of $120M in June 2024 is the primary financial fact of record. The most material financial diligence blockers are: (1) actual burn rate and cash position as of Q1 2026 (is the company funded through tape-out or does it face near-term capital need?); (2) TSMC wafer commitment and cost terms (which determine cost-per-chip and gross margin potential); (3) first customer LOI or design win (which would validate revenue model and timing); and (4) total capital required to reach first production volume (which determines whether Series A is sufficient or a bridge is needed). The financial verdict is that Etched is in the highest-risk zone for a semiconductor startup: all capital has been deployed against development without revenue, the product is unproven in production, and the next funding round must be raised before commercial validation is possible. This is structurally similar to the capital position of Cerebras, Groq, and SambaNova at comparable stages—but each of those companies required $700M-$1.2B to reach commercial offering, compared to Etched's $120M raised to date.

Public Financial Gaps Table
Missing Metric	Impact on Underwriting	Diligence Path
Revenue run rate / ARR	Cannot assess revenue quality without any traction data	Request from company; no public source available
Current cash position and burn rate	Cannot assess runway or going-concern risk without actual cash data	Request Q4 2025 or Q1 2026 bank statements / management accounts from company
TSMC wafer cost and allocation terms	Cannot calculate cost-per-chip or gross margin without this	Request TSMC purchase agreement or term sheet from company
Die size and yield targets	Cannot validate unit economics or cost-per-token claim without chip spec	Request chip floorplan or area estimate; ask for yield targets in investor materials
Customer LOI or design wins	No commercial validation exists; all forward revenue is speculative	Ask company for any signed LOIs, evaluation agreements, or POC agreements
Detailed cap table and option pool	Cannot assess dilution or employee equity without full cap table	Request full cap table; check for SAFEs or convertible notes in addition to Series A
Annual operating expense breakdown	Cannot model payback or funding adequacy without cost structure	Request management accounts or investor reporting package
NRE and tape-out budget actuals vs. plan	Cannot assess whether $120M is sufficient without budget tracking	Request milestone budget tracking from company; compare to tape-out status

This table represents the complete set of financial metrics unavailable from public sources as of Q1 2026. Etched is a private pre-revenue company with no public financial disclosures. All items require company-provided data or third-party estimation.

FI004: Capital Intensity / Cash-Flow Map

Illustrative waterfall of Etched's $120M Series A capital deployment from close to first production revenue.

Waterfall is an illustrative scenario based on analogous semiconductor startup capital deployment patterns. All figures are estimates. Actual allocations are unknown. The negative ending balance scenario is plausible without additional fundraising or revenue earlier than assumed.

[CI014, CI015, CI016, CI017, CI018, CI019]

4.6 Exhibits

Chapter 05

05Product & Technology

5.1 Product Definition and Sohu Chip Specification

Etched's sole product is the Sohu ASIC — a purpose-built inference accelerator that permanently encodes the Transformer self-attention computation in silicon. The company's central premise is that by hardwiring the attention mechanism rather than emulating it on programmable logic, Sohu eliminates the instruction-dispatch and memory-management overhead that limits GPU throughput on autoregressive Transformer workloads. According to Etched, this architectural choice produces approximately 10× the throughput of an NVIDIA H100 for Transformer inference tasks, though this claim has not been independently verified with production silicon. Sohu targets the inference phase of LLM deployment, not training. The chip is designed for Transformer-only architectures: dense decoder models (GPT-4 class, LLaMA, Mistral, Falcon) and encoder-decoder models (T5 class). It does not support non-Transformer architectures such as Mamba state-space models or purely recurrent networks. The chip's specialization means any customer adopting Sohu commits to the Transformer paradigm for the chip's useful life. As of Q1 2026, Sohu does not exist as production silicon. The company has claimed that tape-out on TSMC's 4N process node is in progress, but no engineering samples have been publicly demonstrated. No product specification sheet, die photograph, or third-party benchmark has been published. The product page at etched.com/sohu returns a 404 error. Etched's commercial product assets are currently: a company homepage (etched.com), a fundraising announcement ($120M Series A, June 2024), and two founders' professional histories at Google.

Product module / asset matrix
Module / Asset	Type	Development Status	Differentiation	User / Buyer	Diligence Gap
Sohu ASIC	Hardware chip	Tape-out claimed in progress; no production silicon (Q1 2026)	Transformer attention hardcoded; claimed 10x throughput vs H100 for inference	Hyperscaler inference operators, large AI-native companies	Confirm tape-out completion; request die specification and first silicon timeline
Transformer attention engine (silicon block)	Hardened logic circuit	Design claimed complete; silicon unproven	Fixed-function attention eliminates GPU kernel overhead; lowest latency for attention compute	Chip architects and inference platform teams	Request floorplan or block diagram; confirm multi-head attention head count and SRAM capacity
HBM memory subsystem	Memory interface	In design (inferred); generation and stack count undisclosed	High-bandwidth DRAM for model weights and KV-cache; bandwidth determines tokens/sec ceiling	Inference platform engineers	Confirm HBM generation (HBM3/HBM3E), stack count, and bandwidth target; request memory architecture spec
Model inference compiler	Software toolchain	Early-stage; no public release	Converts standard Transformer checkpoint (HuggingFace format) to Sohu execution graph	ML engineers deploying models	Request compiler architecture; confirm HuggingFace SafeTensors ingestion; ask for model coverage list
Inference runtime / serving layer	System software	Early-stage; no public documentation	Manages token scheduling, request batching, and KV-cache allocation for multi-user inference serving	Inference platform operators	Request system software roadmap; confirm OpenAI-compatible API endpoint support; ask about KV-cache eviction policy
Developer SDK	Software interface	Not yet available; no documentation	Low — no SDK differentiates Etched negatively vs GPU incumbents with mature tooling	Application developers, ML engineers	Request SDK access and timeline; confirm expected open-source vs commercial licensing model; ask about HuggingFace / vLLM integration

Module maturity assessment as of Q1 2026. All software assets are pre-release; all silicon assets are pre-production. Status assessments based on absence of public SDK, documentation, or engineering sample announcements. Sohu chip claims based on company-stated positioning at etched.com and investor press coverage.

Workflow / use-case table
Use Case	Model Architecture	Inference Pattern	Sohu Fit	Limitation / Constraint	Assessment
LLM chatbot / conversational AI	Transformer decoder (GPT, LLaMA, Mistral class)	Autoregressive token generation; sequential decode	High — hardcoded attention is ideal for sequential autoregressive decode; KV-cache access pattern well-suited to HBM	Transformer-only; no SSM or hybrid model support	Primary intended use case; best technical fit
Code generation (Copilot-class tools)	Transformer decoder, long-context	Autoregressive decode, 8K-100K+ context windows	High — long-context workloads have high attention compute fraction; hardcoded attention scales favorably	Long-context KV-cache requires large HBM capacity; stack count and eviction policy matter	Strong fit; long-context is Sohu's architectural sweet spot
Text embedding generation	Transformer encoder (BERT, RoBERTa class)	Single forward pass; no autoregressive decoding	Medium — attention hardening still reduces encoder compute cost, but full throughput benefit requires decoding workloads	Less differentiation vs GPU; FlashAttention already highly optimized for encoder inference	Moderate fit; not the primary workload but supported
Multimodal inference (vision-language)	Transformer encoder + decoder; cross-attention	Mixed encode-then-decode; cross-attention between modalities	Medium-Low — cross-attention layers are architecture-specific; Sohu attention engine must support cross-attention patterns	Multimodal cross-attention design varies by model; compatibility unconfirmed	Uncertain fit; requires model-specific validation
Mixture-of-Experts Transformer inference (Mixtral class)	Sparse MoE Transformer	Sparse routing + attention; only subset of experts active per token	Low — attention is accelerated but sparse MoE routing overhead occurs on host; no confirmed MoE support	MoE token routing may not be accelerated; gating computation falls outside hardwired attention engine	Limited fit; MoE routing bottleneck likely on host CPU
Non-Transformer SSM inference (Mamba, RWKV)	State-space model or recurrent architecture	Recurrence-based; no dot-product attention	Not supported — Sohu is Transformer-only; SSM requires different compute primitives	Fundamental architectural incompatibility; Mamba uses selective state spaces with no attention operation	Excluded use case; confirmed incompatibility with Sohu design

Use-case fit assessed against Etched's stated Transformer-only architecture. MoE and SSM limitations are inferred from the hardwired Transformer attention design, not confirmed by Etched. Actual model compatibility requires SDK and model-compatibility matrix from the company.

FE001: Product architecture map

Layer diagram of Sohu chip's product architecture, from customer application layer down to TSMC silicon foundry, showing the components Etched owns vs. depends on externally.

Compiler, runtime, and SDK layers are pre-release with no public documentation. HBM generation, packaging type, and host interconnect are inferred from AI chip industry norms — not disclosed by Etched. This stack reflects the expected architecture based on available evidence, not confirmed product assets.

[CE001, CE002, CE005, CE008, CE016, CE029]

5.2 Architecture — Hardened Transformer Attention Silicon

Etched's core architectural thesis is that the Transformer attention operation — scaled dot-product multi-head self-attention as introduced by Vaswani et al. in "Attention Is All You Need" (2017) — is sufficiently stable and computationally dominant to justify permanent silicon encoding. In a conventional GPU, attention is computed by CUDA kernels (including FlashAttention-optimized variants) on general-purpose SIMD compute units. Sohu's attention engine replaces this programmable path with hardwired logic gates whose function cannot be changed post-manufacture. The memory architecture is the second key design dimension. Modern LLM inference is bottleneck-limited by memory bandwidth, not raw compute, because each token generation step requires reading the full KV-cache and model weights from DRAM. Sohu almost certainly integrates High Bandwidth Memory (HBM) stacks to address this, following the same path as NVIDIA H100 (HBM3) and Groq LPU (SRAM-dominant). The specific HBM generation and stack count have not been disclosed. HBM bandwidth determines how many tokens per second the chip can decode, which is the primary customer-visible performance metric. The hardwired attention engine creates a meaningful architectural trade-off: it achieves maximum silicon efficiency for the attention computation but permanently excludes support for future architectures that may become dominant. Mamba and other state-space models represent the most visible alternative paradigm; these architectures replace attention with linear recurrence and cannot run on Sohu's attention engine. Etched is therefore making a high-conviction bet that Transformer attention will remain the dominant LLM inference paradigm for the duration of Sohu's commercial life, which typically spans 3-5 years from first production.

Technology / operating architecture table
Architecture Layer	Technology / Component	Etched Implementation	Key Dependency / Risk	Status
Compute substrate	Hardwired ASIC logic (non-programmable)	Transformer multi-head self-attention permanently encoded in combinational and sequential logic gates; no SIMD or programmable ALU units	Architecture lock-in: cannot adapt post tape-out; any future Transformer revision or new architecture requires chip re-spin	Design (claimed)
Process node	TSMC 4N (4nm-class advanced node)	TSMC 4N claimed; provides high transistor density and power efficiency for AI ASIC at leading-edge node	TSMC allocation constrained by hyperscaler demand; startup customers face longer lead times and smaller wafer batches	In tape-out (company-claimed, not independently confirmed)
Memory interface	HBM (High Bandwidth Memory) — generation undisclosed	External HBM stack(s) co-packaged with Sohu die; provides model weight and KV-cache bandwidth essential for autoregressive inference	HBM supply controlled by SK Hynix, Micron, Samsung; startup allocation secondary to hyperscaler commitments	In design (inferred; not disclosed by Etched)
Advanced packaging	CoWoS or equivalent (inferred)	HBM die stacking on silicon interposer likely required for TSMC 4N plus HBM integration; specific packaging type not disclosed	CoWoS capacity at TSMC is oversubscribed; hyperscalers hold priority access; Etched allocation unconfirmed	Unconfirmed; not disclosed
Host interconnect	PCIe Gen 5 (inferred)	Standard x16 PCIe server slot for host CPU-to-chip communication and DMA transfer of model weights and outputs	Host interconnect bandwidth may constrain prefill throughput for long-context models with large KV-caches	Unconfirmed; not disclosed
Compiler / software runtime	Proprietary toolchain (pre-release)	Custom model compiler converts Transformer graph to Sohu-optimized execution format; inference runtime handles batching and scheduling; no third-party LLVM/MLIR backend announced	No existing open-source compiler path; entire software stack must be developed and maintained by Etched engineering team	Early-stage; no public release

Architecture assessment based on Etched's stated Transformer-only ASIC approach and analogous AI chip architectures (Groq LPU, Google TPU, AWS Trainium). HBM, packaging, and interconnect specs are inferred from industry norms — Etched has not published a technical specification. TSMC 4N claim is company-stated and not independently confirmed.

FE002: Customer workflow / operating flow

End-to-end inference workflow on a Sohu chip: from customer model import through Etched compiler and runtime to returned inference output, illustrating how the hardwired attention engine fits into the serving stack.

Workflow is hypothetical — based on expected inference chip operating model; Etched has not published compiler or runtime documentation. Compiler and runtime steps are pre-release. Edge labels reflect standard inference serving flow for Transformer decoder models using HBM-backed KV-cache.

[CE001, CE005, CE008, CE025, CE030]

5.3 Manufacturing, Maturity, and Technology Dependencies

Sohu is designed for TSMC's 4N process node, a 4nm-class advanced node that provides high transistor density and power efficiency appropriate for AI inference chips. TSMC's 4N is a customer-specific variant of the N4 process family and requires a foundry relationship with significant minimum wafer commitments. Access to leading-edge TSMC capacity is competitively constrained; major customers including NVIDIA, AMD, Apple, and Qualcomm hold priority allocation. As a startup with no prior TSMC production history, Etched must negotiate wafer allocation as a new customer, which typically involves accepting smaller batch sizes and longer lead times. The chip's manufacturing maturity as of Q1 2026 is pre-silicon: tape-out has been claimed as in progress but not confirmed. First-pass silicon on a novel architecture at a leading-edge node carries inherent design-under-silicon risk: first-pass success rates for complex digital ASICs are reported in industry literature at 50-70% without re-spin, and re-spins add 6-12 months and $5-15M in additional NRE costs. No engineering samples have been publicly demonstrated. Critical technology dependencies beyond TSMC include: HBM supply (concentrated with SK Hynix, Micron, and Samsung), advanced packaging (CoWoS-style integration is required for HBM and is itself capacity-constrained at TSMC), and EDA toolchain licensing (Cadence, Synopsys). Each of these dependencies represents a potential supply chain single point of failure. Etched, as a small startup, may face allocation challenges relative to hyperscaler-backed chip companies (Amazon Trainium, Google TPU, Microsoft Maia) that have priority agreements in place.

FE003: Critical dependency map

DAG of Etched's critical external dependencies for the Sohu chip program, from silicon foundry and memory supply through software frameworks and customer deployment.

HBM vendor, packaging type, and IP core vendors are inferred from AI ASIC industry norms — Etched has not disclosed specific suppliers. EDA vendor and IP licensing are standard for TSMC-fabbed ASICs of this class. Customer deployment node represents expected outcome post-SDK availability, not a confirmed deployment.

[CE003, CE011, CE016, CE021, CE027]

5.4 Software Stack, SDK, and Developer Surface

Etched has not published a developer SDK, API documentation, integration guide, or model compatibility matrix as of Q1 2026. The absence of any public software artifact is the most significant product-readiness gap from a commercial deployment perspective. Enterprise inference customers require at minimum: a model conversion tool (to load HuggingFace-format checkpoints onto the chip), an inference runtime (to handle request batching and token scheduling), and a serving API compatible with OpenAI-format endpoints (the de facto inference API standard). The HuggingFace Transformers library is the dominant ecosystem framework for Transformer model distribution and inference. Any commercial AI inference chip must integrate with the HuggingFace model hub format (SafeTensors, config.json schema) to allow customers to run standard LLaMA, Mistral, and Falcon checkpoints without manual conversion. Etched has not disclosed whether its compiler ingests HuggingFace model formats directly or requires a separate conversion step. Developer adoption for inference chips follows a well-documented pattern: developer-facing documentation → open-source SDK → first reference deployment → ecosystem tooling. Etched is currently at step zero: no SDK, no documentation, no reference implementation. Groq, which shipped its LPU with a developer-accessible cloud API and public benchmark data, demonstrates that developer surface is a critical commercial accelerant. Etched's lack of developer surface suggests the company is focused on silicon first and software second — a reasonable priority ordering at tape-out stage — but this creates a customer adoption delay that will add 6-12 months to revenue realization after first silicon.

Trust / quality / compliance table
Dimension	Status / Claim	Evidence Available	Risk Level	Diligence Path
Silicon quality (pre-production)	No production silicon as of Q1 2026; first silicon unproven	No evidence — no engineering samples have been demonstrated or announced publicly	High — first-pass ASIC success rate typically 50-70%; first silicon is highest-risk milestone in chip development	Request tape-out confirmation and expected first silicon delivery date; ask for DFM sign-off documentation from design team
Process node compliance (TSMC 4N)	TSMC 4N use claimed; no third-party confirmation available	Etched website and investor press coverage state TSMC 4N; TSMC does not publish customer chip lists	Medium — TSMC 4N is a mature process; primary risk is capacity allocation, not process reliability	Request TSMC foundry agreement term sheet or manufacturing purchase order under NDA; confirm capacity allocation
IP licensing and freedom to operate	Transformer attention algorithm is open (academic origin); standard cell library and PHY IP may require licensing	No IP disputes or litigation disclosed; Vaswani et al. (2017) paper is open-access academic work in public domain	Low-Medium — ARM or Synopsys standard cell library licensing typical; no known IP conflicts identified	Confirm standard cell IP vendor and licensing terms; run freedom-to-operate search on ASIC architecture claims
Supply chain resilience	HBM, packaging, and substrate supply chains not disclosed by Etched	No supply agreements announced; HBM market is constrained with limited suppliers globally	High — HBM and CoWoS capacity are oversubscribed; startup allocation priority behind hyperscalers	Request HBM supplier letter of intent or allocation agreement; confirm packaging partner identity; assess supply diversification options
Security and data privacy	No security architecture or model IP protection documentation available from Etched	No security whitepaper, certification, or technical disclosure published by Etched	Medium — inference chips handle sensitive model weights; enterprise customers require assurance of weight isolation and secure boot	Request security architecture brief; confirm memory encryption support, secure boot, and model IP protection controls in chip design
Regulatory and export compliance	No regulatory filings or compliance certifications disclosed; not required for pre-revenue startup	No adverse regulatory signals; US semiconductor export controls (ECCN) apply to advanced AI chips exported to certain jurisdictions	Low-Medium — TSMC 4N chips may fall under EAR/ECCN 3E001; export to restricted jurisdictions requires BIS license	Confirm ECCN classification for Sohu chip; verify compliance with US semiconductor export control regulations before customer shipments

Trust and compliance assessment as of Q1 2026. All silicon quality assessments are pre-production and necessarily speculative. IP, security, and regulatory assessments are based on industry norms for ASIC startups using TSMC advanced nodes. No adverse regulatory, litigation, or quality incidents have been publicly disclosed for Etched.

5.5 Roadmap, Differentiation, and Technical Risks

Etched's differentiation thesis rests on three claims: (1) that hardwired Transformer attention delivers 10× throughput vs. NVIDIA H100 for inference workloads; (2) that this performance advantage translates to materially lower cost-per-token economics for hyperscaler inference operators; and (3) that the Transformer architecture will remain dominant for long enough to justify a single-architecture ASIC investment. None of these claims has been independently validated as of Q1 2026. The most significant technical risk is architecture lock-in. If non-Transformer architectures — Mamba, RWKV, or future recurrent variants — gain significant market share, Sohu becomes obsolete faster than its depreciation schedule allows. The history of domain-specific silicon (early AI training chips, FPGA-based inference accelerators, first-generation neuromorphic chips) shows that architectural bets made in silicon can become stranded assets within one design generation. Etched has made an unusually concentrated bet: not just Transformer, but the specific attention operation hardwired in silicon, with no programmable fallback. The product roadmap is entirely uncharacterized beyond the current Sohu tape-out. No second-generation chip has been announced, and no product family (e.g., inference-only vs. fine-tuning, data-center vs. edge) has been disclosed. This is appropriate for a company at tape-out stage, but represents an additional risk factor: customers making supply commitments need visibility into the next-generation roadmap to justify long-term platform adoption. Etched's roadmap opacity is a commercial risk even if first silicon succeeds on schedule.

Roadmap / release / development-stage table
Milestone	Estimated Target	Status	Key Dependencies	Risk Level
Architecture design freeze	Estimated H2 2023 to H1 2024	Complete (inferred from tape-out claim)	EDA synthesis, timing closure, design rule check (DRC) sign-off, IP licensing	Low — assumed complete since tape-out is stated to be in progress
Tape-out (TSMC 4N)	Estimated H1 to H2 2025	In progress (company-claimed); not independently confirmed	TSMC wafer allocation, DFM compliance, mask set fabrication ($5-15M NRE)	High — first tape-out at a new process node is highest-risk design milestone; no third-party confirmation
First silicon / engineering samples	Estimated H2 2025 to Q2 2026	Not yet announced; likely pending as of Q1 2026	Tape-out completion; TSMC wafer processing (8-16 weeks); assembly and packaging	High — first silicon failure rate is 30-50%; no engineering sample demonstrated as of Q1 2026
Design validation and benchmarking	Estimated Q2 to Q3 2026	Not started; dependent on first silicon availability	Engineering sample availability; test harness; benchmark model suite; SDK minimum viable implementation	Very High — validation reveals yield, performance, and silicon bug issues; re-spin adds 6-12 months
Customer evaluation program	Estimated Q3 2026 to Q1 2027	Not announced; no evaluation partners named	Engineering sample delivery; inference compiler; customer NDA; evaluation server infrastructure	Very High — no customers or evaluation agreements announced; timeline extends if re-spin is required
Volume production ramp	Estimated 2027 and beyond	Not announced; no production commitments disclosed	Production silicon validation; TSMC volume wafer commitment; HBM supply agreements; customer purchase orders	Very High — contingent on all prior milestones; Series B capital required before volume production

All timeline estimates are analytical projections based on TSMC advanced-node chip development norms and comparable AI ASIC programs (Groq, Cerebras, Amazon Trainium). Etched has not published an official product roadmap. Estimated dates assume no re-spin; a single re-spin adds 6-12 months to each subsequent milestone.

FE004: Product maturity / capability map

Capability maturity matrix for Sohu chip modules, assessed across four maturity stages from architecture design through production availability.

Maturity assessments based on absence of public SDK, documentation, or engineering sample announcements as of Q1 2026. Design-stage claims for silicon modules are company-stated and not independently confirmed. All software modules are pre-release. Matrix reflects the most optimistic publicly supportable assessment.

[CE001, CE003, CE004, CE005, CE018, CE019]

5.6 Exhibits

Chapter 06

06Customers

6.1 Customer Base Segmentation and Target Buyers

Etched's Sohu chip targets the AI inference chip buyer, a narrow population of companies running Transformer-based large language models at a scale where GPU cost is a material operational expense. The primary buyer persona is the VP of Infrastructure or the Head of ML Platform at a company spending more than $50 million annually on GPU compute for Transformer inference. This profile describes approximately 50-200 companies globally as of Q1 2026, concentrated in three tiers: (1) frontier AI labs such as OpenAI, Anthropic, and Mistral that run proprietary Transformer models at consumer web scale; (2) hyperscalers such as AWS, Google, and Microsoft that offer LLM inference APIs as a commercial product; and (3) inference-as-a-service platforms such as Together AI, Anyscale, and Perplexity AI that run open-source Transformer models for commercial developers. The segmentation is defined by the nature of the Transformer workload rather than by industry vertical. Any organization that runs autoregressive Transformer decoder inference at scale — whether it is an AI-native company, a hyperscaler offering LLM APIs, or a large enterprise with a proprietary model deployment — represents a potential Sohu customer. Organizations that run primarily Transformer encoder workloads (embedding, classification, BERT-class models) are a lower-priority segment because the attention-compute advantage of hardwired silicon is less pronounced for single-pass encoder inference than for autoregressive decoding. The geographic concentration of Etched's addressable buyer segment is heavily US-centric in the first wave: OpenAI, Anthropic, Together AI, Anyscale, Scale AI, and Perplexity are all US-headquartered companies. Cohere (Canada) and Mistral (France) represent the only tier-1 international prospects with publicly known inference scale. This concentration is advantageous from a sales-motion perspective — fewer accounts, geographically accessible — but creates risks if the US regulatory environment or export-control framework restricts chip supply relationships. [CU001, CU002, CU003, CU018]

Customer segmentation table
Customer Segment	Representative Companies	Inferred Annual GPU Spend	Transformer Inference Need	Etched Sohu Fit	Estimated Sales Cycle
Frontier AI Labs	OpenAI, Anthropic, Mistral, xAI	$100M–$1B+	Very High — proprietary Transformer decoder inference at consumer web scale	High — primary target; would benefit most from 10x throughput claim if verified	18-30 months from first silicon
Hyperscaler LLM API Teams	AWS (Bedrock/Inferentia), Google (Gemini API/TPU), Microsoft (Azure OpenAI)	$1B+ (internal inference)	Very High — commercial LLM inference APIs require lowest possible cost-per-token	Medium — have captive silicon programs (Trainium, TPU, Maia) that reduce urgency to adopt external ASICs	24-36 months; competing against internal silicon roadmaps
Inference-as-a-Service Platforms	Together AI, Anyscale, Perplexity AI	$20M–$200M	High — open-source Transformer model inference is the core product	High — most price-sensitive segment; highest motivation to reduce cost-per-token	12-24 months from first silicon delivery
Enterprise LLM Application Vendors	Cohere, Scale AI, Hugging Face	$10M–$100M	Medium-High — inference for enterprise RAG, embeddings, and API products	Medium — workload mix varies; embedding workloads are less attention-compute-bound	12-24 months; require software integration support
Research Labs and Academic Institutions	Meta AI Research, Allen Institute, national labs	$10M–$500M	Medium — inference used for research evaluation but training dominates spend	Low-Medium — inference fraction of compute is lower; Sohu does not accelerate training	Not primary near-term target

Spend estimates based on public funding, inference pricing disclosed by competitor API providers, and inference cost fraction of compute budgets discussed in industry coverage. No Etched-confirmed customer relationships exist; all segment entries are potential targets. Hyperscaler 'internal inference' spend figures are not directly comparable to third-party inference spend and represent internal cost allocation. Sales cycle estimates assume first production silicon available H2 2026 to H1 2027.

[CU003, CU006, CU007, CU008, CU009, CU011]

FU001: Customer journey map

Eight-phase journey from first awareness to multi-year expansion for an AI inference chip buyer evaluating Etched's Sohu chip.

Journey phases and time estimates are inferred from analog inference chip company adoption patterns (Groq, AWS Inferentia, Cerebras). Etched has disclosed no customer journey details. Phase durations are estimated at 1-4 months each for a total 12-24 month cycle.

[CU012, CU013, CU026]

6.2 Adoption Trajectory and Current Traction

Etched has zero customer traction as of Q1 2026. The company has not disclosed any customers, active evaluations, signed LOIs, design wins, or named engineering-briefing recipients. This is the baseline state for a chip startup whose silicon has not yet been demonstrated publicly: potential customers cannot benchmark a chip that does not exist as a production sample, and procurement teams are unlikely to sign evaluation agreements for hardware without physical samples available for technical validation. The adoption trajectory for an AI inference chip startup typically follows five phases: (1) pre-silicon awareness and engineering briefings to potential customers, (2) first-silicon delivery and confidential performance benchmarking, (3) pilot deployment with 1-3 committed design-win customers, (4) production ramp and named customer announcements, and (5) ecosystem expansion. Based on public evidence, Etched has not yet entered phase 1 on any publicly confirmable basis. The company has not announced engineering briefings, sample availability timelines, or any customer engagement program as of Q1 2026. For comparison, Groq began its customer engagement before shipping hardware by building a developer community around its architecture and providing early access to benchmarks. Cerebras similarly briefed hyperscaler customers well in advance of commercial availability. Etched's adoption trajectory, measured against these analogs, is delayed: the company has raised $120 million but has not yet provided any public evidence of customer engagement activity. The earliest plausible first-revenue date, given a 12-24 month evaluation cycle after first-silicon delivery, is H2 2027 to 2028 — assuming tape-out completes on schedule in 2025-2026 and first silicon is delivered H1 2027. [CU012, CU013, CU020, CU024, CU029]

Customer growth / adoption trajectory table
Phase	Timeline (Projected)	Customer Count	Revenue Stage	Key Milestone Required	Primary Risk
Phase 0: Pre-Silicon / No Engagement	Current — Q1 2026	0	Pre-revenue	None achieved; no customer signals disclosed	Delay to first silicon; inability to demonstrate chip performance to prospective buyers
Phase 1: Engineering Briefings and NDA Evaluations	Q2 2026 – Q4 2026 (projected)	0 disclosed	Pre-revenue	First silicon delivery; benchmark data under NDA; at least 1 signed evaluation agreement	No evaluation partners named; chip may not be available for demo by year-end 2026
Phase 2: Pilot / Design-Win Customers	H1 2027 – H2 2027 (projected)	1–3	Pre-revenue or first contract	Named design-win customer willing to integrate Sohu into production stack; SDK availability	Architecture incompatibility; software integration friction; competitor alternative
Phase 3: Production Ramp and First Revenue	2028 (projected)	3–10	First revenue (likely NRE + production chip contract)	Named public customer announcement; wafer volume commitment; supply chain locked	Revenue concentration; single customer = >50% of revenue at ramp start
Phase 4: Ecosystem Expansion	2029+	10+	Recurring chip + support revenue	SDK ecosystem; third-party integrations; re-order from Phase 3 customers; public benchmark leadership	Architectural obsolescence if non-Transformer paradigms gain adoption before Phase 4

All timeline projections are inferred from public chip development timelines (TSMC 4N tape-out to first silicon = 18-24 months from design freeze), analog inference chip company timelines (Groq, Cerebras, AWS Inferentia), and Etched's disclosed funding timeline (Series A June 2024). No Etched-confirmed milestones or timelines have been disclosed. Customer counts are estimates with high uncertainty. This table represents a plausible adoption trajectory, not a committed forecast.

[CU012, CU013, CU024, CU029]

FU002: Adoption / deployment funnel

Discovery-to-production funnel for Etched as of Q1 2026 — all pipeline stages at or below 'engineering briefing' show zero confirmed entries.

TAM estimate of 200 companies is based on inferred GPU spend thresholds using public funding data and inference pricing for representative AI companies. Awareness estimate of 100 is based on media reach of the June 2024 Series A announcement. Engineering briefing estimate of 20 is speculative; Etched has disclosed no customer engagement data. All confirmed counts at and below 'active technical evaluation' are zero based on absence of any public customer disclosure.

[CU001, CU018, CU020]

6.3 Named Customer Evidence — Absence and Analog Proof

Etched has no named customers, design wins, or publicly confirmed evaluation partners as of Q1 2026. All customer cells in the Named Customer Proof table for Etched itself are placeholders representing potential market targets, not actual commercial relationships. This is an uncommon position for a company 2+ years post-founding and 12+ months post-Series-A: most inference chip companies at comparable funding stages have at least named one evaluation partner or announced a developer access program. Analog customer proof from comparable inference chip companies is available and informative. Groq's case studies page demonstrates that AI-native inference platforms — including companies of the scale and type that Etched is targeting — do adopt specialized inference hardware when the performance-per-token economics are compelling. Groq lists case studies from customers that run large-scale Transformer inference for production consumer products, validating that the buyer segment exists and is willing to adopt non-GPU inference hardware. AWS Inferentia case studies, including Stability AI (image generation inference) and Quora (Poe chatbot inference), demonstrate that companies adopting custom inference silicon can achieve 40-70% cost reductions versus GPU-based inference at comparable throughput levels. These analog proofs are valuable for validating the buyer behavior hypothesis — that AI-native companies will adopt inference ASICs when cost-per-token economics are proven — but they do not validate Etched specifically. The critical gap is the complete absence of Etched-specific customer signal: no LOI, no NDA evaluation agreement, no engineering briefing, and no design win. G2 reviews of Groq provide developer-level feedback confirming that inference chip adoption is real and that performance benchmarks drive developer adoption decisions. Etched currently has no equivalent developer signal, no SDK, and no public benchmark data. [CU001, CU004, CU005, CU021, CU027, CU028]

Named customer proof table
Company / Platform	Customer Category	Evidence Type	Inference ASIC Adoption Evidence	Applicability to Etched
AWS Inferentia Users (Stability AI, Quora, Sprinklr)	AI application companies — image generation, chatbot inference, enterprise NLP	Customer proof (AWS case studies)	Stability AI and Quora deployed AWS Inferentia2 (Inf2) for production inference; both report 40-70% cost reduction vs equivalent GPU instance types at comparable throughput	Direct analog: same buyer archetype (AI company with large inference spend) adopting custom inference ASIC when TCO is proven; validates Etched's target buyer segment
Groq Inference API customers (inference platforms, AI-native apps)	Inference-as-a-service platforms; AI-native developers	Customer proof (Groq case studies page)	Groq case studies show AI-native companies and inference platforms adopting the Groq LPU for latency-sensitive and throughput-sensitive Transformer inference workloads	Direct analog: same buyer segment Etched is targeting; Groq demonstrates that inference platforms pay for specialized inference hardware when token throughput outperforms GPU alternatives
OpenAI (potential — no Etched engagement)	Frontier AI lab — largest known Transformer inference operator globally	Inferred from public scale disclosures	OpenAI operates GPT-4 class models at consumer web scale (hundreds of millions of users); annual inference spend estimated at $1B+ based on compute cost commentary; would benefit from 10x throughput at verified performance	Potential target — highest value; zero disclosed Etched engagement; would require multi-year supply commitment negotiation
Anthropic (potential — no Etched engagement)	Frontier AI lab — Claude models for consumer and enterprise	Inferred from public scale and funding disclosures	Anthropic raised $7.3B+ from Google and Amazon in 2024-2025; Claude API inference is core product; Transformer inference cost is material to unit economics	Potential target — tier-1; zero disclosed Etched engagement; AWS investment may create supply chain preference for Trainium/Inferentia
No Etched-specific customers (actual)	N/A — documentation of absence	Observed absence of evidence	No customer, LOI, named evaluation partner, or engineering briefing recipient has been disclosed by Etched in any public communication through Q1 2026	Diligence gap: every Etched-specific row in this table is hypothetical; only analog company rows (AWS Inferentia, Groq) represent confirmed customer proof for the inference ASIC buyer segment

This table documents the absence of Etched-specific customer proof and provides analog evidence from comparable inference chip companies. All rows labeled 'potential' or 'analog' are not Etched customers. Etched has zero publicly confirmed customer relationships as of Q1 2026. AWS case study figures (40-70% cost reduction) are from published case studies and should be independently verified with AWS or the customer directly. Groq case study details are from groq.com/case-studies/ (accessed 2026-05-18).

[CU001, CU004, CU005, CU006, CU021]

FU003: Customer proof matrix

Comparative customer proof scorecard for Groq LPU, AWS Inferentia, and Etched Sohu across six dimensions of commercial readiness.

Groq and AWS Inferentia data drawn from publicly accessible case studies and G2 reviews (accessed 2026-05-18). Etched Sohu entries reflect absence of public evidence, not absence of internal activity. The Graphcore precedent (strong benchmarks, poor commercial outcome) is omitted here but should be considered as an adverse analog.

[CU004, CU005, CU021, CU027, CU028]

6.4 Retention, Expansion, and Concentration Risk

Etched has no customers, so retention and churn metrics cannot be measured directly. However, the structural economics of inference chip adoption strongly favor high retention once integration is complete. An AI company that re-engineers its serving stack, model compiler, and deployment pipeline to run on Sohu hardware — a process estimated at 3-6 months of engineering work by 2-5 dedicated engineers — faces switching costs equivalent to 12-18 months of re-engineering to move back to GPU infrastructure or to an alternative ASIC. This creates structural lock-in analogous to cloud infrastructure switching costs, though without the data portability friction. Expansion economics in the inference chip segment are favorable if performance is validated. AWS Inferentia case studies show that customers who adopt custom inference silicon typically expand capacity within 12 months of first deployment, driven by lower cost-per-inference enabling higher inference volumes. Together AI and Anyscale, as potential Etched customers, would likely expand Sohu capacity in proportion to their overall LLM inference growth — which is projected to grow significantly as open-source model quality improves and inference costs fall. Concentration risk is the most serious near-term structural concern for Etched. With zero current customers, Etched's first customer would represent 100% of its initial revenue. Even with 3-5 early customers, a scenario where any single customer represents 25-35% of first-year revenue creates extreme concentration risk. If that customer reduces usage — due to a strategic pivot away from Transformer models, a competitor offering better economics, or a loss of their own funding — Etched faces a revenue shock with no diversification buffer. Graphcore's failure was partly attributable to customer concentration: a small number of large customers that delayed or cancelled deployments created a cascading revenue shortfall. Etched must prioritize customer diversification from its very first production run. [CU019, CU025, CU033, CU035, CU036, CU037]

Retention / repeat usage / satisfaction table
Metric	Industry Analog / Benchmark	Etched Status	Structural Outlook	Diligence Ask
Net Revenue Retention (NRR)	AWS Inferentia: NRR not disclosed but customers expand capacity within 12 months; Groq: estimated >100% NRR for inference API customers	Not applicable — zero customers	Structurally favorable: inference chip workloads are sticky once integration is complete; expanding inference volumes = NRR >100% if chip is competitive	Request Etched's NRR model projections; ask how they plan to lock in multi-year supply agreements
Gross Revenue Retention (GRR) / Churn	Inference chip churn is low once production-integrated (switching cost = 12-18 months re-engineering); AWS Inferentia: churn not publicly disclosed	Not applicable — zero customers	Structurally low churn once integrated; risk is if a customer does not complete integration (aborts during pilot phase)	Ask about planned contract structures: minimum volume commitments, take-or-pay clauses, NDA evaluation terms
Customer Satisfaction / NPS	Groq: developer community reports high satisfaction on G2 (reviews emphasize speed improvement vs GPU alternatives); AWS Inferentia: customer case studies report positive ROI	Not applicable — no developer access, no SDK, no benchmark data available to developers	Unknown: Etched has provided no developer access, no public API, and no benchmark data for independent evaluation; developer satisfaction cannot be measured	Request SDK roadmap and developer access timeline; ask when first public benchmark will be published
Contract Length and Renewal Patterns	Inference chip supply agreements are typically 2-3 year contracts with volume commitments; hyperscaler chip programs are typically 5-year+ relationships	Not applicable — no contracts signed	Favorable structural pattern: long contract terms reduce churn risk; but Etched must win first contract before retention metrics are relevant	Request any draft evaluation agreement or term sheet structure
Cohort Retention (Time-Series)	Not available for Groq or Cerebras (private companies); AWS Inferentia cohort data not publicly disclosed	Not applicable — no customer cohorts exist	Cannot be modeled without at least 2 customer cohorts across time periods	All cohort cells null — see Retention/Repeat Cohort figure; provide any internal modeling once first customers are secured

All retention metrics are structural assessments based on inference chip analog companies (Groq, AWS Inferentia, Cerebras). No Etched-specific retention data exists because Etched has zero customers. Satisfaction signals are drawn from G2 reviews of Groq (a proxy for inference chip developer sentiment) and AWS customer case studies. This table should be entirely replaced with actual Etched customer retention data once first customers are onboarded.

[CU025, CU005, CU021]

Expansion and concentration risk table
Risk Factor	Risk Level	Rationale	Mitigation Path
Customer concentration at revenue onset	Critical	Etched enters production with zero customers; first customer = 100% of revenue; even a 3-customer early-adopter base yields extreme concentration if any one is >33% of first-year revenue	Sign ≥3 binding evaluation contracts before first wafer start; negotiate supply commitments from multiple buyers in different segments to diversify
Single-architecture customer lock-out	High	Sohu only accelerates Transformer attention; customers running MoE, SSM (Mamba), or hybrid architectures cannot adopt Sohu without fallback to GPU for non-attention layers	Expand model compatibility list; disclose which model families are fully accelerated vs partially accelerated; develop fallback scheduling for non-attention layers
Revenue delay from long evaluation cycle	High	12-24 month chip evaluation cycle means zero revenue until 2027 at earliest even if tape-out completes on schedule; each month of delay adds to the funding runway risk	Sign early LOIs with milestone-gated payments; offer pre-payment incentives; reduce evaluation cycle with superior benchmarking and pre-built integration scripts
Hyperscaler captive silicon preference	Medium-High	AWS Trainium, Google TPU, Microsoft Maia mean the top-tier hyperscalers may prefer to develop custom inference silicon internally rather than adopt a startup's chip; reduces addressable market by removing 3 largest potential customers	Target inference-as-a-service platforms (Together AI, Anyscale, Perplexity) as primary first-wave customers; position Sohu as the alternative for companies that cannot afford a captive silicon program
SDK / software ecosystem immaturity	Medium	Groq and Cerebras have 1-3 year software ecosystem head starts; customers require HuggingFace-compatible compilers, vLLM-compatible serving, and OpenAI-API-compatible endpoints; Etched has no public SDK as of Q1 2026	Prioritize SDK and developer tooling delivery before or simultaneously with first silicon; open-source key integration layers to accelerate ecosystem adoption; hire experienced ML systems software engineers

Risk ratings are qualitative assessments based on analysis of inference chip startup dynamics, analog company trajectories (Groq, Cerebras, Graphcore), and Etched's current disclosed position. No Etched-confirmed risk mitigations have been disclosed. All mitigation paths are recommendations derived from analog analysis, not confirmed Etched plans.

[CU019, CU014, CU013, CU037]

FU004: Retention / repeat cohort

Retention cohort analysis for Etched inference chip customers — all cells null because Etched has zero customers; analog placeholder rows shown for context.

All cells are null. Etched has zero customers as of Q1 2026; no retention data exists. Analog rows for Groq and AWS Inferentia are also null because neither company publicly discloses cohort-level retention data. If Etched onboards first customers, actual 30-day / 90-day retention metrics should replace these nulls. Structural analysis suggests retention would be very high (>90%) once full hardware integration is complete due to switching cost lock-in.

6.5 Customer Verdict — Diligence Blockers

The customer diligence verdict for Etched is unambiguously the highest-severity blocker in this report. Etched is a pre-revenue, pre-silicon company with zero publicly named customer relationships, zero published benchmarks for customer evaluation, zero SDK for developer experimentation, and zero design wins. The absence of any customer signal is not explained by stealth strategy — the company publicly announced a $120 million Series A in June 2024 — but rather reflects the genuine pre-commercialization stage of the company. The analog evidence from Groq (case studies showing AI-native inference platform adoption) and AWS Inferentia (hyperscaler customer proof of ASIC adoption) validates the buyer behavior hypothesis at the market level. These analogs confirm that the inference chip buyer segment exists, has the budget and procurement capacity to adopt non-GPU inference hardware, and will do so when cost-per-token economics are demonstrated. However, this market-level proof does not reduce Etched's company-specific customer risk, which remains critical. The key diligence asks for the customer chapter are: (1) Has Etched entered any NDA-governed evaluation agreements, even informally? (2) Which companies has Etched briefed at the engineering level on Sohu architecture? (3) What is the company's estimate of first customer close date and first revenue date? (4) Has any company expressed written interest in a production supply agreement contingent on first-silicon performance? Until these questions are answered with verifiable evidence, the customer chapter represents a blocking diligence risk that no investment committee should overlook. The Graphcore precedent — a $700M+ chip startup that failed to convert strong engineering proof into customer adoption at scale — is a direct warning about the difficulty of the customer development problem Etched faces. [CU001, CU014, CU015, CU037]

Chapter 07

07Risks

7.1 Technology and Architecture Risks

The primary technology risk for Etched is the irreversibility of its architectural bet. The Sohu ASIC hardcodes Transformer attention mechanisms directly in silicon: the wiring for multi-head attention, key-value caching, and softmax computation is physically instantiated in hardware. Once tape-out is committed — a decision with an estimated $50–200 million price tag at TSMC's N3/N4 process node — there is no software-layer path to recover if the Transformer paradigm is materially displaced before Sohu reaches commercial revenue. The architectural displacement risk is not hypothetical. Mamba (structured state-space models) and RWKV have demonstrated competitive performance with Transformers on language-modeling benchmarks while eliminating the KV-cache — the exact data structure Sohu's silicon is optimized to accelerate. Mixture-of-Experts (MoE) models such as Mixtral 8x7B have also shown that inference at scale can be achieved with a fundamentally different computational graph than dense-decoder Transformers. If SSM or MoE architectures achieve production adoption at hyperscalers within 4–6 years, Sohu's silicon is architecturally stranded with no recovery path short of a full redesign. Beyond architecture risk, the supply chain for High Bandwidth Memory (HBM) is concentrated among three manufacturers — SK Hynix, Samsung, and Micron — with AI chip startups holding essentially no leverage in the allocation queue relative to NVIDIA and AMD. A supply constraint on HBM3E would delay Sohu production regardless of tape-out success. TSMC itself represents a single-point dependency: there is no alternative N3/N4 foundry with sufficient capacity if TSMC faces disruption from Taiwan Strait escalation, earthquake, or other force-majeure events. Finally, PPA (power, performance, area) targets for a first-ever ASIC design are frequently missed on first silicon; a respin adds 12–18 months and $20–50 million in additional cost. [CR001, CR002, CR003, CR004, CR005, CR006]

Technology and architecture risk register
Risk	Probability	Impact	Time Horizon	Mitigation Status	Residual Severity
Transformer architecture obsolescence — Mamba/RWKV/MoE displacement	Medium	Critical	3–7 years	None — architecture is hardcoded in silicon	Critical
ASIC non-programmability — no software patch path post tape-out	N/A	High	Immediate after tape-out	Design-time mitigation via microcode layer (unproven)	High
TSMC PPA target miss requiring respin	Medium	High	18–24 months	Conservative design margins; third-party DFT review	Medium
HBM supply constraint — dependency on SK Hynix / Samsung / Micron	Medium	High	6–18 months	Multi-supplier design; no confirmed allocation priority	High
TSMC geopolitical disruption — Taiwan Strait escalation	Low	Critical	1–5 years	No viable near-term mitigation pre-revenue	High
Long ASIC design cycle — 18–24 months from tape-out to volume production	High	High	Ongoing	Parallel RTL tracks; milestone-gated burn	High

Probability reflects qualitative author assessment from public sources. Impact ratings are relative, not actuarial. Time horizon indicates when risk would manifest if triggered.

[CR001, CR002, CR003, CR004, CR005, CR007]

FR002: Risk timeline

Key risk milestones and triggers along Etched's development timeline from Series A (June 2024) through projected first revenue (H2 2027).

[CR005, CR008, CR012, CR019, CR033, CR034]

FR003: Technology transition risk diagram

Directed acyclic graph showing how architectural displacement risk (Mamba/MoE adoption) flows through Sohu's value chain to revenue risk and funding outcomes.

[CR001, CR003, CR004, CR005, CR007]

7.2 Regulatory, Geopolitical, and Legal Risks

Etched operates at the intersection of three distinct legal and regulatory risk vectors. First, US export controls administered by the Bureau of Industry and Security (BIS) under the Export Administration Regulations (EAR) require export licenses for advanced semiconductor items. The BIS Entity List restricts exports to hundreds of parties of concern; any sale of Sohu chips to international customers requires verification against the Entity List and potentially an export license application. The October 2023 Federal Register rule tightening export controls on semiconductor manufacturing items also imposes restrictions on the advanced logic chip supply chain, affecting how TSMC-manufactured chips flow to customers in restricted jurisdictions. Second, the CHIPS and Science Act (2022) provides up to $52 billion in semiconductor manufacturing incentives, but any company receiving CHIPS Act funding accepts restrictions including a 10-year prohibition on material expansion of advanced chip manufacturing in countries of concern. While Etched itself may not seek direct CHIPS Act funding, its manufacturing partner TSMC does — and supply agreements with CHIPS-funded fabs carry compliance obligations that could constrain Etched's ability to serve certain international customers. Third, IP and patent exposure creates legal risk. NVIDIA has demonstrated willingness to pursue patent litigation against semiconductor competitors; Arm Holdings licenses its ISA and microarchitecture broadly, and any chip incorporating Arm processor cores requires a current license agreement. If Sohu incorporates any Arm-based control cores (common in complex ASICs), Etched carries ongoing Arm licensing obligations. Trade secret risk is also elevated: engineers who join Etched from NVIDIA, Meta, or Google may face claims of IP misappropriation from their former employers. The EU AI Act (2024) introduces a fourth regulatory dimension: its provisions on general-purpose AI (GPAI) model compliance affect Etched's target customers and could create indirect chip compliance requirements for the inference infrastructure layer. [CR009, CR010, CR011, CR012, CR013, CR014]

Regulatory / legal risk register
Risk	Regulation / Authority	Jurisdiction	Likelihood	Severity	Mitigation	Residual Exposure
US export controls restricting Sohu chip sales to non-allied markets	EAR / BIS Entity List	United States	Medium	High	Export license applications; Entity List screening program	Medium
CHIPS Act restrictions on TSMC supply agreements limiting customer access	CHIPS and Science Act 2022	United States	Low	Medium	Comply with CHIPS Act guardrails; no federal funding sought directly	Low
EU AI Act GPAI compliance requirements indirectly affecting chip infrastructure	EU AI Act 2024	European Union	Medium	Medium	Monitor EU AI Act implementing acts; customer-level compliance	Low
BIS Entity List expansion restricting specific TSMC/Etched supply relationships	EAR Part 744 / BIS Entity List	United States	Low	High	Continuous Entity List monitoring; legal counsel review	Medium
NVIDIA or Arm Holdings patent infringement claim against Sohu design	US patent law; Arm Holdings license agreement	United States	Medium	High	Freedom-to-operate analysis; Arm licensing agreement in place	Medium
Trade secret claim from ex-NVIDIA/Meta/Google engineers	US trade secret law (DTSA)	United States	Low	Medium	IP assignment agreements; onboarding legal review	Low

Likelihood ratings reflect author inference from public sources; no direct legal advice. Residual exposure assumes standard compliance programs are in place.

[CR009, CR010, CR011, CR012, CR013, CR014]

FR001: Risk severity matrix

Comparative severity matrix scoring probability, impact, velocity, and residual exposure across Etched's five primary risk clusters as of Q1 2026.

Severity ratings are author-coded from public sources; they are relative, not actuarial. Technology residual exposure is rated Critical because the architecture is irreversible once tape-out is committed.

[CR001, CR009, CR020, CR027, CR033]

7.3 Competitive Displacement and Obsolescence Risks

The competitive risk facing Etched is more severe than for most chip startups because the primary competitor — NVIDIA — has both a dominant market position and a multi-generational roadmap that continuously raises the performance threshold Sohu must clear. NVIDIA's Blackwell architecture (H100 successor, launched 2024–2025) delivered a 2–4× inference throughput improvement over Hopper-class silicon. The Rubin architecture, expected 2026–2027, is expected to extend this lead further. For Etched to win customers, Sohu must deliver not a single-point performance advantage but a sustained advantage across the entire NVIDIA roadmap — a requirement that grows harder with each NVIDIA generation because the gap Sohu must close increases if Sohu's tape-out slips by even one generation. Beyond NVIDIA, AMD MI300X/MI325X chips have captured meaningful AI inference market share particularly among inference-as-a-service platforms running open-source models. AMD's competitive position at lower price points than NVIDIA creates a two-sided price/performance squeeze for Etched: NVIDIA sets the performance ceiling, AMD sets the cost floor. Hyperscaler captive silicon programs — Google TPU v6 (Trillium), AWS Trainium 2, and Microsoft Maia 100 — disintermediate the market for the highest-value potential Etched customers; if hyperscalers deploy entirely captive silicon for their own inference workloads, Etched's addressable market contracts to inference-as-a-service platforms and enterprise ML teams that cannot build their own silicon. Direct AI inference ASIC competitors Groq (LPU) and Cerebras (CS-3) are already in production with real customers and published performance benchmarks. Etched enters a market where at least two direct hardware analogs have a 2–3 year head start on customer relationships and production experience. Graphcore's failure — strong technical architecture, no sustained commercial traction — is the most instructive precedent: specialized AI chip companies that cannot convert architectural advantages into customer commitments before their capital runs out tend to fail regardless of technical merit. [CR020, CR021, CR022, CR023, CR024, CR025]

Competitive displacement and obsolescence risk register
Competitor / Threat	Threat Vector	Time Horizon	Likelihood	Severity	Etched Mitigation
NVIDIA Blackwell / Rubin (B200, R100 roadmap)	Continuous 2–4× per-generation GPU inference improvement raises the performance bar Sohu must clear	2025–2027	Very High	High	Sohu must maintain a >10× throughput-per-dollar advantage on Transformer decode to justify adoption
AMD MI300X / MI325X inference silicon	Competitive pricing for open-source model inference erodes cost differentiation	2025–2026	High	Medium	Target latency-sensitive use cases where AMD is not competitive
Google TPU v6 / AWS Trainium 2 / Microsoft Maia 100 (hyperscaler captive silicon)	Hyperscalers build their own inference ASICs, disintermediating the startup inference chip market	2025–2027	High	High	Target non-hyperscaler inference platforms that cannot build captive silicon
Groq LPU / Cerebras CS-3 (direct inference ASIC competitors)	Established inference ASICs with production customers and published benchmarks have a 2–3 year lead	2025–2026	Medium	Medium	Demonstrate Sohu's superior tokens-per-second-per-dollar vs LPU and CS-3 on standard benchmarks
Mamba / RWKV / MoE architecture shift making Transformer-only silicon obsolete	Paradigm shift in model architecture makes Sohu's core value proposition irrelevant	2026–2030	Medium	Critical	No viable mitigation; would require full ASIC redesign and a new product generation
Tenstorrent RISC-V AI chip (semi-flexible architecture)	Semi-programmable alternative offers flexibility advantage over Etched's fixed architecture	2026–2028	Low	Medium	Sohu's performance advantage on Transformer workloads remains the key differentiator

Threat likelihood reflects public competitive intelligence as of Q1 2026. Severity ratings assume Etched has not yet reached production revenue when the threat materializes.

[CR020, CR021, CR022, CR023, CR024, CR025]

7.4 Execution, Team, and Operational Risks

Etched's execution risk profile is unusually high even by chip startup standards. The company is attempting to build the world's first production Transformer-inference ASIC with a team of approximately 30 people, no prior tape-out track record as an organization, and a CEO (Gavin Uberti) who is 23 years old with no prior chip-to-production experience. The team includes engineers who have previously worked at NVIDIA, Meta, and Google, providing relevant domain expertise, but the organizational capability to execute a multi-year full-stack ASIC program — from RTL design through DFT, physical design, TSMC PDK integration, and first-silicon bring-up — has not been publicly demonstrated by this team at this scale. The ASIC development cycle creates a structural execution timeline risk. From tape-out submission to first-silicon return is approximately 6–9 months; from first silicon to volume production is an additional 12–18 months. If Etched's tape-out is in 2025–2026 (the most plausible window given the 2024 Series A), first revenue cannot realistically occur before H2 2027 at the earliest — and only if first silicon meets performance targets, SDK development is complete, and a customer evaluation completes within 6–12 months of silicon delivery. The software/SDK execution risk is particularly underappreciated. The Graphcore failure was driven substantially by SDK immaturity that prevented customers from efficiently porting their models to Graphcore's architecture. Etched has disclosed no SDK, no compiler, no software stack, and no developer program as of Q1 2026. Building a performant MLIR/XLA compiler backend or custom SDK for Sohu is a multi-year engineering effort that requires a separate software team with different skills from the hardware team. A hardware-only approach that assumes customer self-service SDK adoption is not a viable commercial path for a chip startup without established ecosystem relationships. [CR027, CR028, CR029, CR030, CR031, CR032]

Execution and team risk register
Risk	Category	Probability	Impact	Mitigation Status	Residual Severity
First ASIC ever designed by this team — no organizational tape-out track record	Technical / Organizational	Medium	High	Experienced external chip design consultants; TSMC PDK support	High
CEO (Gavin Uberti, age 23) lacks prior chip-to-production experience	Leadership	Medium	High	Board oversight; experienced investors; technical co-founder involvement	Medium
Small team (~30 people) for full-stack ASIC development	Capacity / Talent	High	High	Active hiring pipeline required before tape-out	High
SDK / software stack non-existent; no compiler or developer program announced	Product / Software	High	High	SDK development must run parallel to hardware; Graphcore analog risk	Critical
18–24 month design-to-production cycle creates revenue gap with burn accumulating	Timeline / Financial	Medium	High	Runway management; milestone-gated spend; Series B planned	High
Trade secret or IP misappropriation claim from ex-employer of key engineers	Legal / HR	Low	Medium	IP assignment agreements; onboarding legal review; external counsel	Low

Probability ratings are qualitative inferences from public information; no internal Etched data available. Residual severity assumes standard professional mitigations are in place.

[CR027, CR028, CR029, CR030, CR031, CR032]

FR004: Financial stress scenario

Low/base/high scenario ranges for key financial parameters governing Etched's runway, tape-out cost, and funding requirements.

[CR033, CR034, CR035, CR036, CR037, CR038]

7.5 Financial and Investment Risks

Etched's financial risk profile is dominated by three compounding factors: (1) extremely high capital intensity with uncertain timing, (2) zero current revenue with no line of sight to first revenue before H2 2027 at best, and (3) a funding market that has shown reduced appetite for deep-tech hardware investments outside AI hyperscaler-backed companies since 2023. ASIC development at TSMC's N3/N4 process node carries an estimated total program cost of $50–200 million for a single tape-out, depending on mask count, design complexity, and the number of validation iterations required. This cost is committed before a single chip is delivered to a customer. Etched's $120 million Series A provides a runway of approximately 18–36 months at typical pre-tape-out burn rates ($3–6 million per month), but this runway may be insufficient to bridge from tape-out through first silicon, customer evaluation, and design win — particularly if the tape-out requires one or more respins. A Series B raise will be required before any product revenue is realized, making Etched's financial survival entirely dependent on VC market conditions at a time when interest rates, AI investment sentiment, and startup funding multiples are all uncertain. If AI spending growth slows or pauses in 2026–2027, the inference chip market that Etched is targeting may contract, reducing both customer willingness to adopt new hardware and investor appetite to fund pre-revenue chip startups. Revenue concentration risk is also severe: even if Etched reaches production, the first 3–5 customers are likely to represent 60–80% of initial revenue, creating extreme exposure to any single-customer volume reduction or exit. [CR033, CR034, CR035, CR036, CR037, CR038]

Financial and runway risk register
Scenario	Trigger	Probability	Financial Impact	Residual Exposure
ASIC tape-out cost overrun at TSMC N3/N4	Final tape-out NRE exceeds $100M vs $50M base case; multiple mask sets required	Medium	$50–150M additional capital needed before first silicon	High
Series B raise fails or is delayed beyond runway	AI funding market contraction; absence of design wins; no silicon sample for investors	Medium	Operations may need to cease or scale down before first revenue	Critical
First-silicon respin required after tape-out	First silicon misses PPA spec; timing closure failure; yield below threshold	Medium	12–18 month delay; additional $20–50M NRE cost; runway exhaustion risk	High
AI inference market growth decelerates	Enterprise GenAI spending pullback; GPU cost declines reduce Sohu cost advantage	Low	TAM contraction; margin compression for inference ASIC startups	Medium
Customer revenue concentration in first year — one of 3–5 customers reduces volumes	First customer exits or pivots away from Transformer model inference	Medium	25–50% first-year revenue shortfall; operational continuity risk	High

Financial impact estimates are based on industry analogs (Graphcore, Groq) and public TSMC node pricing estimates. No Etched internal financial data is available.

[CR033, CR034, CR035, CR036, CR037, CR038]

7.6 Exhibits

Chapter 08

08Valuation

8.1 Investment Thesis and Anti-Thesis

Etched's investment thesis rests on a single, non-diversifiable architectural bet: that Transformer decoder architectures will remain the dominant paradigm for large-language-model inference for the next five to eight years, and that purpose-built silicon targeting only that workload will deliver a 10× or greater cost-performance advantage over general-purpose GPUs at inference time. If both conditions hold, Etched could capture a disproportionate share of the inference ASIC market as hyperscalers optimise for token-cost rather than training-time flexibility. The anti-thesis is equally concentrated. Etched has zero revenue, zero customers, zero design wins, and no first-silicon delivery as of Q2 2026. Its CEO is 23 years old with no prior tape-out track record. The company's addressable market exists only if its architectural assumptions hold, its TSMC tape-out succeeds without a costly respin, and at least one hyperscaler customer evaluates Sohu before competitors close the gap. The Graphcore failure pattern — technically superior architecture, no commercial traction, eventual distressed exit — is the most applicable cautionary analog in the AI chip industry. Every scenario-weighted analysis must grapple with the compounded probability of these simultaneous execution requirements all succeeding. [CV001, CV002, CV003, CV004, CV039, CV040]

Recommendation summary table
Entry Condition	Implied Post-Money	Recommended Stance	Probability-Weighted EV	Key Qualifier
Attractive entry	≤$800M	Conditional track — re-evaluate at Series B close	$800M–1.1B	Re-evaluate contingent on tape-out completion and first customer win
Marginal entry	$800M–1.5B	Pass — risk-adjusted return insufficient	$800M–1.1B	Insufficient margin of safety; probability-weighted EV barely above entry
Unattractive entry	>$1.5B	Hard pass — negative expected value at current information	$800M–1.1B	Expected loss at any realistic scenario weighting; three kill triggers active

Probability-weighted EV derived from bull (15–20% × $3–5B) + base (40–50% × $800M–1.5B) + bear (30–40% × $200–500M). Post-money entry conditions assume a future financing round; actual Series A post-money is undisclosed. Stance does not constitute investment advice.

[CV040, CV041]

Thesis / anti-thesis table
Dimension	Thesis (Bull)	Anti-Thesis (Bear)	Evidence Weight
Architecture lock-in	Transformer architecture has proven durable; first-gen hardcoded silicon captures switching-cost moat at inference time	Mamba / RWKV / SSM alternatives have no-KV-cache advantage; any paradigm shift strands Sohu permanently	Mixed — Transformers dominant now but SSM evidence growing
Team execution	MIT/OpenAI lineage; deep AI/hardware expertise; youth indicates agility and long commitment horizon	CEO is 23 with no tape-out history; ~30-person team for full-stack ASIC is historically undersized	Weak — no tape-out track record to validate execution
Market timing	$120M raise coincides with peak inference cost pressure; hyperscalers have active incentive to adopt cheaper inference silicon	NVIDIA Blackwell/Rubin roadmap narrows the performance gap every 2 years; window to achieve advantage may be short	Mixed — timing plausible but competitive clock is fast
Capital efficiency	Focused architecture reduces firmware and software complexity; lower opex than broader-platform competitors	ASIC tape-out at TSMC N4 costs $50–200M in NRE alone; Series B required before any product revenue	Negative — capital intensity risk is high and unmitigated at Series A stage

Evidence weight is author's qualitative judgement from publicly available information. 'Mixed' indicates countervailing evidence exists on both sides; 'Weak' or 'Negative' indicates the anti-thesis evidence is materially stronger than the thesis evidence at this stage of the company's development.

[CV001, CV002, CV004]

FV001: Recommendation logic

Decision chain from Series A context through comparables, scenario weighting, and kill-trigger screen to the investment verdict.

[CV039, CV040, CV041]

8.2 Comparable Company and Precedent Transaction Analysis

No directly comparable public company exists for a pre-revenue, Transformer-only inference ASIC startup. The closest public comparables are Marvell Technology — whose custom AI ASIC business for hyperscalers generated approximately $1.6 billion in fiscal year 2025 revenue at 10–15× EV/Revenue — and Broadcom, whose custom silicon and networking revenues for AI have sustained an 18–20× EV/Revenue premium within its overall market capitalisation. NVIDIA remains the aspirational benchmark at approximately 25× EV/Revenue on AI infrastructure revenues, though its diversified moat and software stack (CUDA) are structurally incommensurable with Etched's single-product, pre-revenue profile. Among comparable private-stage peers, Cerebras filed for IPO in September 2024 at an implied $7–8 billion valuation despite limited commercial customers, demonstrating that AI chip startups can sustain elevated private valuations. Groq raised $640 million in early 2024 at approximately $2.5 billion implied, but Groq has production LPU deployments and paying customers — a substantially de-risked profile versus Etched. The precedent acquisition of Habana Labs by Intel for approximately $2 billion in December 2019 remains the primary positive data point for pre-revenue AI chip startup acquisitions, though the AI chip landscape is materially more competitive in 2026 than it was in 2019. [CV005, CV006, CV007, CV008, CV026, CV027]

Comparable valuation table
Company	Stage	Implied Val. / Market Cap	EV/Revenue Multiple	Primary Relevance	Limitation vs Etched
NVIDIA	Public (NASDAQ: NVDA)	~$3T (2024)	~25× LTM	Aspirational benchmark for AI chip dominance	Diversified GPU+CUDA moat; not inference-only; far larger scale
Marvell Technology	Public (NASDAQ: MRVL)	~$80–100B (2024)	~10–15× on AI ASIC revenue	Closest production-stage AI custom ASIC comparable	Marvell has paying hyperscaler customers; Etched has zero revenue
Broadcom	Public (NASDAQ: AVGO)	~$700B (2024)	~18–20× AI chip implied	Custom silicon for hyperscalers at scale	Broadcom earns revenue across networking+ASIC; not startup comparable
Qualcomm	Public (NASDAQ: QCOM)	~$150B (2024)	~7–9× semiconductor revenue	Fabless chip company multiple floor reference	Mobile-centric; no direct inference ASIC business
Cerebras / Groq (private)	Series C–D, pre-IPO	$2.5B (Groq, 2024); $7–8B (Cerebras, implied IPO)	N/A (no public revenue)	Private-stage AI chip peers with disclosed valuations	Both have production deployments; Etched has zero first silicon
Habana Labs (acquired by Intel, 2019)	Pre-revenue at acquisition	~$2B acquisition	N/A (no revenue at exit)	Primary precedent transaction for AI chip startup M&A	2019 vintage; AI chip competition far less intense then

Public company valuations are approximate market-cap as of late 2024; EV/Revenue multiples are author estimates from public filings and analyst consensus. Private-stage valuations are from publicly reported funding rounds or IPO filings. Comparison to Etched requires a 40–60% discount to comparable multiples to reflect pre-revenue stage, single-architecture concentration, and execution risk.

[CV001, CV003, CV005, CV009]

FV002: Valuation / return range

Enterprise value ranges across bear, base, and bull scenarios vs estimated Series A entry point, all in USD millions.

[CV015, CV016, CV017]

8.3 Scenario Analysis — Bull, Base, and Bear Cases

The bull case assigns a 15–20% probability to Etched achieving first-silicon pass without a respin at TSMC N4, confirming at least one hyperscaler customer design win by H2 2027, and reaching $200–300 million in contracted or recognised revenue by 2028. Applied to a 10–15× EV/Revenue multiple consistent with early-stage Marvell AI ASIC comparables, this implies an enterprise value of $3–5 billion, representing a 4–7× return on the estimated $600–800 million entry. The base case (40–50% probability) assumes first-silicon delivery but with at least one major performance shortfall or integration challenge, a single initial customer design win, and a 2028 revenue trajectory of $100–150 million. Risk-adjusted at 4–6× EV/Revenue, this implies an enterprise value of $800 million to $1.5 billion — below the 10× return threshold for lead Series A investors. The bear case (30–40% probability) encompasses tape-out failure, silicon respin requirement, architecture obsolescence via Mamba or SSM adoption, or inability to close a Series B by late 2026. This results in a distressed exit at $200–500 million, representing a loss on the Series A capital base. Cerebras's own experience demonstrates that sustaining private valuation without IPO momentum is possible but fragile; Graphcore's trajectory — $2.8 billion peak to distressed acquisition — is the downside reference point. [CV013, CV014, CV015, CV016, CV017, CV018]

Bull / base / bear scenario table
Scenario	Probability	Key Assumptions	2028 Revenue Est.	Exit EV/Revenue	Implied Exit EV
Bull	15–20%	First-silicon pass; ≥1 hyperscaler design win by H2 2027; Transformer architecture dominant	$200–300M	10–15×	$3–5B
Base	40–50%	First-silicon delivered; performance shortfall or integration delay; 1 initial customer; Series B closed	$100–150M	4–6×	$800M–1.5B
Bear	30–40%	Tape-out failure or respin; architecture displacement; Series B unavailable; distressed exit	<$50M or zero	<5× or distressed	$200–500M

Probabilities are qualitative author estimates calibrated to comparable AI chip startup base rates (Graphcore, Cerebras). Revenue estimates are illustrative scenario analysis, not company projections. Exit multiples assume 2024-level AI chip sector sentiment persists.

[CV015, CV016, CV017]

FV003: Valuation sensitivity

Sensitivity of implied exit EV (in $B) to key value-driver milestones, from base case to upside scenarios.

[CV028, CV030, CV039]

8.4 Capital Structure, Return Requirements, and Exit Path

Etched raised $120 million in a Series A in June 2024, led by Positive Sum with Primary Venture Partners as co-investor. The post-money valuation was not publicly disclosed. Based on typical Series A dilution norms for hardware companies at this scale, analyst estimates place the post-money in the $600–800 million range, implying approximately 15–20% primary dilution. At this entry price, lead investors need a minimum 10× return to meet standard venture fund return targets, requiring an exit enterprise value of $6–8 billion. No scenario in this analysis achieves that threshold at base-case probability weighting; the bull case does if exit multiples hold at 2024 levels. The most probable exit path is a strategic acquisition by a hyperscaler — AWS, Google, Microsoft, or Apple — or a semiconductor company with AI ASIC exposure — Broadcom, Marvell, or Qualcomm. An IPO is unlikely before H2 2028 at the earliest, and Cerebras's delayed IPO illustrates the difficulty of listing an AI chip company even after production deployments. Historical venture base rates for pre-revenue hardware companies are sobering: fewer than 10% achieve 10× or greater returns; the majority experience write-downs or distressed exits within five years of Series A, arguing for a high discount rate applied to projected scenarios. The probability-weighted expected value of approximately $800–1,100 million marginally exceeds the estimated $700 million mid-point of the entry range, providing insufficient risk compensation for a new position above base-case entry price. [CV009, CV010, CV011, CV012, CV022, CV023]

FV004: Investment KPIs

IC-ready key investment indicators summarising Etched's valuation profile, return math, and scenario outcomes.

[CV009, CV011, CV012]

8.5 Exit Readiness, Kill Triggers, and Investment Verdict

Three explicit thesis-break triggers define the conditions under which any existing position must be exited or any prospective investment must be declined regardless of entry price. First, a tape-out failure or unscheduled abort at TSMC N4 would reduce the enterprise value to near zero; IP in a distressed scenario is worth under $100 million absent a functional chip. Second, if Mamba, RWKV, or any SSM-family architecture achieves a confirmed production inference deployment at any top-three hyperscaler before Sohu's commercial launch, the transformer-only differentiation is eliminated without a recovery path. Third, if Etched fails to close a Series B at $800 million or above within 24 months of Series A close, investor concern would signal imminent distress. Three final diligence asks must be resolved before any investment decision. The post-money Series A valuation and cap table require disclosure to establish the entry price, dilution baseline, and liquidation preference stack. The monthly burn rate, tape-out schedule with milestone dates, and cumulative TSMC NRE payments are required to validate runway and Series B timing. Any signed LOIs, evaluation agreements, customer pipeline data, or briefing recipients under NDA must be disclosed to substantiate the commercial thesis. The investment verdict is conditional negative at implied valuations above $1.5 billion, and conditional track at or below $800 million, contingent on Series B close, tape-out completion, and first customer design win confirmation. [CV031, CV032, CV033, CV034, CV035, CV036]

Thesis-break and kill triggers table
Trigger	Category	Signal Event	Urgency	Required Action
First-silicon failure or tape-out abort	Execution / hardware	TSMC reports tape-out reject, severe PPA miss, or unscheduled respin before functional silicon delivery	Immediate	Exit position; IP value <$100M in distressed scenario; business has no revenue path
Transformer architecture displacement	Technology / market	Any top-3 hyperscaler (Google, AWS, Microsoft) announces production inference deployment of Mamba/SSM/MoE replacing Transformer decoder at inference layer	Within 1 quarter	Exit position; Sohu's core differentiation is eliminated with no recovery path
Series B failure at viable valuation	Financial / runway	Etched fails to close a Series B at ≥$800M post-money within 24 months of Series A close (deadline: mid-2026)	Within 6 months of deadline	Reevaluate; capital runway exhaustion implies forced sale or wind-down

Kill triggers are author-defined thresholds informed by comparable AI chip startup failures (Graphcore) and standard venture risk management. They are monitoring indicators, not mechanical sell rules; each requires re-evaluation in context at the time of occurrence.

[CV031, CV032, CV033]

Final diligence asks table
Ask	Priority	Why Required	Risk If Unresolved
Post-money Series A valuation and cap table with full option pool and liquidation preference stack	P0 — Blocking	Entry price, dilution baseline, and preference overhang cannot be assessed without this	Cannot determine whether any entry price offers a positive risk-adjusted return
Monthly burn rate, tape-out milestone schedule, cumulative TSMC NRE payments to date, and projected cash exhaustion date	P0 — Blocking	Runway validation and Series B timing depend entirely on burn and NRE cadence	Runway may be shorter than assumed; Series B timeline may be inside 12 months
Any signed LOIs, evaluation agreements, engineering briefing recipients under NDA, or customer pipeline data	P1 — Material	Commercial thesis has zero public evidence; even a non-binding LOI materially changes probability weighting	Cannot validate any bull-case probability without customer pipeline signal
TSMC foundry agreement terms — node selection (N4 vs N3), NRE payment schedule, allocation priority, and any right-of-first-allocation clause	P1 — Material	Foundry lock-in and NRE structure determine capital requirements and execution optionality	Cannot assess whether $120M is sufficient to reach first silicon delivery

P0 items are pre-conditions for any investment decision; P1 items are required before closing a term sheet. All four items relate to non-public information that Etched would need to disclose in a data room; none are publicly available as of the research date.

[CV034, CV035, CV036]

8.6 Exhibits

Disclaimer

This report was prepared for informational purposes only. All performance claims attributed to Etched are company-stated and have not been independently verified. Valuation estimates are illustrative scenario analyses and do not constitute investment advice. Forward-looking statements about the semiconductor market, AI architecture trends, and Etched's commercial trajectory involve substantial uncertainty.

Evidence index

Claims
ID	Statement	Confidence	Sources
CO001	Etched was founded in 2022 in Cupertino, California.	High	SO001, SO003
CO002	Etched's headquarters is located in Cupertino, California.	High	SO001, SO031
CO003	Etched announced a $120 million Series A funding round on June 26, 2024.	High	SO002, SO003, SO031
CO004	Etched's reported valuation at the time of the Series A was approximately $1 billion.	Medium	SO002, SO003
CO005	Etched's primary product is the Sohu chip, a purpose-built ASIC designed exclusively for Transformer neural network inference.	High	SO001, SO004
CO006	Etched claims the Sohu chip achieves approximately 500,000 tokens per second for Transformer inference workloads.	Medium	SO001, SO031
CO007	Etched claims that an NVIDIA H100 GPU achieves approximately 20,000 tokens per second for Transformer inference, compared to Sohu's 500,000.	Medium	SO001, SO007
CO008	Etched is pre-revenue as of the research date; the Sohu chip has not reached commercial production.	Medium	SO001, SO025
CO009	Primary Venture Partners participated as an investor in Etched's Series A round.	Medium	SO002, SO018
CO010	Gavin Uberti is the CEO and co-founder of Etched.	High	SO001, SO017
CO011	Chris Zhu is the CTO and co-founder of Etched.	High	SO001, SO031
CO012	Robert Winslow is a co-founder of Etched, based on early press coverage.	Medium	SO003, SO031
CO013	The Transformer neural network architecture was introduced in the 2017 paper 'Attention Is All You Need' by Vaswani et al. at Google Brain.	High	SO004, SO005
CO014	Etched's official website states its mission as 'Building the hardware for superintelligence.'	Medium	SO001
CO015	The Sohu chip hardcodes Transformer computation into silicon, eliminating the programmability overhead of general-purpose GPUs.	Medium	SO001, SO006
CO016	Etched claims a 25x or greater performance advantage for Sohu over NVIDIA H100 GPUs for Transformer inference.	Medium	SO001, SO031
CO017	Application-specific integrated circuits (ASICs) outperform general-purpose GPUs for specific fixed workloads by eliminating programmability overhead.	Medium	SO006, SO020
CO018	Positive Sum participated as an investor in Etched's Series A funding round.	Medium	SO016, SO003
CO019	NVIDIA is the dominant player in the AI accelerator market, with the H100 being the leading GPU for AI training and inference as of 2024.	High	SO007, SO024
CO020	NVIDIA's CUDA software ecosystem creates strong switching costs that make it difficult for customers to migrate to alternative AI accelerators.	Medium	SO007, SO024
CO021	Etched, as a fabless semiconductor company, will need to partner with a third-party foundry (most likely TSMC) to manufacture the Sohu chip.	Medium	SO026, SO021
CO022	Major large language models including GPT-4, LLaMA, and Claude are built on Transformer architecture.	Medium	SO005, SO019
CO023	Pre-revenue semiconductor startups face extreme capital and execution risk given multi-year chip development cycles and high tape-out costs.	Medium	SO022, SO034
CO024	Groq offers a Language Processing Unit (LPU) as an AI inference accelerator chip competing in the same market as Etched.	Medium	SO008, SO035
CO025	Cerebras Systems builds wafer-scale ASIC chips for AI compute, representing a direct competitor to Etched's chip-based approach.	Medium	SO009, SO035
CO026	SambaNova Systems offers AI accelerator products for enterprise AI workloads, competing in the AI inference market.	Medium	SO010
CO027	AMD's Instinct MI300X is a GPU-based AI accelerator competing for the AI inference and training market against NVIDIA.	Medium	SO011
CO028	Amazon Web Services offers Trainium custom AI chips for training and inference workloads on its cloud platform.	Medium	SO012
CO029	Google Cloud offers Tensor Processing Units (TPUs) as purpose-built AI accelerators for training and inference.	Medium	SO013
CO030	Intel Gaudi 3 is Intel's AI accelerator chip competing in the enterprise AI inference and training market.	Medium	SO014
CO031	The Transformer attention mechanism is computationally intensive, involving quadratic complexity with sequence length, making it a candidate for dedicated hardware acceleration.	Medium	SO004, SO005
CO032	Etched has not publicly disclosed its headcount as of the research date.	Low
CO033	Etched has not publicly disclosed any customer commitments or design wins as of the research date.	Low
CO034	Etched has not publicly disclosed revenue forecasts, tape-out timelines, or production schedules as of the research date.	Low
CO035	Mamba and other state space model (SSM) architectures have demonstrated competitive performance to Transformers on some sequence modeling tasks.	Medium	SO029, SO030
CO036	If a post-Transformer architecture achieves widespread AI adoption, the Sohu ASIC's hardcoded Transformer logic would become commercially obsolete.	Medium	SO029, SO025
CO037	Etched's Series A announcement received coverage from Bloomberg, Reuters, Wired, Fortune, and TechCrunch on or around June 26–27, 2024.	Medium	SO002, SO003, SO031, SO032, SO033
CO038	No material leadership changes at Etched have been reported in public press coverage as of the research date.	Medium	SO001, SO025
CO039	NVIDIA holds dominant market share in the AI chip market with an estimated 70-90% share of AI accelerator revenue.	Medium	SO024, SO022
CO040	Semiconductor chip development from design to production typically requires 3–5 years and hundreds of millions of dollars in capital investment.	Medium	SO022, SO034
CO041	Gavin Uberti's background includes research experience at Microsoft prior to co-founding Etched.	Medium	SO017, SO001
CO042	As a unicorn-valued startup at Series A, Etched's investor thesis appears to be a high-risk bet on Transformer architecture longevity and semiconductor execution.	Low	SO016, SO002
CO043	No public records of adverse regulatory actions, lawsuits, or sanctions against Etched or its founders have been found as of the research date.	Medium	SO001, SO025
CM001	Etched's total addressable market is the AI inference accelerator segment, specifically the subset of inference workloads running Transformer-based models.	Medium	SM001, SM003
CM002	The primary status-quo substitute for Etched's Sohu chip is the NVIDIA H100 GPU cluster deployed by cloud hyperscalers for LLM inference.	Medium	SM007, SM008
CM003	Etched's addressable market is bounded by Transformer architecture dominance; if non-Transformer models gain substantial inference share, Etched's SAM shrinks proportionally.	Medium	SM020, SM021
CM004	As of the research date, the overwhelming majority of commercially deployed LLMs are based on Transformer architecture, making Etched's near-term TAM very large.	Medium	SM013, SM014
CM005	Etched's market explicitly excludes AI training, edge AI, and non-Transformer inference workloads by design of the Sohu chip.	Medium	SM001, SM003
CM006	The global AI chip market was estimated at approximately $53 billion in 2023 with NVIDIA holding dominant market share.	Medium	SM001, SM002, SM006
CM007	The global AI chip market is projected to reach $300-500 billion by 2030, representing a 30-40% compound annual growth rate.	Low	SM001, SM029
CM008	The AI inference segment is estimated at $20-30 billion in 2024, representing approximately 40% of total AI chip market revenue.	Low	SM001, SM003, SM006
CM009	The AI inference market is projected to reach $100-200 billion by 2028-2030 as inference volumes grow faster than training.	Low	SM005, SM013
CM010	Etched's near-term SOM is estimated at less than $100 million for 2026-2027, assuming successful tape-out and initial hyperscaler pilots.	Low	SM001, SM026
CM011	Etched's 5-year SOM is estimated at $50M-$1B (2027-2030), representing 0.05-1% of the projected inference market — a wide range reflecting execution uncertainty.	Low	SM026, SM001
CM012	Cloud hyperscalers (AWS, Google Cloud, Microsoft Azure) are the largest potential buyers for AI inference chips, running billions of inference calls per day.	Medium	SM009, SM010, SM011
CM013	AI-native companies like OpenAI and Anthropic are highly cost-sensitive inference buyers, with compute costs representing a major component of their operating expenses.	Medium	SM013, SM018
CM014	Hyperscaler procurement cycles for new silicon vendors typically require 18-36 months of qualification and validation before production deployment.	Medium	SM018, SM023
CM015	Inference-as-a-service platforms (Together AI, Anyscale, Replicate) represent Etched's most accessible early customer segment due to shorter sales cycles and willingness to experiment.	Low	SM016, SM017
CM016	Etched must achieve compatibility with major AI model serving frameworks (vLLM, TensorRT-LLM, Hugging Face Transformers) to access the inference buyer market.	Medium	SM015, SM013
CM017	Etched's adoption path requires tape-out, first silicon validation, software ecosystem development, and successful hyperscaler pilot programs before any production revenue.	Medium	SM001, SM022
CM018	The explosive adoption of LLMs in commercial applications since ChatGPT's launch in November 2022 is the primary driver of inference compute demand growth.	Medium	SM005, SM013
CM019	GPU inference compute costs are a major and growing operational expense for LLM providers, creating strong economic incentive for more efficient silicon.	Medium	SM007, SM012
CM020	NVIDIA CUDA ecosystem creates very high switching costs for AI chip buyers; migrating workloads to new silicon requires significant software re-engineering.	Medium	SM015, SM008
CM021	Hyperscalers are actively seeking to diversify their AI chip supply chains away from exclusive NVIDIA dependency, creating an opening for alternative silicon vendors.	Medium	SM009, SM010, SM011
CM022	Data center power density constraints are driving demand for higher performance-per-watt AI silicon, advantaging efficient ASIC designs over general-purpose GPUs.	Medium	SM019, SM022
CM023	AI hardware startups lack the production history, reliability data, and support infrastructure that hyperscalers require, representing a material adoption constraint.	Medium	SM016, SM017
CM024	US government export controls on advanced AI chips (e.g., H100 restrictions to China) affect NVIDIA but could open or close markets for Etched depending on certification status.	Low	SM008, SM006
CM025	Different analyst firms report significantly different AI chip market size estimates, with 2023 figures ranging from $40B to $80B and 2030 projections from $200B to $900B.	Medium	SM001, SM002, SM029
CM026	Market sizing discrepancies across analyst firms reflect different definitions of training vs inference spend, different assumptions about GPU adoption, and different views on edge AI inclusion.	Medium	SM026, SM001
CM027	Model efficiency improvements (quantization, speculative decoding, distillation) could reduce per-query inference compute requirements, potentially constraining total inference hardware spend growth.	Medium	SM013, SM005
CM028	The transition from AI model training to AI inference as the dominant compute workload is a secular market shift that benefits inference-focused chip vendors.	Medium	SM001, SM003, SM013
CM029	Budget ownership for AI chip procurement at hyperscalers is typically in the infrastructure/compute team, with multi-year capex commitments requiring executive approval.	Low	SM023, SM018
CM030	NVIDIA held approximately 70-80% of the AI accelerator market in 2023-2024, with AMD, Google TPU, and AWS Trainium representing the remainder.	Low	SM008, SM006
CM031	Groq has raised over $1 billion total in multiple funding rounds, demonstrating investor appetite for alternative AI inference chip companies.	Low	SM016, SM026
CM032	Cerebras Systems raised approximately $720 million total across multiple funding rounds to build its wafer-scale AI accelerator.	Low	SM017, SM026
CM033	As AI models scale in size, inference cost per token increases, creating a growing economic incentive for inference-optimized silicon.	Medium	SM013, SM007
CM034	The AI inference market's growth rate of 35-45% CAGR through 2030 is supported by multiple independent analyst forecasts, though precise estimates vary significantly.	Low	SM001, SM002, SM029
CM035	Hyperscalers' AI capital expenditure (capex) for 2024-2025 is reported in the hundreds of billions of dollars collectively, reflecting the scale of the AI infrastructure buildout.	Medium	SM009, SM010, SM011
CM036	The LLM API market, representing paid inference services for ChatGPT, Claude, Gemini and similar products, is estimated to generate tens of billions in revenue by 2025-2026.	Low	SM005, SM013
CP001	NVIDIA holds approximately 80-90% of the AI accelerator market as of 2024-2025, making it the dominant status-quo competitor for any AI inference chip.	Medium	SP012, SP013
CP002	Etched has raised $120M in total funding as of 2024, significantly less than Groq ($1.1B+), Cerebras ($720M+), or SambaNova ($1.2B+).	Medium	SP026, SP002, SP004
CP003	Groq uses a Language Processing Unit (LPU) architecture with deterministic streaming execution, optimized for low-latency inference; it supports general AI model inference including non-Transformer architectures.	Medium	SP001, SP002
CP004	Cerebras Systems uses a wafer-scale engine (WSE-3) with 44GB of on-chip SRAM and focuses primarily on training and large-model inference; it is not Transformer-specialized.	Medium	SP003, SP004
CP005	Google's TPU v5e is an inference-optimized tensor processing unit available on Google Cloud, used extensively for Gemini inference; it is not available for external purchase.	Medium	SP009, SP017
CP006	AWS Inferentia2 is available on EC2 Inf2 instances and targets cost-effective inference for large language models at approximately $0.76/hr per chip on-demand.	Medium	SP016
CP007	Etched has not delivered production silicon as of Q1 2026; the company has claimed a TSMC 4N tape-out but no third-party verification or production deliveries have been reported.	Medium	SP026, SP019
CP008	No known competitor has built a Transformer-only hardened ASIC for inference; Etched's specific architectural niche has no direct competition as of 2026.	Medium	SP001, SP003, SP010, SP012
CP009	The CUDA software ecosystem, representing decades of developer investment in GPU-native ML toolchains, is NVIDIA's primary moat against alternatives including Etched.	Medium	SP021, SP012
CP010	Etched has no publicly available SDK or demonstrated framework compatibility (PyTorch, JAX, vLLM) as of Q1 2026; software ecosystem is pre-launch.	Medium	SP019, SP026
CP011	AMD MI300X has achieved credible commercial traction as a NVIDIA alternative for inference, with Microsoft deploying MI300X at scale for Azure AI/OpenAI workloads.	Medium	SP014, SP015
CP012	Groq's GroqCloud API offers inference at approximately $0.27/1M tokens for Llama 3-70B as of late 2024, setting a competitive benchmark for inference-optimized silicon.	Medium	SP002, SP001
CP013	Tenstorrent has raised over $700M in 2024 funding and is building RISC-V-based AI chips with open hardware architecture, targeting both edge and cloud AI inference.	Medium	SP010, SP011
CP014	Intel Gaudi (formerly Habana Labs, acquired for ~$2B in 2019) has not achieved significant market share in AI inference; Intel's software ecosystem lags CUDA significantly.	Medium	SP006, SP007, SP008
CP015	Multi-homing in AI inference—running both NVIDIA GPU and alternative inference chip in parallel—is technically feasible but requires significant engineering investment; buyers typically evaluate alternatives rather than fully switching.	Medium	SP021, SP024
CP016	A company evaluating Etched faces a 12-24 month sales and qualification cycle requiring SDK availability, model validation, and supply chain verification before production deployment.	Medium	SP024, SP018
CP017	NVIDIA, AMD, Google, and AWS have not disclosed plans for a Transformer-only hardened ASIC; their roadmaps focus on general-purpose AI accelerators with inference optimization.	Medium	SP012, SP013, SP009, SP016
CP018	Graphcore raised over $700M in venture funding, reached a $2.8B peak valuation, and was acquired by SoftBank in 2023 for approximately $120M after failing to achieve commercial scale.	Medium	SP005
CP019	Graphcore's commercial failure has been attributed to: misalignment with Transformer-dominated inference workloads, CUDA switching cost barriers, and failure to achieve required software ecosystem depth.	Medium	SP005
CP020	Etched faces the same three failure modes as Graphcore: architectural alignment risk (Transformer-only), CUDA ecosystem switching costs, and software ecosystem immaturity—the company must address all three before achieving commercial scale.	Medium	SP005, SP021, SP019
CP021	NVIDIA is continuously improving its inference-specific software (TensorRT-LLM, NeMo Guardrails, Flash Attention integration) to close the throughput-efficiency gap with specialized inference chips.	Medium	SP012, SP013
CP022	Intel's Gaudi acquisition and integration journey demonstrates that acquiring or building non-CUDA AI chip capability is difficult even for a company with Intel's resources and ecosystem.	Medium	SP006, SP007, SP008
CP023	SambaNova Systems raised approximately $1.2B and uses a reconfigurable dataflow architecture to target enterprise AI deployment; it has not disclosed revenue or deployment scale.	Medium	SP025
CP024	Competitive distribution channels for AI inference chips include: direct enterprise sales (Groq, Cerebras, SambaNova), cloud marketplace integration (AWS, GCP, Azure), and OEM/system integrator partnerships (NVIDIA, AMD).	Medium	SP002, SP004, SP016, SP017
CP025	The ASIC approach to AI chip design provides higher performance-per-watt for fixed workloads but limits flexibility; GPU and FPGA approaches sacrifice some efficiency for programmability.	Medium	SP018, SP020
CP026	No competitor has demonstrated independent third-party benchmarks comparing their performance against Etched's Sohu chip, as Etched has not released production silicon.	Medium	SP019, SP026
CP027	Wave Computing and Mythic AI represent earlier AI chip startup failures, adding further adverse data to the pattern of well-funded AI chip startups failing to achieve commercial scale.	Low	SP024
CP028	AMD's ROCm open-source software ecosystem has improved significantly in 2023-2024, providing a viable CUDA alternative for PyTorch and JAX workloads; this reduces the exclusivity of NVIDIA's software moat.	Medium	SP014, SP015
CP029	Microsoft Azure has made large-scale commitments to AMD MI300X deployment for OpenAI workloads, representing the most significant commercial validation of a non-NVIDIA GPU for major AI inference.	Medium	SP014, SP015
CP030	Etched's $120M in raised capital is insufficient to fund a multi-generation chip program; Groq and Cerebras each required $700M-$1.2B to reach commercial offerings without yet achieving profitability.	Medium	SP002, SP004, SP026
CP031	TSMC manufacturing access is not a differentiator for Etched because NVIDIA, AMD, Google, and multiple startups all manufacture at TSMC; fab access does not confer exclusive advantage.	Medium	SP013, SP018
CP032	The Positive Sum venture firm has invested in Etched, providing some external validation of the investment thesis, though investor perspective is inherently non-independent.	Medium	SP019
CP033	Hyperscaler internal AI chip programs (Google TPU, AWS Inferentia, Maia) are captive to their respective clouds and do not compete in the open market; they represent demand displacement risk rather than direct market competition for third-party chip vendors.	Medium	SP009, SP016, SP017
CP034	Etched's differentiation claim—that attention operations can be 10× more efficient in hardened silicon vs. GPU—is architecturally sound in principle but has not been validated in production silicon by independent benchmarks.	Medium	SP018, SP020, SP019
CP035	The AI chip competitive landscape is rapidly evolving; NVIDIA's Blackwell architecture (B100/B200) includes inference-specific enhancements that may narrow the performance gap with specialized inference chips.	Medium	SP012, SP013
CI001	Etched is a pre-revenue semiconductor company with no reported revenue, customers, or commercial product as of Q1 2026.	Medium	SI008, SI009
CI002	Etched has not disclosed its monthly burn rate, cash position, or balance sheet as of Q1 2026.	Medium	SI008, SI009
CI003	Etched's primary intended revenue model is hardware chip sales (Sohu ASIC) to hyperscalers and large AI inference operators, based on the product's positioning as an inference chip.	Medium	SI008, SI009, SI007
CI004	TSMC advanced node wafer costs at leading-edge processes (4N/4nm equivalent) are estimated at $15,000-$20,000 per wafer, making chip cost a primary determinant of unit economics.	Low	SI001, SI003
CI005	NVIDIA's data center GPU gross margins exceed 70-75% as of fiscal 2024, setting a benchmark for semiconductor AI chip profitability at scale.	Medium	SI006, SI007
CI006	Hardware revenue recognition for semiconductor products typically follows ASC 606 point-in-time model at chip delivery, creating lumpy revenue tied to production batch cycles.	Medium	SI005, SI003
CI007	Etched has not announced any government grants, CHIPS Act funding, or defense/intelligence contracts as of Q1 2026.	Medium	SI008, SI009
CI008	Tape-out costs for a leading-edge ASIC at TSMC advanced nodes are estimated at $5-15M for mask sets alone, before accounting for wafer purchase and yield costs.	Low	SI003, SI001
CI009	First-generation ASIC yield rates at leading-edge process nodes typically run 50-70%, with mature production rates reaching 85-95%; yield directly determines cost-per-good-chip.	Low	SI001, SI002
CI010	OSAT (outsourced semiconductor assembly and test) costs add approximately $20-50 per chip for standard packaging; advanced packaging (CoWoS, HBM integration) adds substantially more.	Low	SI003, SI002
CI011	Fabless semiconductor companies typically target gross margins of 40-65% for first-generation chips, improving to 60-75%+ in mature production as NRE costs are amortized and yields improve.	Low	SI002, SI006
CI012	Enterprise inference chip sales cycles typically span 12-24 months from initial contact to production deployment, driven by technical validation, supply chain qualification, and procurement timelines.	Medium	SI024, SI023
CI013	Etched's $120M Series A was raised at an implied valuation of approximately $1B, based on press coverage of the funding round; no financial terms have been officially confirmed.	Low	SI008, SI019
CI014	A semiconductor startup at Etched's stage (leading-edge ASIC development) typically burns $3-8M per month, driven by engineering headcount, EDA licensing, and wafer shuttle costs.	Low	SI003, SI022
CI015	At an estimated $5M/month burn rate, Etched's $120M Series A provides approximately 24 months of runway from June 2024 close, suggesting cash through approximately mid-2026.	Low	SI008, SI022
CI016	Groq raised approximately $1.1B+ and Cerebras raised approximately $720M+ before reaching commercial product offerings; both required capital substantially in excess of Etched's $120M raise.	Medium	SI011, SI012
CI017	Etched will require a Series B of approximately $200-500M before achieving first production revenue, based on the capital consumption patterns of comparable AI chip startups.	Low	SI011, SI012, SI022
CI018	The working capital cycle for a fabless semiconductor company spans 12-24 months from tape-out to first customer revenue, reflecting design validation, production, and customer integration timelines.	Medium	SI001, SI002, SI003
CI019	Etched has not disclosed any adverse financial signals including layoffs, executive departures, or down-round indicators as of Q1 2026.	Medium	SI009, SI024
CI020	Etched has no publicly disclosed customer LOIs, design wins, or commercial purchase agreements as of Q1 2026.	Medium	SI008, SI009
CI021	The complete cap table for Etched beyond the announced Series A investors (Primary Venture Partners, Positive Sum) has not been publicly disclosed.	Medium	SI008, SI009
CI022	Etched cannot be underwritten from public financial data alone; all revenue, cost structure, and capital adequacy metrics require company-provided data.	High	SI008, SI009
CI023	The Graphcore trajectory (acquired at $120M after $700M+ raise) demonstrates that insufficient capital to sustain chip development through commercial ramp is a critical failure mode for AI chip startups.	Medium	SI017
CI024	Etched must achieve a chip ASP (average selling price) that produces a competitive cost-per-token vs. H100 to justify switching costs; at TSMC 4N wafer costs, this requires either a large die delivering high throughput or very high ASP.	Medium	SI004, SI007, SI001
CI025	AWS Inferentia2 on-demand pricing at $0.76/hr per chip sets the lowest available benchmark for inference chip economics in the cloud; Etched must be cost-competitive with this on a tokens-per-dollar basis.	Medium	SI016, SI018
CI026	Etched's capital adequacy risk is the most material financial risk in the diligence: the Series A is likely insufficient to fund chip development through first revenue without an additional raise.	Medium	SI008, SI011, SI012
CI027	Semiconductor companies must fund a second-generation chip development before first-generation revenue is fully ramped, compounding capital intensity beyond initial estimates.	Medium	SI022, SI001, SI006
CI028	Channel economics for cloud marketplace deployment would reduce Etched's realized revenue by 20-35% (standard cloud marketplace take rates), making direct enterprise sales more attractive economically.	Low	SI016, SI018
CI029	No public lawsuits, regulatory filings, or adverse legal disclosures related to Etched have been identified as of Q1 2026.	Medium	SI009, SI008
CI030	Etched's financial risk profile (pre-revenue, high capital intensity, 2+ year time to revenue, undisclosed cost structure) is typical of a Series A semiconductor startup and represents the highest-risk segment of hardware venture investment.	Medium	SI022, SI002, SI008
CI031	A TSMC 4N tape-out for a large AI chip likely requires 18-24 months from initial design freeze to first production wafers, setting the earliest realistic revenue date at H2 2025 to H2 2026 from Etched's 2024 start.	Low	SI001, SI003
CI032	Buyers evaluating Etched will apply price elasticity tests: if Sohu's cost-per-token economics do not show at least 30-50% savings vs. H100 in production deployments, switching costs will outweigh the benefit.	Medium	SI023, SI007, SI016
CI033	Etched's CAC (customer acquisition cost) in enterprise semiconductor sales is likely $500K-$2M per account in sales engineering and evaluation support, based on typical enterprise chip sales cycles.	Low	SI023, SI024
CI034	The primary financial verdict for Etched is: insufficient public data for underwriting; the company requires a Series B raise before production revenue; and the capital adequacy gap vs. comparable AI chip startups is the most material financial risk.	Medium	SI008, SI011, SI012, SI017
CI035	Etched's target inference chip must generate competitive TCO (total cost of ownership) at the 3-year hardware depreciation horizon vs. H100 cloud instances; a $12,000 ASP chip running at 500K tokens/sec must produce <$0.10/1M tokens to beat H100 cloud economics.	Low	SI007, SI016, SI009
CE001	Etched's Sohu is a Transformer-only ASIC designed exclusively for inference; the Transformer multi-head self-attention operation is permanently hardcoded in silicon rather than computed by programmable logic units.	High	SE001, SE013
CE002	Etched claims Sohu delivers approximately 10x the throughput of an NVIDIA H100 GPU for Transformer inference workloads; this claim is company-stated and has not been independently verified with production silicon.	Low	SE001, SE013
CE003	Etched has claimed tape-out on TSMC's 4N (4nm-class) advanced process node; as of Q1 2026 this tape-out has not been independently confirmed and no first silicon has been publicly demonstrated.	Low	SE001, SE011
CE004	No production silicon exists for the Sohu chip as of Q1 2026; no engineering samples have been publicly demonstrated or announced by Etched or any third party.	Medium	SE001, SE002
CE005	Etched has not published any SDK, developer documentation, API reference, model compatibility matrix, or inference runtime documentation as of Q1 2026; the absence is confirmed by the 404 at etched.com/sohu and no developer resources on etched.com.	Medium	SE001, SE002
CE006	Sohu supports Transformer decoder model classes including GPT-4, LLaMA, and Mistral architectures according to Etched's product positioning; encoder and encoder-decoder Transformer models (T5 class) are also compatible with the hardwired attention architecture.	Medium	SE001, SE009
CE007	Sohu does not support Mamba, RWKV, or other state-space model (SSM) architectures, as these use recurrence rather than dot-product attention and are fundamentally incompatible with Sohu's hardwired attention engine design.	Medium	SE016, SE009, SE006
CE008	Hardwiring the Transformer attention operation in silicon eliminates the software kernel overhead, register pressure, and instruction dispatch costs that limit GPU throughput on autoregressive Transformer inference; this is Etched's core latency and throughput optimization mechanism.	Medium	SE006, SE009, SE003
CE009	FlashAttention (Dao et al., 2022) and its successors demonstrate that software-optimized attention computation can approach memory bandwidth limits on GPUs; Etched's silicon-encoded approach is the hardware analog of this optimization, permanently instantiating the IO-aware attention algorithm in silicon logic.	Medium	SE003, SE004, SE009
CE010	The Transformer architecture, introduced by Vaswani et al. in 'Attention Is All You Need' (2017), is the dominant paradigm for large language models, image-text models, and most production AI inference workloads as of 2026.	High	SE009, SE010
CE011	Hardwired logic circuits offer lower power consumption and latency for fixed-function computations compared to programmable logic (FPGAs, GPUs) because they eliminate instruction fetch, decode, and programmable datapath overhead; this is the fundamental design principle underlying Sohu's architecture.	Medium	SE006, SE012, SE024
CE012	Sohu's hardwired Transformer attention creates permanent architecture lock-in: the chip cannot be reprogrammed to support future non-Transformer architectures, and any architectural change to Transformer attention (e.g., grouped-query attention variants) may require a costly silicon re-spin.	Medium	SE006, SE012, SE016
CE013	Speculative decoding uses a smaller draft model to pre-generate tokens that a larger verifier model accepts or rejects; whether Sohu's hardwired attention engine efficiently accelerates both draft and verify passes in a speculative decoding pipeline has not been confirmed by Etched.	Low	SE007, SE001
CE014	Mamba, RWKV, and other state-space model architectures represent a genuine alternative to Transformer attention for sequence modeling; their emergence poses an architectural risk to Etched's Transformer-only strategy if they displace attention-based models in inference-dominant commercial workloads.	Medium	SE016, SE010
CE015	HuggingFace Transformers is the dominant model distribution framework for open-source Transformer models; any commercial inference chip must integrate with HuggingFace model hub formats (SafeTensors, config.json) to support standard LLaMA, Mistral, and similar checkpoints without manual conversion by the customer.	Medium	SE008, SE010
CE016	High Bandwidth Memory (HBM) is the standard memory architecture for AI inference accelerators requiring high-throughput access to large model weight tensors and KV-caches; Sohu almost certainly uses HBM stacks given its inference focus, though the specific generation and stack count are undisclosed.	Medium	SE019, SE012, SE017
CE017	Groq's LPU (Language Processing Unit) is the closest architectural analog to Sohu: both use fixed-function, non-GPU-based inference silicon optimized for Transformer inference; Groq emphasizes deterministic execution via SRAM-dominant memory while Sohu uses hardwired attention with HBM-backed KV-cache storage.	Medium	SE023, SE014, SE009
CE018	No independent benchmark data exists for Sohu as of Q1 2026; all performance claims including the 10x throughput claim vs. NVIDIA H100 are company-stated projections from Etched and its investors, with no third-party validation from any research group or customer.	Medium	SE001, SE002, SE013
CE019	Etched has not published any technical papers, architecture whitepapers, API documentation, or developer resources describing Sohu's microarchitecture, performance model, or software interface as of Q1 2026.	Medium	SE001, SE002
CE020	Developer adoption for AI inference chips requires at minimum: a model conversion tool, an inference runtime, and an OpenAI-compatible API endpoint; Etched has none of these available publicly, creating a customer adoption delay of approximately 6-12 months after first silicon before commercial deployments can begin.	Medium	SE008, SE023, SE015
CE021	Etched was founded by Gavin Uberti (CEO) and Chris Zhu, both former Google engineers; the company raised $120M in a Series A round in June 2024 from Primary Venture Partners and Positive Sum, with approximately 20-30 employees as of Q1 2026.	Medium	SE013, SE015, SE001
CE022	Etched's company homepage (etched.com) is operational and describes the Sohu chip concept; the Sohu product page (etched.com/sohu) returns a 404 error, indicating no public product documentation or specification has been published.	Medium	SE001, SE002
CE023	The absence of a Sohu product page at the company's own domain, combined with no SDK or documentation, is consistent with active silicon development at pre-tape-out or early tape-out stage with no customer-facing materials ready.	Medium	SE001, SE002
CE024	Tenstorrent Wormhole and similar programmable AI accelerators represent the competitive approach of Transformer-plus-other-workload capability; these chips sacrifice some peak attention throughput to retain flexibility for MoE, SSM, and custom operator support — directly opposing Etched's Transformer-only specialization strategy.	Medium	SE014, SE024
CE025	Production deployment of Sohu requires a compiler that ingests standard Transformer model checkpoint formats (HuggingFace SafeTensors, ONNX) and generates a Sohu-native execution graph; this compiler must handle model-specific attention head configurations, quantization levels, and operator fusion for each supported model family.	Medium	SE008, SE009, SE012
CE026	Sohu's target inference use cases span multiple model families (LLaMA, Mistral, Falcon, GPT-NeoX, Phi) that have architecture variations in head count, layer depth, context length, and vocabulary size; the compiler must handle this architectural diversity without requiring Sohu chip redesign.	Medium	SE008, SE010
CE027	Tape-out on TSMC 4N for a large-die AI inference ASIC requires 18-24 months from design freeze to first production wafers, including mask fabrication (8-12 weeks), wafer processing (8-12 weeks), and packaging and test; this sets the earliest realistic first silicon at H2 2025 to Q2 2026 from a 2024 tape-out start.	Medium	SE020, SE011, SE021
CE028	No Etched customer, evaluation partner, or design win has been publicly announced as of Q1 2026; no hyperscaler, AI-native company, or inference platform operator has been named as an Etched customer or evaluation partner in any press release or investor communication.	Medium	SE001, SE013, SE015
CE029	Sohu functions as a PCIe inference accelerator co-processor requiring a host server for request orchestration, user-facing API serving, and model loading; it is not a standalone compute unit and requires host CPU integration for system software and serving stack operation.	Medium	SE012, SE024, SE023
CE030	Hardcoded Transformer attention in Sohu silicon implies per-query and per-batch attention computation is fully pipelined without software kernel dispatch overhead; this is the mechanism by which Etched claims to achieve throughput superior to FlashAttention-on-GPU for autoregressive decoding workloads.	Medium	SE006, SE003, SE009
CE031	Mixture of Experts (MoE) Transformer architectures route tokens through sparse expert layers; while attention within each expert is standard Transformer multi-head attention, the expert routing and gating logic is not part of the hardwired attention operation and may create a host-side bottleneck on a Sohu-class chip.	Medium	SE022, SE009, SE006
CE032	If verified, Sohu's claimed 10x throughput advantage over H100 for Transformer inference would translate to approximately 10x lower cost-per-token at equivalent chip pricing, making Sohu a compelling cost-reduction option for hyperscaler inference operators running large-scale LLM serving.	Low	SE018, SE001, SE017
CE033	FlashAttention-2 and FlashAttention-3 have demonstrated that software-optimized attention can achieve 50-73% of H100 theoretical FLOPS for attention compute; Etched's silicon approach must demonstrate additional throughput gains beyond the FlashAttention-3 ceiling to justify the permanent architectural trade-off.	Medium	SE003, SE004, SE017
CE034	Etched has not disclosed what numerical precision formats (FP8, INT8, BF16, FP16, FP32) Sohu's attention engine supports; precision flexibility is critical for model compatibility — modern inference deployments typically use INT8 or FP8 quantization to reduce memory bandwidth requirements and improve tokens-per-second throughput.	Medium	SE001, SE012
CE035	Etched has approximately 20-30 employees as of Q1 2026, based on investor page references and press reports; this is a very small engineering team for a leading-edge ASIC development program that typically requires 50-150+ engineers for chip design, verification, and software development combined.	Low	SE013, SE015
CE036	Etched's Transformer-only ASIC represents a high-conviction market bet that the Transformer architecture will remain the dominant paradigm for LLM inference for 5-10 years — the operational lifetime of a chip generation; this bet has precedent in Google's TPU success but also in the failure of earlier single-architecture AI chip programs.	Medium	SE026, SE010, SE016
CE037	Etched's product roadmap beyond Sohu Gen 1 has not been disclosed; no second-generation chip has been announced, and the absence of a multi-generation roadmap is a commercial risk factor for enterprise customers requiring platform visibility before committing to an inference silicon platform.	Medium	SE001, SE013
CU001	Etched has zero named customers, signed letters of intent, design wins, or publicly disclosed evaluation partners as of Q1 2026; the company's homepage, investor page, and all public communications contain no customer references.	High	SU018, SU019, SU006
CU002	The most probable first-wave customer targets for Etched's Sohu chip are frontier AI labs running large-scale Transformer decoder inference (OpenAI, Anthropic, Mistral) and inference-as-a-service platforms with high GPU spend (Together AI, Anyscale, Perplexity).	Medium	SU001, SU002, SU004, SU012
CU003	The primary buyer persona for an AI inference chip is the VP of Infrastructure or ML Platform team at a company spending more than $50 million annually on GPU compute, representing an estimated 50-200 companies globally as of 2026.	Medium	SU006, SU007
CU004	Groq's case studies page demonstrates that AI-native inference platforms and consumer AI applications have adopted the Groq LPU for production Transformer inference workloads, validating that the buyer segment Etched is targeting does adopt specialized inference hardware.	Medium	SU006, SU022
CU005	AWS Inferentia case studies, including Stability AI (image generation inference) and Quora (Poe chatbot inference), show that AI companies adopt custom inference silicon when per-token cost economics are demonstrated to be 40-70% cheaper than GPU alternatives at comparable throughput.	Medium	SU007, SU015, SU016
CU006	OpenAI operates one of the world's largest Transformer inference deployments, running GPT-4 class and subsequent models at consumer web scale across hundreds of millions of monthly active users, making it the highest-value potential Etched customer.	High	SU001, SU008
CU007	Anthropic operates the Claude family of Transformer decoder models (Claude 3 Haiku, Sonnet, Opus) as both a consumer product and an enterprise API, with LLM inference costs material to its unit economics given the model's size and deployment scale.	Medium	SU002, SU009
CU008	Cohere provides LLM-based enterprise products (RAG, embedding, rerank) built on Transformer architectures, with inference being the core infrastructure cost; however, Cohere's embedding workloads are less attention-compute-bound than decoder inference, reducing the Sohu performance advantage.	Medium	SU003, SU013
CU009	Together AI operates an open-source model inference API platform serving research organizations and commercial developers at below-GPU-cloud pricing, making it one of the most price-sensitive and potentially receptive first-wave Etched customer targets.	Medium	SU004, SU010
CU010	Perplexity AI uses Transformer inference at scale to power its AI search product, running multiple LLM requests per user query, making it a representative example of the latency-sensitive and throughput-sensitive inference use case Etched's Sohu chip is optimized for.	Medium	SU011, SU022
CU011	Mistral AI offers both commercial Transformer inference APIs and widely adopted open-source models (Mistral 7B, Mixtral 8x7B), making it both a potential Etched customer and an indicator of the tier of companies that constitute Etched's primary target segment.	Medium	SU012, SU004
CU012	The standard customer journey for an AI inference chip adoption spans at least 5 phases — technical briefing, architecture validation, benchmarking, hardware integration, and production deployment — with the total timeline from first contact to production revenue typically 12-24 months.	Medium	SU006, SU007
CU013	Enterprise procurement of novel inference hardware requires legal review, security assessment, supply commitment negotiation, and SLA definition, which alone adds 3-6 months to the procurement timeline beyond the technical evaluation phase.	Medium	SU006, SU015
CU014	Graphcore, an inference chip company that raised over $700 million in total funding, failed to achieve customer adoption at the scale needed to sustain operations and was acquired by SoftBank at a material loss to investors, demonstrating that specialized AI chip startups can fail to convert strong benchmarks into commercial traction.	Medium	SU020, SU021
CU015	Etched has published no pricing schedule, total cost of ownership analysis, cost-per-token benchmark, or commercial evaluation datasheet as of Q1 2026; potential customers have no publicly available quantitative basis for commercial evaluation.	Medium	SU018, SU019
CU016	HackerNews discussion of Etched's $120 million Series A included developer community skepticism about Transformer-only silicon, with commenters raising concerns about architecture lock-in risk, the potential for Transformer paradigm supersession by state-space models, and the long timeline to first revenue.	Medium	SU021, SU018
CU017	The AI inference chip total addressable market is estimated to grow from approximately $5-10 billion in 2024 to $30-80 billion by 2030 as LLM inference costs scale with model deployment volumes and GPU-based inference becomes the dominant cloud computing cost category.	Low	SU006, SU007
CU018	An estimated 50-200 companies globally meet the threshold of more than $50 million in annual GPU compute spend that qualifies them as near-term viable Etched Sohu customers, concentrated in frontier AI labs, hyperscaler API teams, and inference-as-a-service platforms.	Low	SU001, SU002, SU008
CU019	Etched's Transformer-only architecture creates potential revenue concentration risk: with early production capacity limited to tens to hundreds of chips, any single customer consuming 30% or more of initial production capacity creates dangerous revenue dependency on one buyer's success.	Medium	SU018, SU019
CU020	Etched has not disclosed any customer pipeline data — no count of active evaluations, no stage distribution, no LOI status, and no customer engagement funnel metrics — in any public communication through Q1 2026.	Medium	SU018, SU019
CU021	AWS Inferentia customer deployments, including Stability AI for image generation and Quora for chatbot inference, demonstrate that AI companies will adopt non-GPU inference silicon when the cost-per-token economics are validated at 40-70% cheaper than GPU alternatives.	Medium	SU015, SU016, SU007
CU022	Together AI, as an open-source model inference API competing on price and performance against GPU cloud providers, exemplifies the most price-sensitive and immediately addressable first-wave Etched customer: a company spending $20-200 million annually on inference that would directly benefit from a 10x cost reduction if Sohu's claims are verified.	Medium	SU004, SU010
CU023	The key buying criteria for AI inference chip procurement, inferred from analog company adoption patterns at Groq and AWS Inferentia and from G2 developer reviews, are: (1) tokens/second throughput for target models, (2) cost per million tokens, (3) vendor reliability and supply chain, (4) SDK and software ecosystem maturity, and (5) migration path from existing GPU workloads.	Medium	SU006, SU007, SU017
CU024	With approximately 20-30 employees as of Q1 2026, Etched is too small to simultaneously manage TSMC tape-out, SDK development, enterprise sales outreach, and customer success programs for multiple evaluation partners; the team size is appropriate for silicon development but not for customer acquisition.	Medium	SU018, SU019
CU025	Once an AI company completes hardware integration with Sohu — retooling its serving stack, model compiler, and deployment pipeline (estimated 3-6 months, 2-5 engineers) — switching costs become very high: an estimated 12-18 months of re-engineering to migrate away from Sohu creates structural retention lock-in.	Medium	SU006, SU007
CU026	The first Etched evaluation customer would need to accept four conditions simultaneously: pre-production silicon risk (no demonstrated hardware), NDA-governed evaluation terms, allocation of 2-5 dedicated integration engineers, and willingness to serve as a named design-win reference for future Etched fundraising.	Medium	SU018, SU022, SU006
CU027	Groq secured engineering briefings and early developer interest before first production silicon delivery by building benchmark claims backed by early hardware demonstrations at AI conferences; Etched has not replicated this pre-silicon customer engagement approach as of Q1 2026.	Medium	SU006, SU022, SU017
CU028	Etched's unverified 10x throughput claim relative to the NVIDIA H100 cannot be independently evaluated without engineering samples; no third-party benchmark has been published, placing Etched significantly behind Groq and Cerebras in the volume of customer-evaluable technical evidence.	Medium	SU018, SU023, SU022
CU029	Two years post-founding and more than one year after its $120 million Series A, Etched has not named a single evaluation partner, design-win customer, or engineering briefing recipient; this absence of customer signal is a diligence yellow flag relative to comparable inference chip companies at equivalent funding stages.	Medium	SU018, SU019, SU021
CU030	Scale AI provides AI data labeling and synthetic data generation for frontier labs; its downstream clients' inference economics would benefit from Sohu cost reductions, making Scale AI an indirect potential customer or channel partner for Etched.	Low	SU014, SU018
CU031	Mistral AI raised over $1 billion in funding in 2024 and operates both commercial Transformer inference APIs and widely downloaded open-source models at significant scale, placing it in Etched's tier-1 target segment for the 2027-2028 adoption window.	Medium	SU012, SU004
CU032	Cohere's enterprise RAG and embedding inference workloads are predominantly Transformer encoder-based; while Sohu's hardened attention accelerates encoder inference, the workloads are less attention-compute-bound than decoder inference, reducing the claimed 10x performance advantage for Cohere's primary use cases.	Medium	SU003, SU013
CU033	Inference-as-a-service platforms including Together AI and Anyscale are growing compute spend rapidly as open-source model inference volumes increase in 2025-2026; these platforms are the most price-sensitive inference buyers and would benefit most from Sohu's claimed cost-per-token economics if verified.	Medium	SU004, SU005, SU010
CU034	No publicly available VC reference check, independent analyst customer channel check, or third-party evaluation of Etched's customer pipeline depth has been published as of Q1 2026; all customer-pipeline information must be solicited directly from Etched under NDA.	Medium	SU019, SU018
CU035	Based on analogs from Groq's initial deployment and Cerebras' early hyperscaler engagements, Etched requires a minimum of 3-5 committed evaluation customers with binding production intent to justify the operational costs of full-production wafer starts at TSMC.	Low	SU022, SU023
CU036	If Etched's first three customers each represent 25-35% of first-year revenue and any one reduces usage or exits — due to architectural shift away from Transformer models, a competitor offering better economics, or loss of the customer's own funding — Etched faces a revenue shock that would threaten its operating runway at current burn rates.	Medium	SU018, SU019
CU037	Graphcore's commercial failure followed a pattern where strong architectural performance benchmarks failed to convert into customer adoption at scale because the software stack required too much customer re-engineering effort; this is the identical risk profile Etched faces with Sohu, where SDK maturity and integration friction are primary adoption barriers.	Medium	SU020, SU021, SU018
CR001	The Sohu chip hardcodes Transformer attention mechanisms directly in silicon, making the architecture non-patchable via software after tape-out; no firmware or software update can change the fundamental compute graph the chip executes.	Medium	SR015, SR016, SR034
CR002	Because Sohu's silicon hardcodes attention, any shift in the dominant model architecture away from dense Transformer decoders makes the chip architecturally stranded with no recovery path short of a complete ASIC redesign requiring 3–4 years and an estimated $100–400M in new NRE costs.	Medium	SR015, SR025
CR003	If Transformer architectures are materially displaced by state-space models (Mamba, RWKV) or mixture-of-experts architectures within 4–6 years, Etched's commercial value is effectively zero because the chip's performance advantage over GPUs is entirely derived from the hardcoded Transformer attention accelerator.	Medium	SR022, SR026, SR027
CR004	HBM supply is concentrated among three manufacturers — SK Hynix, Samsung, and Micron — and AI chip startups with no production revenue have essentially no leverage to secure priority HBM3E allocation against established players NVIDIA and AMD.	Medium	SR018, SR014
CR005	ASIC tape-out at TSMC's N3/N4 process node carries an estimated NRE cost of $20–200M per attempt depending on mask count and design complexity; a first-silicon respin adds 12–18 months and a further $20–50M in NRE cost on top of the original tape-out expense.	Medium	SR025, SR017
CR006	TSMC commands more than 50% of global advanced semiconductor foundry capacity and is the only high-volume N3/N4 foundry available to fabless companies; there is no credible alternative foundry at equivalent process maturity if TSMC faces disruption.	High	SR017, SR005
CR007	A Taiwan Strait military escalation or forced TSMC operational shutdown would disrupt the global advanced semiconductor supply chain with no equivalent N3/N4 substitute capacity available in the short term; Etched, as a TSMC-dependent fabless startup, has no mitigation available before revenue.	Medium	SR017, SR014
CR008	The standard ASIC development cycle from tape-out submission to volume production is 18–24 months: approximately 6–9 months from tape-out to first-silicon return, and a further 9–15 months for bring-up, validation, and production ramp.	Medium	SR025, SR017
CR009	Export Administration Regulations (EAR) administered by BIS require US persons and companies to obtain export licenses before exporting advanced semiconductor devices to certain countries; all international Sohu chip sales must be screened against the BIS Entity List and applicable CCL entries.	High	SR002, SR005
CR010	The CHIPS and Science Act (2022) provides approximately $52 billion in incentives for US semiconductor manufacturing, but recipients must comply with guardrails including a 10-year prohibition on material expansion of advanced chip manufacturing in countries of concern; TSMC's CHIPS Act-funded facilities carry these compliance obligations through supply agreements.	High	SR001, SR006
CR011	The EU AI Act (2024) introduces GPAI (general-purpose AI) model compliance requirements affecting providers of Transformer-based LLMs; customers deploying Sohu-accelerated inference for GPAI models in the EU face compliance obligations that may create indirect chip infrastructure requirements.	Medium	SR007, SR008
CR012	The BIS October 2023 Federal Register rule expanded export controls on advanced logic semiconductor manufacturing items, tightening restrictions on chips and manufacturing equipment flowing to entities in countries of concern — directly affecting the supply chain Etched depends on.	High	SR006, SR003
CR013	The BIS Entity List restricts exports to hundreds of parties without a prior license; any Etched international sale requires screening each customer against the Entity List, Unverified List, Denied Persons List, and SDN list before shipment.	High	SR005, SR003
CR014	NVIDIA has demonstrated willingness to pursue patent litigation against semiconductor competitors, including the NVIDIA Corp. v. Samsung and Qualcomm case, indicating material IP risk for chip startups whose designs may overlap with NVIDIA's extensive patent portfolio.	Medium	SR009, SR019
CR015	Arm Holdings licenses its ISA and processor microarchitectures to semiconductor companies worldwide; any ASIC incorporating Arm-based processor cores — a standard practice for complex control-plane logic — requires a current, paid Arm architecture license agreement.	Medium	SR010, SR024
CR016	Trade secret misappropriation claims represent a real legal risk for chip startups that hire engineers from incumbents like NVIDIA, Meta, or Google; former employers regularly monitor and litigate alleged IP transfer to competing chip design teams.	Medium	SR009, SR024
CR017	Semiconductor IP core licensing is standard practice in ASIC design; most complex chips incorporate third-party IP blocks (PCIe controllers, memory interfaces, standard cell libraries) that require ongoing licensing agreements with IP vendors including Arm, Synopsys, and Cadence.	Medium	SR024, SR010
CR018	Etched has not disclosed any freedom-to-operate (FTO) analysis, patent portfolio assessment, or Arm Holdings licensing agreement in public communications as of Q1 2026; the IP risk posture of the Sohu design is unknown from public sources.	Medium	SR015, SR016
CR019	The EU AI Act entered into force in August 2024 with phased implementation; GPAI model providers must meet transparency, documentation, and safety requirements, which may affect procurement decisions for inference infrastructure including Sohu chips in European deployments.	Medium	SR007, SR008
CR020	NVIDIA's Blackwell architecture (launched 2024–2025) delivers an estimated 2–4× inference throughput improvement over Hopper-class H100/H200 silicon for Transformer decode workloads, significantly raising the performance bar Sohu must clear to justify customer adoption of a new chip vendor.	Medium	SR019, SR032
CR021	AMD MI300X/MI325X chips have captured meaningful inference market share in 2024–2025, particularly from inference-as-a-service platforms running open-source models; AMD's competitive pricing creates a cost-floor that narrows Sohu's economic advantage for cost-sensitive workloads.	Medium	SR011, SR033
CR022	Hyperscaler captive silicon programs — Google TPU v6 (Trillium), AWS Trainium 2, and Microsoft Maia 100 — are designed specifically for inference workloads at the hyperscaler's internal scale, reducing or eliminating the need for those companies to source external inference ASICs from startups like Etched.	Medium	SR030, SR019
CR023	Groq (LPU) and Cerebras (CS-3) are direct AI inference ASIC competitors with production deployments, published performance benchmarks, and established customer relationships — giving them a 2–3 year head start over Etched on customer trust, SDK maturity, and production experience.	Medium	SR028, SR031
CR024	Tenstorrent's RISC-V-based AI chip offers a semi-programmable architecture that retains significant flexibility compared to a pure hardcoded ASIC; this semi-flexible positioning could attract customers who want performance-per-watt advantages without sacrificing the ability to run non-Transformer workloads.	Low	SR030, SR012
CR025	Graphcore's failure — a chip company that raised more than $700 million and achieved strong architectural performance but failed to convert that advantage into commercial traction at scale — is the most directly applicable cautionary analog for Etched's risk profile.	Medium	SR020, SR021
CR026	Graphcore's failure was substantially driven by SDK immaturity: the difficulty of porting existing PyTorch/TensorFlow models to Graphcore's IPU software stack created adoption friction that prevented customers from realizing the benchmarked performance advantages in production — the identical risk that Etched faces given its undisclosed SDK status.	Medium	SR020, SR012
CR027	CEO Gavin Uberti is 23 years old and has no prior experience leading a chip company through the full development cycle from RTL design to tape-out to volume production; while Etched's team includes engineers from established chip companies, the organizational execution track record is entirely unproven.	Medium	SR023, SR015
CR028	Etched's team of approximately 30 engineers is small for a full-stack ASIC development effort that requires simultaneous execution across digital design, physical design, DFT, mixed-signal, TSMC PDK integration, verification, firmware, SDK, and customer engineering tracks.	Medium	SR015, SR023
CR029	Etched has disclosed no SDK, no compiler, no developer program, and no software stack for Sohu as of Q1 2026; without a software ecosystem, customer adoption requires customers to port their serving infrastructure entirely from scratch — the same adoption friction that contributed to Graphcore's failure.	Medium	SR015, SR020
CR030	An AI chip company with a hardcoded architecture requires at least 2–3 years of software ecosystem development (compiler, runtime, operator library, serving framework integration) to reach the SDK maturity needed for production customer deployments; Etched has not yet publicly started this program.	Medium	SR020, SR028
CR031	Etched's supply chain for Sohu involves at minimum four single-source dependencies: TSMC (foundry), HBM suppliers (memory), Arm Holdings (if Arm IP is used), and EDA tooling vendors (Synopsys, Cadence); each represents a point of failure with limited substitution options.	Medium	SR017, SR018, SR024
CR032	Flash attention, paged attention, and speculative decoding are algorithmic variants that have become standard in production Transformer serving but may require specific hardware memory access patterns; if Sohu's hardcoded attention logic cannot support these variants, customers using PagedAttention-based serving (vLLM, TensorRT-LLM) would face compatibility blockers.	Low	SR015, SR012
CR033	Etched raised $120 million in a Series A in June 2024; at a pre-tape-out burn rate of $3–6 million per month, this funding provides approximately 20–40 months of runway — placing a hard deadline for achieving first silicon or raising a Series B in approximately Q2 2026 to Q2 2027.	Medium	SR015, SR016, SR029
CR034	The earliest plausible first-revenue date for Etched is H2 2027, contingent on tape-out completion in 2025–2026, first-silicon pass without respin, successful customer benchmarking within 6–12 months of silicon delivery, and at least one customer completing a production deployment — a chain of dependencies with compounding execution risk.	Medium	SR025, SR015
CR035	A Series B raise will be required before any product revenue is realized, making Etched's financial survival entirely dependent on VC market conditions at the time of the raise; if AI hardware investment sentiment deteriorates or funding multiples compress in 2026, Etched may not be able to raise at acceptable terms.	Medium	SR016, SR034
CR036	If the Series B raise fails or is delayed beyond runway exhaustion — a scenario triggered by lack of design wins, AI funding market contraction, or poor first-silicon results — Etched would face a choice between a distressed sale, wind-down, or terms-unfavorable bridge round.	Medium	SR020, SR034
CR037	First-silicon respin at TSMC would add approximately 12–18 months to the development timeline and $20–50 million in additional NRE cost; combined with continued burn, a respin scenario could exhaust the $120 million Series A before any customer revenue is received.	Medium	SR025, SR017
CR038	If AI inference market growth slows or pauses in 2026–2027, the economic rationale for adopting a new inference ASIC vendor weakens: GPU cost declines reduce the per-token cost advantage Sohu must demonstrate, and enterprise infrastructure spending pauses reduce customer willingness to take on integration risk.	Medium	SR030, SR016
CR039	Mamba (selective SSM) has demonstrated competitive language modeling performance on academic benchmarks versus Transformers of comparable size, and its linear-time inference complexity eliminates the KV-cache memory bandwidth bottleneck — the specific bottleneck Sohu's hardcoded silicon targets.	Medium	SR022, SR027
CR040	Etched has zero announced customers, zero design wins, zero signed LOIs, and zero publicly named evaluation partners as of Q1 2026 — more than two years post-founding and over twelve months post-Series A, which is an unusually weak commercial signal for a well-funded chip startup at this stage.	Medium	SR015, SR016, SR020
CR041	Developer community commentary on Etched's Series A raised substantive concerns about the Transformer-only architecture bet, with experienced practitioners noting that model architecture shifts in AI have historically occurred within 3–5 year windows — comparable to or shorter than Sohu's projected commercial cycle.	Medium	SR021, SR012
CR042	AI safety concerns and evolving AI governance frameworks at the EU, US, and national levels may generate new chip-level compliance requirements (hardware security, provenance attestation, compute usage reporting) that increase the regulatory compliance burden for AI inference chip vendors.	Low	SR013, SR007
CV001	NVIDIA's market capitalisation reached approximately $3 trillion in 2024, with an implied EV/Revenue multiple of approximately 25× on its AI infrastructure segment revenues.	Medium	SV020, SV029
CV002	Advanced Micro Devices (AMD) traded at approximately $200 billion market capitalisation in 2024 with an EV/Revenue multiple of 6–8× on its AI chip segment, reflecting lower inference-market penetration than NVIDIA.	Medium	SV010, SV014
CV003	Marvell Technology's AI ASIC custom silicon business generated approximately $1.6 billion in revenue in fiscal year 2025, with the company trading at 10–15× revenue on its AI segment, making it the most directly applicable production-stage AI ASIC comparable for Etched.	Medium	SV004, SV014
CV004	Broadcom's custom silicon and networking revenues for AI sustained an 18–20× EV/Revenue premium within its overall market capitalisation of approximately $700 billion in 2024.	Medium	SV005, SV014
CV005	Intel acquired Habana Labs for approximately $2 billion in December 2019, establishing it as the primary precedent transaction for pre-revenue AI chip startup acquisitions by a strategic buyer.	Medium	SV018, SV023
CV006	Graphcore reached a peak valuation of approximately $2.8 billion in 2021 but entered severe commercial and financial decline by 2023–2024; its IPU architecture never achieved production-scale commercial adoption, making it the leading cautionary analog for Etched.	Medium	SV017, SV013
CV007	Cerebras Systems filed for IPO in September 2024 at an implied enterprise value of $7–8 billion; the IPO was delayed following scrutiny of its primary customer G42's ties to Chinese entities, demonstrating capital-market fragility for AI chip startups even after production deployments.	Medium	SV019, SV030
CV008	Groq raised $640 million in a March 2024 funding round at an implied valuation of approximately $2.5 billion; unlike Etched, Groq has production LPU deployments and paying customers, representing a materially de-risked comparable profile.	Medium	SV022, SV031
CV009	Etched raised $120 million in a Series A funding round in June 2024, with Positive Sum as lead investor and Primary Venture Partners as co-investor, as reported by Bloomberg and confirmed by both investors' public portfolio pages.	Medium	SV015, SV016, SV023
CV010	Etched's post-money Series A valuation has not been publicly disclosed by the company, Positive Sum, Primary Venture Partners, Bloomberg, Reuters, TechCrunch, or Fortune as of Q2 2026.	Medium	SV015, SV016
CV011	Analyst estimates based on typical Series A dilution norms for hardware companies at this raise size place Etched's post-money valuation in the $600–800 million range, implying approximately 15–20% primary dilution for lead investors.	Medium	SV006, SV001
CV012	Pre-revenue AI chip startups in 2022–2024 commanded post-money valuations of $500 million to $2 billion depending on team credibility, technical differentiation, and market timing, based on publicly reported funding rounds.	Medium	SV006, SV012
CV013	Comparable Company Analysis applied to pre-revenue companies like Etched requires using projected future revenue discounted for execution risk rather than actual trailing revenue, materially widening the valuation range versus production-stage comparables.	Medium	SV002, SV001
CV014	Discounted cash flow analysis for Etched is not feasible from public information as the company has not disclosed any revenue forecast, burn rate, operating model, or cash position, making CCA on projected 2027–2028 revenue the only tractable valuation methodology.	Medium	SV007, SV015
CV015	The bull case enterprise value for Etched is $3–5 billion, based on 10–15× EV/Revenue applied to $200–300 million projected 2028 revenue; this requires a first-silicon pass without respin and at least one confirmed hyperscaler design win by H2 2027.	Medium	SV001, SV002
CV016	The base case enterprise value for Etched is $800 million to $1.5 billion, based on 4–6× risk-adjusted EV/Revenue applied to $100–150 million projected 2028 revenue, reflecting first-silicon delivery with execution challenges and a single initial customer.	Medium	SV001, SV006
CV017	The bear case enterprise value for Etched is $200–500 million, reflecting tape-out failure, silicon respin, architecture obsolescence, or inability to close a Series B, consistent with Graphcore's distressed exit trajectory.	Medium	SV007, SV017
CV018	The bull case probability signal is 15–20%, conditioned on TSMC N4 tape-out success, first-silicon pass without respin, and at least one hyperscaler customer confirmation by H2 2027.	Medium	SV006, SV002
CV019	The base case probability signal is 40–50%, reflecting the base rate for pre-revenue AI chip startups achieving first silicon without respin and securing at least one initial customer.	Medium	SV006, SV001
CV020	The bear case probability signal is 30–40%, elevated by zero commercial traction, the Graphcore failure analog, and the compounded execution risk of a first-time chip CEO operating with a team of approximately 30 engineers.	Medium	SV017, SV007
CV021	Graphcore raised over $700 million across multiple rounds, demonstrated benchmark-superior IPU architecture, but failed to achieve commercial traction at scale; its software stack never reached enterprise production maturity, resulting in a distressed outcome.	Medium	SV017, SV013
CV022	A $120 million Series A at an estimated $600–800 million post-money implies 15–20% primary dilution; subsequent down-round risk or preference stack overhang could materially reduce common-equity value at exit.	Medium	SV006, SV009
CV023	At a post-money valuation of $600–800 million, lead Series A investors require a minimum 10× return to achieve standard venture fund return targets, implying a minimum exit enterprise value of $6–8 billion; no scenario analysis in this chapter assigns base-case probability to that threshold.	Medium	SV006, SV011
CV024	Etched's most likely exit path is a strategic acquisition by a hyperscaler or an established semiconductor company with AI ASIC exposure; both Marvell and Broadcom have structural motivation to acquire Sohu's architecture if first silicon delivers on its performance claims.	Medium	SV012, SV004
CV025	An IPO exit for Etched is unlikely before H2 2028 at the earliest, requiring sustained revenue, commercial momentum, and demonstrated silicon performance; Cerebras's delayed IPO illustrates the difficulty of listing an AI chip company even with production deployments.	Medium	SV019, SV006
CV026	Qualcomm's 2024 market capitalisation of approximately $150 billion at 7–9× semiconductor revenue demonstrates the floor multiple for a scaled fabless chip company, providing a lower bound reference for AI chip comparable analysis.	Medium	SV008, SV014
CV027	Marvell Technology is the most directly applicable production-stage AI custom ASIC comparable for Etched: it operates a hyperscaler custom silicon business at meaningful scale and its 10–15× EV/Revenue multiple on AI revenue is the reference discount-target for Etched CCA.	Medium	SV004, SV014
CV028	Comparable company analysis for Etched requires applying a 40–60% discount to Marvell/Broadcom AI ASIC multiples to account for pre-revenue stage, single-architecture concentration risk, and execution uncertainty.	Medium	SV002, SV001
CV029	Precedent M&A transaction analysis shows a bimodal distribution for AI chip startup acquisitions: distressed exits at $100–500 million and premium pre-revenue acquisitions at $1–2 billion, with Habana Labs ($2 billion) as the primary positive precedent.	Medium	SV003, SV018
CV030	The appropriate EV/Revenue multiple for Etched valuation analysis is 5–12× on projected 2028 revenue, reflecting a 50–80% discount to NVIDIA's 25× multiple due to pre-revenue stage, single-architecture concentration, and execution risk.	Medium	SV002, SV020
CV031	Thesis-break trigger one: a first-silicon failure or tape-out abort at TSMC N4 would reduce Etched's enterprise value to near zero — IP in a distressed scenario is worth under $100 million absent a functional chip.	Medium	SV015, SV007
CV032	Thesis-break trigger two: if Mamba, RWKV, or any SSM-family architecture achieves confirmed production inference adoption at any top-three hyperscaler before Sohu's commercial launch, Sohu's transformer-only differentiation is permanently eliminated with no recovery path.	Medium	SV015, SV014
CV033	Thesis-break trigger three: failure to close a Series B at $800 million or above within 24 months of Series A close would signal investor concern about execution and force a distressed outcome or wind-down.	Medium	SV016, SV006
CV034	Final diligence ask one: Etched's post-money Series A valuation and full cap table with option pool and liquidation preference stack must be disclosed to establish entry price, dilution baseline, and preference overhang before any investment decision.	Medium	SV015, SV016
CV035	Final diligence ask two: Etched's monthly burn rate, tape-out milestone schedule with dates, cumulative TSMC NRE payments, and projected cash exhaustion date must be disclosed to validate runway assumptions and Series B timing.	Medium	SV015, SV016
CV036	Final diligence ask three: any signed LOIs, evaluation agreements, customer pipeline data, or engineering briefing recipients under NDA must be disclosed to validate the commercial thesis, given that zero customer relationships are publicly announced.	Medium	SV015, SV017
CV037	Precedent AI chip transactions include Intel/Habana Labs (~$2 billion, 2019), Qualcomm/Nuvia (~$1.4 billion, 2021), and various distressed AI startup exits; the acquisition premium range for pre-revenue hardware companies is historically wide and dependent on strategic fit.	Medium	SV003, SV018
CV038	Cerebras's experience demonstrates that an AI chip company can sustain high private valuation for multiple years without profitability, but capital-market scrutiny intensifies sharply at IPO stage, as shown by Cerebras's delayed offering following G42 customer concentration concerns.	Medium	SV019, SV027
CV039	Etched's valuation is most sensitive to three variables: probability of a first-silicon pass without respin, speed of customer adoption following silicon delivery, and exit multiple achievable at time of acquisition or IPO.	Medium	SV001, SV002
CV040	The investment recommendation is conditional negative at implied post-money valuations above $1.5 billion: the probability-weighted expected value ($800 million–$1.1 billion) does not justify entry at premium pricing given zero commercial traction and high execution risk.	Medium	SV001, SV017
CV041	The investment recommendation is conditional track at implied post-money valuations at or below $800 million: the risk-adjusted return profile marginally justifies a monitoring position contingent on Series B close, tape-out completion, and first customer design win.	Medium	SV001, SV006
CV042	Historical venture base rates for pre-revenue hardware companies show fewer than 10% achieve 10× or greater returns; the majority experience write-downs or distressed exits, arguing for a high discount rate and conservative probability assignments in all scenario analyses.	Medium	SV006, SV012

Sources
ID	Publisher	Title	Quote
SO001	Etched	Etched Official Website	Building the hardware for superintelligence.
SO002	Bloomberg	AI Chip Startup Etched Raises $120 Million to Build Transformer Chips
SO003	Reuters	Etched raises $120 million for chip designed to run AI transformers
SO004	arXiv / Google Brain	Attention Is All You Need (Transformer paper)
SO005	Wikipedia	Transformer (deep learning architecture)
SO006	Wikipedia	Application-specific integrated circuit
SO007	NVIDIA	NVIDIA H100 Tensor Core GPU
SO008	Groq	Groq Official Website
SO009	Cerebras	Cerebras Systems Official Website
SO010	SambaNova Systems	SambaNova Systems Official Website
SO011	AMD	AMD Instinct MI300X GPU
SO012	Amazon Web Services	AWS Trainium — AI Training and Inference Chip
SO013	Google Cloud	Google Cloud TPUs
SO014	Intel	Intel Gaudi AI Accelerator
SO015	Primary Venture Partners	Primary Venture Partners Official Website
SO016	Positive Sum	Positive Sum Official Website
SO017	Wikipedia	Gavin Uberti
SO018	Wikipedia	Primary Venture Partners
SO019	Wikipedia	Large language model
SO020	Wikipedia	Graphics processing unit
SO021	Wikipedia	TSMC (Taiwan Semiconductor Manufacturing Company)
SO022	Wikipedia	Semiconductor industry
SO023	Wikipedia	Artificial intelligence accelerator
SO024	Wikipedia	NVIDIA
SO025	Hacker News	Ask HN: Etched AI Chip Sohu — Developer Discussion and Skepticism	Developer community discussion questioning the viability of Transformer-only ASICs and the risk of architectural obsolescence.
SO026	Wikipedia	Fabless semiconductor company
SO027	Wikipedia	High bandwidth memory
SO028	Wikipedia	Unicorn (finance)
SO029	Wikipedia	Mamba (deep learning architecture)	Mamba is a deep learning architecture based on a state space model, presented as an alternative to Transformer architecture for sequence modeling.
SO030	Wikipedia	State space model
SO031	TechCrunch	Etched is building a chip that only runs Transformer models, raising $120M for the effort
SO032	Wired	Etched Chip AI Transformers
SO033	Fortune	Etched AI chip startup raises $120 million Series A
SO034	Wikipedia	Tape-out (semiconductor)
SO035	Wikipedia	Cerebras Systems
SM001	Wikipedia	AI chip (artificial intelligence chip)
SM002	Wikipedia	AI semiconductor chip market
SM003	Wikipedia	Artificial intelligence accelerator
SM004	Wikipedia	Deep learning
SM005	Wikipedia	Generative artificial intelligence
SM006	Wikipedia	Semiconductor industry
SM007	NVIDIA	NVIDIA H100 Tensor Core GPU
SM008	Wikipedia	NVIDIA
SM009	Amazon Web Services	AWS Trainium
SM010	Google Cloud	Google Cloud TPUs
SM011	Microsoft Azure	Azure AI Solutions
SM012	NVIDIA	NVIDIA DGX Systems
SM013	Wikipedia	Large language model
SM014	Wikipedia	Transformer (deep learning architecture)
SM015	Wikipedia	CUDA
SM016	Groq	Groq Official Website
SM017	Cerebras	Cerebras Systems Website
SM018	Wikipedia	Cloud computing
SM019	Wikipedia	Data center
SM020	Wikipedia	Mamba (deep learning architecture)	Mamba presents itself as a Transformer alternative with linear rather than quadratic scaling, potentially addressing inference efficiency concerns.
SM021	Wikipedia	State space model
SM022	Wikipedia	Hardware acceleration
SM023	Wikipedia	Hyperscale computing
SM024	AMD	AMD Instinct MI300X
SM025	Intel	Intel Gaudi AI Accelerator
SM026	Wikipedia	Total addressable market
SM027	Google Cloud	Vertex AI
SM028	Amazon Web Services	AWS EC2 P4 Instances (GPU Inference)
SM029	Wikipedia	Global AI chip market
SM030	Wikipedia	Chip shortage
SM031	Hacker News	Ask HN: What's the state of AI inference chips beyond NVIDIA? (2024)
SM032	Wikipedia	Tenstorrent
SM033	SambaNova Systems	SambaNova Cloud AI Inference Platform
SM034	arXiv	Attention Is All You Need (Transformer architecture foundational paper)
SM035	TechCrunch	Etched raises $120M to build a transformer-only AI chip
SP001	Wikipedia	Groq
SP002	Groq	Groq — Fast AI Inference
SP003	Wikipedia	Cerebras Systems
SP004	Cerebras	Cerebras — AI Compute Platform
SP005	Wikipedia	Graphcore
SP006	Wikipedia	Habana Labs
SP007	Wikipedia	Intel Gaudi
SP008	Intel	Intel Gaudi AI Accelerator Overview
SP009	Wikipedia	Google Tensor Processing Unit
SP010	Wikipedia	Tenstorrent
SP011	Tenstorrent	Tenstorrent — AI Compute for All
SP012	Wikipedia	NVIDIA
SP013	NVIDIA	NVIDIA H100 Tensor Core GPU
SP014	Wikipedia	AMD Instinct
SP015	AMD	AMD Instinct MI300 Series Accelerators
SP016	AWS	Amazon EC2 Inf2 Instances
SP017	Google Cloud	Cloud TPU v5e
SP018	Wikipedia	Application-specific integrated circuit
SP019	Positive Sum	Etched — Positive Sum
SP020	Wikipedia	Transformer (deep learning architecture)
SP021	Wikipedia	CUDA
SP022	Wikipedia	Mamba (deep learning architecture)
SP023	Wikipedia	Large language model
SP024	Hacker News	Ask HN: What's the state of AI inference chips beyond NVIDIA? (2024)
SP025	SambaNova Systems	SambaNova Cloud AI Inference Platform
SP026	TechCrunch	Etched raises $120M to build a transformer-only AI chip
SI001	Wikipedia	TSMC
SI002	Wikipedia	Fabless manufacturing
SI003	Wikipedia	Integrated circuit design
SI004	Wikipedia	Semiconductor intellectual property core
SI005	Wikipedia	Application-specific integrated circuit
SI006	Wikipedia	NVIDIA
SI007	NVIDIA	NVIDIA H100 Tensor Core GPU
SI008	TechCrunch	Etched raises $120M to build a transformer-only AI chip
SI009	Positive Sum	Etched — Positive Sum portfolio
SI010	Groq	Groq — Fast AI Inference
SI011	Wikipedia	Groq
SI012	Wikipedia	Cerebras Systems
SI013	Cerebras	Cerebras — AI Compute Platform
SI014	SambaNova Systems	SambaNova Cloud AI Inference Platform
SI015	AMD	AMD Instinct MI300 Series Accelerators
SI016	AWS	Amazon EC2 Inf2 Instances
SI017	Wikipedia	Graphcore
SI018	Google Cloud	Cloud TPU v5e
SI019	Bloomberg	Etched Raises $120 Million to Build Transformer Chips
SI020	Wikipedia	Transformer (deep learning architecture)
SI021	Wikipedia	Large language model
SI022	Wikipedia	Semiconductor industry
SI023	Wikipedia	CUDA
SI024	Hacker News	Ask HN: What's the state of AI inference chips beyond NVIDIA? (2024)
SI025	Tenstorrent	Tenstorrent — AI Compute for All
SI026	SEC EDGAR	NVIDIA Corporation — Annual Reports (10-K) filing index
SI027	NVIDIA Investor Relations	NVIDIA Annual Reports and Proxy Statements
SI028	Microsoft Azure	Azure Machine Learning Pricing
SI029	Wikipedia	High Bandwidth Memory
SI030	Wikipedia	Tape-out
SI031	Wikipedia	Die (integrated circuit)
SE001	Etched	Etched — Official Company Homepage
SE002	Etched	Etched — Sohu Product Page (404 Not Found)
SE003	arXiv	FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness (Dao et al., 2022)
SE004	Wikipedia	Flash attention
SE005	Wikipedia	Attention (machine learning)
SE006	Wikipedia	Hardwired logic
SE007	Wikipedia	Speculative decoding
SE008	HuggingFace	HuggingFace Transformers Documentation
SE009	arXiv	Attention Is All You Need (Vaswani et al., 2017)
SE010	Wikipedia	Transformer (deep learning architecture)
SE011	Wikipedia	TSMC
SE012	Wikipedia	Application-specific integrated circuit
SE013	Positive Sum	Etched — Positive Sum Portfolio Page
SE014	Tenstorrent	Tenstorrent — AI Chip Solutions
SE015	Hacker News	Hacker News — AI Chips Discussion Thread
SE016	Wikipedia	Mamba (deep learning architecture)
SE017	Wikipedia	NVIDIA
SE018	NVIDIA	NVIDIA H100 Tensor Core GPU
SE019	Wikipedia	High Bandwidth Memory
SE020	Wikipedia	Tape-out
SE021	Wikipedia	Die (integrated circuit)
SE022	Wikipedia	Mixture of experts
SE023	Groq	Groq — AI Inference Technology
SE024	Wikipedia	AI accelerator
SE025	TechCrunch	Etched raises $120M for Transformer-only AI chip
SE026	Wikipedia	Tensor Processing Unit
SU001	OpenAI	OpenAI official company homepage
SU002	Anthropic	Anthropic official company homepage
SU003	Cohere	Cohere official company homepage
SU004	Together AI	Together AI official homepage
SU005	Anyscale	Anyscale official homepage
SU006	Groq	Groq case studies — inference chip customer proof	Groq case studies demonstrate that AI-native inference platforms and consumer AI applications have adopted the Groq LPU for production Transformer inference workloads.
SU007	Amazon Web Services	AWS Inferentia — machine learning inference at scale	AWS Inferentia delivers high performance at low cost for deep learning inference, enabling customers to lower costs and improve performance for ML inference workloads at scale.
SU008	Wikipedia	Wikipedia: OpenAI
SU009	Wikipedia	Wikipedia: Anthropic
SU010	Wikipedia	Wikipedia: Together AI
SU011	Wikipedia	Wikipedia: Perplexity AI
SU012	Wikipedia	Wikipedia: Mistral AI
SU013	Wikipedia	Wikipedia: Cohere (company)
SU014	Wikipedia	Wikipedia: Scale AI
SU015	Amazon Web Services	AWS case study: Stability AI on Inferentia2	Stability AI uses AWS Inferentia2 for image generation inference, achieving significant cost reduction compared to GPU-based inference at comparable throughput.
SU016	Amazon Web Services	AWS case study: Quora Poe chatbot on Inferentia	Quora uses AWS Inferentia to run Transformer model inference for its Poe chatbot at lower cost than equivalent GPU instances.
SU017	G2	G2 reviews: Groq — inference chip developer community feedback	Developer reviews on G2 highlight Groq's throughput speed as a primary adoption driver and confirm that inference-chip adopters prioritize tokens-per-second performance as a key buying criterion.
SU018	Etched	Etched company homepage
SU019	Positive Sum	Positive Sum investor page: Etched
SU020	Wikipedia	Wikipedia: Graphcore — AI chip company failure	Graphcore, a UK-based AI chip startup that raised over $700 million, was acquired by SoftBank at a loss after failing to achieve sufficient customer adoption at scale to sustain operations.
SU021	Hacker News (Y Combinator)	HN: Etched raises $120M for a Transformer-only AI chip — developer commentary	HN comments include developer skepticism about Transformer-only silicon, with commenters questioning whether the architecture bet is too narrow and noting the risk of Transformer paradigm shift.
SU022	Groq	Groq official homepage
SU023	Cerebras Systems	Cerebras Systems official homepage
SU024	Wikipedia	Wikipedia: Cerebras Systems
SU025	SambaNova Systems	SambaNova Systems official homepage
SR001	Wikipedia	Wikipedia: CHIPS and Science Act	The CHIPS and Science Act of 2022 provides approximately $52 billion in semiconductor manufacturing incentives and includes guardrails prohibiting recipients from material expansion of advanced chip manufacturing in countries of concern for 10 years.
SR002	Wikipedia	Wikipedia: Export Administration Regulations	The Export Administration Regulations (EAR) administered by the Bureau of Industry and Security (BIS) govern the export, re-export, and transfer of dual-use items including advanced semiconductor devices.
SR003	Wikipedia	Wikipedia: Bureau of Industry and Security
SR004	Wikipedia	Wikipedia: Export controls
SR005	Bureau of Industry and Security	BIS: Lists of Parties of Concern — policy guidance	The Entity List, Unverified List, Denied Persons List, and other BIS lists of parties of concern restrict exports, re-exports, and in-country transfers to listed entities without a prior license.
SR006	US Federal Register	Federal Register: Export Controls on Semiconductor Manufacturing Items (Oct 7 rule)	The October 2023 rule expands export controls on advanced semiconductor manufacturing equipment and advanced logic chips, tightening restrictions on exports to entities in countries of concern.
SR007	Wikipedia	Wikipedia: EU AI Act	The EU AI Act imposes transparency and compliance obligations on general-purpose AI (GPAI) model providers and defines risk categories for AI systems deployed in the EU.
SR008	Wikipedia	Wikipedia: AI regulation
SR009	Wikipedia	Wikipedia: NVIDIA Corp. v. Samsung — semiconductor patent litigation	NVIDIA Corporation v. Samsung Electronics and Qualcomm is a patent infringement case demonstrating NVIDIA's willingness to pursue chip-design IP litigation against other semiconductor companies.
SR010	Wikipedia	Wikipedia: Arm Holdings	Arm Holdings licenses its instruction set architecture and processor designs to semiconductor companies worldwide; all licensees must maintain a valid Arm architecture license agreement.
SR011	Wikipedia	Wikipedia: Advanced Micro Devices
SR012	Hacker News (Y Combinator)	HN: AI chip architecture discussion — developer signal on Transformer lock-in	Developer commentary raises concerns about single-architecture AI chip bets, noting that model architecture shifts have historically occurred faster than ASIC commercial cycles.
SR013	Wikipedia	Wikipedia: AI safety
SR014	Wikipedia	Wikipedia: Supply chain
SR015	Etched	Etched company homepage
SR016	Positive Sum	Positive Sum investor page: Etched
SR017	Wikipedia	Wikipedia: TSMC	TSMC accounts for the majority of global advanced semiconductor foundry capacity and is headquartered in Taiwan, creating single-point geopolitical dependency for fabless chip companies requiring advanced process nodes.
SR018	Wikipedia	Wikipedia: High bandwidth memory	HBM manufacturing is concentrated among SK Hynix, Samsung, and Micron; AI accelerator supply chains depend on allocation commitments from these three suppliers.
SR019	Wikipedia	Wikipedia: NVIDIA
SR020	Wikipedia	Wikipedia: Graphcore — AI chip startup failure case study	Graphcore, a UK-based AI chip startup that raised over $700 million in total funding, was acquired by SoftBank at a loss after failing to achieve the customer adoption needed to sustain operations as an independent company.
SR021	Hacker News (Y Combinator)	HN: Etched raises $120M for Transformer-only AI chip — developer commentary	Hacker News commentary on Etched's Series A includes developer skepticism about Transformer-only silicon, with multiple commenters raising concerns about architecture lock-in and the timeline to first revenue.
SR022	Wikipedia	Wikipedia: Mamba (deep learning architecture)	Mamba is a selective state-space model that eliminates the key-value cache required by Transformer architectures, potentially reducing the inference memory-bandwidth bottleneck that Transformer-hardened ASICs are designed to accelerate.
SR023	Wikipedia	Wikipedia: Gavin Uberti — Etched CEO
SR024	Wikipedia	Wikipedia: Semiconductor intellectual property core	Semiconductor IP cores are pre-designed, pre-verified functional blocks licensed from IP vendors; most complex ASICs incorporate third-party IP cores that require ongoing licensing agreements.
SR025	Wikipedia	Wikipedia: Tape-out	Tape-out refers to the final stage of the chip design process before manufacturing; for advanced logic nodes, tape-out NRE costs typically range from tens of millions to hundreds of millions of dollars.
SR026	Wikipedia	Wikipedia: Mixture of experts
SR027	Wikipedia	Wikipedia: State space model
SR028	Wikipedia	Wikipedia: Cerebras Systems
SR029	Fortune	Fortune: Etched raises $120M for AI chip — Series A coverage	Etched raised $120 million in a Series A round to develop Sohu, a chip that runs only Transformer-based AI models, betting that Transformer architectures will remain dominant in AI for years to come.
SR030	Wikipedia	Wikipedia: AI chip
SR031	Cerebras Systems	Cerebras Systems official homepage
SR032	NVIDIA	NVIDIA H100 Tensor Core GPU — data center product page
SR033	AMD	AMD Instinct MI300X accelerator product page
SR034	TechCrunch	TechCrunch: Etched is building a chip that only runs Transformer models, raising $120M	Etched is betting that Transformer architectures will remain the dominant paradigm for AI models long enough for its dedicated ASIC to recoup its development investment and earn a commercial return.
SV001	Wikipedia	Wikipedia: Valuation (finance)
SV002	Wikipedia	Wikipedia: Comparable company analysis
SV003	Wikipedia	Wikipedia: Precedent transaction
SV004	Wikipedia	Wikipedia: Marvell Technology
SV005	Wikipedia	Wikipedia: Broadcom Inc.
SV006	Wikipedia	Wikipedia: Venture capital
SV007	Wikipedia	Wikipedia: Discounted cash flow
SV008	Wikipedia	Wikipedia: Qualcomm
SV009	Wikipedia	Wikipedia: Lightspeed Venture Partners
SV010	Wikipedia	Wikipedia: AMD
SV011	Wikipedia	Wikipedia: Primary Venture Partners
SV012	Wikipedia	Wikipedia: Acquisition premium
SV013	AnandTech	Etched Sohu: A Transformer-Only ASIC	Etched's Sohu is a purpose-built transformer inference chip designed to run only transformer-based AI models.
SV014	SemiAnalysis	Etched Sohu — Transformer ASIC Analysis
SV015	Etched	Etched — official company homepage
SV016	Positive Sum	Positive Sum portfolio — Etched investment page
SV017	Wikipedia	Wikipedia: Graphcore	Graphcore, which had raised over $700 million and once held a valuation of $2.8 billion, struggled to gain widespread commercial adoption and faced financial difficulties.
SV018	Wikipedia	Wikipedia: Habana Labs
SV019	Wikipedia	Wikipedia: Cerebras Systems
SV020	Wikipedia	Wikipedia: NVIDIA
SV021	Wikipedia	Wikipedia: Nvidia H100
SV022	Wikipedia	Wikipedia: Groq
SV023	Bloomberg	AI Chip Startup Etched Raises $120 Million to Build Transformer Chips
SV024	Reuters	Etched raises $120 million for chip designed to run AI transformers
SV025	TechCrunch	Etched is building a chip that only runs Transformer models, raising $120M for the effort
SV026	Fortune	Etched AI chip startup raises $120 million Series A for Transformer Sohu chip
SV027	Hacker News	Hacker News: Etched Sohu Transformer ASIC discussion thread
SV028	U.S. Securities and Exchange Commission — EDGAR	NVIDIA Corporation 10-K annual report filings — EDGAR search
SV029	NVIDIA Investor Relations	NVIDIA Investor Relations — Annual Reports
SV030	Cerebras Systems	Cerebras Systems official homepage
SV031	Groq	Groq official homepage
SV032	Wikipedia	Wikipedia: NVIDIA Corporation

Cover facts

Company profile

Executive summary

Top strengths

Top risks

Open gaps

Contents

1.1 Company Identity and Mission

1.2 Founders and Leadership

1.3 Funding History and Investors

1.4 Company Milestones and History

1.5 Strategic Context and Competitive Position

1.6 Exhibits

2.1 Market Definition and Boundaries

2.2 Market Sizing: TAM, SAM, and SOM

2.3 Buyer Segmentation and Adoption Path

2.4 Growth Drivers and Adoption Constraints

2.5 Market Sizing Gaps and Contradictions

2.6 Exhibits

3.1 Competitive Landscape Overview

3.2 Incumbent GPU Competitors

3.3 Hyperscaler Internal Programs

3.4 Purpose-Built Inference Startup Competitors

3.5 Switching Costs, Moat Durability, and Displacement Risk

3.6 Exhibits

4.1 Revenue Model and Streams

4.2 GTM Motion and Sales Efficiency

4.3 Cost Structure and Unit Economics

4.4 Capital Adequacy and Runway

4.5 Financial Gaps and Diligence Blockers

4.6 Exhibits

5.1 Product Definition and Sohu Chip Specification

5.2 Architecture — Hardened Transformer Attention Silicon

5.3 Manufacturing, Maturity, and Technology Dependencies

5.4 Software Stack, SDK, and Developer Surface

5.5 Roadmap, Differentiation, and Technical Risks

5.6 Exhibits

6.1 Customer Base Segmentation and Target Buyers

6.2 Adoption Trajectory and Current Traction

6.3 Named Customer Evidence — Absence and Analog Proof

6.4 Retention, Expansion, and Concentration Risk

6.5 Customer Verdict — Diligence Blockers

7.1 Technology and Architecture Risks

7.2 Regulatory, Geopolitical, and Legal Risks

7.3 Competitive Displacement and Obsolescence Risks

7.4 Execution, Team, and Operational Risks

7.5 Financial and Investment Risks

7.6 Exhibits

8.1 Investment Thesis and Anti-Thesis

8.2 Comparable Company and Precedent Transaction Analysis

8.3 Scenario Analysis — Bull, Base, and Bear Cases

8.4 Capital Structure, Return Requirements, and Exit Path

8.5 Exit Readiness, Kill Triggers, and Investment Verdict

8.6 Exhibits

Disclaimer

Evidence index