SAN FRANCISCO, January 15, 2026 – Nvidia has invested $150 million in Baseten, a San Francisco-based startup specializing in AI inference infrastructure, the company announced Wednesday, as part of a broader push to expand its ecosystem in the fast-growing market for running trained AI models at scale.

The funding, structured as a strategic investment rather than a full acquisition, gives Nvidia a minority stake in Baseten and access to its platform for optimizing inference workloads.

Baseten provides cloud-based tools that allow developers to deploy, monitor, and scale AI models efficiently, with features including automatic hardware selection, cost optimization, and real-time performance monitoring.

Baseten co-founder and CEO Phillip Howes said the partnership will combine Nvidia’s hardware expertise with Baseten’s software stack to deliver faster and more cost-effective inference for customers.

“This investment accelerates our ability to serve enterprises running large-scale production AI,” Howes stated in a blog post.

The deal aligns with Nvidia’s strategy of supporting a diverse inference ecosystem rather than relying solely on its own chips and software.

Inference—the phase where trained models generate outputs—now accounts for the majority of AI compute demand as adoption shifts from training to deployment. NVIDIA dominates the training market but faces growing competition in inference from startups and cloud providers offering specialized solutions.

Baseten, founded in 2019, has raised more than $60 million before this round from investors including Greylock Partners and Lightspeed Venture Partners. The company serves clients across finance, healthcare, and retail, focusing on production-grade reliability and cost control.

NVIDIA’s investment follows similar strategic moves, including partnerships with CoreWeave and Lambda Labs, and recent licensing agreements with other inference specialists.

The company has emphasized inference as a major growth driver, with CEO Jensen Huang stating in recent earnings calls that the segment will surpass training in long-term compute spend.

The funding comes amid rising demand for efficient inference solutions as enterprises deploy large language models at scale. Baseten plans to use the capital to expand its engineering team and enhance support for Nvidia’s latest architectures.