Inferencing AI Models

AI inference costs dropped up to 10x on Nvidia's Blackwell — but hardware is only half the equation

New deployment data from four inference providers shows where the savings actually come from — and what teams should evaluate ...

Network World

Nvidia claims 10x cost savings with open-source inference models

Nvidia noted that cost per token went from 20 cents on the older Hopper platform to 10 cents on Blackwell. Moving to Blackwell’s native low-precision NVFP4 format further reduced the cost to just 5 ...

90% Cheaper AI? Microsoft-Backed Chip Startup Says It’s Possible Today

AI is expensive. This Microsoft-backed chip startup says its can generate AI answers 90% cheaper ... and it's going to get even better over time ...

The $20 Billion Bet On Inference: What Every AI Infrastructure Team Needs To Get Right

Every ChatGPT query, every AI agent action, every generated video is based on inference. Training a model is a one-time ...

2don MSN

AI inference startup Modal Labs in talks to raise at $2.5B valuation, sources say

General Catalyst is in talks to lead the round for the four-year-old startup, according to our sources.

2don MSN

Chinese AI startup Zhipu releases new flagship model GLM-5

BEIJING, Feb 11 (Reuters) - China's Zhipu AI released its latest artificial intelligence model on Wednesday, joining a wave ...

The Manila Times

Cloudera Unveils Next Phase of AI Inferencing and Unified Data Access Capabilities

Enabling faster, more accurate enterprise AI and analytics across multi-cloud, edge, and data center environments ...

Seeking Alpha

Nebius launches Token Factory to enable AI inference for open-source models

Nebius (NBIS) has released the Nebius Token Factory, a production inference platform that enables artificial intelligence companies and enterprises to deploy and optimize open-source and custom AI ...

VentureBeat

Together AI's ATLAS adaptive speculator delivers 400% inference speedup by learning from workloads in real-time

Enterprises expanding AI deployments are hitting an invisible performance wall. The culprit? Static speculators that can't keep up with shifting workloads. Speculators are smaller AI models that work ...

10don MSN

OpenAI ditches Nvidia for faster AI inference chips, threatening chipmaker's dominance

Nvidia remains dominant in chips for training large AI models, while inference has become a new front in the competition.

MIT Technology Review

Realizing value with AI inference at scale and in production

As organizations enter the next phase of AI maturity, IT leaders must step up to help turn promising pilots into scalable, trusted systems. In partnership withHPE Training an AI model to predict ...

SiliconANGLE

Lenovo launches new ThinkSystem servers dedicated to AI inference

Lenovo Group Ltd. is pushing to become the workhorse of the artificial intelligence industry after unveiling a slate of new, enterprise-grade server systems specifically for AI inference workloads.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results