Fireworks AI Launches Preview of Training Platform Supporting Trillion-Parameter Full Training

According to monitoring by 1M AI News, AI inference infrastructure company Fireworks AI has released a preview of Fireworks Training, expanding from a pure inference platform to an integrated platform for training and deployment. Fireworks AI was founded by Lin Qiao, a former Meta engineer involved in building PyTorch, and is currently valued at $4 billion, processing 15 trillion tokens daily. The platform offers three tiers: 1. Training Agent: Designed for product teams without ML infrastructure, allowing them to describe tasks and upload data to complete the entire process from training to deployment, currently supporting only LoRA; 2. Managed Training: Aimed at ML engineers, supporting SFT, DPO, and reinforcement learning fine-tuning, including full parameter training; 3. Training API: Targeted at research teams, allowing customization of loss functions and training loops, supporting algorithms such as GRPO and DAPO with full parameter training scales ranging from single-node Qwen3 8B to Kimi K2.5 (trillion parameters) on 64 NVIDIA B200s. Fireworks AI's production inference clients, AI programming tools Cursor, Vercel, and Genspark, have completed cutting-edge reinforcement learning training on this platform. Vercel trained an automatic error correction model for its code generation product v0, achieving a 93% error-free code generation rate, compared to only 62% with Sonnet 3.5, and improved end-to-end latency by 40 times compared to the previously used closed-source model. Genspark fine-tuned the trillion-parameter open-source model Kimi K2 with reinforcement learning to build a deep research agent, increasing tool usage by 33% and reducing costs by 50%. Cursor completed distributed reinforcement learning training for Composer 2 across 3 to 4 clusters globally (currently ranked first on CursorBench), sharing the same GPU pool for training and production inference. Fireworks AI emphasizes its core technological differentiation in numerical consistency between training and inference. MoE (Mixture of Experts) models are numerically more fragile than dense models, where minor changes in hidden states can flip expert routing and amplify cascading effects. Fireworks has published the KL divergence values between training and inference for all supported models, all below 0.01.

Recently Searched

Hot Coins

Trending

Daily Must-Read

Welcome Back

Join CoinTime

Sign in with email

Sign up with email

Check your inbox

Fireworks AI Launches Preview of Training Platform Supporting Trillion-Parameter Full Training

All Comments

Recommended for you

Nvidia Stock Hits Record High with Market Capitalization Reaching $5.5 Trillion

BTC Falls Below $80,000

President Trump to Arrive in Beijing

Intelligent Computing Awakens, Igniting Globally | OG Agent Global Launch Conference (Shenzhen Station) Successfully Concludes, Ushering in a New Era of AI Intent Trading

OG Agent Global Launch Conference (Shenzhen Station) Successfully Concludes: The Intelligent Computing Era Officially Begins

OG Agent Global Launch Conference Holds Roundtable Forum, Discussing “Web 4.0 Era: Foundations, Challenges, and Future Directions of AI Intent Trading”

Advanced Web4.0 Intelligent Intent AI Large Model OG Agent Officially Completes Global Launch and Ecosystem Activation

SEC: NYSE Tokenized Securities Proposal Officially Effective

Zheng Xiaojun, Chief Scientist of the World Artificial Intelligence Organization: For Ordinary People in the AI Era, the Most Important Thing Is to Seize AI Opportunities and Gain Profi

Blockchain Business Blogger Michael: The Trillion-Level Secondary Market Capital Is a Vast Blue Ocean for AI Intelligent Agent Trading

Daily Must-Read

Intelligent Computing Awakens, Igniting Globally | OG Agent Global Launch Conference (Shenzhen Station) Successfully Concludes, Ushering in a New Era of AI Intent Trading

$OG Officially Explodes After Launch: OG Agent Global AI Compute Ecosystem Fully Activated, Ushering in the Era of AI Intent Trading

Arthur Hayes' in-depth analysis: Amidst the hype of the AI foam, lies the greatest opportunity for the crypto market

After the surge in memory, where are the sectors in the AI industry chain that are truly worth betting on for the long term?

CRCL bucked the trend and surged, despite Circle's financial report falling short of expectations, as it unveiled two game-changing weapons to break the deadlock

Capital investment drives the stock market higher, rendering the "economic recession" talk irrelevant

Popular Activities

RaveDAO at Terra Solis by Tomorrowland: A Female-Led Techno Night Where Web3 Culture Converges

Popular Tags

Share