Rent NVIDIA B300 GPUs on Demand from $2.45/hr
288GB HBM3e Blackwell Ultra with 15 PFLOPS dense FP4, built for trillion-parameter training.
You can rent an NVIDIA B300 Blackwell Ultra GPU on Spheron starting at $2.45/hr per GPU per hour on dedicated (99.99% SLA, non-interruptible), with spot pricing cheaper still. Per-minute billing, no long-term contracts, and B300 instances deploy as part of GB300 NVL72 rack systems or HGX B300 8-way nodes. Each GPU ships with 288GB HBM3e (50% more than B200), NVLink 5 @ 1.8 TB/s, 5th gen Tensor Cores with an enhanced FP4 Transformer Engine, and dramatically higher throughput than B200 across every precision format. Built for 200B+ parameter training, ultra-long-context inference (1M+ tokens), MoE models at trillion-parameter scale, and multi-modal foundation models. B300 is the pick when B200's 192GB isn't enough.
Technical specifications
Pricing comparison
| Provider | Price/hr | Savings |
|---|---|---|
SpheronYour price | $2.45/hr | - |
Nebius | $6.10/hr | 2.5x more expensive |
CoreWeave | Contact sales | - |
AWS (p6-b300) | $17.80/hr | 7.3x more expensive |
Need More B300 Than What's Listed?
Reserved Capacity
Commit to a duration, lock in availability and better rates
Custom Clusters
8 to 512+ GPUs, specific hardware, InfiniBand configs on request
Supplier Matchmaking
Spheron sources from its certified data center network, negotiates pricing, handles setup
Need more B300 capacity? Tell us your requirements and we'll source it from our certified data center network.
Typical turnaround: 24–48 hours
When to pick the B300
Pick B300 if
You're training or serving 200B+ parameter models and B200's 192GB HBM3e isn't enough. 288GB lets you fit larger dense models on a single GPU, keep longer context windows (1M+ tokens), or reduce tensor-parallel splits on fixed model sizes. Also the pick for GB300 NVL72 rack-scale deployments where all 72 GPUs address unified memory.
Pick B200 instead if
Your model fits comfortably in 192GB and you want the cheapest Blackwell rate. B200 is widely available, cheaper per hour, and matches B300 on FP4 Transformer Engine capability. Best for most 70B-200B workloads.
Pick H200 instead if
You don't need Blackwell FP4 and want proven Hopper with 141GB HBM3e. H200 is significantly cheaper per hour and has been production-hardened for over a year, a safer pick when Blackwell software tuning isn't worth the premium.
Pick GB300 NVL72 instead if
You need rack-scale training for trillion-parameter frontier models. GB300 NVL72 connects 72 B300 GPUs over NVLink into a unified 20+ TB memory domain — the only architecture that handles models too large for any single 8-way node.
Ideal use cases
Frontier Model Training
Train the most advanced frontier AI models at scale with 288GB memory per GPU and class-leading memory bandwidth. Handle the largest MoE and dense transformer architectures without memory constraints.
Ultra-High-Throughput LLM
Serve the world's largest language models at production scale with massive memory capacity and superior compute density, minimizing cost per token across all precision formats.
Generative AI & Creative Workloads
Power next-generation generative AI with massive VRAM headroom for high-resolution video, 3D, and complex multi-modal generation pipelines all within a single GPU.
AI Research & Architecture Exploration
Give researchers the memory and compute needed to explore novel architectures, scaling laws, and experimental approaches without hardware bottlenecks.
Performance benchmarks
Train a 400B+ MoE model on 8x B300 HGX
288GB per GPU on an 8-way HGX B300 node gives you 2.3TB of HBM3e across NVLink, enough to train a 400B+ MoE or pre-train a large dense model with aggressive batch sizes.
# SSH into your HGX B300 nodessh ubuntu@<instance-ip> # NVIDIA NeMo Framework ships Blackwell-optimized containersdocker run --gpus all --rm -it \ nvcr.io/nvidia/nemo:25.04 bash # Inside container, launch FP8 pre-training with FSDPtorchrun --nproc_per_node=8 \ examples/nlp/language_modeling/megatron_gpt_pretraining.py \ model.mcore_gpt=True \ model.transformer_engine=True \ model.fp8=hybrid \ model.tensor_model_parallel_size=4 \ model.pipeline_model_parallel_size=2 \ trainer.devices=8For FP4 pre-training, pass model.fp4=True (requires Transformer Engine 2.0+ and Blackwell kernels). FP4 roughly doubles effective throughput vs FP8 on compatible layers.
NVLink Ultra Configuration
B300 GPUs are built on NVLink Ultra technology, delivering 1.8 TB/s bidirectional bandwidth per GPU. Combined with 288GB of HBM3e memory per card, B300 clusters enable near-linear scaling for the most data-intensive distributed training workloads, including trillion-parameter models with long-context requirements.
Need a custom multi-node cluster or reserved capacity? Talk to us about topology, regions, and committed pricing.
B300 vs alternatives
CDNA 5 vs Blackwell Ultra architecture, LLM inference projections, ROCm vs CUDA maturity, and GPU cloud pricing for teams weighing AMD's MI400 series as an alternative to B300.
Where B300 fits in NVIDIA's generational stack, how Blackwell Ultra compares to Hopper, and what changes with Rubin on the horizon. Useful context before committing to multi-year infrastructure.
Related resources
NVIDIA B300 (Blackwell Ultra): Complete Guide to Specs and Pricing
Everything you need to know about B300 specs, pricing, architecture, and when the upgrade from B200 is worth it.
GPU Requirements Cheat Sheet 2026
Find the right GPU for every major open-source AI model, includes B300-class workload recommendations.
GPU Cloud Benchmarks 2026
Real performance and pricing data across every major GPU cloud provider, including next-gen Blackwell GPUs.