Name: NVIDIA H100 GPU Rental
Brand: NVIDIA
Availability: InStock

Question 1

How much does it cost to rent an H100 GPU?

Accepted Answer

On Spheron the H100 starts at $1.43/hr per GPU per hour, the lowest live marketplace rate. There is no minimum commitment and billing is per minute, so a one-hour test costs you one hour. For comparison, AWS p5 H100 instances run around $3.90/hr per GPU on demand after the mid-2025 price cut, Google Cloud A3-high is about $3.00/hr, and Azure ND H100 v5 sits near $7/hr.

Question 2

What is the cheapest way to rent an H100?

Accepted Answer

Spot instances on Spheron are the cheapest path, often 50 to 70 percent below the dedicated rate. Both are on-demand tiers with per-minute billing. The trade-off is that spot instances can be reclaimed when demand spikes, so checkpoint your training state every 15 to 30 minutes and treat spot as a fit for fault-tolerant workloads. For long uninterrupted runs, stay on dedicated (99.99% SLA, non-interruptible).

Question 3

Can I rent an H100 by the hour?

Accepted Answer

Yes. Spheron bills per minute with no minimum. You can rent an H100 for a single hour to benchmark a workload, or keep it running for months. There are no contracts, reserved-instance lock-ins, or commit fees on dedicated or spot.

Question 4

How fast can I deploy an H100 instance?

Accepted Answer

Most H100 instances are live in under 2 minutes. The hardware is pre-warmed, so provisioning is closer to a container start than a hyperscaler VM boot. If you have a Docker image ready, you can be running training inside two minutes of clicking deploy.

Question 5

Is the H100 worth it over the A100?

Accepted Answer

If you are doing FP8 inference, training models above 30B parameters, or running anything that benefits from the Transformer Engine, the H100 is the right choice. It is roughly 3 to 4x faster on those workloads. For sub-30B training and most inference at scale, the A100 80GB delivers better dollars per token.

Question 6

Do you support multi-node H100 clusters with InfiniBand?

Accepted Answer

Yes. Spheron supports up to 8x H100 per node with NVLink, and bare-metal clusters up to 80x H100 across 10 nodes connected by 400 Gb/s InfiniBand. Every cluster is tested with PyTorch DDP, DeepSpeed ZeRO-3, and Megatron-LM. For larger configurations, contact us.

Question 7

What deep learning frameworks come pre-installed?

Accepted Answer

PyTorch, TensorFlow, JAX, and the major serving stacks (vLLM, TensorRT-LLM, SGLang, Triton). Containers ship with CUDA 12.4+, cuDNN, NCCL, and the standard NVIDIA AI Enterprise libraries. You can also bring your own Docker image.

Question 8

What regions are H100s available in?

Accepted Answer

H100 capacity is online across North America, Europe, and Asia, sourced from data center partners. Specific availability shifts with demand. The dashboard shows current capacity per region in real time.

Question 9

What is the difference between H100 SXM and H100 PCIe?

Accepted Answer

SXM5 is the higher-power variant (700W) with NVLink connectivity between GPUs in a node, which matters for multi-GPU training. PCIe is air-cooled, lower power (350W), and easier to mix with existing servers. Spheron offers both. Pick SXM for distributed training, PCIe for single-GPU inference.

Question 10

How long does it take to fine-tune Llama 3 70B on an H100?

Accepted Answer

A QLoRA fine-tune of Llama 3 70B on a 50k-sample dataset takes roughly 8 to 12 hours on a single H100. Full-parameter fine-tunes need 4 to 8 H100s and run in the 24 to 48 hour range depending on dataset size and sequence length. We have a fine-tuning case study with the exact numbers.

Question 11

Can I run H100 on spot instances safely?

Accepted Answer

Yes, with checkpointing. Spot instances on Spheron are reclaimed when demand rises, so unsaved state is at risk. The safe pattern is: checkpoint every 15 to 30 minutes to a persistent volume, restart from the latest checkpoint on preemption, and reserve spot for training, batch jobs, and fault-tolerant inference. Use dedicated instances (99.99% SLA) for production serving.

Question 12

Does the H100 support FP8 training and inference?

Accepted Answer

Yes. The Transformer Engine on Hopper supports FP8 mixed precision out of the box. In practice this is a roughly 1.7x speedup over FP16 with no measurable accuracy loss for most LLM workloads, plus halved memory pressure on activations. vLLM, TensorRT-LLM, and PyTorch all support FP8 paths.

Question 13

Do you offer enterprise SLAs and dedicated support for H100 deployments?

Accepted Answer

For 100+ GPU deployments and production-critical workloads, Spheron offers dedicated Slack or Discord support, sourcing assistance for capacity, and SLA-backed instances. Smaller deployments are self-serve through the dashboard.

Question 14

How does pricing on Spheron compare to AWS, GCP, and Azure?

Accepted Answer

For the same H100 hardware, Spheron is meaningfully cheaper than AWS p5, Azure ND H100, and GCP A3 on-demand. As of April 2026 hyperscaler on-demand H100 pricing runs roughly $3.00 per GPU per hour on GCP A3-high, ~$3.90/hr on AWS p5, and ~$7/hr on Azure ND H100 v5. Spheron starts at $1.43/hr. Same chip, different pricing model.

Question 15

Is it better to rent or buy an NVIDIA H100?

Accepted Answer

For most teams, renting wins. An H100 costs roughly $25,000 to $40,000 to buy, plus a host server, power, and cooling, and it depreciates as newer GPUs ship. Renting on Spheron starts at $1.43/hr per GPU per hour with per-minute billing and no commitment, so you only pay while a job runs and you can move to H200 or B200 the day they help. Buy only if you keep an H100 near 24/7 utilization for more than a year and already have the data-center overhead to support it.

Provider	Price/hr	Savings
SpheronYour price	$1.43/hr	-
Lambda Labs	$2.99/hr	2.1x more expensive
Google Cloud	$3.00/hr	2.1x more expensive
Nebius	$3.08/hr	2.2x more expensive
RunPod	$3.29/hr	2.3x more expensive
Latitude.sh	$3.37/hr	2.4x more expensive
AWS	$3.90/hr	2.7x more expensive
CoreWeave	$6.16/hr	4.3x more expensive
Azure	$6.98/hr	4.9x more expensive

NVIDIA H100 GPU: 80GB HBM3 Specs, Pricing & Rental. Rent H100 GPU from $1.43/hr

NVIDIA H100 specifications

NVIDIA H100 pricing

Need More H100 Than What's Listed?

When to pick the H100

Pick the H100 if

Pick the A100 instead if

Pick the H200 instead if

Pick the B200 instead if

NVIDIA H100 use cases

LLM training and fine-tuning

Production LLM inference

Diffusion and video generation

HPC and scientific computing

NVIDIA H100 benchmarks

Launch vLLM on an H100 in under 2 minutes

InfiniBand for multi-node H100 training

H100 vs alternatives

NVIDIA H100 guides and resources

NVIDIA H100 vs H200: benchmarks and when to upgrade

Running 10 concurrent fine-tuning jobs on bare-metal H100s

Building a sub-200ms RAG pipeline on bare-metal H100s

vLLM production deployment in 2026

Best NVIDIA GPUs for LLMs

GPU cost optimization playbook

NVIDIA H100 Release Date and Cloud Availability

H100 VRAM and Memory Bandwidth: 80GB HBM3 at 3.35 TB/s

NVIDIA H100 FAQ

NVIDIA H100 alternatives and related GPUs

H200

B200

A100