← All docs
Getting started
Your first sessionThe workspaceSSH access
Core concepts
Agent rolesExperimentsResearch plansCanvasWorkflows
Integrations
GitHubWebhooksNotifications
Platform
Billing and plansAgent memoryMulti-chatKeyboard shortcuts
Reference
GPU catalogAgent toolsAPI referenceSecurityLimits and quotas
← Docs
Reference

GPU catalog

Available GPUs

| GPU | VRAM | $/hr | Best for |

|---|---|---|---|

| RTX 4090 | 24 GB | Variable | Fine-tuning up to 13B, inference, prototyping |

| RTX 5090 | 32 GB | Variable | Larger fine-tunes, extended context, vision models |

| A6000 | 48 GB | Variable | 30B+ fine-tuning, large batch training, multi-model |

| A100 | 80 GB | Variable | Pre-training, large-scale fine-tuning, multi-GPU |

| H100 | 80 GB | Variable | Maximum throughput, FP8 training, fastest iteration |

| H200 | 141 GB | Variable | Largest models in a single GPU, 70B+ fine-tuning |

Availability

GPUs are provisioned on demand. Availability varies:

  • RTX 4090: almost always available, typically <30 second provision
  • RTX 5090: good availability, usually <1 minute
  • A6000: good availability, usually <1 minute
  • A100: moderate availability, can take 1 to 3 minutes during peak hours
  • H100: limited, may require a short wait during peak hours (US business hours)
  • H200: most limited, best availability during off-peak hours

If a GPU isn't available, Meshia queues your request and provisions it as soon as one opens up. You're not charged during the queue wait.

Recommendations by use case

Fine-tuning 7B to 13B models (LoRA/QLoRA): RTX 4090. Plenty of VRAM for LoRA adapters, fast enough for quick iteration. Best price-to-performance for this workload.

Fine-tuning 30B+ models: A6000 or A100. You need the VRAM. A100 is faster but costs more. Use A6000 for overnight runs where speed matters less.

Pre-training or full fine-tuning: A100 or H100. These workloads are compute-bound. The H100's FP8 support and higher memory bandwidth make a real difference for large training runs.

Inference and prototyping: RTX 4090. Cheapest option, plenty fast for interactive use. Start here and scale up if you need more VRAM.

Running 70B+ models: H200. The 141 GB VRAM lets you fit models that won't fit on anything else in a single GPU.

CUDA and driver versions

All GPUs run CUDA 12.4+ with the latest stable NVIDIA drivers. PyTorch, JAX, and TensorFlow are pre-installed in the pod environment. The agent handles versions. If your code needs a specific one, just ask.

← Keyboard shortcutsAgent tools →