/ Hardware / HYDRA
HYDRA: a 4×RTX 3090 inference rig that runs Gemma-4-31B at 76.9 tok/s
Built on the used market for ~$6,600 capex. Serves a production agentic workload while doubling as the reference benchmark target on this site. Below is the bill of materials with current Amazon and Newegg links.
▸ HYDRA · photo placeholder
replace with rig photo
Capex
~$6,600
Power
~1,500 W under load · ~120 W idle
Performance
76.9 tok/s decode · Gemma-4-31B-AWQ · single-stream
Serves
Maple agentic workflows + Ollama Watch bench rig
Bill of materials
| Component | Part | Buy |
|---|---|---|
| GPU | NVIDIA RTX 3090 24GB (used market) Used market only. Look for Founders Edition or EVGA FTW3 for cleaner thermals. | Amazon |
| NVLink | PNY NVLink 3-slot bridge (RTXA6000NVLINK3S-KIT) Removed from production. Single-stream decode does not benefit from NVLink — empirically 4% slower with than without on TP=4. | Amazon |
| Motherboard | ASRock X299 Sage / WS X299 PRO 7 PCIe x16 slots so all four 3090s run at x8 minimum. Avoid consumer Z690/Z790. | Amazon |
| CPU | Intel Xeon W-2295 (or i9-10980XE) CPU barely matters for inference but you need the lanes. | used market |
| RAM | 128 GB DDR4 ECC (4×32 GB) ECC because the rig runs unattended. | used market |
| PSU | Corsair AX1600i + secondary 1200W (dual-PSU) 4× 3090 will pull ~1500W under inference; dual-PSU keeps each unit out of derate range. | Amazon |
| Cooling | Noctua NH-U14S TR4-SP3 + 8× NF-A12x25 GPU stack runs hot when packed; case airflow matters more than CPU cooler. | used market |
| Storage | Samsung 980 Pro 2TB NVMe Model weights cache + OS. | Amazon |
| Software | NVIDIA driver 580 + CUDA 13.0 + vLLM 0.17 Stack as of v15. Pin versions until benched. | used market |
Not ready to commit $6,600?
Rent the equivalent on cloud GPUs first
Test your workload on a single A100 or H100 before deciding whether a 4×3090 build pays back. Both platforms below offer per-second billing.