llm·monitor

/ Hardware / HYDRA

HYDRA: a 4×RTX 3090 inference rig that runs Gemma-4-31B at 76.9 tok/s

Built mostly on the used market. Serves a production agentic workload while doubling as the reference benchmark target on this site. Bill of materials below — each row links to the marketplace where we actually sourced that part.

HYDRA: 6U open-frame chassis, front-corner view showing three intake fans, dual PSU sockets, side carry handle, and HYDRA label, sitting on a desk
HYDRA · 6U open-frame chassis · 4× RTX 3090 · dual 1600 W PSUs
Build cost
~$6,200
Power
~800 W under load · ~170 W idle
measured at the wall
Performance
76.9 tok/s decode · Gemma-4-31B-AWQ · single-stream
Serves
Maple agentic workflows + LLM Monitor bench rig

Bill of materials

Component Part Price Buy
GPU
NVIDIA RTX 3090 24GB
Used market only. Look for Founders Edition or EVGA FTW3 for cleaner thermals. Each card runs at PCIe 3.0 x16 in this build (see motherboard note).
~$890 each used
Find on eBay →
Motherboard
Asus Pro WS X299 Sage II (LGA 2066)
CEB workstation board. PLX chips on the board feed all four PCIe slots at full x16 PCIe 3.0 to each GPU — no bandwidth split. The reason this specific board was chosen.
~$475
Find on eBay →
CPU
Intel Core i9-7920X (LGA 2066, 12-core / 24-thread)
Skylake-X. Socket LGA 2066 (Socket R4). 44 PCIe 3.0 lanes. CPU compute barely affects single-stream inference — the platform is chosen for the lane count.
~$185 used
Find on eBay →
RAM
16 GB DDR4 NON-ECC stick
Eight matched 16 GB sticks for 128 GB total — full quad-channel population on the X299 platform. ~$224 across the eight slots.
~$28 each
Find on eBay →
PSU
1600W 80+ Titanium PSU (dual)
Dual-PSU rig. Even though measured load is ~800 W at the wall, dual 1600 W units keep each PSU well below its peak-efficiency knee and tolerate a single-PSU failure. Headroom is cheap insurance on used parts.
~$320 each
Find on eBay →
Case
6U open-frame mining/AI chassis (dual-PSU support)
Open-frame 6U chassis sized for 4–6 full-length GPUs with built-in dual-PSU mounting. The same form factor crypto rigs use; for inference duty it gives you the GPU spacing and airflow that a standard ATX case can't provide.
~$200
Find on eBay →
Storage
Samsung NVMe SSD 1TB
Model weight cache + OS. Bumping to 2 TB worth considering once you cache more than two large GGUFs.
~$85
Find on eBay →
Cooling
Thermalright 240mm AIO
CPU loop. The 7920X is happy under any 240mm class AIO at inference loads.
~$75
Find on eBay →
Cooling
Noctua NF-F12 / NF-A12 industrialPPC 120mm
Case + GPU stack airflow. Industrial PPC variant for the static pressure needed across packed 3090s.
~$30 each
Find on eBay →
Risers
PCIe Gen5 60mm risers
Lets the GPU stack physically clear in an open-frame layout. Gen5-rated only because Gen4/Gen3 ones drop signal at 4090 / 3090 power loads on long runs.
~$45 each
Amazon
Cabling
DisplayPort 1.4 extension cable 3ft (8K-rated)
One per GPU. Lets you mount the rig in a way where the cards face into the case but the DP outputs route cleanly to a switch or breakout panel.
~$12 each
Amazon
Misc
DisplayPort headless ghost dongles (3-pack)
EDID emulators that spoof a connected monitor on each GPU. Required if you ever want full clock-rate behaviour on cards running headless — some workloads (and some driver versions) downclock GPUs that have no display attached.
~$14 per 3-pack
Amazon
Software
NVIDIA driver 580 + CUDA 13.0 + vLLM 0.17
Stack as of v15. Pin versions until benched.
free

eBay rows pull a daily median across the cheapest Buy-It-Now listings sampled. Amazon rows show our actual paid price at time of build — Amazon's pricing is stable enough that a daily probe isn't worth the API overhead. Click through to verify current pricing before purchasing.

In the flesh

A few photos from the build and testing phases. Front to back, lit up running, then back through the build.

Eagle-eyed readers will notice an RTX 3090 Ti in the testing photos. The Ti was a stand-in during bring-up; HYDRA's production lineup is four 3090s, which is what the bill of materials and the benchmark numbers reflect.

HYDRA front panel: three 120 mm intake fans above two PSU mounts labeled POWER-1 and POWER-2, with a HYDRA sticker top-right
Front panel. Three 120 mm intake fans pull cool air across the GPU stack. Below them, two 1600 W PSUs mount end-on with a "HYDRA" tag top-right — POWER-1 and POWER-2 carry the load split between the cards.
HYDRA rear panel: Thermalright AIO radiator with two fans mounted to it and the motherboard I/O panel visible below
Rear panel. The Thermalright AIO radiator handles CPU heat through its own dedicated loop, with the motherboard I/O (USB, audio, networking) tucked underneath. GPU airflow exhausts through the open vents on either side.
Top-down view of HYDRA during a testing run, with four GPUs (three 3090s and a 3090 Ti) RGB-illuminated, AIO radiator and Noctua fan rail visible
During testing, looking down. Four GPUs in a TP=4 layout — at the time, three 3090s and a 3090 Ti standing in for the fourth. AIO radiator visible up top, Noctua industrialPPCs at the bottom. The Ti was swapped for a fourth 3090 before HYDRA went into production.
Empty 6U open-frame mining/AI chassis from above, showing the GPU bay layout with two top-mounted exhaust fans
The 6U chassis fresh out of the box, exhaust fans mounted, before any GPUs went in. Same form factor crypto miners use; for inference duty it gives the GPU spacing and airflow a standard ATX case can't.
HYDRA mid-build with one GPU installed and a Maltese dog standing in front of it
Mid-build, first GPU in. Project supervisor pictured.
Not ready to commit $6,200?

Rent the equivalent on cloud GPUs first

Test your workload on a single A100 or H100 before deciding whether a 4×3090 build pays back. Both platforms below offer per-second billing.