Strix Halo AI Toolboxes

Toolboxes for GenAI on AMD Ryzen AI MAX+

Containerized environments for LLMs, Image Generation, and Fine-tuning.

The Project

In August 2025, I got my hands on a Strix Halo machine. I needed to run local inference for some Cyber Security work where Cloud LLMs were not an option.

I quickly realized the software ecosystem wasn't ready. Stuff wasn't working. So I started digging, learning, and fixing things. I shared my findings in a video. People found it useful, so I've continued to maintain these toolboxes to help others unlock the potential of their hardware.

Thanks to support from the Strix Halo Home Lab community, Framework, and AMD, I've continued to maintain these "Toolboxes" to help others reproduce this setup and run AI workloads on Strix Halo hardware.

// WHOAMI

Donato Capitella

Software Engineer and Ethical Hacker. I enjoy understanding systems by breaking them down and documenting the process.

▶ YouTube Channel 🔗 LinkedIn Profile 📝 LLM Chronicles

What is Strix Halo?

// RYZEN AI MAX+
AMD Ryzen AI MAX

"Strix Halo" (Ryzen AI MAX+) is AMD's high-performance mobile processor platform. Its key feature for AI workloads is Unified Memory, allowing the iGPU to access up to 128GB of system RAM, significantly increasing the model size capacity compared to traditional consumer GPUs.

Architecture Zen 5 + RDNA 3.5
GPU ID gfx1151
Max Unified Memory 128 GB
-> Official Product Page

Active Toolboxes

// MAINTAINED CONTAINERS

These are containerized environments built on Toolbx (Docker/Podman). This approach allows you to easily get the specific runtime needed for Strix Halo, keep the host system clean, and instantly switch between different ROCm or software versions without dependency conflicts.

Llama.cpp Logo

Llama.cpp Toolboxes

Setup for LLM inference. Supports clustering via RDMA and Vulkan/ROCm backends.

View Repo ->
ComfyUI Logo

ComfyUI Toolboxes

Environment for Image & Video generation. Validated for Flux, Wan 2.2, HunyuanVideo, and Qwen.

View Repo ->
vLLM Logo

vLLM Toolboxes

Serving server setup. Includes custom RCCL patches for high-speed clustering.

View Repo ->
PyTorch Logo

LLM Fine-tuning

Training environment. QLoRA and Full Fine-Tuning support for Gemma 3, Qwen 3, and generic models.

View Repo ->

Tutorials & Guides

// YOUTUBE VIDEOS

Host Config

// TUNED FOR PERFORMANCE

This is the configuration I use on my Framework Desktop to maintain and benchmark all toolboxes.

Framework Desktop

My Rig - Sent to me by Framework

System Specifications

Model Framework Desktop
CPU Ryzen AI MAX+ 395 "Strix Halo"
Total RAM 128 GB DDR5
OS Fedora 43 (Linux 6.18.5)

Why Custom Kernel Parameters?

Many guides suggest statically partitioning memory between the CPU and iGPU (e.g., locking 32GB for video). However, this is a waste. With Unified Dynamic Memory, we can let the GPU access nearly all system RAM (up to ~124GB) on demand, while keeping the flexibility to use it for the CPU when needed.

root@strix-halo:~
# Add these to GRUB_CMDLINE_LINUX in /etc/default/grub
$ sudo vim /etc/default/grub
GRUB_CMDLINE_LINUX="... iommu=pt amdgpu.gttsize=126976 ttm.pages_limit=32505856"
$ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
Parameter How it enables Unified Memory
iommu=pt Pass-Through Mode: Bypass IOMMU translation for the GPU, reducing overhead when accessing the user-space pages.
amdgpu.gttsize=126976 GTT Size (Graphics Translation Table): Explicitly sets the maximum unified memory addressable by the GPU to ~124GB (126976 MB), overriding default driver limits.
ttm.pages_limit=32505856 Pinned Memory Limit: Allows the TTM (Translation Table Manager) to pin up to ~124GB of pages in high-speed system RAM, ensuring the GPU has direct access without swapping.

Power & Performance Tuning

Following the documentation here, we set a performance profile to get max performance.

root@strix-halo:~
$ paru -S tuned
$ sudo systemctl enable --now tuned
$ tuned-adm list | grep accelerator
- accelerator-performance - Throughput performance based tuning with disabled higher latency STOP states
$ sudo tuned-adm profile accelerator-performance
$ tuned-adm active
Current active profile: accelerator-performance

Join the Community

Connect with other Strix Halo owners, share benchmarks, and get help.