🖥️ Your Platform: Detecting...

Commands below auto-adjust for your operating system

🖼️ Image Generation Runtimes

Run FLUX, Stable Diffusion, and all LoRA models locally

🎯

ComfyUI

The most powerful node-based image generation GUI. Drag-and-drop workflows, supports every model format.

🐧 Linux 🍎 macOS 🪟 Windows
git clone https://github.com/comfyanonymous/ComfyUI && cd ComfyUI && pip install -r requirements.txt
⬇️ Install 📖 Docs
🖌️

Stable Diffusion WebUI (Auto1111)

Classic web interface for Stable Diffusion. Huge extension ecosystem, simple LoRA management.

🐧 Linux 🍎 macOS 🪟 Windows
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui && cd stable-diffusion-webui && ./webui.sh
⬇️ Install
⚒️

SD WebUI Forge

Optimized fork of Auto1111 — 30-60% faster, lower VRAM usage, FLUX support built-in.

🐧 Linux 🍎 macOS 🪟 Windows
git clone https://github.com/lllyasviel/stable-diffusion-webui-forge && cd stable-diffusion-webui-forge && ./webui.sh
⬇️ Install
🐍

Diffusers (Python)

HuggingFace's Python library — programmatic control, scripting, batch processing. Supports all models.

🐧 Linux 🍎 macOS 🪟 Windows
pip install diffusers transformers torch accelerate safetensors
📖 Docs
🎨

InvokeAI

Professional creative AI tool — unified canvas, layer support, inpainting, outpainting.

🐧 Linux 🍎 macOS 🪟 Windows
pip install invokeai && invokeai-web
⬇️ Install
🍎

Draw Things

Native macOS/iOS app — optimized for Apple Silicon, Metal GPU acceleration.

🍎 macOS 📱 iOS
Available on Mac App Store — FREE


🧠 LLM Runtimes (Text Generation)

Run large language models locally — chat, code, reasoning, all offline

🦙

Ollama

Simplest way to run LLMs locally. One command to download and run any model. Built-in API server.

🐧 Linux 🍎 macOS 🪟 Windows
curl -fsSL https://ollama.com/install.sh | sh && ollama pull llama3.1
⬇️ Install 📚 Models

vLLM

High-throughput LLM serving engine — fastest inference, PagedAttention, continuous batching.

🐧 Linux 🍎 macOS
pip install vllm && vllm serve meta-llama/Llama-3.1-8B-Instruct
⬇️ Install
🔧

llama.cpp

CPU/GPU inference for GGUF models — runs on ANY hardware, even without GPU.

🐧 Linux 🍎 macOS 🪟 Windows
git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make -j
⬇️ Install
🖥️

LM Studio

Beautiful desktop GUI — discover, download, and run LLMs with zero config. ChatGPT-like interface.

🐧 Linux 🍎 macOS 🪟 Windows
Download from lmstudio.ai — FREE for personal use
⬇️ Download
💬

Jan

Open-source ChatGPT alternative — runs 100% offline, clean UI, plugin system.

🐧 Linux 🍎 macOS 🪟 Windows
Download from jan.ai — FREE & open source
⬇️ Download
🍏

MLX LM (Apple Silicon)

Apple's ML framework — optimized for M1/M2/M3 chips, unified memory for large models.

🍎 macOS (Apple Silicon)
pip install mlx-lm && mlx_lm.generate --model mlx-community/Llama-3.1-8B-Instruct-4bit


🖥️ 100TB Server Setup Guide

Complete guide to set up your servers for the ultimate AI model library

Step 1: System Preparation

sudo apt update && sudo apt install -y git git-lfs python3 python3-pip nvidia-driver-535 nvidia-cuda-toolkit
git lfs install

Step 2: Create Model Storage

sudo mkdir -p /mnt/100tb/models/{foundation,loras,characters,editing,llms} && sudo chown -R $USER:$USER /mnt/100tb/models

Step 3: Install AI Runtimes

pip install torch torchvision diffusers transformers accelerate safetensors huggingface_hub vllm invokeai
curl -fsSL https://ollama.com/install.sh | sh
git clone https://github.com/comfyanonymous/ComfyUI /opt/comfyui && cd /opt/comfyui && pip install -r requirements.txt

Step 4: Download All Models (No Login!)

cd /mnt/100tb/models/foundation && git clone https://huggingface.co/black-forest-labs/FLUX.1-dev && git clone https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0

💡 Pro tip: Visit the AI Models Hub and click "Copy Bulk Script" to get ALL 60+ model download commands at once!


💻 Hardware Requirements

What you need for different model sizes

🟢

Entry Level

GPU: 8GB VRAM (RTX 3060/4060)
RAM: 16GB
Storage: 500GB SSD

✅ Runs: SDXL, small LoRAs, 7B LLMs, Ollama

🟡

Mid Range

GPU: 16-24GB VRAM (RTX 4080/4090)
RAM: 32-64GB
Storage: 2TB NVMe

✅ Runs: FLUX.1, SD3, all LoRAs, 30B LLMs

🔴

Power User / Server

GPU: 48-80GB VRAM (A100/H100)
RAM: 128GB+
Storage: 100TB (your setup!)

✅ Runs: EVERYTHING. All models, all sizes, concurrent serving.