← Blog
Run Your Own ChatGPT with Open WebUI and Ollama

ChatGPT is great until you realize every prompt you send is stored on someone else's servers, your conversations feed into training data, and your API bill grows with every request.

Open WebUI + Ollama gives you the same experience — a polished chat interface with multiple AI models — running entirely on your own server. No API costs, no data leaving your network, no rate limits.

Open WebUI is the interface (129k+ stars on GitHub). Ollama is the engine that runs the models. Together, they deploy in 5 minutes with Docker Compose, and the result feels like a real product — not a side project.

This guide walks you through the complete setup on a Linux server.

What you'll need

  • A Linux server (Ubuntu 22.04/24.04, Debian 12, or similar)
  • At least 8 GB RAM for running 7-8B parameter models on CPU
  • 4+ CPU cores recommended
  • 20 GB storage minimum (models are 4-8 GB each)
  • Docker Engine 25+ with the docker compose plugin

For CPU inference with small models (Llama 3.1 8B, Mistral 7B), a VPS with 4 cores and 16 GB RAM works well. Responses are slower than GPU but perfectly usable for personal use or a small team. For faster inference, a dedicated server with a GPU (RTX 4090 or A100) makes a huge difference.

Step 1 — Install Docker

If Docker isn't already installed:

bash

sudo apt update
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

sudo usermod -aG docker $USER

Log out and back in, then verify:

bash

docker compose version

Step 2 — Create the Docker Compose file

Create a directory and set up the compose file:

bash

mkdir -p /opt/openwebui && cd /opt/openwebui

Create docker-compose.yml:

yaml

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - ollama_data:/root/.ollama
    restart: unless-stopped
    tty: true

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    volumes:
      - openwebui_data:/app/backend/data
    depends_on:
      - ollama
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=your_secret_key_here
    extra_hosts:
      - host.docker.internal:host-gateway
    restart: unless-stopped

volumes:
  ollama_data:
  openwebui_data:

Generate a secret key:

bash

openssl rand -hex 32

Replace your_secret_key_here with the generated value.

Step 3 — Start the stack

bash

docker compose up -d

Watch the logs:

bash

docker compose logs -f open-webui

Wait until you see the server is ready on port 8080. Then open:

http://your-server-ip:3000

The first user to register becomes the admin. Create your account immediately.

Step 4 — Download your first model

Open WebUI lets you pull models directly from the interface. But the fastest way is via command line:

bash

docker exec -it ollama ollama pull llama3.1:8b

This downloads the Llama 3.1 8B model (around 4.7 GB). Other popular models to try:

bash

# Fast and capable general model
docker exec -it ollama ollama pull mistral:7b

# Google's compact model
docker exec -it ollama ollama pull gemma2:9b

# Coding-focused model
docker exec -it ollama ollama pull qwen2.5-coder:7b

# Small and fast for quick tasks
docker exec -it ollama ollama pull phi4-mini

Each model takes 3-8 GB of disk space. You can download as many as your storage allows — Ollama loads and unloads them from memory as needed.

Step 5 — Start chatting

Go back to the Open WebUI interface. Select a model from the dropdown at the top, type a message, and you have your own private ChatGPT.

Features you get out of the box:

  • Multiple models — switch between Llama, Mistral, Gemma, and others in one click
  • Conversation history — all chats saved locally on your server
  • Multi-user support — create accounts for your team, each with their own chat history
  • RAG (document chat) — upload PDFs and ask questions about their content
  • System prompts — customize how each model behaves
  • Model presets — save temperature, context length, and other settings per model
  • Dark mode — of course

Step 6 — Set up HTTPS for remote access

If you want to access your AI from outside your network:

bash

sudo apt install -y nginx certbot python3-certbot-nginx

Create /etc/nginx/sites-available/openwebui:

nginx

server {
    listen 80;
    server_name ai.yourdomain.com;

    client_max_body_size 100M;

    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support (needed for streaming responses)
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Enable and get SSL:

bash

sudo ln -s /etc/nginx/sites-available/openwebui /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
sudo certbot --nginx -d ai.yourdomain.com

The WebSocket headers are important — without them, streaming responses (where text appears word by word) won't work through the proxy.

Step 7 — Connect cloud APIs (optional)

Open WebUI doesn't just work with local models. You can also connect it to cloud APIs:

In the admin panel, go to Settings → Connections and add your API keys for:

  • OpenAI — GPT-4o, GPT-4 Turbo
  • Anthropic — Claude
  • Any OpenAI-compatible API — Groq, Together AI, Fireworks, etc.

This turns Open WebUI into a unified interface for all your AI models — local and cloud — in one place. Use local models for daily tasks (free, private) and switch to cloud models when you need maximum capability.

How much RAM do you actually need?

The biggest question with self-hosted LLMs is always RAM. Here's a practical guide:

ModelParametersRAM needed (CPU)Quality
Phi-4 Mini3.8B4 GBGood for quick tasks
Mistral 7B7B8 GBStrong general use
Llama 3.1 8B8B8 GBExcellent all-rounder
Gemma 2 9B9B10 GBGoogle's best compact model
Qwen 2.5 Coder 7B7B8 GBBest for coding tasks
Llama 3.1 70B70B48 GB+Near GPT-4 quality (needs GPU)

For CPU inference, 16 GB RAM gives you comfortable headroom for 7-8B models. Responses take 2-5 seconds to start — slower than ChatGPT, but completely private and free.

With a GPU (RTX 4090, 24 GB VRAM), responses are near-instant and you can run models up to 30B parameters comfortably.

Updating

Open WebUI and Ollama update frequently. To upgrade:

bash

cd /opt/openwebui
docker compose pull
docker compose up -d
docker image prune -f

Your conversations, users, and settings are preserved in the Docker volumes.

Troubleshooting

"Model not found" error: The model isn't downloaded yet. Run docker exec -it ollama ollama pull model_name.

Slow responses: That's CPU inference. It's normal for 7B models to take 2-5 seconds per response on CPU. For faster results, use a smaller model (Phi-4 Mini) or add a GPU.

Out of memory: The model is too large for your RAM. Stick to 7-8B models on 16 GB RAM. Ollama uses quantized models (Q4) by default to reduce memory usage.

WebSocket errors through proxy: Make sure your Nginx config includes the Upgrade and Connection headers for WebSocket support.

Can't register: By default, the first user is admin. If registration is disabled, the admin can create accounts in Settings → Admin → Users.

Continue reading

Community zone

A question ?
Find answers and share your knowledge !

We are waiting you on community zone. More than 70 guides (sysadmin, gaming, devops...) !

Let me check
DEDIMAX DEDIMAX DEDIMAX DEDIMAX
DEDIMAX

Need a quote ?

Write us !

Contact us

Prendre contact