Audience. This tutorial assumes you already have a basic understanding of Podman and its architecture, as well as basic networking. It also assumes Podman is already installed within Linux, on bare-metal (my setup), virtual machine, or via Windows Subsystem for Linux (W2L). It is also assumed you already acknowledge the system requirements, risks, and limitations of running AI on your local machine.
Since the release of NetworkChuck’s tutorial on May 3, 2024, Ollama for Windows is now available, but these steps are NOT compatible with Ollama for Windows.
Inspiration. In 2023, I was amazed by what could be created with ChatGPT. Being able to generate a 10-page research paper on Renaissance Art in 5 minutes was mindblowing. However, the real fun in ChatGPT came from entering prompts that generated more humorous but still company appropriate images of people who don’t exist.
Then I realized my curiosity was directing me to using AI in ways that would be incompatible with the policies of my former organization. Many organizations forbid the use of ChatGPT for personal uses, in addition to stating clearly that all data generated on company equipment automatically belongs to the company. I needed a new personal solution, completely disconnected from a professional context, that would both allow me to generate more experimental results without having any of this data be collected by either the employer or OpenAI.
So I was excited to find that NetworkChuck, a popular IT expert on YouTube, uploaded a tutorial on hosting All your AI locally. Even with making a couple tweaks to make it work on FreeBSD, it was exciting to be able to run local Large Language Models (LLMs) on my own PC under my control.
A year later, I took this basic local AI setup and built upon it, with enhanced security and minimized risk, using Podman. As someone just starting to utilize podman in my personal homelabs, I have put together this tutorial after teaching myself.
The goal. On Linux, we’re taking the Ollama + WebUI structure outlined by NetworkChuck and moving it into Podman containers hosted by a non-admin (non-root) user. Additionally, we’ll also have them run at boot as rootless Systemd services so they’re always running. This setup offers these key security advantages:
- Daemonless. The absence of persistent background Daemons, performing tasks and services without user intervention, means no privileged process is running constantly, therefore less attack surface
- Rootless. The Podman architecture is rootless by default, meaning impact of attack is decreased without root privileges. This means if an attacker happens to gain access to the non-admin account, it’s not as easy to take over vs a standard host Ollama + WebUI install
- Control. The Podman setup allows for greater control over security measures with SELinux. Allowing for future hardening tweaks and techniques
With this setup, we avoid the potential security risks with a non-containerized Ollama installed with the official install.sh script from ollama.com.
Requirements. Many users have different preferences for what hardware is best for local AI. It depends on whether you want basic prompting, or advanced image generation with image diffusion. The minimum setup, based on my homelab, is at least 32GB of GDDR4 RAM, 12GB of GDDR6 VRAM, and 8-core x86 CPU.
Podman is a widely available container platform for most Linux distributions, including ones for W2L.
Step-By-Step Tutorial
Step 1, Ollama. Within our non-admin account, we’re going to pull the container image for Ollama and run it with this command. The structure for each command defines the name of the container, a flag that calls to replace an existing container with a new one, a flag that tells podman to restart the container if it exists, devices, tcp port to listen on, the container root, and the image:
For AMD:
podman run -d \
--name ollama \
--replace \
--restart=always \
--device /dev/kfd \
--device /dev/dri \
-p 11434:11434 \
-v ollama:/root/.ollama \
docker.io/ollama/ollama:rocm
For Nvidia:
podman run -d \
--name ollama \
--replace \
--restart=always \
--device nvidia.com/gpu=all \
-p 11434:11434 \
-v ollama:/root/.ollama \
docker.io/ollama/ollamaFor CPU-only:
podman run -d \
--name ollama \
--replace \
--restart=always \
-e OLLAMA_NO_CUDA=1 \
-p 11434:11434 \
-v ollama:/root/.ollama \
docker.io/ollama/ollama
First, Now let’s verify that the Podman Ollama is working with curl localhost:11434
Should print “Ollama is running” in the console.
Or typing http://localhost:11434 in a web browser on your host machine should print the same message. If Ollama is not working, make sure the port is not being blocked by your firewall settings.
Now we can interact with our Podman Ollama with
podman exec -it ollama /bin/bash and interact as usual.
You will need to already have some LLMs downloaded for the next step to work.
Step 2, WebUI. Now for the easier part, let’s pull the container for WebUI that will automatically connect to the Ollama container. The --network=host flag tell us to run the container on localhost:
podman run -d \
--name open-webui \
--restart=always \
--network=host \
-v open-webui:/app/backend/data \
-e OLLAMA_BASE_URL=http://127.0.0.1:11434 \
ghcr.io/open-webui/open-webui:main
Now curl localhost:8080 should print out the HTML code in the console, or typing the address in the browser should open WebUI!
Now verify that both ollama and open-webui containers are running with podman ps
We should now have the same working WebUI experience, but within pods. Follow the standard process for creating a new Admin account.
Step 3, Systemd. We now have our setup working, but let’s create Systemd services within our rootless account so both containers run at boot. Let’s make our directory:
Mkdir -p ~/.config/systemd/userNow we generate our service files:
podman generate systemd --new --name ollama --files
podman generate systemd --new --name open-webui --filesNow move systemd files to directory, still within our rootless account:
mv container-ollama.service ~/.config/systemd/user/
mv container-open-webui.service ~/.config/systemd/user/Finally, enable the systemd services at boot so podman containers are always running:
systemctl --user enable container-ollama.service
systemctl --user enable container-open-webui.service
loginctl enable-linger [nonroot user]Conclusion. We now have a basic setup with two containers running at boot and completely isolated from the main system. This is designed to only involve the basics of Podman, so this should be a great introductory guide to Podman + AI.












