OpenStack

OpenStack End-to-End Setup -- Kolla-Ansible, OVN Networking, Ceph Storage, and Multi-Tenancy

Eric Black

16 Mar 2026 — 14 min read

OpenStack Kolla-Ansible Nova Neutron Cinder Keystone OVN Ceph

Standalone Infrastructure | Component: OpenStack 2024.2 (Dalmatian) via Kolla-Ansible | Audience: Private Cloud Architects, Senior Infrastructure Engineers

OpenStack is the largest open source private cloud platform in production at scale. It's also the most complex infrastructure platform in this series by a significant margin, and it's honest to say that upfront. A properly designed OpenStack deployment can run hundreds of thousands of VMs across dozens of sites. Getting there requires making good decisions about control plane topology, networking architecture, storage backend selection, and multi-tenancy model before you run a single command.

This article covers a production quality OpenStack deployment using Kolla-Ansible, which is the current recommended deployment method for OpenStack. PackStack is deprecated and no longer suitable for production. The OpenStack release covered here is 2024.2 (Dalmatian), the most recent stable release at time of writing. The article covers control plane architecture, Kolla-Ansible deployment, networking with OVN, storage with Ceph via Cinder, VM and image management, multi-tenancy with projects and quotas, and OpenShift on OpenStack considerations.

1. Hardware Requirements and Pre-Install Decisions

Node Roles

An OpenStack deployment has distinct node roles. In a lab you can run everything on one machine. In production, these roles run on separate servers for scale and blast radius isolation.

Role	Services	Minimum Specs	Notes
Controller	Keystone, Glance, Nova API, Neutron Server, Cinder API, Horizon, RabbitMQ, MariaDB, Memcached, HAProxy	3 nodes (HA), 8 vCPU, 32 GB RAM, 100 GB OS disk	Three controllers minimum for HA. RabbitMQ and MariaDB require three nodes for quorum. HAProxy balances API traffic across them.
Compute	Nova Compute, Neutron OVN agent, Ceph client	2+ nodes, 16+ cores, 64+ GB RAM	Add compute nodes to scale VM capacity. No maximum. Each compute node runs the hypervisor (KVM) and Nova Compute daemon.
Network (Edge)	OVN Northbound/Southbound DBs, OVN Gateway chassis	2+ nodes, 4 cores, 16 GB RAM	Handles North South routing between tenant networks and the physical network. Can run on controller nodes in smaller deployments.
Storage (Ceph)	Ceph MON, MGR, OSD	3+ nodes, 4 cores, 16 GB RAM, dedicated SSDs per OSD	Separate Ceph cluster recommended for production. Ceph provides Cinder block storage and Glance image storage.

Network Requirements

OpenStack requires multiple networks. At minimum you need three. In a proper production deployment, five or more isolated networks reduce the blast radius of any single network failure and keep different traffic types from interfering with each other.

Management/API network: All OpenStack API calls, service-to-service communication, RabbitMQ, MariaDB replication. All controller, compute, and storage nodes need this network.
Tunnel network: Carries the OVN Geneve overlay traffic between compute nodes for tenant VM-to-VM communication. Dedicated 10 GbE or better. Keep it separate from management.
Provider/external network: Connects OVN gateway to the physical network for floating IPs and North South routing. Requires a NIC on gateway nodes that connects directly to the external network without a Linux bridge getting in the way.
Storage network: Ceph OSD-to-OSD replication and compute-to-Ceph client traffic. 10 GbE minimum, 25 GbE preferred.

2. Control Plane Overview: The Core Services

Before deploying, understand what each service does. Most OpenStack operational questions come down to knowing which service owns which function and where to look when something breaks.

Service	What It Does	Where Failures Show Up
Keystone	Identity and authentication. Issues tokens for all API calls. Every other service validates tokens against Keystone.	Login failures, "401 Unauthorized" on any API call, service-to-service authentication errors.
Nova	Compute management. Schedules VMs to compute nodes, manages VM lifecycle (start, stop, migrate), talks to the hypervisor via libvirt.	VM launch failures, migration errors, compute capacity reporting issues.
Neutron	Networking. Manages virtual networks, routers, floating IPs, security groups, and the OVN integration that enforces them on compute nodes.	VM network connectivity failures, floating IP failures, security group enforcement issues.
Cinder	Block storage. Manages volume lifecycle: create, attach, detach, snapshot, resize. Talks to the storage backend (Ceph RBD, LVM, or commercial arrays).	Volume attach failures, snapshot failures, storage quota errors.
Glance	Image service. Stores and retrieves VM images (qcow2, raw, ISO). Nova pulls images from Glance when launching a VM.	VM launch failures where "image not found," slow VM launches from large images in slow backends.
Horizon	Web dashboard. A thin UI layer over the OpenStack APIs. Everything in Horizon can also be done via CLI or direct API calls.	Horizon failures usually indicate a Keystone token issue or memcached cache corruption. The APIs still work when Horizon doesn't.
RabbitMQ	Message queue. Services communicate with each other by sending messages via RabbitMQ. A RabbitMQ failure stops inter-service communication.	Slow operations, delayed VM state changes, eventual complete stoppage of most operations.
MariaDB (Galera)	Database for all services except Swift. Runs as a three-node Galera cluster in HA deployments.	Any service failure. When MariaDB is down, no writes succeed anywhere.

3. Deploying with Kolla-Ansible

Kolla-Ansible is the recommended production deployment tool for OpenStack. It deploys OpenStack services as Docker containers managed by Ansible. Every service runs in its own container with a consistent configuration across all nodes. Updates are container image swaps rather than package upgrades, which significantly simplifies the upgrade path.

PackStack is no longer recommended for production. It installs OpenStack services directly on the host OS via RPM packages, which creates complex dependency chains and makes upgrades risky. If you're evaluating PackStack for a lab, that's fine. Don't use it for anything you intend to operate long-term.

Install Kolla-Ansible from PyPI pinned to the versioned release rather than pulling from the stable branch tip via git. The stable branch can contain unreleased commits that haven't completed the full test cycle. Pinning to a PyPI release gives you a tested, reproducible deployment baseline. Kolla-Ansible 18.x maps to OpenStack 2024.2 (Dalmatian).

bash: Install Kolla-Ansible and prepare for deployment (run on the deploy node)

# Deploy node: a separate machine or the first controller
# that orchestrates the Kolla-Ansible playbooks

# Install in a Python virtualenv to avoid dependency conflicts
python3 -m venv /opt/kolla-venv
source /opt/kolla-venv/bin/activate

# Install Kolla-Ansible for OpenStack 2024.2 (Dalmatian)
# Pin to the PyPI release rather than the stable branch tip to avoid
# unreleased commits landing in your deployment
pip install 'kolla-ansible==18.4.0'

# Create the Kolla config directory
mkdir -p /etc/kolla
chown $USER:$USER /etc/kolla

# Copy the example configuration files
cp -r $(python3 -c "import kolla_ansible; print(kolla_ansible.__path__[0])")/etc_examples/kolla/* /etc/kolla/

# Copy the multinode inventory template
cp $(python3 -c "import kolla_ansible; print(kolla_ansible.__path__[0])")/ansible/inventory/multinode .

# Install Ansible dependencies
kolla-ansible install-deps

bash: Key globals.yml settings for a production multinode deployment

# /etc/kolla/globals.yml - critical settings to configure before deployment

# OpenStack release
kolla_base_distro: "ubuntu"
openstack_release: "2024.2"

# Networking: set to the management interface on controller/compute nodes
network_interface: "eno1"
# The interface used for tenant overlay (OVN Geneve tunnels)
neutron_external_interface: "eno2"
# The VIP used to access the OpenStack APIs (load-balanced across controllers)
kolla_internal_vip_address: "10.0.100.50"

# Enable High Availability (requires 3 controller nodes)
enable_haproxy: "yes"
enable_keepalived: "yes"

# Networking backend: OVN is the current recommended backend
neutron_plugin_agent: "ovn"

# Storage: use Ceph for Cinder and Glance
enable_cinder: "yes"
cinder_backend_ceph: "yes"
glance_backend_ceph: "yes"
ceph_glance_pool_name: "glance"
ceph_cinder_pool_name: "cinder-volumes"

# Enable Horizon dashboard
enable_horizon: "yes"

# TLS for the API endpoints (recommended for production)
kolla_enable_tls_internal: "yes"
kolla_enable_tls_external: "yes"

bash: Deploy OpenStack with Kolla-Ansible

source /opt/kolla-venv/bin/activate

# Generate cryptographic passwords for all services
kolla-genpwd

# Run prechecks - fix everything flagged before proceeding
kolla-ansible -i multinode prechecks

# Bootstrap the target servers (installs Docker, sets sysctl, configures NTP)
kolla-ansible -i multinode bootstrap-servers

# Deploy OpenStack
kolla-ansible -i multinode deploy

# After deployment, generate the admin credentials file
kolla-ansible -i multinode post-deploy

# Source the admin credentials
source /etc/kolla/admin-openrc.sh

# Verify the deployment
openstack endpoint list
openstack compute service list
openstack network agent list

Run kolla-ansible prechecks and fix every failure before running deploy. A failed deployment partway through leaves the cluster in a partially configured state that requires manual cleanup. Prechecks specifically validate: all nodes can be reached via SSH, Docker is installable, network interfaces exist and are configured correctly, NTP is working, and disk space is sufficient. A clean precheck pass doesn't guarantee a successful deploy, but a precheck with failures guarantees a problematic deploy.

4. Networking with OVN

OVN (Open Virtual Network) is the current standard networking backend for OpenStack Neutron, replacing the older OVS based ML2 plugin. OVN implements distributed routing: every compute node participates in routing decisions rather than routing all traffic through a centralized network node. This eliminates the network node as a single point of failure and scales routing throughput with the number of compute nodes.

Core Networking Concepts

Provider networks: Directly connected to the physical network infrastructure. VMs on a provider network get IPs from an external DHCP server or from fixed IPs assigned via Nova/Neutron. Provider networks bypass the overlay entirely. Use provider networks when VMs need direct layer 2 access to the physical network.
Tenant (self-service) networks: Private networks created by tenants using Geneve overlay encapsulation. VMs on tenant networks are isolated from each other and from the physical network unless a router and floating IP are configured. Each tenant creates their own network topology independently of other tenants.
Routers: Neutron routers connect tenant networks to provider networks for North South traffic. With OVN, routing is distributed: the router is implemented on each compute node that has VMs connected to it, rather than on a dedicated network node.
Floating IPs: Public IP addresses allocated from a provider network pool and associated with a specific VM's private tenant network IP. Incoming traffic to the floating IP is DNAT'd to the private IP. The VM itself only sees its private tenant network IP.
Security groups: Stateful firewall rules applied per virtual port (VM NIC). OVN implements security groups in the OVS flow tables on each compute node. Rules apply to ingress and egress traffic independently. The default behavior is deny all ingress, allow all egress unless you override it.

OpenStack CLI: Create a tenant network, router, and floating IP

source /etc/kolla/admin-openrc.sh  # or source your project credentials

# Create a tenant private network
openstack network create tenant-net-01
openstack subnet create \
  --network tenant-net-01 \
  --subnet-range 192.168.100.0/24 \
  --gateway 192.168.100.1 \
  --dns-nameserver 8.8.8.8 \
  tenant-subnet-01

# Create a router and attach the tenant network
openstack router create tenant-router-01
openstack router add subnet tenant-router-01 tenant-subnet-01

# Set the router's external gateway to the provider network
openstack router set \
  --external-gateway public \
  tenant-router-01

# Allocate a floating IP from the public network pool
openstack floating ip create public

# Launch a VM and associate the floating IP
openstack server create \
  --image ubuntu-22.04 \
  --flavor m1.small \
  --network tenant-net-01 \
  --key-name my-keypair \
  test-vm-01

# Associate the floating IP with the VM
openstack floating ip set \
  --port $(openstack port list --server test-vm-01 -f value -c ID | head -1) \
  $(openstack floating ip list -f value -c "Floating IP Address" | head -1)

5. Storage: Cinder and Ceph Integration

Cinder provides block storage volumes that attach to VMs like virtual hard disks. The underlying storage backend determines the performance, reliability, and features available. Ceph RBD is the recommended backend for production OpenStack deployments: it provides thin provisioning, snapshots, cloning, and multipath access without a dedicated storage appliance.

Connecting Cinder to an Existing Ceph Cluster

bash: Configure Ceph credentials for Cinder and Glance (run before kolla-ansible deploy)

# On your existing Ceph cluster, create pools and users for OpenStack
ceph osd pool create cinder-volumes 64 64
ceph osd pool create glance 32 32
ceph osd pool create nova-vms 64 64  # for Nova ephemeral storage if needed

# Create a Ceph user for Cinder
ceph auth get-or-create client.cinder \
  mon 'profile rbd' \
  osd 'profile rbd pool=cinder-volumes, profile rbd pool=nova-vms, profile rbd-read-only pool=glance' \
  -o /etc/ceph/ceph.client.cinder.keyring

# Create a Ceph user for Glance
ceph auth get-or-create client.glance \
  mon 'profile rbd' \
  osd 'profile rbd pool=glance' \
  -o /etc/ceph/ceph.client.glance.keyring

# Copy the keyring and ceph.conf to the Kolla config directories
mkdir -p /etc/kolla/config/cinder/cinder-volume
mkdir -p /etc/kolla/config/glance
cp /etc/ceph/ceph.conf /etc/kolla/config/cinder/cinder-volume/
cp /etc/ceph/ceph.client.cinder.keyring /etc/kolla/config/cinder/cinder-volume/
cp /etc/ceph/ceph.conf /etc/kolla/config/glance/
cp /etc/ceph/ceph.client.glance.keyring /etc/kolla/config/glance/

Volume Types

Cinder volume types map to different storage backends or configurations. You create volume types as an admin and users select them when creating volumes. A common pattern: a "performance" volume type that maps to an all-flash Ceph pool, and a "standard" volume type that maps to a hybrid pool. Users choose the right tier for their workload. Quota enforcement applies per volume type, so you can limit how much premium storage any project can consume.

6. VM Management via Horizon and CLI

Horizon is the web dashboard and the right interface for operators doing infrequent tasks or showing the environment to stakeholders. The OpenStack CLI and direct API calls are the right tools for anything you do more than once. Horizon is a thin wrapper over the APIs with limited filtering and batch operation support. Everything Horizon can do, the CLI can do faster with better output formatting.

OpenStack CLI: Common VM management operations

source /etc/kolla/admin-openrc.sh

# List all servers across all projects (admin)
openstack server list --all-projects

# Get detailed information about a specific VM
openstack server show test-vm-01

# Live migrate a VM to a specific compute node
openstack server migrate --live-migration --host compute-node-02 test-vm-01

# Create a volume snapshot
openstack volume snapshot create \
  --volume my-data-volume \
  --force \
  my-data-snapshot-$(date +%Y%m%d)

# Create a VM image from a running server (requires shutdown first for consistency)
openstack server image create \
  --name "web-server-template-v1" \
  test-vm-01

# Show quotas for a project
openstack quota show my-project

# Update quotas for a project
openstack quota set \
  --cores 100 \
  --ram 204800 \
  --instances 50 \
  --volumes 100 \
  --gigabytes 10000 \
  my-project

7. Multi-Tenancy: Projects, Users, and Quotas

Multi-tenancy in OpenStack is built around projects (historically called tenants). A project is an isolated namespace with its own network topology, VMs, volumes, images, and quota limits. Users belong to one or more projects with role based access control determining what they can do within each project.

Role Hierarchy

admin: Cloud administrator. Can see and manage all projects, all resources, all quotas. Set policies, add compute nodes, manage Ceph backends.
member: Standard project user. Can create and manage VMs, volumes, and networks within their assigned project. Can't see other projects' resources.
reader: Read-only access within a project. Can view resource states but can't create or modify anything.

OpenStack CLI: Create a project with users and set quotas

source /etc/kolla/admin-openrc.sh

# Create a new project
openstack project create \
  --description "Development team project" \
  --enable \
  dev-team-project

# Create a user and assign to the project
openstack user create \
  --project dev-team-project \
  --password "SecurePassword123!" \
  --enable \
  dev-user-01

openstack role add \
  --project dev-team-project \
  --user dev-user-01 \
  member

# Set reasonable quotas for the project
openstack quota set \
  --cores 40 \
  --ram 81920 \
  --instances 20 \
  --volumes 50 \
  --gigabytes 5000 \
  --floating-ips 10 \
  --security-groups 20 \
  dev-team-project

# Verify the quotas
openstack quota show dev-team-project

8. OpenShift on OpenStack

Running OpenShift on OpenStack is a supported and common pattern at organizations that have invested in OpenStack for their private cloud and want to run OpenShift workloads on the same infrastructure. OpenShift treats OpenStack as a cloud provider, using Nova for VM provisioning, Cinder for persistent volumes, Neutron for VM networking, and Octavia for load balancers.

Prerequisites

Octavia load balancer service must be deployed in OpenStack. OpenShift uses Octavia to provision load balancers for the API server and for OpenShift Route and Service objects. Deploying OpenShift on OpenStack without Octavia requires manual load balancer configuration that's significantly more complex to maintain.
The OpenStack project that will host OpenShift needs quotas sized for the cluster: typically 30+ cores, 96+ GB RAM, and 1 TB+ of Cinder storage for the initial three-master, three-worker deployment. Quotas that are too small cause cryptic installation failures late in the IPI deployment process.
Floating IPs must be available. OpenShift IPI on OpenStack allocates floating IPs for the API VIP and the Ingress VIP automatically. Have at least two floating IPs pre-allocated or ensure the quota allows their creation.

IPI Installation

OpenShift IPI (Installer Provisioned Infrastructure) on OpenStack reads your clouds.yaml file for OpenStack credentials and handles all VM provisioning automatically. The install-config.yaml specifies the platform as openstack with the cloud name, the external network for floating IPs, the compute flavor for master and worker nodes, and the number of replicas. The openshift-install program then creates the VMs, configures networking, and bootstraps the cluster. You don't manually create VMs or configure Nova.

YAML: Minimal install-config.yaml for OpenShift IPI on OpenStack

apiVersion: v1
baseDomain: yourdomain.local
metadata:
  name: ocp-cluster-01
platform:
  openstack:
    cloud: myopenstack         # matches entry in clouds.yaml
    externalNetwork: public    # the provider network with floating IPs
    defaultMachinePlatform:
      type: m1.xlarge          # flavor: 4 vCPU, 16 GB RAM minimum for masters
pullSecret: '{"auths": ...}'   # from Red Hat pull secret page
sshKey: |
  ssh-rsa AAAA...              # your SSH public key
controlPlane:
  name: master
  replicas: 3
  platform:
    openstack:
      type: m1.xlarge
compute:
  - name: worker
    replicas: 3
    platform:
      openstack:
        type: m1.large         # 2 vCPU, 8 GB RAM minimum per worker

Key Takeaways

PackStack is deprecated. Don't use it for production. Kolla-Ansible is the current recommended deployment tool. It deploys OpenStack services as Docker containers, making upgrades container image swaps rather than package dependency nightmares.
Run kolla-ansible prechecks and fix every failure before running deploy. A partially failed deployment requires manual cleanup. Prechecks are not optional.
OVN is the current standard networking backend, replacing the older OVS ML2 plugin. OVN implements distributed routing on every compute node, eliminating the network node as a single point of failure.
Three controllers are the production minimum for HA. RabbitMQ and MariaDB Galera both require three nodes for quorum. A two-controller deployment has no quorum on these critical services and isn't truly HA.
Ceph RBD is the recommended Cinder and Glance backend for production. It provides thin provisioning, snapshots, cloning, and multi-node redundancy without a dedicated storage appliance. Size Ceph pools before running kolla-ansible deploy and configure the keyring files in /etc/kolla/config before deployment.
Horizon is a thin wrapper over the APIs. Use the OpenStack CLI for any operation you do more than once. Horizon has limited filtering, no bulk operations, and adds latency. The CLI is faster for every operational task.
OpenShift IPI on OpenStack requires Octavia. Without it, load balancer provisioning for the API server and ingress controller fails late in the installation, which is one of the most painful places for an IPI install to fail. Deploy Octavia in OpenStack before starting the OpenShift installation.