Veeam v13: SOSAPI Deep Dive -- How Smart Object Storage Integration Actually Works
Veeam v13 Series | Component: VBR v13 | Audience: Enterprise Architects, Hands-on Sysadmins, Storage Engineers
Most people who use S3 object storage with Veeam don't know that two completely different levels of integration exist. There's S3 Compatible, which is what you get when you point VBR at any standard S3 bucket. And there's S3 Integrated, which is what you get when the storage platform has implemented Veeam's Smart Object Storage API. The difference isn't cosmetic. It changes what VBR knows about the storage, how it places data, how it scales across nodes, and how much operational visibility you have into what's actually happening.
SOSAPI is one of those features where understanding how it actually works under the hood makes you a meaningfully better architect. It's not complicated once you see the mechanism. This article covers exactly that: what the API is, how it works at the file level, what each capability does for you in practice, which vendors have implemented it, and what you lose when you're working with storage that doesn't support it.
1. The Problem SOSAPI Solves
S3 is a great protocol for storing and retrieving objects. It wasn't designed to tell backup software how full a storage system is, which node in a cluster has capacity available, or whether the platform is healthy. From VBR's perspective, a standard S3 bucket is essentially a black box. You can write to it and read from it, but you can't ask it questions. VBR has to make capacity decisions blind. It has to distribute data across nodes without knowing which nodes have capacity. It shows "unlimited" in the repository list because it has no idea what the actual storage size is.
That's fine when the S3 target is a cloud provider like AWS or Wasabi with effectively unlimited capacity. It becomes a real operational problem when the S3 target is an on-premises appliance with finite capacity and multiple nodes that need to be balanced. You don't find out you're out of space until a backup job fails. You don't know which node is taking all the load until you look at your storage system's own monitoring. VBR is flying blind.
SOSAPI fixes this by giving on-premises object storage vendors a way to publish information about themselves inside the bucket, using the S3 API itself. No new ports. No plugins. No agents. VBR reads it like it reads any S3 object. The storage maintains the files. VBR polls them. They talk.
2. How It Actually Works: The XML Files
This is the part that's genuinely clever about SOSAPI. Instead of adding new API endpoints or requiring a sidecar service, Veeam extended the S3 interaction by using ordinary S3 GET operations against a specific folder and a set of XML files that the storage vendor maintains in the bucket root.
When a SOSAPI capable storage platform creates a bucket, it creates a hidden folder at the bucket root with a fixed name: .system-d26a9498-cb7c-4a87-a44a-8ae204f5ba6c. That GUID is part of the SOSAPI specification and is the same across all vendors. Inside that folder, two core XML files live:
- system.xml declares what the platform is and which SOSAPI capabilities it supports. VBR reads this file first to understand what the storage is capable of.
- capacity.xml contains the current total capacity, used space, and available space in bytes. The storage system updates this file continuously as data is written and deleted.
VBR polls these files every 4 minutes. Every 4 minutes it issues a standard S3 GET against capacity.xml and updates its own capacity tracking from the result. That's the entire mechanism. From the S3 protocol's perspective, it's just a file read. From VBR's perspective, it's a live feed of storage health.
What system.xml Looks Like
Here's a representative system.xml showing a storage platform declaring its capabilities:
<?xml version="1.0" encoding="UTF-8" ?>
<SystemInfo>
<ProtocolVersion>"1.0"</ProtocolVersion>
<ModelName>"Ootbi v4.0"</ModelName>
<ProtocolCapabilities>
<CapacityInfo>true</CapacityInfo>
<UploadSessions>false</UploadSessions>
<IAMSTS>false</IAMSTS>
<SmartEntity>true</SmartEntity>
</ProtocolCapabilities>
</SystemInfo>
Each capability is declared true or false independently. A platform can support CapacityInfo without supporting SmartEntity. VBR enables only the capabilities the platform declares. You don't configure anything in VBR to activate them. When VBR reads system.xml and sees CapacityInfo is true, it starts reading capacity.xml. When it sees SmartEntity is true, it activates the Smart Entity routing workflow. It's automatic.
What capacity.xml Looks Like
<?xml version="1.0" encoding="UTF-8" ?> <CapacityInfo> <Capacity>192000000000000</Capacity> <Available>147200000000000</Available> <Used>44800000000000</Used> </CapacityInfo>
All values are in bytes. The storage platform is responsible for keeping this accurate. VBR trusts the values and uses them for job scheduling, repository selection in a SOBR, and the capacity display in the console. If the platform stops updating capacity.xml, VBR is working with stale data. That's a vendor quality issue, not a Veeam configuration issue, but it's worth knowing.
3. CapacityInfo: What VBR Does With Capacity Data
Once VBR has real capacity data from capacity.xml, it uses it in three concrete ways that change how your backup environment behaves.
Console Visibility
In the Backup Repositories list, a standard S3 Compatible repository shows "Unlimited" in the capacity column because VBR has no data. An S3 Integrated repository shows the actual total capacity and free space, updated every 4 minutes. When you're creating or editing a backup job and selecting this repository, you can see whether there's space for the new job's expected footprint before you commit. With standard S3 you're guessing.
SOBR Extent Selection
This is where CapacityInfo has the biggest practical impact in production environments. When you're using multiple SOSAPI repositories as extents in a Scale Out Backup Repository performance tier, VBR uses the CapacityInfo from each extent to decide where to place incoming backup data. It doesn't have to distribute blindly. It knows that Extent A has 40 TB free and Extent B has 8 TB free, so it routes the large job to Extent A. Without CapacityInfo, VBR uses a round robin or fill and move approach that can leave you with uneven utilization and surprise capacity failures on the fullest extent.
Immutability Lock Period Enforcement
In v13, when you use the retention policy immutability mode (which ties the Object Lock window to your backup job's retention period), VBR reads the CapacityInfo to factor current available space into its retention decisions. It needs accurate capacity data to make good decisions about when restore points can be pruned after their immutability expires. Without it, these decisions are approximations.
4. SmartEntity: How Load Balancing Works
SmartEntity is the SOSAPI capability that matters most for multi-node on-premises object storage clusters. It's a two-way conversation between VBR and the storage platform that happens before each backup stream is sent.
The Sequence Step by Step
- VBR announces the entity. Before sending backup data, VBR tells the storage platform what it's about to send and approximately how much data is involved. The "entity" is the backup object: a VM name, a physical server name, a NAS file share. VBR includes the entity identifier and the expected data size in the SmartEntity request.
- The storage platform assigns a node. The storage evaluates its cluster state and responds with the specific node IP or endpoint address that should receive this data stream. It considers current load, available capacity per node, and potentially network interface throughput.
- VBR writes directly to that node. The backup data bypasses the vIP or load balancer entirely and goes straight to the assigned node. No extra network hop. No intermediary. The data lands where the storage said to put it.
- The storage updates capacity.xml. After the write, the platform updates capacity.xml so the next CapacityInfo poll reflects the new state.
The practical result in a three node cluster running six simultaneous backup jobs is that each node gets two streams directed to it, instead of all six going to whichever node happens to be behind the vIP. You get better aggregate throughput, more even utilization, and no single node becomes a bottleneck.
SmartEntity and Load Balancers Don't Mix
This is an important constraint. SmartEntity works by returning a specific node IP to VBR. If your Ootbi cluster sits behind a load balancer, VBR sends the data to the node IP that SmartEntity specified, but the load balancer intercepts it and may redirect to a different node. That defeats the entire purpose of SmartEntity routing. Object First's architecture explicitly says don't put a load balancer in front of Ootbi if you want SmartEntity to work correctly. The same applies to any SOSAPI platform with SmartEntity.
SmartEntity and Multiple Bucket Mode
Veeam v13 introduced per object bucket mode, where VBR uses a separate bucket per machine rather than a single shared bucket. SmartEntity and per object bucket mode are incompatible. SmartEntity routing relies on a single endpoint receiving the SmartEntity announcement and returning a node assignment. Per object bucket mode changes how the backup chain is organized in ways that conflict with how SmartEntity expects to route data. If you're on a large environment considering per object bucket mode for namespace reasons, you give up SmartEntity. For most environments that's not the right trade, but it's your call to make.
5. SOSAPI Capabilities Reference
Not all SOSAPI capabilities are mandatory for a vendor to implement. A platform can ship SOSAPI support with only CapacityInfo enabled and add SmartEntity later. Here's what each capability does:
| Capability | What It Does | Required? |
|---|---|---|
| CapacityInfo | Exposes total, used, and free capacity to VBR via capacity.xml. Enables capacity display in console, SOBR extent selection, and retention decisions. | Effectively yes. Without it the integration has minimal value. |
| SmartEntity | Enables VBR to announce incoming backup entities so the storage can assign specific node endpoints. Enables direct data path routing in multi-node clusters. | No. Single node deployments don't need it. |
| UploadSessions | Allows the storage to tell VBR whether it's ready to accept uploads. VBR can pause writes if the platform signals it needs recovery time. | No. Not all SOSAPI vendors implement this. |
| IAMSTS | Enables the storage platform to handle AWS IAM Security Token Service requests, allowing temporary credentials. | No. Vendor dependent. |
6. Who Has Implemented SOSAPI
SOSAPI was introduced in Veeam v12. At launch the inaugural partners were Scality (with their Artesca platform) and Object First (Ootbi). Cloudian also shipped SOSAPI support at v12 launch. Since then the list has grown as other on-premises object storage vendors have implemented the integration.
The authoritative list is on the Veeam Ready Object Storage program page, which shows which vendors have validated their SOSAPI implementation with Veeam. Any vendor claiming SOSAPI support without being on the Veeam Ready list hasn't been through the compatibility validation process. That's worth checking before you architect around SmartEntity for a vendor you haven't deployed before.
Cloud providers like AWS S3, Wasabi, Azure Blob, and Backblaze B2 do not implement SOSAPI. Those buckets always show as S3 Compatible in VBR. That's expected and fine. SOSAPI is specifically for on-premises object storage where capacity visibility and multi-node routing actually matter. A cloud bucket with effectively unlimited capacity and its own internal load balancing doesn't need either of those things.
7. What You Lose Without SOSAPI
It's worth being direct about the tradeoffs when you're working with standard S3 Compatible storage instead of an S3 Integrated platform.
| Capability | S3 Compatible (no SOSAPI) | S3 Integrated (SOSAPI) |
|---|---|---|
| Capacity visibility in VBR console | Shows "Unlimited." You don't know how full it is until a job fails. | Shows actual total and free space, updated every 4 minutes. |
| SOBR extent placement | Round robin or fill and move. No capacity awareness. | Capacity aware placement routes jobs to extents with space. |
| Multi-node load balancing | All traffic goes through the endpoint or vIP. You manage distribution yourself. | SmartEntity routes each backup stream to the optimal node automatically. |
| Platform health signaling | None. VBR doesn't know if the platform is degraded. | Storage can signal via UploadSessions if it needs VBR to pause writes. |
| Capacity aware retention decisions | VBR makes retention decisions without storage context. | Retention policy immutability mode works with accurate capacity data. |
None of this means S3 Compatible storage doesn't work. It works fine. But you're giving up a layer of intelligence that changes how much operational attention your storage requires. With SOSAPI you have fewer surprises, better utilization, and more visibility. Without it, you're managing the storage system separately from VBR and reconciling the two manually.
8. Diagnosing SOSAPI Status in Your Environment
If you've added an object storage repository and you're not sure whether SOSAPI is active, here's how to confirm it.
Visual Indicators in VBR
- Blue bucket icon + "S3-integrated" type: SOSAPI is active. VBR is reading system.xml and capacity.xml. Capacity is displayed in the console.
- Green bucket icon + "S3-compatible" type: SOSAPI is not active. Either the platform doesn't support it, the hidden .system folder doesn't exist in the bucket, or versioning isn't enabled (for platforms where versioning is a prerequisite like Ootbi).
Checking the Bucket Directly
If you expect SOSAPI to be active but you're seeing the green icon, connect to the bucket with an S3 browser (Cyberduck, S3 Browser, rclone) and look for the hidden folder at the bucket root. If the .system-d26a9498-cb7c-4a87-a44a-8ae204f5ba6c folder doesn't exist, the storage platform hasn't created it. If it exists but the files inside are stale (the modified timestamp on capacity.xml hasn't changed in hours), the platform has stopped updating them. Both are storage platform configuration or health issues that you'd resolve on the storage side, not in VBR.
Rescanning the Repository
If you've fixed a SOSAPI issue on the storage side (enabled versioning, confirmed the .system folder exists), VBR won't automatically upgrade the integration from S3 Compatible to S3 Integrated. Right click the repository in Backup Repositories and select Rescan. VBR re-reads the bucket root and picks up the SOSAPI files if they're now present. The icon and type update immediately after a successful rescan.
Key Takeaways
- SOSAPI isn't a separate protocol. It's Veeam's extension of the S3 API using ordinary file reads against XML files in a hidden folder at the bucket root. No new ports, no plugins, no agents required.
- Two XML files do the work: system.xml declares what capabilities the platform supports. capacity.xml contains total, used, and available space in bytes, updated continuously by the storage platform. VBR polls capacity.xml every 4 minutes.
- CapacityInfo gives VBR real storage visibility: actual capacity in the console, capacity aware SOBR extent selection, and better retention decisions. Without it, VBR shows "Unlimited" and makes placement decisions blind.
- SmartEntity is VBR announcing to the storage what it's about to send and the storage responding with the specific node endpoint to send it to. Backup data bypasses the vIP and goes directly to the assigned node. Multi-node clusters get even load distribution automatically.
- SmartEntity and load balancers don't work together. SmartEntity routing tells VBR to send data to a specific node. A load balancer in front of the cluster intercepts that and can redirect to a different node, defeating the purpose.
- SmartEntity and per object bucket mode (v13) are mutually exclusive. If you need per object buckets for namespace reasons, you give up SmartEntity routing. Pick one.
- S3 Compatible (green icon) versus S3 Integrated (blue icon) is the tell. Green means VBR is flying blind. Blue means SOSAPI is active. If you expect blue and see green, either versioning isn't enabled on the bucket, or the .system folder doesn't exist at the bucket root.
- Rescanning the repository is how you trigger a SOSAPI upgrade from S3 Compatible to S3 Integrated after fixing an issue on the storage side. VBR won't pick it up automatically.