Skip to main content

CSI Addon

The CSI Addon ships the platform-curated Azure Blob CSI driver. Enabling it on a cluster gets you a working azureblob-default StorageClass — NFSv3-backed, ReadWriteMany-capable — wired up with a Crossplane-provisioned UserAssignedIdentity and the role grants the driver needs to mint storage accounts on demand.

For background on what an Installable / Installation actually does at the platform layer, see Installation. The rest of this page assumes you've read that.

Upstream

This addon installs and wires up the open-source kubernetes-sigs/blob-csi-driver — that project owns the actual mount path, supported StorageClass parameters, and the driver pods. The platform's value-add is the Crossplane-provisioned WorkloadIdentity + role grants, the curated azureblob-default StorageClass, and the per-cluster Installation wiring. Driver-level questions (parameters, supported features, bug reports) belong upstream.

What Lands on the Cluster

When the addon is enabled, ArgoCD reconciles a Helm release that produces:

ResourcePurpose
WorkloadIdentity (Crossplane claim)Provisions a UAMI + Federated Credential + a kube-system/blob-csi-driver ServiceAccount annotated for Workload Identity.
EnvironmentConfig (synthetic)Stamps an env-config the IdentityRoleAssignment composition's selectors pick up — lets us scope role grants at the cluster RG without a sibling resource claim.
IdentityRoleAssignment × 2Grants the UAMI Storage Account Contributor (account create/manage + listKeys for blobfuse) and Network Contributor (subnet Microsoft.Storage service-endpoint patch at NFSv3 account creation) on the cluster RG.
StorageClass azureblob-defaultProvisioner blob.csi.azure.com; NFSv3, RWX, Standard_LRS / StorageV2, allowSharedKeyAccess: false, reclaimPolicy: Retain, volumeBindingMode: Immediate.
Upstream blob-csi-driverController Deployment + node DaemonSet + RBAC + CSIDriver from kubernetes-sigs/blob-csi-driver.

The driver pods bind to the chart-provisioned ServiceAccount, so the Workload Identity webhook injects the federated token automatically.

Add the Installation

Cluster enablement happens in the org's .platform repo. Create installations/p6m-csi.yaml:

apiVersion: p6m.dev/v1alpha1
kind: Installation
metadata:
name: p6m-csi
spec:
cd:
enabled: true
autoPromote: true
installableRef:
kind: Installable
name: p6m-csi
namespace: installables
destinations:
- clusterRef:
name: <your-cluster-name>
overrides:
source:
template: |
azure:
blobDriver:
enabled: true

Replace <your-cluster-name> with the cluster's clusterRef.name (find it via grep clusterRef installations/*.yaml in the same repo). Push and merge.

The active config is enabled: true only — chart defaults apply. To deviate (different SKU, additional StorageClasses, etc.), add the relevant keys under azure.blobDriver per the Settings Reference below.

To enable on multiple clusters in the same org, add additional destinations[].clusterRef.name entries. Each destination can have its own overrides.

Settings Reference

All settings live under azure.blobDriver. Set them via destinations[].overrides.source.template in the Installation.

The "Override snippet" column shows the minimal addition needed under blobDriver: in the template: block — paste it directly into the install YAML you saw above.

Upstream

The settings below are the platform-curated subset. The upstream kubernetes-sigs/blob-csi-driver supports many additional StorageClass parameters (for example NFS-specific mount options, network endpoint type, soft-delete retention) — see upstream's driver-parameters.md. If you need one of those, mention it on the rollout ticket so the chart can be extended; don't edit the rendered StorageClass directly on the cluster.

azure.blobDriver

SettingOverride snippetNotes
enabled (bool)
default: false
template: |
azure:
blobDriver:
enabled: true
Master toggle. When false (chart default), nothing renders. The Installation template above sets it to true.

azure.blobDriver.storageClass

The default StorageClass the addon renders. Always rendered when blobDriver.enabled: true.

SettingOverride snippetNotes
enabled (bool)
default: true
template: |
azure:
blobDriver:
# ... snip ...
storageClass:
enabled: false
Set to false to suppress the default SC (e.g. if you only want additionalStorageClasses[]).
name (string)
default: azureblob-default
template: |
azure:
blobDriver:
# ... snip ...
storageClass:
name: azureblob-shared
The StorageClass name. Stable across SKU changes — does not include the SKU slug.
mode (string)
default: ReadWriteMany
template: |
azure:
blobDriver:
# ... snip ...
storageClass:
mode: ReadWriteOnce
ReadWriteManyprotocol: nfs (NFSv3) on the SC. ReadWriteOnce → omit protocol, falling back to the upstream BlobFuse default. RWO is unsafe for multi-pod Deployments.
reclaimPolicy (string)
default: Retain
template: |
azure:
blobDriver:
# ... snip ...
storageClass:
reclaimPolicy: Delete
Retain keeps the underlying Azure storage account when the PVC is deleted (manual cleanup needed). Delete is also available but discouraged on shared volumes.
skuName (string)
default: Standard_LRS
template: |
azure:
blobDriver:
# ... snip ...
storageClass:
skuName: Premium_LRS
⚠️ Performance tier is immutable post-creationStandard_*Premium_* cannot be changed on an existing storage account. Pick upfront. See Gotchas.
resourceGroup (string)
default: azure.resourceGroup
template: |
azure:
blobDriver:
# ... snip ...
storageClass:
resourceGroup: my-storage-rg
Override the storage account RG — useful if storage and cluster live in different RGs.

azure.blobDriver.additionalStorageClasses[]

A list — each entry renders a sibling StorageClass. Unset fields inherit from storageClass. The snippets below all start with the list parent and show one entry; combine fields into the same entry as needed.

SettingOverride snippetNotes
enabled (bool)
template: |
azure:
blobDriver:
# ... snip ...
additionalStorageClasses:
- enabled: true
Per-entry on/off. Required for the entry to render.
name (string)
template: |
azure:
blobDriver:
# ... snip ...
additionalStorageClasses:
- enabled: true
name: azureblob-archive
Auto-derives to azureblob-{sku-slug} (e.g. azureblob-standard-lrs for the inherited default Standard_LRS) when unset.
mode (string)
template: |
azure:
blobDriver:
# ... snip ...
additionalStorageClasses:
- enabled: true
mode: ReadWriteOnce
Inherits from storageClass.mode if unset.
reclaimPolicy (string)
template: |
azure:
blobDriver:
# ... snip ...
additionalStorageClasses:
- enabled: true
reclaimPolicy: Delete
Inherits from storageClass.reclaimPolicy if unset.
skuName (string)
template: |
azure:
blobDriver:
# ... snip ...
additionalStorageClasses:
- enabled: true
skuName: Premium_LRS
Inherits from storageClass.skuName if unset.
resourceGroup (string)
template: |
azure:
blobDriver:
# ... snip ...
additionalStorageClasses:
- enabled: true
resourceGroup: archive-rg
Inherits from storageClass.resourceGroup, then azure.resourceGroup.

Required at install-time

The Installable template surfaces the cluster's azure.subscriptionId and azure.resourceGroup from cluster context — you don't set these on the Installation. Set them on the cluster's PlatformCluster if missing.

Find Available StorageClasses on the Cluster

After ArgoCD syncs the Installation, the SC takes a moment to land. Confirm with:

kubectl --context=<your-cluster> get sc

You should see azureblob-default. Inspect specifics:

kubectl --context=<your-cluster> get sc azureblob-default -o yaml

For wider context (other StorageClass categories, picking the right one for your workload), see Storage.

Verification

After merge, expect the following on the target cluster:

# ArgoCD
kubectl --context=<your-cluster> -n argocd get applications.argoproj.io | grep p6m-csi
# Driver pods
kubectl --context=<your-cluster> -n kube-system get pods -l app.kubernetes.io/name=azureBlobCsiDriver
# Crossplane claims (all should reach READY=True)
kubectl --context=<your-cluster> -n kube-system get workloadidentity,identityroleassignment
# StorageClass
kubectl --context=<your-cluster> get storageclass azureblob-default

For an end-to-end smoke test, see the Cloud Storage page — the example there (3-replica Deployment mounting an NFSv3 RWX volume) is the canonical "does this thing actually work" check.

Upstream

Driver pod logs, snapshot support, volume expansion behavior, and advanced troubleshooting are all upstream territory. For a working pod that won't mount, start with kubectl logs -n kube-system -l app.kubernetes.io/name=azureBlobCsiDriver — the message format and remediation flow comes from kubernetes-sigs/blob-csi-driver.

Gotchas

  • skuName performance tier is immutable. You can flip replication letters within a tier (LRSGRS) on an existing storage account, but Standard_*Premium_* requires re-provisioning and migrating data. Pick the right tier upfront — most workloads land on Standard_LRS. (Azure docs — performance tiers.)
  • reclaimPolicy: Retain means storage survives kubectl delete pvc. The auto-provisioned nfs<random> storage account stays in the cluster RG and continues billing until you az storage account delete it manually. Useful for safety; surprising if you assume PVC deletion = teardown.
  • volumeBindingMode: Immediate — the storage account is provisioned as soon as the PVC is created, even if no pod is ever scheduled against it. Combined with reclaimPolicy: Retain this means orphan storage accounts can pile up on the Azure side if PVCs are created speculatively. Plan to clean up unused PVCs (and their underlying storage accounts) periodically.
  • VNet topology assumption. The chart provisions Network Contributor at the same RG as azure.resourceGroup. If the cluster's VNet lives in a different RG, the driver's subnet patch will fail with AuthorizationFailed at first NFSv3 account creation. Verify the VNet RG matches before enabling.
  • Other clusters in the same org are not enabled by default. Each destinations[].clusterRef.name entry is opt-in. Promote dev → stg → prd by adding entries explicitly.
  • BlobFuse RWX is unsafe. If you set mode: ReadWriteOnce and run more than one replica that needs the volume, you'll see file corruption — this is an upstream constraint, not something the platform can paper over. See upstream's limitations.md. RWX cases must use the default NFSv3 mode.