Fixing Crossplane 'Composite Resource Failed' XRD Schema Violations: A Production Debugging Guide
Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 10–30 mins
TL;DR
- What broke: Crossplane's XRD OpenAPI v3 schema rejected a composite resource or claim because a field type, required constraint, or enum value mismatched between the XRD
spec.versions[].schema.openAPIV3Schemaand the actual resource manifest submitted. - How to fix it: Align the offending field's type, add missing
requiredentries, or correct thex-kubernetes-preserve-unknown-fieldsflag in the XRD schema. Re-apply the XRD and re-trigger the claim. - Shortcut: Use our Client-Side Sandbox above to auto-refactor your failing XRD/Composition — it diffs your schema against the claim and generates corrected YAML locally in your browser.
The Incident (What Does the Error Mean?)
You'll see this surface in kubectl describe composite <name> or in the Crossplane provider pod logs:
Event: Warning CannotApplyComposite composite/my-xr-abc123
cannot apply composite resource: admission webhook denied the request:
spec.parameters.retentionDays: Invalid value: "string": spec.parameters.retentionDays in body must be of type integer
or from the claim:
Error from server (BadRequest): error when applying patch:
.spec.parameters.region: Required value
.spec.parameters.storageClass: Unsupported value: "COLD": supported values: "STANDARD", "NEARLINE", "ARCHIVE"
Immediate consequence: The composite resource object is created in etcd but stays in Synced: False / Ready: False. Every managed resource downstream — your RDS instance, GKE cluster, S3 bucket — never gets provisioned. The Composition never runs. Your application deployment is blocked entirely.
The Attack Vector / Blast Radius
This isn't a security exploit vector, but the blast radius in a platform engineering context is severe:
- All claims referencing this XRD version are dead. One bad schema change in a shared XRD breaks every tenant claim across every namespace.
- Silent queue buildup: Crossplane will retry the reconcile loop. If the XRD is broken at the CRD validation layer, the controller logs fill with noise and masks real failures in adjacent resources.
- Version lock risk: If you've already served
v1alpha1to teams and need to fix the schema, you cannot mutatespec.versions[].schemain-place without a version bump — forcing av1alpha2migration or a destructive delete-recreate of the XRD, which drops all existing composite resources. - CI/CD pipeline stall: Any GitOps reconciliation (Flux, ArgoCD) will continuously fail and potentially trigger alert fatigue, burying the real error.
The most dangerous pattern: a platform team ships a schema fix to v1alpha1 that changes a field from type: string to type: integer without a version bump. Existing claims silently break on next sync.
How to Fix It
Root Cause Checklist
Before touching anything, run:
# Get the exact validation error from the composite object
kubectl describe composite.example.org <composite-name> -n <namespace>
# Check XRD schema for the offending version
kubectl get xrd xpostgresqlinstances.example.org -o jsonpath='{.spec.versions[0].schema.openAPIV3Schema}' | jq .
# Validate claim against live schema
kubectl apply --dry-run=server -f my-claim.yaml
Basic Fix: Correct the Field Type Mismatch
Scenario: XRD declares retentionDays as integer, but the claim passes a string.
# claim.yaml
apiVersion: example.org/v1alpha1
kind: XPostgreSQLInstance
metadata:
name: my-db-claim
spec:
parameters:
- retentionDays: "7" # WRONG: quoted string fails integer schema
+ retentionDays: 7 # CORRECT: unquoted integer
region: us-east-1
Basic Fix: Add Missing Required Field in XRD
# xrd.yaml - spec.versions[0].schema.openAPIV3Schema.properties.spec.properties.parameters
properties:
region:
type: string
storageClass:
type: string
enum:
- STANDARD
- NEARLINE
- ARCHIVE
required:
- region
- storageClass # was missing, causing silent nil pointer in Composition patches
Enterprise Best Practice: Schema Versioning + Structural Validation
Never mutate an existing served XRD version schema. Use proper version promotion:
# xrd.yaml
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
name: xpostgresqlinstances.example.org
spec:
versions:
- - name: v1alpha1
- served: true
- referenceable: true
- schema:
- openAPIV3Schema:
- properties:
- spec:
- properties:
- parameters:
- properties:
- retentionDays:
- type: string # WRONG TYPE - breaking existing claims
+ - name: v1alpha1
+ served: true
+ referenceable: false # demote old version, stop new claims
+ schema:
+ openAPIV3Schema:
+ properties:
+ spec:
+ properties:
+ parameters:
+ properties:
+ retentionDays:
+ type: string
+ - name: v1alpha2
+ served: true
+ referenceable: true # new claims use corrected version
+ schema:
+ openAPIV3Schema:
+ type: object
+ properties:
+ spec:
+ type: object
+ properties:
+ parameters:
+ type: object
+ required:
+ - region
+ - retentionDays
+ properties:
+ retentionDays:
+ type: integer # CORRECT TYPE
+ minimum: 1
+ maximum: 365
+ region:
+ type: string
+ enum:
+ - us-east-1
+ - eu-west-1
Critical: After updating the XRD, force-reconcile stuck composites:
# Annotate to trigger re-reconcile without delete
kubectl annotate composite.example.org <name> \
crossplane.io/paused=false --overwrite
# If composite is truly stuck, check for orphaned managed resources before deleting
kubectl get managed -l crossplane.io/composite=<composite-name>
💡 Tired of pasting proprietary configs into ChatGPT? Generic AI tools log your company's ARNs, DB strings, and private keys. StackEngine is a zero-backend, pure Client-Side WASM utility. Drop your failing config into the sandbox above. We redact your secrets locally in the browser and auto-generate the refactored code using your own API key.
Prevention in CI/CD
1. Dry-Run Validation in Pull Requests
# .github/workflows/crossplane-validate.yaml
- name: Validate XRD and Claims (server-side dry-run)
run: |
kubectl apply --dry-run=server -f infrastructure/xrds/
kubectl apply --dry-run=server -f infrastructure/claims/
2. Crossplane crossplane beta validate (v1.14+)
# Validates compositions and XRDs offline without a live cluster
crossplane beta validate --cache-dir ~/.crossplane/cache \
xrd.yaml composition.yaml claim.yaml
3. Conftest / OPA Policy: Enforce Required Fields in XRD Schema
# policy/xrd_required_fields.rego
package crossplane.xrd
deny[msg] {
input.kind == "CompositeResourceDefinition"
version := input.spec.versions[_]
version.referenceable == true
params := version.schema.openAPIV3Schema.properties.spec.properties.parameters
count(params.required) == 0
msg := sprintf("XRD version %v has no required fields in parameters schema", [version.name])
}
conftest test xrd.yaml --policy policy/
4. Kubeval / kubeconform in Pre-Commit
kubeconform -schema-location default \
-schema-location 'https://raw.githubusercontent.com/crossplane/crossplane/main/cluster/crds/{{.Group}}/{{.ResourceKind}}_{{.ResourceVersion}}.json' \
infrastructure/claims/*.yaml
5. Argo CD / Flux: Block Promotion Without Schema Validation
Add a pre-sync Job in your ArgoCD Application that runs crossplane beta validate before any XRD change reaches production. Gate on exit code — fail the sync if validation returns non-zero.