Skip to content

Architecture

mc-operator is a Kubernetes Operator built with .NET 10 and KubeOps 10. It manages MinecraftServer and MinecraftServerCluster custom resources through continuous reconciliation loops, ensuring the actual cluster state matches the desired state expressed in the CRD specs.

The operator uses the API group mc-operator.dhv.sh. This domain is owned and maintained under the dhv.sh umbrella.

mc-operator/
├── .github/
│ ├── dependabot.yml # Automated dependency updates
│ └── workflows/
│ ├── ci.yml # Build + test on push/PR
│ ├── release-image.yml # Publish container image on tag
│ └── release-chart.yml # Package and publish Helm chart on tag
├── charts/
│ └── mc-operator/ # Helm chart (OCI-published to GHCR)
├── docs/ # Astro Starlight documentation site
├── examples/ # Example MinecraftServer manifests
├── manifests/
│ ├── crd/ # CustomResourceDefinition YAML (MinecraftServer + MinecraftServerCluster)
│ ├── rbac/ # ClusterRole + ClusterRoleBinding
│ └── operator/ # Deployment, Service, webhook configs (Kustomize)
└── src/
├── McOperator/ # Main operator application
└── McOperator.Tests/ # Unit tests (TUnit)

The MinecraftServer CRD (mc-operator.dhv.sh/v1alpha1) is the operator’s public API. Every field in the spec maps directly to a Kubernetes resource or container configuration. The CRD is fully documented in the CRD reference.

MinecraftServerController implements IEntityController<MinecraftServer>. On each reconcile it:

  1. Sets the status phase to Provisioning
  2. Reconciles the ConfigMap (audit-visible server.properties)
  3. Reconciles the Service (ClusterIP/NodePort/LoadBalancer)
  4. Pre-pulls the new image and server jar when spec.prePull: true (see Pre-pull Jobs below)
  5. Reconciles the StatefulSet (the actual server workload)
  6. Reads StatefulSet status to determine the current phase
  7. Updates the status subresource with endpoint and PVC info

All child resources are labeled with owner references, so they are automatically garbage-collected when the parent MinecraftServer is deleted.

Validating webhook (MinecraftServerValidationWebhook): Rejects invalid specs at admission time, before resources are created or updated. It validates:

  • EULA acceptance
  • Minecraft version not blank
  • JVM memory values parseable and maxMemory >= initialMemory
  • NodePort value only provided for NodePort service type
  • NodePort in the 30000–32767 range
  • Storage size a valid Kubernetes resource quantity
  • Mount path absolute
  • Immutable fields not changed on update (storage.enabled, storage.size, storage.storageClassName)
  • Server port and view distance in range

Mutating webhook (MinecraftServerMutationWebhook): Normalizes specs on create/update:

  • Applies a default MOTD based on server name and type if not set
  • Normalizes levelType to uppercase
  • Trims whitespace from string fields

MinecraftServerFinalizer runs during deletion when spec.storage.deleteWithServer: true. It:

  1. Identifies the PVC created by the StatefulSet (data-<name>-0)
  2. Deletes it explicitly (PVCs from VolumeClaimTemplates are not owned by the MinecraftServer, so they won’t be garbage-collected automatically)

When deleteWithServer: false (the default), the finalizer runs but intentionally does nothing — the PVC is retained.

Minecraft is a stateful singleton. StatefulSets provide:

  • Stable pod identity (server-name-0): deterministic PVC naming
  • VolumeClaimTemplates: native pod-to-PVC relationship management
  • Ordered updates: prevents concurrent pod replacement that could corrupt state
  • WhenScaled: Retain: PVC not deleted on scale-down

Rather than maintaining separate images per distribution, the operator sets the TYPE and VERSION environment variables on the itzg/minecraft-server image. This image is the community standard, handles all distributions, and is actively maintained.

Users can override with spec.image for full control (e.g. custom images with pre-installed plugins).

PVCs use ReadWriteOnce access mode by default. When spec.prePull: true is set, the access mode is switched to ReadWriteMany so the running server pod and a short-lived pre-pull Job can both mount the same volume simultaneously during a version upgrade. Most production CSI drivers (NFS, Longhorn, Rook-Ceph, cloud block storage) and single-node setups (k3s local-path) support ReadWriteMany.

PVCs are retained by default on deletion (deleteWithServer: false). World data is irreplaceable; the operational cost of a retained PVC is negligible compared to the risk of accidental deletion.

Storage fields (enabled, size, storageClassName) are immutable after creation, enforced by the validating webhook. This matches PVC semantics: you cannot resize a claim once bound.

The API is at v1alpha1 even though the software is a working v1. This signals that the spec may evolve before it stabilizes. The graduation path is: v1alpha1 → v1beta1 → v1.

Reconcile(server):
1. status.phase = Provisioning
2. ConfigMap: get-or-create / update
3. Service: get-or-create / update
4. Pre-pull: if spec.prePull is true and image changed on an active server →
create Job (mounts PVC, pulls image + downloads jar)
requeue 30s until Job complete, then proceed
5. StatefulSet: get-or-create / update
6. Read StatefulSet.status.readyReplicas
- If replicas==0 → phase=Paused
- If readyReplicas==1 → phase=Running
- Else → phase=Provisioning
7. Update status (phase, endpoint, PVC info, conditions)
8. Requeue after 5m (drift detection)

On error, the controller requeues after 30 seconds.

When spec.prePull: true is set and a MinecraftServer version or image is changed, the operator creates a short-lived batch/v1 Job (<server-name>-prepull) before updating the StatefulSet. This minimises downtime by ensuring the new image layers and server jar are already present when the rolling update begins. Pre-pull is disabled by default.

The Job runs in one of two modes:

ModeConditionBehaviour
Jar-downloadDefault itzg image + storage.enabled: trueMounts the data PVC, runs the itzg /start script with a fake java stub — startup scripts download the server jar to the PVC, then the stub exits 0
OCI-onlyCustom spec.image or storage.enabled: falseRuns sh -c "exit 0" to force OCI layers onto the node’s image cache; no jar download

The Job targets the specific node where the server pod is running (spec.nodeName), so cached image layers are available on exactly the right node. SKIP_SERVER_PROPERTIES=true prevents the startup scripts from overwriting config files that the live server is actively using.

Pre-pull is skipped when spec.prePull is false (the default), status.currentImage is unset (fresh server), the desired image matches the current image, or spec.replicas == 0 (paused server). If the Job fails, the upgrade proceeds anyway — pre-pull failure is a warning, not a blocker.

The MinecraftServerCluster CRD (mc-operator.dhv.sh/v1alpha1) manages a fleet of backend MinecraftServer instances behind a Velocity proxy. It is fully documented in the CRD reference.

MinecraftServerClusterController implements IEntityController<MinecraftServerCluster>. On each reconcile it:

  1. Determines the desired server count from the scaling configuration
  2. Reconciles backend MinecraftServer instances (create, update, or delete to match the desired count)
  3. Builds the server address list for the Velocity configuration
  4. Reconciles the proxy ConfigMap (velocity.toml + forwarding.secret)
  5. Reconciles the proxy Service (the player-facing endpoint)
  6. Reconciles the proxy Deployment (the Velocity proxy workload)
  7. Updates the status subresource with backend server status, proxy endpoint, and conditions

This order ensures backend servers exist before the proxy is configured to route to them.

The Velocity proxy runs as a Kubernetes Deployment, not a StatefulSet. This is because the proxy is stateless — it maintains only in-memory player sessions and can be replaced at any time without data loss. A Deployment provides:

  • Rolling restarts without complex orchestration
  • Standard horizontal scaling (though the operator currently deploys 1 replica)
  • No need for PVCs or stable pod identities

When backend servers are created or removed, the cluster controller regenerates the velocity.toml configuration in a ConfigMap. The [servers] section lists all backend server addresses (using Kubernetes internal DNS), and the [servers.try] list controls the order players are assigned to servers.

The proxy Deployment mounts this ConfigMap as a volume. Configuration changes trigger a pod restart via annotation-based rollout.

When the player forwarding mode is Modern or BungeeGuard, the operator generates a random forwarding secret (UUID) and stores it in the proxy ConfigMap as forwarding.secret. The same ConfigMap is mounted into the proxy pod. Backend servers are configured with onlineMode: false so that Velocity handles player authentication.

Reconcile(cluster):
1. desiredCount = scaling.mode == Static ? scaling.replicas : scaling.minReplicas
2. Reconcile backend MinecraftServers (scale up/down, update template)
3. Build server address list from live MinecraftServer instances
4. ConfigMap: velocity.toml + forwarding.secret → get-or-create / update
5. Service: proxy service → get-or-create / update
6. Deployment: Velocity proxy → get-or-create / update
7. Read backend server phases + proxy Deployment status
- If all servers Running + proxy ready → phase=Running
- If some servers not ready → phase=Degraded
- Else → phase=Provisioning
8. Update status (phase, endpoint, server statuses, conditions)
9. Requeue after 5m (drift detection)

On error, the cluster controller requeues after 30 seconds.

Validating webhook (MinecraftServerClusterValidationWebhook): Validates the template (same rules as MinecraftServer), scaling configuration (replicas ≥ 1 for Static, minReplicas ≤ maxReplicas for Dynamic, policy required for Dynamic), and proxy settings (port range, NodePort rules).

Mutating webhook (MinecraftServerClusterMutationWebhook): Applies default MOTD, normalizes levelType, and trims whitespace — the same normalization as the MinecraftServer webhook, applied to the cluster template and proxy spec.