Scaling Responsibly with Carbon-Aware KEDA

Sustainability is no longer just a slide in the annual report; it's code. With Kubernetes and KEDA (Kubernetes Event-driven Autoscaling), we can now make our infrastructure responsive to the physical reality of the energy grid.

This tutorial shows you how to configure a Carbon-Aware Scaler to pause heavy AI training jobs when the energy grid is "dirty" (high carbon intensity) and scale up when it is "clean" (renewables are active).

The Concept: Follow the Wind

Grid carbon intensity is measured in grams of CO2 per kilowatt-hour (gCO2/kWh). This value fluctuates wildly throughout the day based on sun and wind availability. Goal: Run delay-tolerant batch jobs only when intensity < 200 gCO2/kWh.

The KEDA Configuration

We will use the KEDA carbon-intensity trigger. Ensure you have KEDA installed on your cluster.

YAML

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: green-ai-training-job
  namespace: ml-ops
spec:
  scaleTargetRef:
    name: model-trainer-worker
  minReplicaCount: 0 # Pause completely if grid is dirty
  maxReplicaCount: 20
  triggers:
    - type: carbon-intensity
      metadata:
        # Scale out when grid is clean
        region: "us-west-2" # Oregon (Hydro-heavy region)
        # Threshold: The limit above which we scale down
        emissionThreshold: "200"
        # Handler: How to react? 'scaleDown' or 'pause'
        handler: "scaleDown"

How It Works

Data Ingestion: KEDA polls a carbon data provider (like WattTime or ElectricityMaps) for the us-west-2 region.
Evaluation:
- If the current intensity is 150 g (Clean): KEDA scales the model-trainer-worker deployment up to maxReplicaCount.
- If the current intensity hits 300 g (Dirty—maybe the wind stopped): KEDA scales the deployment down to minReplicaCount (0).
Impact: Your workloads effectively "hibernate" during dirty hours and "sprint" during clean hours.

Benefits for the Platform Engineer

Cost Savings: Often, low-carbon hours correlate with off-peak electricity pricing.
ESG Compliance: This provides tangible, auditable proof that your organization is actively reducing Scope 2 emissions.
Automation: No manual intervention required. The grid dictates the schedule.

Pro Tip: Use this for delay-tolerant workloads like retraining models, generating embeddings, or nightly reporting. Do not use this for customer-facing APIs where latency is critical.

See, Understand, Optimize -
All in One Place

Atler Pilot decodes your cloud spend story by bringing monitoring, automation, and intelligent insights together for faster and better cloud operations.