Sustainable Engineering / GreenOps
Scaling Responsibly with Carbon-Aware KEDA
Learn how to make your Kubernetes cluster eco-friendly. This guide shows you how to use KEDA to pause heavy workloads when the energy grid is dirty and run them when renewable energy is abundant, saving money and the planet.
Scaling Responsibly with Carbon-Aware KEDA

Sustainability is no longer just a slide in the annual report; it's code. With Kubernetes and KEDA (Kubernetes Event-driven Autoscaling), we can now make our infrastructure responsive to the physical reality of the energy grid.

This tutorial shows you how to configure a Carbon-Aware Scaler to pause heavy AI training jobs when the energy grid is "dirty" (high carbon intensity) and scale up when it is "clean" (renewables are active).

The Concept: Follow the Wind

Grid carbon intensity is measured in grams of CO2 per kilowatt-hour (gCO2/kWh). This value fluctuates wildly throughout the day based on sun and wind availability. Goal: Run delay-tolerant batch jobs only when intensity < 200 gCO2/kWh.

The KEDA Configuration

We will use the KEDA carbon-intensity trigger. Ensure you have KEDA installed on your cluster.

YAML

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: green-ai-training-job
  namespace: ml-ops
spec:
  scaleTargetRef:
    name: model-trainer-worker
  minReplicaCount: 0 # Pause completely if grid is dirty
  maxReplicaCount: 20
  triggers:
    - type: carbon-intensity
      metadata:
        # Scale out when grid is clean
        region: "us-west-2" # Oregon (Hydro-heavy region)
        # Threshold: The limit above which we scale down
        emissionThreshold: "200"
        # Handler: How to react? 'scaleDown' or 'pause'
        handler: "scaleDown"

How It Works

  1. Data Ingestion: KEDA polls a carbon data provider (like WattTime or ElectricityMaps) for the us-west-2 region.

  2. Evaluation:

    • If the current intensity is 150 g (Clean): KEDA scales the model-trainer-worker deployment up to maxReplicaCount.

    • If the current intensity hits 300 g (Dirty—maybe the wind stopped): KEDA scales the deployment down to minReplicaCount (0).

  3. Impact: Your workloads effectively "hibernate" during dirty hours and "sprint" during clean hours.

Benefits for the Platform Engineer

  • Cost Savings: Often, low-carbon hours correlate with off-peak electricity pricing.

  • ESG Compliance: This provides tangible, auditable proof that your organization is actively reducing Scope 2 emissions.

  • Automation: No manual intervention required. The grid dictates the schedule.

Pro Tip: Use this for delay-tolerant workloads like retraining models, generating embeddings, or nightly reporting. Do not use this for customer-facing APIs where latency is critical.

See, Understand, Optimize -
All in One Place

Atler Pilot decodes your cloud spend story by bringing monitoring, automation, and intelligent insights together for faster and better cloud operations.