Tutorial
Practical Examples of Using Open Policy Agent (OPA) for Cloud Cost Control
Ready to get hands-on with cost control? This article provides practical, copy-paste-ready examples of Open Policy Agent (OPA) policies. Learn how to write rules in Rego to enforce budgets, require tags, and restrict expensive resources in both Kubernetes and Terraform.
Practical Examples of Using Open Policy Agent (OPA) for Cloud Cost Control

Open Policy Agent (OPA) has become the de facto standard for policy as code, providing a unified framework to enforce rules across the cloud-native stack. While often associated with security and compliance, OPA is an incredibly powerful tool for FinOps, enabling teams to implement automated, policy-driven cost optimization. By writing policies in its declarative language, Rego, you can create guardrails that prevent budget overruns and enforce cost-saving best practices.

How OPA Works for Cost Control

OPA works by decoupling policy logic from your application or CI/CD pipeline. It evaluates a JSON input against policies written in Rego to make a decision (e.g., allow or deny). For cost control, this process typically involves:

  1. Generating Input Data: For Kubernetes, the API server provides the manifest of a resource being created as JSON. For Terraform, you generate a plan and convert it to JSON.

  2. Adding Cost Data: A cost estimation tool like Infracost or Scalr's native integration analyzes the IaC plan and injects cost data (e.g., proposed_monthly_cost) into the JSON input for OPA.

  3. Evaluating Policies: OPA evaluates this combined JSON data against your Rego policies to check for violations.

  4. Enforcing Decisions: If a policy is violated, OPA returns a deny message, and the admission controller (for Kubernetes) or CI/CD pipeline (for Terraform) blocks the deployment.

OPA Policy Examples for Kubernetes

In Kubernetes, OPA is often deployed as an admission controller using Gatekeeper. This allows you to enforce policies every time a new resource is created or updated.

1. Enforce Mandatory Cost Allocation Labels

This policy ensures every Deployment has a cost-center label.

Code snippet

package kubernetes.validating.cost_control

deny[msg] {
    input.request.object.kind == "Deployment"
    not input.request.object.metadata.labels["cost-center"]
    msg := "Deployments must have a 'cost-center' label for cost allocation."
}

How it works: This policy checks if the incoming object is a Deployment and if the cost-center label is missing from its metadata. If the label is not present, it generates a deny message, and Gatekeeper will reject the resource creation.

2. Restrict Expensive Node Selectors in Non-Production

This policy restricts which node selectors are allowed in certain namespaces.

Code snippet

package kubernetes.validating.cost_control

deny[msg] {
    input.request.object.kind == "Pod"
    input.request.namespace == "development"
    input.request.object.spec.nodeSelector["gpu"] == "true"
    msg := "GPU nodes are not allowed in the 'development' namespace."
}

How it works: This policy triggers if a Pod is being created in the development namespace and its nodeSelector is set to request a GPU-enabled node.

3. Enforce Resource Limits to Prevent Runaway Costs

This policy denies pods that do not have memory limits defined.

Code snippet

package kubernetes.validating.cost_control

deny[msg] {
    container := input.request.object.spec.containers[_]
    not container.resources.limits.memory
    msg := sprintf("Container '%v' must have memory limits defined.", [container.name])
}

How it works: The policy iterates through each container in a pod's specification (containers[_]). If any container is found without resources.limits.memory defined, it generates a deny message.

OPA Policy Examples for Terraform

In a CI/CD workflow for Terraform, OPA evaluates a JSON file generated from the plan and cost estimation output.

1. Block Deployments Exceeding a Cost Increase Threshold

This policy blocks any pull request that would increase the monthly infrastructure cost by more than $100.

Code snippet

package terraform.cost_control

deny[msg] {
    input.tfrun.cost_estimate.delta_monthly_cost > 100
    msg := sprintf("Monthly cost increase of $%v exceeds the $100 limit.", [input.tfrun.cost_estimate.delta_monthly_cost])
}

How it works: This policy directly accesses the delta_monthly_cost field provided by a tool like Scalr. If the value is greater than 100, the policy fails, and the CI/CD pipeline can be configured to block the merge.

2. Restrict Expensive Instance Types

This policy prevents the use of large, expensive AWS EC2 instance types.

Code snippet

package terraform.cost_control

# List of disallowed expensive instance types
disallowed_types := {"m5.8xlarge", "c5.12xlarge", "p3.2xlarge"}

deny[msg] {
    # Find any aws_instance resource being created or updated
    resource_change := input.plan.resource_changes[_]
    resource_change.type == "aws_instance"
    resource_change.change.actions[_] == "create"

    # Check if the instance type is in the disallowed list
    instance_type := resource_change.change.after.instance_type
    disallowed_types[instance_type]

    msg := sprintf("Resource '%v' uses disallowed expensive instance type '%v'.", [resource_change.address, instance_type])
}

How it works: The policy iterates through all resource_changes in the Terraform plan JSON. It identifies any aws_instance being created and checks if its instance_type is present in the disallowed_types set.

3. Enforce Data Archiving with S3 Lifecycle Policies

This policy ensures that all S3 buckets have a lifecycle rule to transition old data to a cheaper storage class.

Code snippet

package terraform.cost_control

deny[msg] {
    resource_change := input.plan.resource_changes[_]
    resource_change.type == "aws_s3_bucket"
    resource_change.change.actions[_] == "create"
    # Check if a lifecycle_rule is defined
    not resource_change.change.after.lifecycle_rule
    msg := sprintf("S3 bucket '%v' must have a lifecycle_rule defined for cost optimization.", [resource_change.address])
}

How it works: This policy checks every new aws_s3_bucket resource to ensure that a lifecycle_rule block is present in its configuration.

Conclusion

Open Policy Agent, when combined with cost estimation data, provides a powerful, flexible, and open-source framework for implementing FinOps as Code. The Rego language allows you to write nuanced policies that go far beyond simple budget alerting. By codifying cost controls for both Kubernetes and Terraform, organizations can build an automated governance system that enforces financial best practices and prevents costly mistakes.

See, Understand, Optimize -
All in One Place

Atler Pilot decodes your cloud spend story by bringing monitoring, automation, and intelligent insights together for faster and better cloud operations.