דלגו לתוכן

Terraform Authentication and Impersonation

פורסם בתאריך 23 בדצמבר 2024

תוכן זה אינו זמין עדיין בשפה שלך.

In this lesson we will learn how to manage permissions properly in our team. We will focus on a few key guidelines and best practices:

  • Groups - we will strive to avoid giving personal permissions to users, instead we will focus on placing users in groups and giving permissions to the group.
  • Minimal Permissions - Each of our groups will get the minimum permissions required to perform their tasks.
  • Impersonation - We will use service accounts and impersonation to create our cloud resources
  • Short lived token - We will impersonate service accounts using short lived generated tokens, this will be defined in our project so the developer will not have to manually impersonate the service account.
  • No basic roles - We will avoid using basic roles and instead use predefined roles or custom roles.

Everything will be defined in our Terragrunt Tofu/Terraform project, so a user who wants to create resource will automatically impersonate the proper service account without needing to manual impersonation. There are a few advantages to dealing with permissions properly:

  • Damage control - user has certain restrictions so there mistakes and damage will be limited.
  • Security - security breach can have limited damage.
  • No Cinderella complex - Cinderella complex is when we have a single devops user that is in charge of opening all the gcp resources for the entire organization, (like Cinderalla step sisters asking: do this Cinderella, do that Cinderella), there is also a management problem when we have a single user that is in charge of all the resources, if that user leaves the organization, we will have a problem.

Let’s learn how to manage properly permissions in a Terragrunt project.

Principals

In GCP we give permissions to principals in our case we will look at a smaller window in the principal list and we will focus on these 3 principals:

  • User
  • Group
  • Service Account

We will give permissions to these principals and they will be able to perform certain actions in GCP. Often we don’t have to give permissions to a user, we simply place that user in a group and give permissions to the group. In this example we will create 3 groups:

  • Admin - The admin group can create other groups and assign permissions to them.
  • DevOps - The devops group will have access to the common resources, they will also have access to all the environments: prod and non-prod.
  • Developers - The developers group will have access to the non-prod environments only.

We will create a folder in: iac/gcp/tofu/iam inside this folder we will create a folder called groups where we will define the different groups. Until now when we ran our IAC code we created resource in GCP using the google provider, the google provider requires us to authenticate with GCP, what we did is run the command gcloud auth application-default login authenticated in the browser with our user and then (if our user has enough permissions) we created resourced in GCP using our user.

Things are going to change using security best practices, if a user will try to create cloud resources using his own identity he will fail, no user in the will be able to do that. A user will be part of a group, when a user in in a group he will inherit any permissions that the group has. So if we give permissions for the group to created cloud resources, technically I can create the resources using my own identity, but this is not the case, even if team members will be placed in a group they will not be able to create cloud resources. The only principals that will have the permissions to run our terraform code successfully are dedicated IAC service accounts (notice that I’m using plural, not one service account which I saw many organizations do - give god like permissions to a single service account is not recommended). So if the google provider needs us to authenticate how do we authenticate as a service account? We use something called Impersonation.

Impersonation

Impersonation is when an authenticated principal, such as user or another service account, authenticate as a different service account to gain the service account’s permissions. Impersonation in GCP is similar to AWS AssumeRole and Azure On-Behalf-Of flow. Using the impersonation tricks we will create several service accounts with different gcp resource creation privilages, we can then give impersonation permissions to our groups to impersonate certain service accounts according to the permissions we want to grant the team memebers in the group.

So a user is can not run our terraform code, he is placed in a group so he is getting more permissions but even with those permissions he can’t directly run the terraform code, but the group might contains permissions to impersonate certain service accounts that are dedicated to IAC, and then using impersonation the user can create the cloud resources. Now that sounds really complicated right? Does that mean every time I want to run terraform code I need to remember which service account is allowed to run that code? Does it mean I need to start generating tokens to impersonate the service account? No, and it is actually simpler than you think. In the code itself we will define in each folder which service account should be impersonated to and the impersonation will happen automatically.

Why do we need impersonation

If we are using impersonation it means the GCP user is not directly creating the resources, instead the user is impersonating a service account that has the proper permissions to create the resources and then the service account will create those resources. This is a good practice because:

  • Enhanced security - we will follow the principle of least privilage, every user will get the minimum amount of permissions required. If that user wants to create resources he will have to generate a short lived token to impersonate the service account which will require the user to authenticate. If the user account is compromised the attacker will not be able to create resources in GCP.
  • Better CI Integration - Usually you assign service account to the CI and you can easily add impersonation to the CI by assigning him to the right group
  • Better audit logs and breadcrumbs - you can easily track resource generation which will help you manage vulnerabilities and security breaches.
  • Better permissions management - It’s easier to understand the teams you have and which permissions each group has, it allows you for better management, better organization, better control on which resources can be created by who, which gives you more control on your cloud and the amount of resources that can be generated by the cloud.
  • Better damage control - if a user account is compromised you can easily find out which service account goes berserk and understand which group is compromised and easily sandbox the problem and revoke permission on that service account.

Service, Permissions, Roles, Predefined roles, Basic Roles

Before we start coding let’s cover some of the terminology here

Service

In GCP a service is a resource that is managed by GCP, for example a GKE cluster, a GCS bucket, a Pub/Sub topic, a BigQuery dataset, etc. A service has a name, for example: Kubernetes Engine, it has an api: container.googleapis.com, and it has a service permission prefix: container (the prefix of the api). You can use find the api by the service name here

Permissions

Based on the service that we want to give certain permission to we can have the list of permissions that are associated with that service. You can find the permissions here and you find the relavant ones by searching using the service permission prefix. For example if I want to give permissions on Kubernetes Engine I will find the api container.googleapis.com and I will search using the permission prefix: container. And looking at the first of the list we have the container.apiServices.create and on the right you have the list of roles that include this permissions.

Roles

Roles are a group of permissions, you can create custom roles, predefined roles, or use basic roles (which are not recommended).

Predefined roles

Predefined roles are roles that are created and maintained by Google, they are roles that are created for a specific service and they give you a logical permission on that service. In our course we will use predefined roles only. The best place to search for the proper predefined role is the IAM roles page.

Basic roles

Basic roles are highly permissive roles that existed prior to the introduction of IAM, we recommend to avoid using these roles and instead use the predefined roles or custom roles.

Custom roles

Custom roles are roles that you can create, you can create a role with a specific set of permissions and assign that role to a user, group, or service account.

Root Terragrunt

At the moment our root Terragrunt looks like this:

...
generate "provider" {
path = "provider.tf"
if_exists = "overwrite"
contents = <<EOF
provider "google" {
region = "${local.region}"
billing_project = "${local.billing_project}"
user_project_override = true
}
EOF
}

What we are doing is defining the google provider without any special authentication configurations, which means every GCP resource we create now will be created by our user. Things are going to change, what we would like to do is enforce service account impersonation. So we will change this logic to this:

locals {
sa_vars = try(yamldecode(file(find_in_parent_folders("sa.yaml"))), {"email": ""})
...
}
generate "provider" {
path = "provider.tf"
if_exists = "overwrite"
contents = <<EOF
# create condition based on the existence of the service account
%{ if length(local.sa_vars.email) > 0 }
provider "google" {
alias = "impersonation"
user_project_override = true
scopes = [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
]
}
#receive short-lived access token
data "google_service_account_access_token" "default" {
provider = google.impersonation
target_service_account = "${local.sa_vars.email}"
scopes = ["cloud-platform", "userinfo-email"]
lifetime = "3600s"
}
%{ endif }
provider "google" {
region = "${local.region}"
billing_project = "${local.billing_project}"
user_project_override = true
%{ if length(local.sa_vars.email) > 0 }
access_token = data.google_service_account_access_token.default.access_token
%{ endif }
}
provider "google-beta" {
region = "${local.region}"
user_project_override = true
billing_project = "${local.billing_project}"
%{ if length(local.sa_vars.email) > 0 }
access_token = data.google_service_account_access_token.default.access_token
%{ endif }
}
EOF
}

The logic here is a bit complex so let’s try and break it down:

  • sa_vars is a local variable that will try to find the first sa.yaml file in the parent folders, if it finds it it will decode the yaml file and extract the email of the service account. the sa.yaml file represents the service account that we want to impersonate to. this means every section can define different service account that is in charge of creating things in that section.
  • We are creating a first google provider with alias = "impersonation", this provider will authenticate using our user
  • Using a data block we are creating a short lived token for the service account that we want to impersonate to.
  • Using that token we can now create google and google-beta providers that will authenticate using the service account token.

Now every place we want impersonation we can simply add a file called sa.yaml:

sa.yaml
email: some-service-account-email@that-we-will-impersonate-to.com

If the file sa.yaml is not found, there is a fallback to simply use the user permissions without impersonation.

The logic here is the engine behind how we are going to deal with authentication and permissions in our Terragrunt project.

IAM Project

All the service accounts that can create cloud resources and that we can impersonate them, will belong to a new project we are going to create academeez-k8s-iam. We will also place that project in an iam folder. Let’s create the project and folder:

Creating the folder:

iac/gcp/tofu/iam/folder/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
include "folder" {
path = "${dirname(find_in_parent_folders())}/_env/folder.hcl"
}
# since this folder is under the root folder we will use dependency to get the parent folder
dependency "root_folder" {
config_path = "../../common/folders/root"
}
inputs = {
parent = dependency.root_folder.outputs.id
names = ["iam"]
}

We already created a folder in previous lessons, we are simply using the same logic to create a new folder called iam.

Now let’s create the project:

iac/gcp/tofu/iam/project/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
include "project" {
path = "${dirname(find_in_parent_folders())}/_env/project.hcl"
}
# Grab the iam folder
dependency "iam_folder" {
config_path = "../folder"
}
inputs = {
folder_id = dependency.iam_folder.outputs.id
name = "academeez-k8s-iam"
}

We are using the _env/project.hcl file to create the project named academeez-k8s-iam We are also grabbing the folder we just created to parent that folder.

Note that until now we did not place any sa.yaml file so the project and folder will be created with our user permissions, without impersonation.

Admin Service Account

Those gcp

Basic roles

Basic roles are highly permissive roles that existed prior to the introduction of IAM. We recommend to avoid using these roles and instead use the predefined roles (or fine tuned and limited custom roles).

Predefined roles

Predefined roles will focus on a certain GCP service and give you some sort of logical permission on that service. These roles are created and maintained by Google. In this course we will learn to use the Predefined roles to give limited permissions based on service.

The GCP IAM docs have a search for predefined roles which we will use

How to find the right predefined role

First step is to figure what is the service we want to give permissions to, each service has a service permission prefix. For example if we want to give a principal permission to create a K8S cluster than then to find the service permission prefix

If we need a permission and we want to find the roles that include that permission we can goto gcp IAM permissions reference