Working in multiple GCP projects used for development, testing and production increases the risk of accidentally performing the right operation in the wrong environment. This can of course have disastrous results. In addition, using terraform (which is the best way to go about, in our opinion) can quickly further increase the damage.
And while an extendable multiverse of roles and permissions available might make it easier to implement a more or less static POLP, for the majority of time all users, including administrators, only need read access to their GCP resources. Permanent write access does little to protect against accidental destructive operations. In my own experience, it often ends with a (often infrequently audited) list of people having the owner role.
Worth noting is that the owner role, as other basic roles, don’t support conditions, so granting temporary access using IAM is not an option.
We have found that a very efficient way to minimize the risk of screwing things up is to implement a variant of the sudo pattern.
We prefer the following approach:
This has turned out to save the day many times.
Make sure there is always someone with the role of Organization Administrator in your organization - don’t get locked out.
A nice way to implement sudo is using shell scripts, Workflows, Python and Cloud Scheduler. Engineers sudo using the shell script, preferably also unsudo when they don’t need elevated access anymore, and in case they forget, a scheduled workflow runs periodically to remove the owner role from everyone in the project.
A sudo shell script could look like below. Key points are:
#!/bin/bash
# Get the active account
get_active_account() {
gcloud auth list --filter=status:ACTIVE --format="value(account)"
}
# List all projects the user can see
list_gcp_projects() {
gcloud projects list --format="value(projectId)"
}
# Set the user as an owner for the selected project
set_owner() {
local project_id=$1
local account=$2
gcloud projects add-iam-policy-binding $project_id \
--member="user:${account}" --role="roles/owner" --no-user-output-enabled
echo "You've been set as an owner for project: $project_id"
}
main() {
local account=$(get_active_account)
# If a project ID is provided as an argument, use it directly
if [[ ! -z "$1" ]]; then
set_owner $1 $account
exit 0
fi
local projects=$(list_gcp_projects)
echo "Select a project from the list:"
select project_id in $projects; do
if [[ -z "$project_id" ]]; then
echo "Invalid option. Exiting..."
exit 1
fi
set_owner $project_id $account
break
done
}
main "$@"
An unsudo workflow would first call a function to make sure any owner is IAM admin, and then remove the owner privileges of that user. Note that in case GCP groups are used for IAM admin rights and the organization is set up properly, the first step could be skipped.
- initialize:
assign:
- project: ${var.project_id}
- add_iam_admin_role_function_url: ${google_cloudfunctions2_function.add_iam_admin_role.service_config[0].uri}
- remove_owner_role_function_url: ${google_cloudfunctions2_function.remove_owner_role.service_config[0].uri}
- add_iam_admin_role:
call: http.get
args:
url: ${google_cloudfunctions2_function.add_iam_admin_role.service_config[0].uri}
auth:
type: OIDC
audience: ${google_cloudfunctions2_function.add_iam_admin_role.service_config[0].uri}
result: add_iam_admin_role_result
- remove_owner_role:
call: http.get
args:
url: ${google_cloudfunctions2_function.remove_owner_role.service_config[0].uri}
auth:
type: OIDC
audience: ${google_cloudfunctions2_function.remove_owner_role.service_config[0].uri}
result: remove_owner_role_result
- final:
return: "Workflow completed"
The actual cloud functions to remove user from sudo would do something like:
import os
from apiclient import discovery
from google.auth import default
def remove_owner_role(request):
credentials, project_id = default()
if not project_id:
return "Error: Project id not found", 500
service = create_service()
print("Project: " +project_id)
if not remove_owner(service, project_id):
return "Failed to remove owner role.", 500
return "Successfully removed owner role.", 200
def create_service():
"""Provides a service using application default credentials."""
return discovery.build('cloudresourcemanager', 'v1')
def get_policy(crm_service, project_id, version=3):
"""Gets IAM policy for a project."""
policy = (
crm_service.projects()
.getIamPolicy(
resource=project_id,
body={"options": {"requestedPolicyVersion": version}},
)
.execute()
)
return policy
def set_policy(crm_service, project_id, policy):
"""Sets IAM policy for a project."""
crm_service.projects().setIamPolicy(resource=project_id, body={"policy": policy}).execute()
def has_iam_admin_role(crm_service, project_id, user_email):
"""Checks if the specified user has the IAM admin role."""
policy = get_policy(crm_service, project_id)
iam_admin_binding = next((b for b in policy["bindings"] if b["role"] == "roles/resourcemanager.projectIamAdmin"), None)
# Check if the user is in the IAM admin members list
if iam_admin_binding and f"user:{user_email}" in iam_admin_binding["members"]:
return True
return False
def remove_owner(crm_service, project_id):
"""Removes the owner role."""
policy = get_policy(crm_service, project_id)
owner_binding = next((b for b in policy["bindings"] if b["role"] == "roles/owner"), None)
if owner_binding:
members_to_remove = [] # List to store members who will have their owner role removed
for member in owner_binding["members"]:
user_email = member.split(":")[1] # Extract email from the "user:email" format
if has_iam_admin_role(crm_service, project_id, user_email):
members_to_remove.append(member)
else:
print(f"User {user_email} does not have IAM admin role. Refusing to remove owner role.")
return False
# Remove the members from the owner role and print their emails
for member in members_to_remove:
user_email = member.split(":")[1]
print(f"Removing owner role for user: {user_email}")
owner_binding["members"].remove(member)
set_policy(crm_service, project_id, policy)
return True
return False
if __name__ == "__main__":
# Simulate a call to the primary function
result, status_code = remove_owner_role(None)
print(f"Result: {result}, Status Code: {status_code}")
A complete git repo containing source code, terraform module, cloud function implementation, etc. can be found here.
If you have any questions about what we do or if you think we can help in any way, please reach out on X or LinkedIn. We would love to hear your thoughts on what we are doing.