IAM: What happens when you assume a role?
One of the many items on the list of AWS security best practices is to use roles to grant limited access to certain resources for a limited period of time. When you start out, the process of using roles often isn’t straightforward and you end up in a situation where you’re told to “just execute these commands and it will work” (reminds me of git). In this article we’re going to take a look behind the curtain and we’ll learn what’s going on when you assume a role.
Basics
Before we dive into the process, let’s get some terminology out of the way. You’ll most likely be familiar with IAM Users. These are an example of both identities and principals in AWS. An identity is used to identify and group entities within Identity and Access Management. Principals on the other hand can take action in an AWS account and make calls to APIs. These are closely related, which is why these terms are often used interchangeably. There is however a difference:
An IAM group is not a principal and as such can’t take action in an AWS account. We can see in this diagram that only users and roles are both identities and principals. These are what we’re going to focus on. The other principals work the same way. Principals take care of the authentication in an API call, which is the first step that needs to happen. The entity must prove that they are who they claim to be.
The next step is authorization, which is covered by policies in IAM. A policy allows or denies a set of actions to a principal on certain resources. These policies come in two main varieties: identity-based and resource-based policies.
Identity-based policies can be attached to all identities and resource based policies belong to resources. They’re very similar in what they do, but there are a few key differences:
-
Perspective: An identity based policy answers the question “Which API calls can this identity perform on which resources?” whereas a resource based policy answers the question: “Which identities can perform which actions on me?”.
-
Syntax: The resource-based policies have an additional mandatory parameter, which is called
Principal
that is used to - you guessed it - define which principal(s) this statement refers to.
Example of an identity based policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "s3:PutObject",
"Resource": "*",
"Effect": "Allow"
}
]
}
Example of a resource based policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "s3:PutObject",
"Principal": {
"AWS": "arn:aws:iam::<account-id>:root"
},
"Resource": "arn:aws:s3:::mybucket/*",
"Effect": "Allow"
}
]
}
Now that we’ve covered the mechanisms for authentication (principals) and authorization (policies), let’s take a look at roles. A role is both a principal and identity in AWS and has the primary purpose of granting temporary permissions to perform API-calls in an account. In order to use a role, it has to be assumed.
Each role has a trust relationship which determines the entities that can assume the role. It also has a set of permissions that define which privileges entities get after they assume the role. That has been a lot about permissions but trust me, it’s critical to understand.
Assuming a Role
To assume a role, we use the Security Token Service (STS) that gives us temporary credentials to use the role. Why would we need separate credentials? When you assume a role, you get credentials, which you can use to make API-calls with. These credentials let you act as the role until they expire. They’re separate from your original credentials, so you can easily use both at the same time for different API calls. Temporary Credentials also look different from the long-term credentials.
The API call we need to make in order to assume the role is the sts:AssumeRole action. Here we need to specify the ARN of the role we want to assume as well as a session name. The session name will be visible in CloudTrail and is part of what makes it transparent who assumed a role. Optionally we can also specify how long the credentials should be valid. The upper limit in IAM is 72 hours, but you can specify a lower boundary for each role.
In order for this to work, the principal that assumes the role needs the sts:AssumeRole
permission for said role in its identity policy and the principal needs to be listed in the trust relationship of the role. If either of them is missing the call fails. When the permissions are set up correctly, the STS response contains a credentials object with four pieces of information:
- Access Key Id
- Secret Access Key
- Security Token
- Expiration - a timestamp that tells you when the credentials expire
You can then use the first three pieces of information to instantiate your API client and make API calls with the permission of the role. Now we know all the pieces of the puzzle, let’s visualize the full picture:
- A User has a set of credentials attached that allows them to call
sts:AssumeRole
. They make the API call to STS and sign the request with their long-term security credentials. - STS gets the API call and before it does, IAM checks if the user has permissions to make this API call in its identity-based permissions (2a). Then STS checks if the trust relationship of the role also allows the principal to assume it (2b).
- Only if both of those checks succeed, temporary security credentials are returned to the client.
- The user can then use the temporary security credentials to make an API call (e.g. to S3) and will have the permissions that are assigned to the role.
To me the process seemed a bit confusing when I first encountered it, but it makes a lot of sense once you understand what’s going on. Now we’re going to take a look at some common issues when assuming roles or using temporary credentials.
Troubleshooting
Access Denied when assuming a role
An error occurred (AccessDenied) when calling the AssumeRole operation: User:
arn:aws:...
is not authorized to perform: sts:AssumeRole on resource:arn:aws:iam:<account>::role/<rolename>
This is probably the most common error you’ll encounter and there are a few things to check:
- Make sure the ARN of the role is correct.
- Make sure the identity that calls sts:AssumeRole has the permission in its policies.
- Make sure the trust relationship of the role references the entity that calls assume role.
- Make sure there is no explicit deny for the operation in either of the two policies.
Expired Credentials
An error occurred (ExpiredToken) when calling the ListBuckets operation: The provided token has expired.
When you assume a role, you get back temporary credentials that expire after a while. This is what that looks like. The fix is fairly easy, you have to assume the role again and use the new temporary credentials.
Access Denied when using a role
When using a role, you sometimes encounter Access Denied errors that shouldn’t be there. There are two main sources of errors here:
- The permissions aren’t actually there which you can diagnose by taking a look at the policies the role has attached
- You’re not actually using the role which you can diagnose by using the
aws sts get-caller-identity
API call, which gives you the ARN of the entity that’s making the API call. You don’ t need special permissions for this one API Docs.
Summary
In this post we took a look at what goes on behind the scenes when you assume a role in AWS. We looked at identities, principals and the different kind of policies that can be used. After covering the workflow to assume a role we also looked at some common errors and how to address them.
Hopefully this has been helpful to you and I look forward to your questions, concerns or any feedback you may have. If you want to get in touch, take a look at my bio - there are a few ways to do that.
— Maurice