AWS/EKS Setup¶
To remain inherently safe, Karrots does not acquire or store account secrets. On EKS Karrots relies on the AWS CLI tool aws
to perform all of the Terraform operations that require secrets and authorization. To make this work you will need to first do a little aws
setup.
Prerequisites¶
Before you run Karrots you need an AWS organization, project, and billing account. Your user account needs to then have admin privileges in that organzation. You can find organization setup information here: https://cloud.google.com/resource-manager/docs/creating-managing-organization.
AWS-CLI¶
The Terraform scripts that Karrots runs use your local aws
setup to handle validation and authorization. You can find aws
installation instructions here: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-mac.html. (On Mac we recommend homebrew which the official Amazon's instal docs don't mention, even though homebrew supports it as a first-class install: https://formulae.brew.sh/formula/awscli.)
JQ¶
You will need to install jq
so that karrots
can extract values from the AWS-CLI results. For installation instructions visit: https://stedolan.github.io/jq/download/.
AWS Setup¶
The next few steps you probably need to perform as the AWS root user. Sign in to the AWS console as the root user of the organization that you want to install karrots on. Generally, an organization should only use their root user to setup sub-organizations, accounts, roles, groups, and users. You can find more information here: orgs_manage_accounts_access-as-root.
To access the karrots
user once setup follow "option 2" in these instructions: organizations-member-account-access.
AWS Organization¶
Create an AWS organizational unit called karrots
to contain your karrots
account: https://console.aws.amazon.com/organizations/home.
AWS Karrots Account¶
Create an AWS account called karrots
within your karrots
AWS organizational unit (this is where your karrots-worker
users will go): https://console.aws.amazon.com/organizations/home?#/accounts/add/create. Amazon doesn't let you use duplicate emails for accounts, so you might need to create an email like karrots-admin@organization
first. Hang onto the account ID once created — you will need it to limit policy scope in a later step.
It gets a little tricky here, but we do it this way for good reason: to make sure the karrots-worker
service worker account is well isolated from the rest of your AWS accounts and organizations. In the event of a breach against a laptop or host that runs karrots-worker
the attacker will not be able to expand beyond the confines of your karrots-worker
IAM permissions.
When you created the new karrots
account, AWS automatically created a root user karrots
for that account. You should have received a welcome message to the email address of this root user. If you did not receive this email then you won't be able to proceed (you will need to debug this first). AWS creates the new karrots
account root user with an unknown 64-byte random password. Receiving the welcome email means you can request a password reset so that you can gain control of this root user. You can find complete instructions here: orgs_manage_accounts_access-as-root (under the section: "To request a new password for the root user of the member account."). N.B.: It is best practice to only use an account's root user for managing users, groups and policies.
The AWS instructions to recover the new karrots
account root user ask you to first logout of your current user so you can return to the AWS console login. Since we only need to issue a few commands as the karrots
account user, it's probably easier to simply open a private browser window and start a new login session: https://console.aws.amazon.com/.
The first thing you should do when you login to the karrots
account root user is add an MFA (Titan Key, Google Authenticator, etc.): https://console.aws.amazon.com/iam/home#/security_credentials.
AWS Karrots Policies¶
Once you're able to login as the karrots
account root user, you need to create a set of karrots-worker
IAM policies in that account: https://console.aws.amazon.com/iam/home#/policies. Use the JSON edit option to create each policy. The JSON files below represents the policy, but you must first substitute <KARROTS_MAIN_ACCOUNT_ID>
with your actual karrots
account ID in each file before you use it. Each of the policies should have the same name as their JSON file, e.g. karrots-worker-autoscale
.
- jenkins-agent.json
- jenkins-controller.json
- karrots-worker-autoscale.json
- karrots-worker-ec2.json
- karrots-worker-peering.json
- karrots-worker-eks.json
- karrots-worker-elb.json
- karrots-worker-iam.json
- karrots-worker-kms.json
- karrots-worker-route53.json
- karrots-worker-sts.json -
<YOUR_ROOT_ACCOUNT_ID>
in this file should be replaced with the account ID of your organization's root user, not thekarrots
account you just created.
AWS Karrots Worker Group¶
While still logged into karrots
account root user session, you need to create a karrots-worker
IAM group in that account: https://console.aws.amazon.com/iam/home#/groups. Now attach all of the karrots policies, except jenkins-agent
and jenkins-controller
, to the group. (The easiest way to do this is filter by the word karrots
and then select them all before adding.)
AWS Karrots Worker IAM User¶
The following is a suggestion, but you can set this up many ways. The goal in our setup is to make sure that any karrots-worker
has the least privileges needed to do its work and no more. Adding any of your regular IAM users to the karrots-worker
group potentially gives karrots
binary more permissions than it needs when it runs. At the same time, since many use cases for karrots
involve automation, we need to generate API keys to allow karrots
to gain permission to do its work. These API keys are the weak link in your security chain since they're "out in the world" where someone could compromise them. For this reason we don't want to create a single API key that's shared by every user or process that needs to invoke karrots
. We suggest that you create a karrots-worker-<USER_NAME>
for every user or process that needs an API key to invoke karrots
. If someone compromises an API key you can delete that API key and deal with the damage on the single user or process. If a user with a karrots
API key is no longer part of your system, you can easily delete that karrots-worker-x
IAM user.
While still logged into karrots
account root user session, you need to create a karrots-worker-<USER_NAME>
IAM user in that account: https://console.aws.amazon.com/iam/home#/users. Set the user name to karrots-worker
with the Programmatic Access option. Next add the user to the karrots-worker
group you created above.
When you complete user creation, you should copy, and securely store, the access key id
and secret access key
. These are the keys you will use when you configure your local AWS-CLI profile to run karrots
. N.B.: protect this key! Do not share it through insecure means such as email, Slack, MSFT Teams, paper/pencil, etc. The best way to manage these keys is with a secrets manager such as 1Password or LastPass. All secrets managers have a public vaults feature that allows you to share and manage secrets in a secure way — this is the best way to share these new API keys with someone.
Delegated Route53 DNS¶
The setup we now have prevents karrots
from modifying Route53 DNS Zones in your main account. This is good for security, but prevents karrots
from creating a routable DNS name to the new cluster. (This also prevents Ambassador from being able to use the ACME host challenge to request a Let's Encrypt TLS certificate.) To get around this problem we will allow karrots
one, small avenue to your main account using AWS Assume Role. To make this work we need to go back to our main account and create a role that allows karrots
to add NS records to the main Route53 resolver. You can see an example of this process here: https://hackernoon.com/terraform-with-aws-assume-role-21567505ea98.
N.B.: perform the following steps in the AWS account that contains your organization's primary DNS resolver.
Create Delegate Policy¶
First we need to create a karrots-root-dns
IAM policy in the account that holds your primary DNS resolver: https://console.aws.amazon.com/iam/home#/policies. Use the JSON edit option to create the policy.
Create Delegate Role¶
While logged into the account that holds your primary DNS resolver, create a new role karrots-root-dns
: https://console.aws.amazon.com/iam/home?#/roles. In the first step choose Another AWS account option and enter the account number of your karrots
account. In the next step attach the karrots-root-dns
policy. In the final step name the role karrots-root-dns
. The ARN for the new role should be: arn:aws:iam::KARROTS_ACCT_ID:role/karrots-root-dns
. Because karrots
will build the delegate ARN when it runs, it's important that the role's ARN matches this pattern.
CloudTrail¶
You can easily monitor every action taken by karrots
in AWS CloudTrail event history: https://us-west-1.console.aws.amazon.com/cloudtrail/home. (You need to first create at least one CloudTrail analyzer.) In case of a potential breach, you can filter events by API key to see what actions that API key took during the potential breach period. (It takes about 30min for events to show up in the CloudTrail console.)
Setup ECR¶
Create an Amazon ECR repo in the same region that the cluster will run in: https://us-west-1.console.aws.amazon.com/ecr/create-repository. Name it the same as your cluster (e.g. karrots-example-python
), set the visibility to Private
, and make sure all the options are set to Disabled
.
Setup Karrots-Data Account¶
For security reasons, Karrots operates in its own account isolated from everything else, but to do anything useful with Karrots we need access to data. For organizations with an existing data warehouse you need to take a few steps to allow Karrots service accounts to connect to it via VPC peering and IAM role chaining. For organizations without a data warehouse, you should create one first.
Build a New Data Warehouse AWS Account¶
A data warehouse can be an incredibly complex thing to build, but you can also start simply. If you're just starting out you begin with either with Aurora database (mysql, postgres) or RedShift. There's no easy rule for how to decide except maybe your existing scale: if your systems generate logs or transactions in the hundred or thousands per second then probably Redshift is a better choice. Otherwise start with Aurora.
For Aurora start here: Setting Up Aurora
For Redshift start here: RedShift Getting Started
Setup the Data Warehouse AWS Account¶
Note
Now switch to your data warehouse AWS account for this section
Karrots accesses the data warehouse account databases usingIAM role-chaining over a VPC peering connection. There are two types of connections we may make this way: one connection comes from programs run by humans in mostly trusted environments such as JupyterHub (human) and another that comes from automation run in a fully trusted environment (automation). The permissions for both groups are identical, the only reason to differentiate them is if an organization wants to add 2FA/MFA validation to the IAM role used by humans to chain.
Data Warehouse IAM Policies¶
In order to grant the proper permissions to roles that can read and attach to the data warehouse, we need to create policies using the following JSON files:
Data Warehouse IAM Roles¶
Create two account roles, karrots-data-human
and karrots-data-automation
. Attach the karrots-data
policy to both of these roles. When you finish creating the roles you need to edit both of them to use this trust relationship:
Note
For now we won't require the human IAM role to use 2FA/MFA. We can enable this later by setting MultiFactorAuthPresent
to true
in the karrots-data-human
role's trust relationship.
Now create an account role karrots-peering
. Attach the karrots-peering
policy to the role. When you finish creating the roles you need to edit it to use this trust relationship:
Add Database IAM Role¶
Now we need to grant the two IAM roles, karrots-data-human
and karrots-data-automation
, read-only access to the target warehouse databases. For an example of how to do this see: https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-authentication-access-control.html.
Modify Database Security Group¶
Now we need to grant inbound access from Karrots cluster CIDR block to the database. Click through to the database's default security group and on the inbound rules tab, set the type, protocol, port range as necessary. The source field will be the default Karrots CIDR block: 10.100.0.0/16
.
Setup VPC Peering¶
When Karrots runs, it will create a VPC peering request to the data warehouse account's main VPC. To accept this request, go to the AWS VPC console and select the VPC Peering menu item. From the main VPC Peering page you should see the request in pending state. Once you approve the request, Karrots users, e.g. Jupyterhub users and Python apps, will be able to access the data warehouse.
Setup Steps for Machines that Will Run Karrots¶
Once you have a valid organization and project, you need to perform a few steps to setup aws
CLI on the local machine so that Karrots can use it provide authorization for certain Terraform operations. (Karrots never stores secrets.)
aws configure --profile karrots
In setting up the profile, use the information from the karrots-worker
user you created above. (If the karrots
user already exists, then you will need to get the access information from your team.)
Note
You must set Default output format [None]
to json
so that karrots
can find and extract key values.
By default karrots
will use the karrots
AWS-CLI profile. If you want to use another profile, then set that up in the karrots.yaml
config file before you run the karrots cluster-create
command.
Once a cluster is running, you then setup a kubectl
kubeconfig
file. (More details here: https://docs.aws.amazon.com/eks/latest/userguide/create-kubeconfig.html.)
Security Posture Note¶
There are several ways to manage how we grant permissions to Karrots so it can do its work. Our preferred way is the above where we create a karrots
user with least privilege and give the access keys to people or processes that need to run it.
Another option is to create a karrots
IAM role and add that role to existing person or process IAM users. This alternate option is better if we want to quickly revoke a person or process's access to the elevated permissions needed to run karrots
. The downside of this is that the person or process running karrots
then grants karrots
access to the user account's entire permission set. This means karrots
runs with more permissions than it needs, viloating the "principal of least privilege."
The downside to the first option comes when we need to revoke privileges to run karrots
. Privilege revocation in the dedicated karrots
user case happens by invalidating the access keys. If we ever have to do this then, yes, we have to distribute new access keys to every person or process that runs karrots
. The upside is that the long-term security posture is better.
In the end, karrots
has no access to secrets and simply calls out to the aws
CLI tool which relies on the standard AWS permissions setups. (E.g. ~/.aws/config
, ENV vars, etc.). How you manage the profile/config is entirely up to you.