Developers routinely use GitHub to back up, share, and manage changes to code. GitHub code repositories are usually public, meaning anyone can find and access code that’s been uploaded to the site. And all too often, developers forget to remove sensitive data from their code before putting it on GitHub.
Exposing sensitive private data in public GitHub repositories isn’t a new problem. Malicious hackers actively scan and scrape GitHub for leaked passwords, client IDs, secret keys, and API tokens, to name a few, because they know programmers are prone to such oversights.
But how long does it take for attackers to find data once it’s exposed, and what do they do with it? Comparitech researchers sought to find answers to these questions by setting up a honeypot.
Our researchers created multiple accounts on Amazon Web Services (AWS) and GitHub. They then published user credentials such as AWS IDs and secret keys in public GitHub repositories. Using the AWS CloudTrail service, they then watched and logged attackers who used the credentials to access our AWS servers.
Researchers set up the dummy accounts with programmatic access but no permissions to prevent the attackers from impacting our AWS infrastructure. The user was assigned a policy with full access to any part of the AWS elastic cloud service (AmazonEC2FullAccess).
The AWS Athena service was used to search and query the logs of attacks by time, event, and IP address.
Honeypot results: 1 minute to find exposed credentials and launch attacks
It took just one minute for attackers to find and start abusing the exposed AWS secret key. Based on the speed of the attacks, researchers assume that attackers use custom or modified tooling and scripts for such attacks, and most use proxies that allow them to perform each request from a different IP address.
That’s bad news for programmers and developers. Even if a developer quickly realizes their mistake after committing code to GitHub, they might not be able to remove it before attackers get their hands on the exposed credentials.
The first attack we logged abused the exposed credentials to get information about our server infrastructure, users, permissions, groups, roles, and policies using the DescribeInstances and GetAccountAuthorizationDetails API calls. The entire attack took less than four minutes from the moment of exposure.
Next, our researchers decided to give attackers more privileges. They created a new dummy user and assigned it read-only access permissions (AmazonEC2ReadOnlyAccess) and a role that allows the user to register and deregister container instances (AmazonEC2ContainerServiceRole).
Within one minute of exposing the user’s key, the EC2 instance received more than 1,000 RunInstances API calls. The attacker tried to only run the largest available instances—an attempt to quickly pile up bills on Amazon and cause monetary losses.
In this case, however, Amazon immediately suspended the user account and contacted us via email. This protection mechanism was automatic.
By examining all of the logs from our honeypot, researchers found that every API call came from 547 unique IP addresses, showing that hackers use proxies to help hide abusive activity. Attack activity decreased with time, with the vast majority of attacks made within one hour of exposure.
How attackers can use data exposed on GitHub
The actions that an attacker can perform using exposed AWS credentials varies widely depending on the permissions assigned to those credentials.
Typically, attackers can get information about running EC2 instances, existing S3 buckets, lists of users and their permissions, and other account information. This can lead to further account compromise and data breaches.
Compromised infrastructure can be used in further cyberattacks, such as botnet attacks. This can lead to your IP addresses being blacklisted.
What to do if you’ve exposed AWS credentials on GitHub
If you’ve inadvertently exposed AWS credentials in a public GitHub repository, chances are you will be attacked very soon.
Researchers recommend taking the following steps immediately:
- Update the root password
- Rotate and delete all access keys
- Check each region for unauthorized AWS usage
- Terminate unauthorized resources
- Immediately remove leaked keys and other credentials
- Revoke and delete those credentials
- Contact AWS support for any further issues
Failing to take these steps can lead to financial losses and potential termination or suspension of your AWS account.
How to prevent credential exposure on GitHub
Both AWS and GitHub provide users with some basic protections against these sorts of attacks. GitHub has a secret scanning feature that checks repositories for exposed credentials using regular expressions. As we saw with our honeypot, Amazon can suspend compromised accounts and alert administrators via phone and email. But as we demonstrated, these mechanisms shouldn’t be relied upon.
Here are a few best practices suggested by our researchers to prevent developers from exposing credentials and other sensitive information in public repositories:
- Always scan code before committing it. This can be automated and should be integrated into the development pipeline.
- Do not create an access key for your AWS account’s root user. You cannot restrict the permissions for the root user, so an attacker who obtains that key can control all of the resources in your account, including billing information.
- Use temporary security credentials instead of long-term access keys. Temporary credentials consist of an access key ID and a secret access key, but also include a security token that indicates when those credentials expire. You can specify for how long a temporary access key is valid, after which it will automatically terminate.
- Configure IAM users correctly. If you need to create access keys for programmatic access, then create an IAM user. Only grant that user the permissions they need, then generate an access key for that user.
- If you have multiple applications, use different access keys for each one and rotate them periodically.
- Remove unused keys.
- Configure multi-factor authentication for your most sensitive operations.
- Avoid clear token transfer.