General

A Guide to Keeping Secrets out of Git Repositories

Published May 31, 2022

Updated Feb 13, 2023

6 min read

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

If you’ve been a developer for a while, then you hopefully know it is wise to keep secret information such as passwords and encryption keys outside of source control. If you didn’t know that, then surprise! Now you know.

Sometimes slip-ups do happen and a password ends up in a default config file or a new config file was not added to “.gitignore” and that same someone ran “git add .” and didn’t even notice it got committed. There should be protections in place no matter how diligent your programmers are since nobody is infallible, and the peace of mind is well worth it.

How can software know something is a secret?

It can’t know for sure, but it can make an educated guess. Secrets typically fit certain known patterns, or have higher entropy than other strings in your code and configuration files. A good scanner should check for strings that fit these patterns throughout your entire repository’s history, and raise anything suspicious to you.

Checking for Secrets in CI and CD

When it comes to automatically checking for secrets in your code, you have quite an array of options. To keep this article brief, I am just going to cover a few tools, and which tools you use may depend on your repository host.

GitHub

If you’re using GitHub for your project and your repository is either public or you use GitHub Enterprise Cloud, then GitHub will automatically scan the code you upload for secrets. GitHub’s solution is special because they have partnered with several different companies to allow for automatic revocation of secrets pushed to the repo. See the following excerpt from GitHub’s secret scanner documentation:

When you make a repository public, or push changes to a public repository, GitHub always scans the code for secrets that match partner patterns. If secret scanning detects a potential secret, we notify the service provider who issued the secret. The service provider validates the string and then decides whether they should revoke the secret, issue a new secret, or contact you directly. Their action will depend on the associated risks to you or them.

Well, that’s nice now isn’t it? If you’re curious if the services you use are partnered with GitHub so that their secrets can be scanned for, you can view the full list here. Just keep in mind that this functionality is only available for public repositories and private repositories using GitHub Enterprise Cloud with a “GitHub Advanced Security” license.

GitLab

GitLab, like GitHub, has secret detection as well. GitLab uses Gitleaks for their secret detection. This is a well documented tool whose source code is freely available. The capabilities of secret detection in GitLab does vary based on your tier, though.

You will have to use GitLab Ultimate to view detected secrets in the pipeline, and merge request sections for example. You can still use the scanner in free and premium versions, but it isn’t nearly as integrated as it is in the ultimate version.

Gitleaks

We mentioned that GitLab uses Gitleaks, but you aren’t just limited to using it with GitLab! Since Gitleaks is open source, that means you can use it with other providers such as GitHub, and even run it locally on your own system. It is also very easy to set up either as a CI job, or if you need to run it locally.

Scanning for Secrets using a CI Job

For GitHub you can simply use this action made by the author of Gitleaks. In this case Gitleaks is helpful if you’re using private repositories on GitHub without GitHub Enterprise Cloud. It is fully configurable with the action as it allows you to specify a custom .gitleaks.toml file. This is optional of course, and the default might work fine for you.

Checking for Secrets in a Pre-Commit Hook

There are a couple of ways to accomplish setting up the hook. A pre-commit script is available on the Gitleaks GitHub that will run Gitleaks on your staged files before you commit. Your commit will be stopped if any secrets are detected. This script can simply be copied into your .git/hooks/ directory. It does require that Gitleaks is installed and in your $PATH, however.

The other method involves using the pre-commit utility. It will assist with installing Gitleaks automatically for any developers that clone the repository and it can also assist with installing the hooks for the first time as well. Using the pre-commit tool might make more sense if you want to ensure other linters and checkers run, and you don’t want to have developers juggle installing everything themselves.

A Good Code Review Process Goes a Long Way

Although automated tooling for identifying secrets in code works well, it’s still good to keep an eye out for them when reviewing code. Automated scanning tools, as I mentioned earlier in the article, work great and you should definitely use them. However, they aren’t perfect. These tools look for sets of patterns and strings with high entropy, but not all secrets fit these criteria.

Knowing that even with the best scanning tools it’s still possible secrets could sneak through, it’s easier to understand why it is important to also have a good code review process to catch these issues. Also remember that committing secrets isn’t the only thing you should be worried about. You should have others review your work to help mitigate the chances that your changes could introduce new security vulnerabilities in the code as well!

Oh no, there’s already a secret in my Git history!

If you already have secrets in your repository, and they’re pushed to a main branch, not all hope is lost. Before we get into methods of removing the secrets, I have a massive disclaimer I should get out of the way. The only way to truly remove secrets from your repository is to rewrite your git history. This is a destructive operation, and will require developers to re-pull branches and cherry-pick changes from their local branches if applicable.

Did you read the disclaimer? Good, we can discuss methods then. What method we do depends on how and when the secret made it into the repository.

The Secret is in a Single Branch

If a secret made it into a feature branch by mistake, then you could simply initiate an interactive rebase to remove it and force push that branch to the remote. It should be noted that this is only effective if no other branches are based off of your branch, and if it isn’t tagged.

Let’s say, for example, you push your code to the remote and a CI job identifies a secret after your push. At this point there would be nobody else using your branch and it shouldn’t be tagged, so this is the perfect opportunity to just rewrite the commit that triggered the CI failure. If your commit hash is let’s say 09fac8dbfd27bd9b4d23a00eb648aa751789536d, then these are the first command you would have to execute to begin cleaning up your branch’s history:

$ git rebase --interactive 09fac8dbfd27bd9b4d23a00eb648aa751789536d^

Note the caret at the end of the SHA1 commit hash. This is vitally important to include as we need to rebase to the commit prior to the commit introducing the secret. The gist is that we’re going to return back to that point in time, and prevent the secret from ever being added in the first place.

Git will now open your default command-line text editor and ask you how you want to execute your rebase. Find the line referencing the problematic commit and replace pick with edit on that line. If you save the file and quit you then should now find yourself at the commit where the secret was introduced. From here you can remove the secret, stage the affected files, and then execute the following commands:

$ git commit --all --amend --no-edit

And if that’s successful:

$ git rebase --continue

Then after that the history should be successfully rewritten to not include the secret locally. If you pushed your branch already, then you’ll need to get these changes pushed to the remote as well. You can do this with a force push using the -f flag like so:

$ git push origin <your_branch_name> -f

Now you should be set! At this point, you can get to fixing up your code to pull the secret some other way that doesn’t include config files or strings hard-coded inside of your codebase.

The Secrets are in Main Already…

If your secrets are present in many branches, like tagged versions or your main branch, then things get a little more complicated. There’s more than one way to handle this situation, but I am going to cover only one. Just revoke the secret.

In this scenario, you should revoke the secret and issue a new one. If you do this, then it doesn’t matter that the old secret is still in the repository history because it will be entirely useless! How this is done of course depends on what service the secret was issued from.

There is also the added benefit in that anyone that has cloned your repo with the secret will also be unable to use it. Simply rewriting history doesn’t matter if an adversary has already downloaded it before you deleted it.

It’s important to note that this isn’t always great if, for example, the secret is used in multiple projects. You will need to ensure your revoked secret is replaced everywhere before you actually revoke it or else you may experience downtime.

Conclusion

With scanning tools becoming more accessible, there are fewer and fewer reasons to not use them. Secret scanning is especially important for public repositories, but it is also useful for private repositories where a compromised developer account can access secrets and wreak havoc.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

Jamie Kuppens

I’m a software engineer with an interest in web development and some more esoteric things like emulator development.

@Reshurum @Reshurum

Git Bisect: the Time Traveling Bug Finder

I think it’s safe to say that most of us have been in a situation where we pull down some changes from main and something breaks unexpectedly, or a bug got introduced in a recent deployment. It might not take long to narrow down which commit caused the issue if there’s only a couple of new commits, but if you’re a dozen or more commits behind, it can be a daunting task to determine which one caused it. But can’t I just check each commit until I find the culprit? You could just check each commit individually without any special tools until you find the one that caused the issue, but that can be a very slow process. This is not ideal and is analogous to the reason why linear search isn’t as effective as binary search. As the title suggests, there is a tool that Git provides called “bisect”. What this command does is checks out various commit refs in the tree of the branch you’re currently working in and allows you to mark commits as, “good”, “bad”, or “skip” (invalid / broken build). It does away with the need of having to check each commit individually as it is able to infer if commits are good or bad based on which other commits you have already marked. Git Bisect in Action Let’s imagine a hypothetical scenario where some bug was reported for the software we’re working on. Starting a git bisect session usually looks like the following example. ` In this case, the commit hash in this example comes from a commit that I already know works. In the case where you pull down changes, and only then does something break, you can use whatever commit you were at prior before you pulled them down. If it’s an older bug, then you could check an older tag or two to see if it exists there. Next is the part where we search for the offending commit. Every time you mark a commit, bisect will then navigate to another commit in-between your good and bad starting points using a specialized binary search algorithm. ` This is the general workflow you will follow when bisecting for its most basic use case, and these commands will be repeated until there are no more revisions left to review. Bisect will try to predict how many steps are left, and let you know every time you mark a commit. Once you are done, you will be checked into the commit that introduced the regression. This assumes that you marked everything accurately! After you are done bisecting, you can quickly return to where you started by running git bisect reset. How Git Bisect Works Firstly, bisect makes the reasonable assumption that any commits after a bad commit remain bad, and any commits before a good commit remain good. It then continues to narrow down which commit is the cause by asking you to check the middlemost commit, along with some added bias when navigating around invalid commits. Though, that’s not vitally important to understand as a user of the command. The following graphic shows how bisect moves throughout your branch’s history. Bisect becomes incredibly useful when dealing with repositories with a lot of history, or when tracking down the cause of a bug that’s been in a codebase for a long time. It makes it possible to mule over hundreds of commits in fewer than a dozen steps! That’s a lot better than going through commits one-by-one or at random. Limitations It is worth mentioning that bisect isn’t as useful in cases where commits are very large because they incorporate several different changes all bundled together (e.g. squash merges). In an ideal world, each commit in the main branch’s history can be built, and they will implement or fix one thing and one thing only. But in reality, this isn’t always the case. The skip command is available to help with this scenario, but even with that, it’s possible that a change that caused the bug is in one of those skipped commits; therefore, relying solely on the diff of the determined commit to find the root cause of a bug may be misleading. Conclusion Git bisect is a very useful tool that can dramatically decrease the amount of time it takes to identify the cause of a regression. I would also recommend reading the official documentation on git bisect as it’s actually quite informative! There are a lot of good examples in here that demonstrate how you can use the command to its full potential....

Sep 7, 2022

4 mins

GitJavaScript

Ensuring Accurate Workflow Status in GitHub for Enhanced Visibility

Introduction In the world of software development, GitHub workflows are crucial for automating CI/CD processes. However, a key challenge emerges when these workflows report a 'success' despite underlying issues, like failed tests. This is especially common in scenarios involving tests (e.g., Cypress) and notifications (e.g., Slack) within the same workflow. This blog post aims to highlight the importance of accurate GitHub workflow statuses for better visibility and team response, and how to ensure your workflows reflect the true outcome of their runs. The Problem with Misleading Workflow Statuses Consider a scenario in a GitHub workflow where end-to-end tests are run using Cypress. ` If these tests fail, but the workflow proceeds to a subsequent step, like sending a notification via Slack, which completes successfully, the entire workflow might still show a green checkmark. This misleading success status suggests everything is functioning as intended, when in fact, there could be significant underlying issues. The core issue is the determination of workflow success. Even if critical steps like testing fail, later steps without errors can override this, resulting in a false sense of security. This not only delays bug detection but can also lead to faulty code advancing in the CI/CD pipeline. It's crucial for the overall workflow status to accurately reflect failures in critical steps to ensure prompt and appropriate responses to issues. Crafting a Solution and Best Practices Ensuring Accurate Status Reporting To address the issue of misleading workflow statuses, it’s essential to configure your GitHub Actions properly. The goal is to ensure that the workflow accurately reflects the success or failure of critical tasks, such as running tests, regardless of the success of subsequent steps. Adjusting the Workflow Conditional Notifications: First, set up notifications to execute conditionally based on the outcome of critical steps. This ensures you're alerted of the workflow status without altering the overall result. For example, sending a Slack message if a Cypress test fails: ` Explicit Failure Handling: After configuring conditional notifications, explicitly handle failure scenarios. If a critical step like a Cypress test fails, force the workflow to exit with a failure status. This step is crucial to ensure that the overall workflow reflects the true status: ` Best Practices: Clear Step Separation: Clearly separate and label each step in your workflow for easier readability and troubleshooting. Regular Reviews: Periodically review your workflows to ensure they are aligned with the latest project requirements and best practices. Document Workflow Logic: Maintain documentation for your workflows, especially for complex ones, to aid in understanding and future modifications. By first setting up conditional notifications and then enforcing explicit failure handling, you maintain both alertness to issues and accuracy in workflow status. This approach ensures that failures in critical steps like tests are not overshadowed by subsequent successful steps, keeping the reported status of your workflow true to its actual state. Conclusion Accurate GitHub workflow statuses are vital for a transparent and efficient CI/CD process. By implementing conditional notifications and explicit failure handling, we ensure that our workflows truthfully represent the success or failure of critical tasks. This not only fosters better issue awareness and response but also upholds the integrity of our development practices. Embrace these steps as part of your commitment to maintaining clear and reliable automation processes in your projects. Happy coding!...

Mar 22, 2024

3 mins

GitHubGit

A Tale of Form Autofill, LitElement and the Shadow DOM

Many web applications utilize forms in places be it for logging in, making payments, or editing a user profile. As a user of web applications, you have probably noticed that the browser is able to autofill in certain fields when a form appears so that you don't have to do it yourself. If you've ever written an application in Lit though, you may have noticed that this doesn't always work as expected. The Problem I was working on a frontend project utilizing Lit and had to implement a login form. In essence these aren’t very complicated on the frontend side of life. You just need to define a form, put some input elements inside of it with the correct type attributes assigned to it, then you hook the form up to your backend, API, or whatever you need to call to authenticate by adding a submit handler. However, there was an issue. The autocomplete doesn’t appear to be working as expected. Only the username field was being filled, but not the password. When this happened, I made sure to check documentation sites such as MDN and looked at examples. But I couldn’t find any differences between theirs and mine. At some point, I prepared a minimal reproducible example without Lit, and I was able to get the form working fine, so it had to do something with my usage of Lit. After doing a little bit of research and some testing, I found out this happened because Lit relies very heavily on something known as the Shadow DOM. I don’t believe the Shadow DOM is necessarily supposed to break this functionality. But for most major browsers, it doesn’t play nice with autocomplete for the time being. I experienced slightly different behavior in all browsers, and the autocomplete even worked under Shadow DOM with Firefox in the Lit app I was working on. The solution I ended up settling on was ensuring the form was contained inside of the Light DOM instead of the Shadow DOM, whilst also allowing the Shadow DOM to continue to be used in places where autofillable forms are not present. In this article I will show you how to implement this solution, and how to deal with any problems that might arise from it. Shadow DOM vs. Light DOM The Shadow DOM is a feature that provides a way to encapsulate your components and prevent unrelated code and components from affecting them in undesired ways. Specifically, it allows for a way to prevent outside CSS from affecting your components and vice versa by scoping them to a specific shadow root. When it comes to the Light DOM, even if you’ve never heard of the term, you’ve probably used it. If you’ve ever worked on any website before, and interacted with the standard DOM tree, that is the Light DOM. The Light DOM, and any Shadow DOMs under it for that matter, can contain Shadow DOMs inside of them attached to elements. When you add a Lit component to a page, a shadow root will get attached to it that will contain its subelements, and prevent CSS from outside of that DOM from affecting it. Using Light DOM with Certain Web Components By default, Lit attaches a shadow root to all custom elements that extend from LitElement. However, web components don’t actually require a shadow root to function. We can do away with the shadow root by overriding the createRenderRoot method, and returning the web component itself: ` Although we can just put this method in any element we want exposed into the Light DOM. We can also make a new component called LightElement that overrides this method that we can extend from instead of LitElement on our own components. This will be useful later when we tackle another problem. Uh oh, where did my CSS styling and slots go? The issue with not using a shadow root is Lit has no way to encapsulate your component stylesheets anymore. As a result, your light components will now inherit styles from the root that they are contained in. For example, if your components are directly in the body of the page, then they will inherit all global styles on the page. Similarly when your light components are inside of a shadow root, they will inherit any styles attached to that shadow root. To resolve this issue, one could simply add style tags to the HTML template returned in the render() method, and accept that other stylesheets in the same root could affect your components. You can use naming conventions such as BEM for your CSS classes to mitigate this for the most part. Although this does work and is a very pragmatic solution, this solution does pollute the DOM with multiple duplicate stylesheets if more than one instance of your component is added to the DOM. Now, with the CSS problem solved, you can now have a functional Lit web component with form autofill for passwords and other autofillable data! You can view an example using this solution here. A Better Approach using Adopted Stylesheets For a login page where only one instance of the component is in the DOM tree at any given point, the aforementioned solution is not a problem at all. However, this can become a problem if whatever element you need to use the Light DOM with is used in lots of places or repeated many times on a page. An example of this would be a custom input element in a table that contains hundreds of rows. This can potentially cause performance issues, and also pollute the CSS inspector in your devtools resulting in a suboptimal experience both for users and yourself. The better, though still imperfect, way to work around this problem is to use the adopted stylesheets feature to attach stylesheets related to the web component to the root it is connected in, and reuse that same stylesheet across all instances of the node. Below is a function that tracks stylesheets using an id and injects them in the root node of the passed in element. Do note that, with this approach, it is still possible for your component’s styles to leak to other components within the same root. And like I advised earlier, you will need to take that into consideration when writing your styles. ` This solution works for most browsers, and a fallback is included for Safari as it doesn’t support adoptedStylesheets at the time of writing this article. For Safari we inject de-duplicated style elements at the root. This accomplishes the same result effectively. Let’s go over the evictDisconnectedRoots function that was called inside of the injection function. We need to ensure we clean up global state since the injection function relies on it to keep duplication to a minimum. Our global state holds references to document nodes and shadow roots that may no longer exist in the DOM. We want these to get cleaned up so as to not leak memory. Thankfully, this is easy to iterate through and check because of the isConnected property on nodes. ` Now we need to get our Lit component to use our new style injection function. This can be done by modifying our LightElement component, and having it iterate over its statically defined stylesheets and inject them. Since our injection function contains the de-duplication logic itself, we don’t need to concern ourselves with that here. ` With all that you should be able to get an autocompletable form just like the previous example. The full example using the adopted stylesheets approach can be found here. Conclusion I hope this article was helpful for helping you figure out how to implement autofillable forms in Lit. Both examples can be viewed in our blog demos repository. The example using basic style tags can be found here, and the one using adopted stylesheets can be found here....

Jun 7, 2023

6 mins

Lit

Advanced TypeScript - Schema Validation with Zod - Type Inference & Generics with Josh Goldberg

In this episode of Modern Web, Josh Goldberg discusses the benefits of TypeScript ESLint v8 and as well as other various topics related to JavaScript tools, AI in coding, and industry dynamics. Josh breaks down the latest version of TypeScript ESLint, v8. He points out the big performance boosts and introduces the cool new feature of type-aware linting. With this, developers can catch potential errors and follow best practices by using TypeScript's static type checking. This not only cuts down on bugs but also makes the code easier to read and maintain. Tracy and Josh talk about the importance of using the right tools for better coding results. They discuss how the Gartner hype cycle can influence developers' choices and warn against adopting tools just because they’re trendy. Instead, they suggest carefully evaluating tools based on specific needs and project requirements. By picking the right tools, developers can simplify their workflows, improve code quality, and get better outcomes overall. The conversation also touches on the impact of companies like Vercel and the unexpected consequences in tech development. While new tools and technologies can be super beneficial, they can also bring unexpected challenges. It’s important for developers to be aware of these potential issues and address them to ensure smooth development and successful projects. They also chat about the "trough of disillusionment" in tech adoption and mention upcoming typed linting tools which aim to further improve code quality and developer productivity. Lastly, the two talk about the SquiggleConf conference, which focuses on web development tools. They explain the term "squiggle" in error indicators and why clear, informative error messages are important for helping developers debug and troubleshoot. The conference is a great place for developers to learn about the latest web development tools and share tips on improving the developer experience. Check it out https://2024.squiggleconf.com/ on October 3-4, 2024 in Boston, MA!...

Jul 26, 2024

2 mins

TypeScriptWeb Development

A Guide to Keeping Secrets out of Git Repositories

How can software know something is a secret?

Checking for Secrets in CI and CD

GitHub

GitLab

Gitleaks

Scanning for Secrets using a CI Job

Checking for Secrets in a Pre-Commit Hook

A Good Code Review Process Goes a Long Way

Oh no, there’s already a secret in my Git history!

The Secret is in a Single Branch

The Secrets are in Main Already…

Conclusion

Jamie Kuppens

You might also like

Git Bisect: the Time Traveling Bug Finder

Ensuring Accurate Workflow Status in GitHub for Enhanced Visibility

A Tale of Form Autofill, LitElement and the Shadow DOM

Advanced TypeScript - Schema Validation with Zod - Type Inference & Generics with Josh Goldberg

You might also like

Git Bisect: the Time Traveling Bug Finder

Ensuring Accurate Workflow Status in GitHub for Enhanced Visibility

A Tale of Form Autofill, LitElement and the Shadow DOM

Advanced TypeScript - Schema Validation with Zod - Type Inference & Generics with Josh Goldberg