Back to KB
Difficulty
Intermediate
Read Time
7 min

Gitleaks: Open-Source Secret Scanning for Git Repos in 2026

By Codcompass Team··7 min read

Current Situation Analysis

Hardcoded credentials in version control remain one of the most persistent and costly failure modes in software engineering. Despite widespread awareness, developers continue to push API keys, tokens, and private keys to remote repositories. The risk is immediate and automated: bots scrape public repositories continuously, and exposed AWS access keys or GitHub tokens are often compromised within minutes of being pushed. The consequence is rarely just embarrassment; it typically manifests as unauthorized resource provisioning, crypto-mining workloads, or data exfiltration.

Many teams treat secret scanning as an afterthought or rely on manual code review, which is statistically ineffective for pattern-based detection. Others adopt commercial solutions that introduce per-seat licensing costs and vendor lock-in, which may be disproportionate for smaller teams or open-source projects. The gap lies in a tool that offers robust, regex-based detection with full history scanning and CI integration without the overhead of enterprise licensing.

Gitleaks fills this gap. It is an open-source scanner written in Go that provides deterministic secret detection using a curated ruleset of over 100 patterns. It supports full repository history analysis, pre-commit hooks, and CI pipelines, outputting results in formats compatible with GitHub Code Scanning. For teams prioritizing cost control, auditability, and integration flexibility, Gitleaks provides a production-grade alternative to paid platforms.

WOW Moment: Key Findings

The following comparison illustrates the operational trade-offs between Gitleaks and a commercial alternative like GitGuardian. The data reflects typical detection coverage, integration depth, and cost structure for a mid-sized engineering organization.

ApproachDetection CoverageCI/Pre-commit SupportEnterprise FeaturesCost Model
Gitleaks~80% of known patternsFull (CLI, GitHub Action, SARIF)None (self-managed)Free (MIT license for binary)
GitGuardian~95% (includes ML-based generic detection)Full (dashboard, webhooks, SARIF)Dashboard, auto-revocation, SOC 2 logs, triage UIPer-developer monthly subscription

Gitleaks covers the majority of high-value secret patterns out of the box. The remaining gap in commercial tools is primarily composed of machine learning-based detection for unknown token formats, centralized incident management, and automated revocation workflows. For teams with fewer than 20 developers or those operating under strict budget constraints, Gitleaks delivers sufficient coverage when paired with disciplined rotation practices and CI enforcement.

The critical insight is that detection frequency matters more than detection breadth. A scanner that runs on every commit and pre-push prevents leaks before they propagate. Gitleaks enables this workflow at zero licensing cost, provided the team invests in configuration and pipeline integration.

Core Solution

Implementing Gitleaks requires three layers: local development enforcement, CI pipeline integration, and ruleset customization. Each layer addresses a different point in the development lifecycle.

1. Local Development: Pre-commit Hook

The most effective defense is catching secrets before they enter the repository. Gitleaks provides a protect command optimized for this use case. It scans staged changes and exits with a non-zero status if a match is found.

Implementation:

# Install gitleaks via Homebrew (macOS/Linux)
brew install gitleaks

# Or download the binary directly
# https://github.com/gitleaks/gitleaks/releases

# Verify installation
gitleaks version

Integrate with a pre-commit framework. Using pre-commit (Python-based hook manager):

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
      - id: gitleaks
        name: gitleaks-protect
        entry: gitleaks protect --staged --verbose
        language: golang
        pass_filenames: false

Rationale: The --staged flag ensures only changes about to be committed are scanned. This minimizes latency and prevents false positives from untracked files. The pass_filenames: false directive is required because Gitleaks operates on the git diff context, not individual file arguments.

2. CI Pipeline: GitHub Actions

For repositories without local hooks, or to enforce scanning across all contributors, CI integration is mandatory. The official GitHub Action wraps the binary and supports SARIF output for native GitHub Code Scanning integration.

Implementation:

# .github/workflows/secret-scan.yml
name: Secret Scanning

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

jobs:
  gitleaks:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run Gitleaks
        uses: gitlea

ks/gitleaks-action@v2 env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITLEAKS_LICENSE: ${{ secrets.GITLEAKS_LICENSE }}


**Rationale:** `fetch-depth: 0` is required for full history scanning. Without it, the action only sees the latest commit, defeating the purpose of historical analysis. The `GITLEAKS_LICENSE` secret is only required for private repositories when using the official action. For public repos, or when running the binary directly, the MIT license applies without restriction.

**Alternative CI (Direct Binary):**

If licensing constraints or runner compatibility are concerns, download and run the binary directly:

```yaml
- name: Download Gitleaks
  run: |
    curl -LO https://github.com/gitleaks/gitleaks/releases/download/v8.18.0/gitleaks_8.18.0_linux_x64.tar.gz
    tar -xzf gitleaks_8.18.0_linux_x64.tar.gz

- name: Run Gitleaks Detect
  run: ./gitleaks detect --source . --report-path gitleaks-report.json --report-format json

This approach remains fully MIT-licensed regardless of repository visibility.

3. Ruleset Customization

Gitleaks ships with a default ruleset, but internal services often use custom token formats. Adding a rule requires editing a TOML configuration file.

Implementation:

# gitleaks.toml
[[rules]]
id = "internal-api-key"
description = "Internal service API key (32-char hex prefix)"
regex = '''internal_key_[a-f0-9]{32}'''
entropy = 3.5
tags = ["internal", "api-key"]

[rules.allowlist]
regexes = ['''internal_key_00000000000000000000000000000000''']

Rationale: The entropy field filters low-complexity matches. A threshold of 3.5 bits per character is typical for hex strings. The allowlist section prevents false positives from documentation examples or placeholder values. Custom rules should be committed to the repository to ensure consistent scanning across all environments.

4. Historical Remediation

Running gitleaks detect on a repository with past leaks will surface findings even after key rotation. The secret remains in the git object store. Rotation mitigates active risk but does not remove the artifact.

Remediation Workflow:

  1. Rotate all exposed credentials immediately.
  2. Use git filter-repo or BFG Repo-Cleaner to rewrite history.
  3. Force-push the cleaned branch.
  4. Notify all collaborators to re-clone or reset their local copies.

Example:

# Install git-filter-repo
pip install git-filter-repo

# Rewrite history to remove a specific file containing secrets
git filter-repo --invert-paths --path secrets.env

# Force push (coordinate with team)
git push --force --all

Rationale: Git is append-only by design. Removing data requires rewriting commit hashes, which breaks downstream clones. Coordination is mandatory to prevent merge conflicts and data loss.

Pitfall Guide

1. Running detect Without fetch-depth: 0

Explanation: Shallow clones in CI only include the latest commit. Gitleaks will miss secrets buried in historical commits. Fix: Always set fetch-depth: 0 in checkout steps when using GitHub Actions or GitLab CI.

2. Relying Solely on Pre-commit Hooks

Explanation: Developers can bypass hooks with --no-verify or use clients that ignore them. Pre-commit is a convenience, not a enforcement mechanism. Fix: Implement CI scanning as the authoritative gate. Pre-commit reduces noise; CI prevents escapes.

3. Ignoring Entropy Thresholds

Explanation: Regex matches on low-entropy strings (e.g., AKIAIOSFODNN7EXAMPLE) generate false positives. Without entropy filtering, noise overwhelms signal. Fix: Configure entropy values in custom rules. Use the default ruleset's entropy checks as a baseline.

4. Assuming Rotation Removes Secrets from Git

Explanation: Rotating a key stops active abuse but leaves the credential in the repository history. Bots may have already cached the value. Fix: Combine rotation with history rewriting using git filter-repo. Treat rotation as incident response, not remediation.

5. Hardcoding Allowlists in CI

Explanation: Allowing specific patterns in CI configuration creates drift between local and remote scans. Developers may push allowed patterns locally that fail in CI. Fix: Commit the gitleaks.toml configuration to the repository. Ensure all environments use the same ruleset.

6. Overlooking SARIF Integration

Explanation: JSON reports are machine-readable but lack UI integration. Teams may miss findings if they rely on console output. Fix: Enable SARIF output and upload to GitHub Code Scanning. This surfaces findings in the PR diff and repository security tab.

7. Running protect Without --staged

Explanation: Scanning the working directory includes untracked files and build artifacts. This increases scan time and false positives. Fix: Always use --staged in pre-commit hooks. It limits scope to changes about to be committed.

Production Bundle

Action Checklist

  • Install Gitleaks CLI and verify version compatibility with team tooling
  • Configure .pre-commit-config.yaml with gitleaks protect --staged
  • Add CI workflow with fetch-depth: 0 and SARIF report upload
  • Commit gitleaks.toml with custom rules for internal token formats
  • Set entropy thresholds on custom rules to filter low-complexity matches
  • Document rotation and history rewriting procedures in runbooks
  • Test pre-commit hook bypass resistance by verifying CI enforcement
  • Review SARIF findings in GitHub Code Scanning dashboard weekly

Decision Matrix

ScenarioRecommended ApproachWhyCost Impact
< 20 developers, rare leaksGitleaks + pre-commit + CISufficient coverage, low overheadFree (MIT binary)
> 50 developers, frequent leaksGitGuardian or similarDashboard, auto-revocation, triage reduce coordination cost$10–$50/dev/month
Open-source projectGitleaks direct binaryNo licensing restrictions, full history scanFree
Compliance-driven (SOC 2, HIPAA)Commercial scanner with audit logsBuilt-in reporting, centralized incident managementHigher tier pricing
Custom internal tokensGitleaks + custom TOML rulesRegex-based detection is deterministic and auditableFree

Configuration Template

# gitleaks.toml
# Global settings
[config]
    # Maximum file size to scan (bytes)
    max-file-size = 1048576
    # Enable entropy checking
    entropy-check = true

# Default rules are loaded automatically.
# Add custom rules below.

[[rules]]
id = "custom-service-token"
description = "Internal service token (prefix + 40 hex chars)"
regex = '''svc_token_[a-f0-9]{40}'''
entropy = 4.0
tags = ["internal", "service-token"]

[rules.allowlist]
regexes = [
    '''svc_token_0000000000000000000000000000000000000000''',
    '''svc_token_ffffffffffffffffffffffffffffffffffffffff'''
]
paths = [
    '''docs/.*'''
]

Quick Start Guide

  1. Install CLI: Run brew install gitleaks (macOS/Linux) or download the binary from GitHub releases.
  2. Initialize Pre-commit: Add the Gitleaks hook to .pre-commit-config.yaml and run pre-commit install.
  3. Verify Local Scan: Create a test file with a known pattern (e.g., AKIAIOSFODNN7EXAMPLE) and run gitleaks protect --staged. Confirm detection.
  4. Add CI Workflow: Copy the GitHub Actions template to .github/workflows/secret-scan.yml. Ensure fetch-depth: 0 is set.
  5. Commit and Push: Push changes. Verify the action runs and SARIF results appear in the repository security tab.