Continuous integration

Continuous Integration #

CI/CD integration #

In this chapter, we will walk you through the process of integrating Semgrep into your GitHub repository as part of your continuous integration (CI) and continuous deployment (CD) pipeline.

We recommend integrating Semgrep with GitHub Actions using the following approach:

  1. Schedule a full Semgrep scan on the main branch with a broad set of Semgrep rules (e.g., p/default).
  2. Implement a diff-aware scanning approach for pull requests, using a fine-tuned set of rules that yield high confidence and true positive results.
  3. Once your Semgrep implementation is mature, configure Semgrep to block the PR pipeline if there are unresolved Semgrep findings.

Understanding Semgrep CI configuration options #

Familiarize yourself with the available environment variables and their default values by reviewing the Configuration reference. The following are key points to note:

  • Semgrep checks for new versions by default, as controlled by the SEMGREP_ENABLE_VERSION_CHECK variable.
  • By default, Semgrep sets a five-minute timeout for each individual Git command that Semgrep runs (SEMGREP_GIT_COMMAND_TIMEOUT).
  • Semgrep attempts to scan each file with a 30-second timeout (SEMGREP_TIMEOUT) and retries up to three times (--timeout-threshold).
  • The SEMGREP_RULES environment variable defines the rules used by Semgrep. You can specify multiple rule sources by separating them with a space.
  • By default, the CI process fails if findings are detected but passes if internal errors occur. For more information, see Passing or failing the CI job.

GitHub integration steps #

Follow these steps to integrate Semgrep with your GitHub repository:

  1. Create a semgrep.yml file in the .github/workflows directory of the repository you want to scan.
  2. Copy the code snippet below into the semgrep.yml file. This workflow is based on two jobs:
    • The first job:
      • Runs on a schedule basis (once per month).
      • Runs when a pull request is merged.
      • Runs when there is a direct push on the main/master branch.
      • Uses the broad p/default Semgrep rule.
    • The second job:
      • Runs specifically for pull requests.
      • Uses multiple security-related rules.
 1# Define the name of this GitHub Actions workflow.
 2name: Semgrep
 3on:
 4  # Run the workflow on pull_request events for diff-aware scanning.
 5  pull_request: {}
 6  # Run the workflow on push events to mainline branches to report all findings.
 7  push:
 8    branches: ["master", "main"]
 9  # Schedule the workflow to run periodically using cron syntax.
10  schedule:
11    - cron: '0 0 1 * *' # Schedule Semgrep to run once per month (at 00:00 on day-of-month 1).
12# Define the jobs that run as part of this workflow.
13jobs:
14  # Define the first job for scheduled scanning and mainline branch scanning.
15  semgrep-schedule:
16    # Define the conditions for running this job. Run on schedule, push to master/main, or merged PR.
17    # Skip any PR created by Dependabot to avoid permission issues.
18    if: ((github.event_name == 'schedule' || github.event_name == 'push' || github.event.pull_request.merged == true)
19        && github.actor != 'dependabot[bot]')
20    # Name this GitHub Actions job.
21    name: Semgrep default scan
22     # Define the environment in which the job runs.
23    runs-on: ubuntu-latest
24    container:
25      # Use a Docker image with Semgrep pre-installed.
26      image: returntocorp/semgrep
27    steps:
28      # Use the GitHub Actions Checkout step to fetch the project source code.
29      - uses: actions/checkout@v3
30       # Execute the "semgrep ci" command within the Semgrep Docker container.
31      - run: semgrep ci
32        env:
33          # Set the SEMGREP_RULES environment variable to define which rules Semgrep should use.
34          SEMGREP_RULES: p/default # Browse more rulesets - semgrep.dev/explore
35  # Define the second job for scanning pull requests.
36  semgrep-pr:
37    # Define the conditions for running this job. Run only within Pull Requests, excluding Dependabot PRs.
38    if: (github.event_name == 'pull_request' && github.actor != 'dependabot[bot]')
39    # Name this GitHub Actions job.
40    name: Semgrep PR scan 
41    # Define the environment in which the job runs.
42    runs-on: ubuntu-latest
43    container:
44      # Use the GitHub Actions Checkout step to fetch the project source code.
45      image: returntocorp/semgrep
46    steps:
47      # Fetch project source with GitHub Actions Checkout.
48      - uses: actions/checkout@v3
49      # Execute the "semgrep ci" command within the Semgrep Docker container.
50      - run: semgrep ci
51        env:
52          # Set the SEMGREP_RULES environment variable to define which rules Semgrep should use.
53          # Use common security-related rulesets for this job.
54          SEMGREP_RULES: p/cwe-top-25 p/owasp-top-ten p/r2c-security-audit p/javascript p/trailofbits # more at semgrep.dev/explore

This configuration ensures that your codebase is scanned regularly for potential issues and that new code introduced through pull requests is thoroughly checked for security vulnerabilities.

This content is licensed under a Creative Commons Attribution 4.0 International license.