Skip to content

fix(scan): exclude Python virtual environments from manifest collection + bump Coana CLI to 15.5.10 (1.1.128)#1379

Merged
Martin Torp (mtorp) merged 2 commits into
v1.xfrom
martin/exclude-venv-variants-for-manifest-file
Jun 25, 2026
Merged

fix(scan): exclude Python virtual environments from manifest collection + bump Coana CLI to 15.5.10 (1.1.128)#1379
Martin Torp (mtorp) merged 2 commits into
v1.xfrom
martin/exclude-venv-variants-for-manifest-file

Conversation

@mtorp

@mtorp Martin Torp (mtorp) commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

This PR ships in release 1.1.128 and contains two changes.

1. Exclude Python virtual environments from manifest collection

Problem

When the CLI collects manifest files for a scan (socket scan, socket scan --reach, and socket fix), it walks the target tree recursively and excludes a fixed set of directories (node_modules, .git, .yarn, …) plus anything matched by .gitignore / socket.yml rules. Python virtual environments are not excluded. A venv's lib/.../site-packages tree holds thousands of installed packages, each with its own setup.py, pyproject.toml, requirements.txt, etc. Those get picked up as manifests — bloating scans, polluting results with dependency-of-dependency manifests, and wasting a full-tree walk into a directory that should never be scanned.

Venvs are usually named .venv or venv, but the name is arbitrary. The reliable, name-independent signal is the pyvenv.cfg file that Python's stdlib venv (PEP 405) and virtualenv ≥ 20 always write at the environment root — nothing else creates that file, so there are no false positives.

Change

In src/utils/glob.mts (the single path all three commands funnel through, getPackageFilesForScanglobWithGitIgnore):

  • Add .venv to IGNORED_DIRS for a cheap traversal-prune of the conventional directory name (also catches pre-2020 virtualenvs that lack a pyvenv.cfg).
  • Detect arbitrarily-named venvs by their pyvenv.cfg marker. Discovery is folded into the existing .gitignore discovery walk, so it adds no extra full-tree traversal. Each detected venv root contributes a <dir>/** ignore pattern to the shared ignores set, which every downstream glob path (fast, filter-streaming, and negated-pattern) already honors.

Bare venv / env are intentionally not name-excluded, to avoid skipping a legitimately-named non-venv directory that holds real manifests; the pyvenv.cfg check covers them correctly (a real venv has the marker; a non-venv directory does not).

Out of scope: conda environments (no pyvenv.cfg) and the separate Bazel manifest walker.

Tests

Added cases to src/utils/glob.test.mts:

  • Excludes an arbitrarily-named venv (myenv/) detected via pyvenv.cfg, keeping the project's own root manifest.
  • Excludes .venv by name.
  • Keeps a non-venv directory named venv/ that has no pyvenv.cfg (regression guard against over-exclusion).
  • Confirms the exclusion also prunes through the streaming-filter path that manifest scanning actually uses.

Verification: glob.test.mts 18/18 and path-resolve.test.mts 14/14 pass; pnpm check (lint + tsc) is green.

2. Bump Coana CLI to 15.5.10

Upgrades @coana-tech/cli from 15.5.9 to 15.5.10 (package.json + pnpm-lock.yaml). For details on what's included in this Coana release, see the Coana Changelogs.

…on (1.1.128)

Recursive manifest discovery for `socket scan`, reachability, and
`socket fix` walked into Python virtual environments and collected the
thousands of dependency manifests (setup.py, pyproject.toml,
requirements.txt, …) installed under their site-packages, bloating scans
with packages that are not part of the user's project.

Exclude venvs two ways:
- Add `.venv` to IGNORED_DIRS for a cheap traversal-prune of the
  conventional directory name.
- Detect arbitrarily-named venvs by their `pyvenv.cfg` marker (written at
  the environment root by stdlib `venv` per PEP 405 and by virtualenv >=
  20). Discovery is folded into the existing `.gitignore` discovery walk,
  so it adds no extra full-tree traversal; each venv root contributes a
  `<dir>/**` ignore that all downstream glob paths honor.

Bare `venv`/`env` are intentionally not name-excluded to avoid skipping a
legitimately-named non-venv directory; the pyvenv.cfg check covers them.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✅

@mtorp Martin Torp (mtorp) changed the title fix(scan): exclude Python virtual environments from manifest collection (1.1.128) fix(scan): exclude Python virtual environments from manifest collection + bump Coana CLI to 15.5.10 (1.1.128) Jun 25, 2026
@socket-security-staging

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​@​coana-tech/​cli@​15.5.10971008098100

View full report

@socket-security

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​@​coana-tech/​cli@​15.5.10961008098100

View full report

@mtorp Martin Torp (mtorp) merged commit 330612d into v1.x Jun 25, 2026
12 checks passed
@mtorp Martin Torp (mtorp) deleted the martin/exclude-venv-variants-for-manifest-file branch June 25, 2026 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants