fix(scan): exclude Python virtual environments from manifest collection + bump Coana CLI to 15.5.10 (1.1.128)#1379
Merged
Martin Torp (mtorp) merged 2 commits intoJun 25, 2026
Conversation
…on (1.1.128) Recursive manifest discovery for `socket scan`, reachability, and `socket fix` walked into Python virtual environments and collected the thousands of dependency manifests (setup.py, pyproject.toml, requirements.txt, …) installed under their site-packages, bloating scans with packages that are not part of the user's project. Exclude venvs two ways: - Add `.venv` to IGNORED_DIRS for a cheap traversal-prune of the conventional directory name. - Detect arbitrarily-named venvs by their `pyvenv.cfg` marker (written at the environment root by stdlib `venv` per PEP 405 and by virtualenv >= 20). Discovery is folded into the existing `.gitignore` discovery walk, so it adds no extra full-tree traversal; each venv root contributes a `<dir>/**` ignore that all downstream glob paths honor. Bare `venv`/`env` are intentionally not name-excluded to avoid skipping a legitimately-named non-venv directory; the pyvenv.cfg check covers them.
Oskar Haarklou Veileborg (BarrensZeppelin)
approved these changes
Jun 25, 2026
Oskar Haarklou Veileborg (BarrensZeppelin)
left a comment
Member
There was a problem hiding this comment.
LGTM ✅
Benjamin Barslev Nielsen (barslev)
approved these changes
Jun 25, 2026
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR ships in release 1.1.128 and contains two changes.
1. Exclude Python virtual environments from manifest collection
Problem
When the CLI collects manifest files for a scan (
socket scan,socket scan --reach, andsocket fix), it walks the target tree recursively and excludes a fixed set of directories (node_modules,.git,.yarn, …) plus anything matched by.gitignore/socket.ymlrules. Python virtual environments are not excluded. A venv'slib/.../site-packagestree holds thousands of installed packages, each with its ownsetup.py,pyproject.toml,requirements.txt, etc. Those get picked up as manifests — bloating scans, polluting results with dependency-of-dependency manifests, and wasting a full-tree walk into a directory that should never be scanned.Venvs are usually named
.venvorvenv, but the name is arbitrary. The reliable, name-independent signal is thepyvenv.cfgfile that Python's stdlibvenv(PEP 405) andvirtualenv≥ 20 always write at the environment root — nothing else creates that file, so there are no false positives.Change
In
src/utils/glob.mts(the single path all three commands funnel through,getPackageFilesForScan→globWithGitIgnore):.venvtoIGNORED_DIRSfor a cheap traversal-prune of the conventional directory name (also catches pre-2020 virtualenvs that lack apyvenv.cfg).pyvenv.cfgmarker. Discovery is folded into the existing.gitignorediscovery walk, so it adds no extra full-tree traversal. Each detected venv root contributes a<dir>/**ignore pattern to the sharedignoresset, which every downstream glob path (fast, filter-streaming, and negated-pattern) already honors.Bare
venv/envare intentionally not name-excluded, to avoid skipping a legitimately-named non-venv directory that holds real manifests; thepyvenv.cfgcheck covers them correctly (a real venv has the marker; a non-venv directory does not).Out of scope: conda environments (no
pyvenv.cfg) and the separate Bazel manifest walker.Tests
Added cases to
src/utils/glob.test.mts:myenv/) detected viapyvenv.cfg, keeping the project's own root manifest..venvby name.venv/that has nopyvenv.cfg(regression guard against over-exclusion).Verification:
glob.test.mts18/18 andpath-resolve.test.mts14/14 pass;pnpm check(lint + tsc) is green.2. Bump Coana CLI to 15.5.10
Upgrades
@coana-tech/clifrom15.5.9to15.5.10(package.json+pnpm-lock.yaml). For details on what's included in this Coana release, see the Coana Changelogs.