fix: guard C extension reload in _prefer_module_from_site_packages to prevent process crash#9148
Conversation
… prevent process crash
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The
_has_loaded_c_extensionhelper traverses allsys.moduleskeys on every call; consider short‑circuiting earlier by limiting the search to the module subtree via a precomputed prefix set or caching results to reduce overhead during large dependency trees. - The C extension detection relies purely on
.pyd/.sosuffixes; if you expect nonstandard layouts (e.g., extensions packaged under different filenames or platforms like macOS with.dylib), consider making the detection slightly more flexible or documenting the assumptions in the docstring.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `_has_loaded_c_extension` helper traverses all `sys.modules` keys on every call; consider short‑circuiting earlier by limiting the search to the module subtree via a precomputed prefix set or caching results to reduce overhead during large dependency trees.
- The C extension detection relies purely on `.pyd`/`.so` suffixes; if you expect nonstandard layouts (e.g., extensions packaged under different filenames or platforms like macOS with `.dylib`), consider making the detection slightly more flexible or documenting the assumptions in the docstring.
## Individual Comments
### Comment 1
<location path="astrbot/core/utils/pip_installer.py" line_range="634-635" />
<code_context>
+ mod = sys.modules.get(key)
+ if mod is None:
+ continue
+ mod_file = getattr(mod, "__file__", "") or ""
+ if os.path.splitext(mod_file)[1].lower() in (".pyd", ".so"):
+ return True
+ return False
</code_context>
<issue_to_address>
**suggestion:** The extension check might miss some platform-specific C-extension filenames.
The current check only handles `.pyd` and `.so`, which works for most CPython setups but can miss valid extension modules (e.g., `.dll` on Windows or custom/embedded builds). To make this more robust, consider either expanding the allowed suffixes (including `.dll`) or using `importlib.machinery.EXTENSION_SUFFIXES` so the logic stays aligned with the interpreter’s supported C-extension types.
Suggested implementation:
```python
import os
import sys
import importlib.machinery
```
```python
def _has_loaded_c_extension(module_name: str) -> bool:
"""检查 sys.modules 中目标模块的依赖子树是否已包含 C 扩展。
遍历已加载的 key(如 'pikepdf'、'pikepdf._core'),
检查其 __file__ 后缀是否为当前解释器支持的 C 扩展后缀
(参见 importlib.machinery.EXTENSION_SUFFIXES)。
"""
# 使用解释器提供的 EXTENSION_SUFFIXES,以覆盖不同平台和嵌入场景下的所有 C 扩展后缀
ext_suffixes = {suffix.lower() for suffix in importlib.machinery.EXTENSION_SUFFIXES}
for key in list(sys.modules.keys()):
if not (key == module_name or key.startswith(f"{module_name}.")):
continue
mod = sys.modules.get(key)
if mod is None:
continue
mod_file = getattr(mod, "__file__", "") or ""
suffix = os.path.splitext(mod_file)[1].lower()
if suffix and suffix in ext_suffixes:
return True
return False
```
1. 如果文件顶部的导入顺序或分组(例如按标准库/第三方/本地模块分组)与上面的 SEARCH 片段不一致,需要相应调整 SEARCH 块,以在现有导入列表中插入 `import importlib.machinery`。
2. 如果该文件中还有其它地方直接假设 C 扩展只为 `.pyd` / `.so`,建议也统一改为使用 `importlib.machinery.EXTENSION_SUFFIXES`,以保持行为一致。
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request introduces a helper function _has_loaded_c_extension to check if a module's dependency subtree contains loaded C extensions (with .pyd or .so extensions) and uses it to skip preferring modules from site-packages when C extensions are detected. The review feedback suggests making the __file__ attribute check more robust by explicitly verifying it is a string before checking its extension, preventing potential runtime errors.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| mod_file = getattr(mod, "__file__", "") or "" | ||
| if os.path.splitext(mod_file)[1].lower() in (".pyd", ".so"): | ||
| return True |
There was a problem hiding this comment.
为了使 C 扩展检测更加健壮和具有防御性,我们应该显式检查 __file__ 是否为字符串。在某些环境(如命名空间包、冻结模块或测试中的 Mock 模块)中,__file__ 可能是 None 甚至是非字符串类型。使用 isinstance(mod_file, str) 结合 .endswith() 更加安全、简洁,并且可以避免在路径解析过程中可能出现的 TypeError 或 AttributeError。
| mod_file = getattr(mod, "__file__", "") or "" | |
| if os.path.splitext(mod_file)[1].lower() in (".pyd", ".so"): | |
| return True | |
| mod_file = getattr(mod, "__file__", None) | |
| if isinstance(mod_file, str) and mod_file.lower().endswith((".pyd", ".so")): | |
| return True |
There was a problem hiding this comment.
Verified against the live AstrBot process: every module in sys.modules has __file__ as str (24 modules) or None (22 modules) — zero instances of int, Path, or other non-str types. CPython guarantees __file__ is str | None on real modules. Our or "" fallback already converts None to an empty string, which yields a non-matching suffix from splitext. The isinstance check covers a scenario that cannot occur in production module introspection, so we're keeping the simpler or "" guard.
Gemini Code Assist 评审澄清1. 对当前 AstrBot 运行实例中 Sourcery 评审澄清1.后缀检测改用
2.每次遍历全部 当前运行时 3.docstring 未注明后缀假设(不采纳) 现有 docstring 已明确注明仅检测 |
Summary / 概述
Closes #9146
_prefer_module_from_site_packages的sys.modules.pop()+exec_module()对 C 扩展模块(.pyd/.so)不安全——C 扩展的初始化状态存储在进程级 C++ 内存中,pop 无法清理,重新 exec_module 触发重复初始化。nanobind 检测到重复键注册后调用abort(),直接杀死整个 Python 进程。仅影响 Tauri 桌面壳版本(
ASTRBOT_DESKTOP_CLIENT=1),Docker 和源码运行不受影响。复现场景:安装含
img2pdf依赖的插件 →_collect_candidate_modules递归展开传递依赖收集到{img2pdf, pikepdf, ...}→ T₁ 处理img2pdf时内部import pikepdf加载pikepdf._core.pyd→ T₂ 处理pikepdf时 pop 后重新 exec_module → C 扩展重复初始化 →abort()杀死进程。此后每次重启循环崩溃。Root Cause / 根因
sys.modules.pop()只能清理 Python 侧的模块对象,C++ 侧的静态变量、类型注册表、枚举表完全不受影响。这不是 nanobind 的 bug——任何 C 扩展(pybind11、SWIG、Cython、pyo3)都存在同样的问题。Changes / 改动
文件:
astrbot/core/utils/pip_installer.py新增
_has_loaded_c_extension()检测函数(15 行),在_prefer_module_from_site_packages入口加一个守卫(5 行),matched_keys处新增注释(2 行)。总计 +24 行,0 删除。设计要点:
sys.modules而非文件系统——只关注已加载状态,避免误判.pyd(Windows)和.so(Linux/macOS)locked_modules过滤无冲突return False后_ensure_preferred_modules的二次校验中发现模块已从正确路径加载,跳过冲突检测Verification / 验证
img2pdf)pikepdf,T2 时刻)abort()杀死进程__file__=None/ 无__file__属性的模块.so扩展单元测试:
_has_loaded_c_extension10 个场景全部通过。已在 Tauri 桌面版实测验证(安装了之前必然崩溃的插件,正常运行)。Checklist
pip_installer.py一个文件)list(sys.modules.keys())防御迭代中 dict 变更)