[Klaud Cold] Remove disallowed --hf-overrides indexer override from DSV4 ATOM disagg / 移除 DSV4 ATOM disagg 中不允许的 --hf-overrides indexer 覆盖#2038
Conversation
… launcher
The DeepSeek-V4-Pro conditional set --hf-overrides
'{"use_index_cache":true,"index_topk_freq":4}', skipping the indexer
on 3 of every 4 layers and reducing model architecture FLOPs — not
allowed per PR_REVIEW_CHECKLIST.md. HF_OVERRIDES_ARG stays as an empty
extension point so the expansion sites are unchanged.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Thanks for the contribution! Please reach out to respective companies' CODEOWNER to fill in the latest PR_REVIEW_CHECKLIST.md before pinging core maintainer on Slack for review. In order for the signoff PR check bot to trigger, you must follow the PR_REVIEW_CHECKLIST.md template correctly, including the phrase For PR verification, add the PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. See GitHub's docs on re-running failed jobs 感谢你的贡献!请联系相应公司的 CODEOWNER 填写最新的 PR_REVIEW_CHECKLIST.md,然后再在 Slack 上联系核心维护者进行审阅。为了触发 signoff PR 检查机器人,你必须正确遵循 PR_REVIEW_CHECKLIST.md 模板,包括保留英文语句 如需进行 PR 验证,请为此 PR 添加 PR 作者有责任确保合并后所有 GitHub Action 任务完全通过。 很多时候失败只是偶发抖动(flake),重新运行失败的任务即可解决。参见 GitHub 关于重新运行失败任务的文档 |
There was a problem hiding this comment.
LGTM — removes a disallowed indexer-skipping override, keeps the empty extension point intact.
Extended reasoning...
Overview
Small mechanical cleanup PR touching two files: benchmarks/multi_node/amd_utils/server_atom.sh (removes a 3-line MODEL_NAME == DeepSeek-V4-Pro conditional that set HF_OVERRIDES_ARG to an --hf-overrides JSON that reduces indexer FLOPs) and perf-changelog.yaml (appends the required entry for dsv4-fp4-mi355x-atom-disagg). Companion to #2037 which did the same for the single-node recipe.
Security risks
None. This is a shell-launcher config for benchmark runs — no auth, crypto, or user-input paths. The removed override was a performance/FLOPs manipulation, not a security control.
Level of scrutiny
Low. The PR removes a disallowed model-architecture-FLOPs-reducing override to comply with PR_REVIEW_CHECKLIST.md. HF_OVERRIDES_ARG="" is preserved as an empty extension point so the three ${HF_OVERRIDES_ARG} expansion sites (prefill node 0, prefill nodes 1..N, decode nodes) still expand cleanly to nothing. Other DSV4-conditional branches (GPU_MAX_HW_QUEUES/ATOM_CPU_AFFINITY exports, --enable-tbo, AITER_QUICK_REDUCE_QUANTIZATION) are untouched.
Other factors
full-sweep-fail-fastlabel applied, so benchmark validation runs.- The
pull/XXXplaceholder in the changelog entry matches the documented template in AGENTS.md, not a bug. - No bugs found by the bug hunting system.
- No prior reviews from me on this PR.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28693609959 |
1 similar comment
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=28693609959 |
Summary
benchmarks/multi_node/amd_utils/server_atom.shthat setHF_OVERRIDES_ARG="--hf-overrides '{\"use_index_cache\":true,\"index_topk_freq\":4}'"— this indexer-skipping override reduces model-architecture FLOPs, which PR_REVIEW_CHECKLIST.md disallows.HF_OVERRIDES_ARG=""remains as an empty extension point so the three expansion sites are untouched.perf-changelog.yamlentry fordsv4-fp4-mi355x-atom-disagg.full-sweep-fail-fastlabel applied for PR validation.中文说明
benchmarks/multi_node/amd_utils/server_atom.sh中针对 DeepSeek-V4-Pro 的条件分支(设置HF_OVERRIDES_ARG="--hf-overrides '{\"use_index_cache\":true,\"index_topk_freq\":4}'")— 该覆盖跳过部分层的 indexer,减少了模型架构 FLOPs,PR_REVIEW_CHECKLIST.md 禁止此类优化。保留HF_OVERRIDES_ARG=""作为空扩展点,三处展开位置不受影响。dsv4-fp4-mi355x-atom-disagg追加所需的perf-changelog.yaml条目。full-sweep-fail-fast标签用于 PR 验证。🤖 Generated with Claude Code