Skip to content

[Klaud Cold] Update kimik2.5-fp4-b300-vllm vLLM image to v0.24.0 / 将 kimik2.5-fp4-b300-vllm 的 vLLM 镜像 升级至 v0.24.0#2076

Closed
functionstackx wants to merge 1 commit into
mainfrom
klaud-cold/kimik2.5-fp4-b300-vllm-v0.24.0
Closed

[Klaud Cold] Update kimik2.5-fp4-b300-vllm vLLM image to v0.24.0 / 将 kimik2.5-fp4-b300-vllm 的 vLLM 镜像 升级至 v0.24.0#2076
functionstackx wants to merge 1 commit into
mainfrom
klaud-cold/kimik2.5-fp4-b300-vllm-v0.24.0

Conversation

@functionstackx

Copy link
Copy Markdown
Collaborator

Summary

Update vLLM image from vllm/vllm-openai:v0.22.0 to vllm/vllm-openai:v0.24.0

Recipes touched: kimik2.5-fp4-b300-vllm

中文说明

将 vLLM 镜像 从 vllm/vllm-openai:v0.22.0 升级至 vllm/vllm-openai:v0.24.0。涉及配置:kimik2.5-fp4-b300-vllm

Test plan

  • full-sweep-fail-fast sweep passes.

🤖 Generated with Claude Code

中文:将 kimik2.5-fp4-b300-vllm 的 vLLM 镜像 升级至 v0.24.0。

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@functionstackx functionstackx force-pushed the klaud-cold/kimik2.5-fp4-b300-vllm-v0.24.0 branch from d5379c3 to df44c1c Compare July 4, 2026 21:24
@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Thanks for the contribution! Please reach out to respective companies' CODEOWNER to fill in the latest PR_REVIEW_CHECKLIST.md before pinging core maintainer on Slack for review. In order for the signoff PR check bot to trigger, you must follow the PR_REVIEW_CHECKLIST.md template correctly, including the phrase As a PR reviewer and CODEOWNER, I have reviewed this and have.

For PR verification, add the full-sweep-enabled or full-sweep-fail-fast label to this PR — the benchmark sweep only runs on labeled PRs.

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. See GitHub's docs on re-running failed jobs


感谢你的贡献!请联系相应公司的 CODEOWNER 填写最新的 PR_REVIEW_CHECKLIST.md,然后再在 Slack 上联系核心维护者进行审阅。为了触发 signoff PR 检查机器人,你必须正确遵循 PR_REVIEW_CHECKLIST.md 模板,包括保留英文语句 As a PR reviewer and CODEOWNER, I have reviewed this and have

如需进行 PR 验证,请为此 PR 添加 full-sweep-enabledfull-sweep-fail-fast 标签 — 基准测试 sweep 仅在带有标签的 PR 上运行。

PR 作者有责任确保合并后所有 GitHub Action 任务完全通过。 很多时候失败只是偶发抖动(flake),重新运行失败的任务即可解决。参见 GitHub 关于重新运行失败任务的文档

1 similar comment
@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Thanks for the contribution! Please reach out to respective companies' CODEOWNER to fill in the latest PR_REVIEW_CHECKLIST.md before pinging core maintainer on Slack for review. In order for the signoff PR check bot to trigger, you must follow the PR_REVIEW_CHECKLIST.md template correctly, including the phrase As a PR reviewer and CODEOWNER, I have reviewed this and have.

For PR verification, add the full-sweep-enabled or full-sweep-fail-fast label to this PR — the benchmark sweep only runs on labeled PRs.

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. See GitHub's docs on re-running failed jobs


感谢你的贡献!请联系相应公司的 CODEOWNER 填写最新的 PR_REVIEW_CHECKLIST.md,然后再在 Slack 上联系核心维护者进行审阅。为了触发 signoff PR 检查机器人,你必须正确遵循 PR_REVIEW_CHECKLIST.md 模板,包括保留英文语句 As a PR reviewer and CODEOWNER, I have reviewed this and have

如需进行 PR 验证,请为此 PR 添加 full-sweep-enabledfull-sweep-fail-fast 标签 — 基准测试 sweep 仅在带有标签的 PR 上运行。

PR 作者有责任确保合并后所有 GitHub Action 任务完全通过。 很多时候失败只是偶发抖动(flake),重新运行失败的任务即可解决。参见 GitHub 关于重新运行失败任务的文档

@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward image bump following the established pattern, with a matching perf-changelog entry.

Extended reasoning...

Overview

This PR bumps the vLLM image for the kimik2.5-fp4-b300-vllm recipe from v0.22.0 to v0.24.0 in configs/nvidia-master.yaml and adds a corresponding perf-changelog.yaml entry pointing at PR #2076. The changes are two lines of substance in total.

Security risks

None. This is a container image tag update to an upstream vLLM image; no auth, crypto, permissions, or user-input paths are touched.

Level of scrutiny

Low. This is a mechanical, well-patterned dependency bump for a single benchmark recipe, gated by the full-sweep-fail-fast label so CI will validate the runtime before merge.

Other factors

The bug hunter flagged a nit about a stale "reuses B200 recipe as-is" comment now that the B200/B300 image tags have diverged. It's a documentation-drift nit, not a correctness issue, and doesn't warrant blocking the PR.


kimik2.5-fp4-b300-vllm:
image: vllm/vllm-openai:v0.22.0
image: vllm/vllm-openai:v0.24.0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The comment above this block (lines 2900-2902) says the B300 config "reuses the existing Kimi-K2.5 FP4 B200 vLLM recipe as-is until B300-specific tuning is available," but after this bump the B200 config still pins vllm/vllm-openai:v0.22.0 while B300 is on v0.24.0 — so the image field of the recipe has diverged and "as-is" is no longer strictly accurate. Consider a one-line tweak to the comment (e.g. "derived from the B200 recipe; image has since been bumped independently") or bump the B200 image in the same PR.


上方注释(2900-2902 行)写道 B300 配置"reuses the existing Kimi-K2.5 FP4 B200 vLLM recipe as-is until B300-specific tuning is available",但本次升级后 B200 配置仍固定在 vllm/vllm-openai:v0.22.0,而 B300 已切到 v0.24.0 —— 因此 recipe 中的 image 字段已经出现分歧,"as-is"在字面上已不再成立。建议要么微调该注释(例如改为"派生自 B200 recipe;image 已独立升级"),要么在本 PR 中同步将 B200 的 image 升级到 v0.24.0。

Extended reasoning...

What the comment says vs. what the code now does. Lines 2900-2902 in configs/nvidia-master.yaml contain an explanatory NOTE above the kimik2.5-fp4-b300-vllm block:

At the time of submission, https://docs.vllm.ai/projects/recipes/en/latest/moonshotai/Kimi-K2.5.html does not have a B300-specific recipe, so this config reuses the existing Kimi-K2.5 FP4 B200 vLLM recipe as-is until B300-specific tuning is available.

After this PR:

  • kimik2.5-fp4-b200-vllm (line 2880): image: vllm/vllm-openai:v0.22.0
  • kimik2.5-fp4-b300-vllm (line 2905): image: vllm/vllm-openai:v0.24.0

Step-by-step check. Diffing the two YAML blocks field-by-field after the bump:

  1. model, model-prefix, precision, framework, runner (differs by design: b200 vs b300), scenarios, and the search-space (tp/ep/concurrency ranges) all still match B200. So the substantive "no B300-specific tuning" claim is still true.
  2. image: B200 = v0.22.0, B300 = v0.24.0. The two recipes now differ in exactly one field — the image tag.

Since image is part of the recipe, "reuses ... as-is" is no longer literally accurate.

Why the refutation is worth taking seriously. One verifier argued the comment's essential meaning is about B300-specific tuning — i.e., that no B300-tailored scenario/search-space work has been done — not about the image tag, which is routine dependency maintenance shared with the B200 line. That framing is reasonable, and it's why I'm flagging this as a nit rather than a blocker: the load-bearing claim of the comment (upstream lacks a B300 recipe, so tuning parameters are copied from B200) remains true. The "as-is" phrasing is just slightly loose given the image divergence.

Impact. None on benchmark execution — this is purely comment/documentation drift. The value of flagging it is small (readers of the file may be briefly confused when they see the image tags differ), and the fix is a one-line comment edit or a companion image bump on the B200 recipe.

Suggested fix. Either (a) reword the note to something like "...reuses the B200 recipe's tuning parameters until B300-specific tuning is available (the image tag has been bumped independently)," or (b) also bump the B200 image to v0.24.0 in this PR and add kimik2.5-fp4-b200-vllm to the changelog config-keys. Option (a) is lower-risk since it doesn't require re-running B200 benchmarks.

@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

@functionstackx

Copy link
Copy Markdown
Collaborator Author

Canary blocked by cluster-side model staging, not the image bump: /scratch/models/Kimi-K2.5-NVFP4 is permission-denied for the job user on multiple B300 nodes (reproduced on b300-017 and b300-002; vLLM then misparses the unreadable path as an HF repo id). The int4 sibling configs using moonshotai/Kimi-K2.5 load fine on the same cluster with the same v0.24.0 image. Needs a chown/chmod of the NVFP4 model dir on the B300 nodes.
中文:canary 受阻于集群侧模型目录权限,而非镜像升级 - 多个 B300 节点上 /scratch/models/Kimi-K2.5-NVFP4 对作业用户拒绝访问(b300-017 与 b300-002 均复现)。同集群同镜像下使用 moonshotai/Kimi-K2.5 的 int4 配置加载正常。需要在 B300 节点上修复该 NVFP4 模型目录的属主/权限。

@functionstackx

Copy link
Copy Markdown
Collaborator Author

Closing: blocked by NVIDIA cluster-side model staging (NVFP4 checkpoint missing/unreadable) or image-pull lock issues. Will re-cut once the cluster staging is fixed.
中文:关闭 - 受阻于 NVIDIA 集群侧模型未就绪(NVFP4 权重缺失/不可读)或镜像拉取锁问题。集群修复后重新提交。

@functionstackx functionstackx deleted the klaud-cold/kimik2.5-fp4-b300-vllm-v0.24.0 branch July 4, 2026 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

1 participant