Skip to content

feat: add reproducible Evaluation + Optimization pipeline#99

Open
Adonis-a233 wants to merge 2 commits into
trpc-group:mainfrom
Adonis-a233:feat/eval-optimize-loop
Open

feat: add reproducible Evaluation + Optimization pipeline#99
Adonis-a233 wants to merge 2 commits into
trpc-group:mainfrom
Adonis-a233:feat/eval-optimize-loop

Conversation

@Adonis-a233

@Adonis-a233 Adonis-a233 commented Jun 30, 2026

Copy link
Copy Markdown

Summary

Implements the reproducible Evaluation + Optimization closed loop for the issue
"设计并实现可复现的 Evaluation + Optimization pipeline".
Related Issue Resolves #91

What Changed

  • Added/updated examples/optimization/eval_optimize_loop/
  • Implemented six-stage pipeline:
    1. baseline train/val evaluation
    2. failure attribution
    3. prompt optimization
    4. candidate validation
    5. configurable acceptance gate
    6. JSON/Markdown audit persistence
  • Added fake mode with fake model / fake judge / trace-mode evalsets.
  • Added live mode wiring for AgentOptimizer.optimize and TargetPrompt.add_path.
  • Added prompt snapshots and sample optimization_report.json/.md.
  • Added Yun callback prompt example using TargetPrompt.add_callback.

Validation

Fake mode:

python examples/optimization/eval_optimize_loop/run.py --mode fake

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@8080800). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             main         #99   +/-   ##
==========================================
  Coverage        ?   87.64107%           
==========================================
  Files           ?         433           
  Lines           ?       41557           
  Branches        ?           0           
==========================================
  Hits            ?       36421           
  Misses          ?        5136           
  Partials        ?           0           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Adonis-a233

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

Rook1ex added a commit to trpc-group/cla-database that referenced this pull request Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

构建 Evaluation + Optimization 的自动回归与提示词优化闭环

1 participant