查看“︁让AI Agent不再迷失：规划的艺术”︁的源代码

<blockquote>"没有计划的 Agent 走哪算哪" —— 先列步骤再动手，完成率翻倍。</blockquote>在前面文章中，我们构建了最简单的 Agent 循环。今天我们来解决一个实际问题：'''多步任务中，Agent 会丢失进度'''。

重复做过的事、跳步、跑偏——对话越长越严重。一个 10 步重构可能做完 1-3 步就开始即兴发挥，因为 4-10 步已经被挤出注意力了。

== 一、问题：长对话中的迷失 ==
想象一下这个场景：

你让 Agent 重构一个项目：

# 添加类型注解
# 补充文档字符串
# 添加主函数保护
# 运行测试验证
# 提交更改

Agent 开始工作，读取文件、修改代码、运行测试...但到第 4 步时，上下文已经被工具结果填满，系统提示的影响力逐渐被稀释。Agent "忘记"了还有第 5 步，或者以为第 2 步还没做，又做了一遍。

'''这不是模型的错，是 Harness 的问题。'''

== 二、解决方案：TodoManager ==
给 Agent 一个带状态的待办清单，并强制它定期更新进度。
 <code>+--------+      +-------+      +---------+
 |  User  | ---> |  LLM  | ---> | Tools   |
 | prompt |      |       |      | + todo  |
 +--------+      +---+---+      +----+----+
                     ^                |
                     |   tool_result  |
                     +----------------+
                           |
               +-----------+-----------+
               | TodoManager state     |
               | [ ] task A            |
               | [>] task B  <- doing  |
               | [x] task C            |
               +-----------------------+
                           |
               if rounds_since_todo >= 3:
                 inject <reminder> into tool_result</code>

== 三、核心机制 ==

=== 1. 状态管理 ===
同一时间只允许一个 <code>in_progress</code>：
 <code>class TodoManager:
     def update(self, items: list) -> str:
         validated, in_progress_count = [], 0
         for item in items:
             status = item.get("status", "pending")
             if status == "in_progress":
                 in_progress_count += 1
             validated.append({
                 "id": item["id"],
                 "text": item["text"],
                 "status": status
             })
         if in_progress_count > 1:
             raise ValueError("Only one task can be in_progress")
         self.items = validated
         return self.render()</code>
'''"同时只能有一个 in_progress"''' 强制顺序聚焦。

=== 2. 工具集成 ===
<code>todo</code> 工具和其他工具一样加入调度映射：
 <code>TOOL_HANDLERS = {
     ''# ...基础工具...''
     "todo": lambda **kw: TODO.update(kw["items"]),
 }</code>

=== 3. 提醒机制（Nag Reminder） ===
模型连续 3 轮以上不调用 <code>todo</code> 时，系统主动注入提醒：
 <code>if rounds_since_todo >= 3 and messages:
     last = messages[-1]
     if last["role"] == "user" and isinstance(last.get("content"), list):
         last["content"].insert(0, {
             "type": "text",
             "text": "<reminder>Update your todos.</reminder>",
         })</code>
'''问责压力'''——你不更新计划，系统就追着你问。

== 四、实际效果 ==
对比无规划 vs 有规划的 Agent：
{| class="wikitable"
!场景
!无规划
!有规划
|-
|10步重构
|经常重复或遗漏步骤
|按顺序完成，状态清晰
|-
|多文件编辑
|上下文混乱
|每一步聚焦单一目标
|-
|长对话
|后期偏离主题
|始终围绕待办清单
|}

== 五、完整示例 ==
 <code>''# Agent 首先创建计划''
 todo.update([
     {"id": 1, "text": "读取并分析当前代码", "status": "in_progress"},
     {"id": 2, "text": "添加类型注解", "status": "pending"},
     {"id": 3, "text": "补充文档字符串", "status": "pending"},
     {"id": 4, "text": "添加主函数保护", "status": "pending"},
     {"id": 5, "text": "运行测试验证", "status": "pending"},
 ])
 
 ''# 完成第一步后更新''
 todo.update([
     {"id": 1, "text": "读取并分析当前代码", "status": "completed"},
     {"id": 2, "text": "添加类型注解", "status": "in_progress"},
     ''# ...其他不变''
 ])
 
 ''# 如此继续，直到全部完成''</code>

== 六、试一试 ==
 <code>cd learn-claude-code
 python agents/s03_todo_write.py</code>
试试这些 prompt：

# <code>Refactor the file hello.py: add type hints, docstrings, and a main guard</code>
# <code>Create a Python package with __init__.py, utils.py, and tests/test_utils.py</code>
# <code>Review all Python files and fix any style issues</code>

观察 Agent 如何先创建待办清单，然后一步步执行，定期更新进度。

== 七、设计哲学 ==
'''Harness 层应该提供规划结构，但不替模型画航线。'''

* Harness 提供 Todo 工具和管理器
* 模型自己决定任务是什么、如何排序
* Harness 强制"一次只做一件事"和"定期更新"
* 但不规定具体怎么做

这种'''约束与自由'''的平衡是关键：

* 太松：模型迷失在细节中
* 太紧：模型变成脚本执行器

== 八、进阶思考 ==
TodoManager 是内存中的扁平清单，适合单次会话。对于更复杂的场景，我们需要：

* '''依赖关系'''：任务 B 依赖任务 A
* '''并行执行'''：任务 C 和 D 可以同时进行
* '''持久化'''：跨会话保存任务状态

这将在后续文章中介绍，进化成完整的'''任务系统（Task System）'''。

但即使在那之前，简单的 Todo 已经能大幅提升 Agent 的可靠性。'''先列步骤再动手，完成率翻倍'''。
----''Bash 就够了。真正的 Agent 是宇宙所需要的全部。''