第50期 Changes in the system prompt between Claud

今日摘要

Simon Willison：Anthropic are the only major AI lab to publish the system prompts for their user-facing chat systems. Their system prom…

Simon Willison：Anthropic's published system prompt history for Claude is transformed into a git-based exploration tool, breaking up th…

Simon Willison：Adding a new content type to my blog-to-newsletter tool - Agentic Engineering Patterns Guides > Agentic Engineering Pat…

Simon Willison：This year’s PyCon US is coming up next month from May 13th to May 19th, with the core conference talks from Friday 15th…

Hugging Face Blog：nvidia/nemotron-ocr-v2 Image-to-Text • Updated 5 days ago • 917 • 121

总结 + 观点：Anthropic are the only major AI lab to publish…｜中文观点：Claude 系统提示词公开最有价值的地方，是让开发者能更具体地比较产品边界、对齐策略和版本变化。

总结 + 观点：Anthropic's published system prompt history for…｜中文观点：把系统提示词做成 git 时间线很实用，因为它把“模型行为变化”从感觉问题变成了可追踪的工程事实。

总结 + 观点：Adding a new content type to my blog-to-newslet…｜中文观点：这类小而完整的自动化案例很有参考价值，说明内容工作流的收益常常来自细碎但可复用的工具改造。

总结 + 观点：This year’s PyCon US is coming up next month fr…｜中文观点：PyCon 把 AI 和安全单独拉出来，说明这两个话题已经从兴趣分支变成了主流开发生态的正式议程。

总结 + 观点：nvidia/nemotron-ocr-v2 Image-to-Text • Updated…｜中文观点：OCR 模型真正值得看的是部署成本和多语言稳定性；如果这两点站住了，它会直接进入文档处理链路。

Changes in the system prompt between Claude Opus 4.6 and 4.7

来源：Simon Willison

标签：#ai_engineering_blogs #core

作者：

原文：Anthropic are the only major AI lab to publish the system prompts for their user-facing chat systems. Their system prompt archive now dates all the way back to Claude 3 … 18th April 2026 Anthropic are the only major AI lab to publish the system prompts for their user-facing chat systems.

链接：https://simonwillison.net/2026/Apr/18/opus-system-prompt/#atom-everything

观点：中文观点：版本间系统提示词差异公开后，开发者终于能更具体地分析行为变化，不必再靠猜。

Claude system prompts as a git timeline

来源：Simon Willison

标签：#ai_engineering_blogs #core

作者：

原文：Anthropic's published system prompt history for Claude is transformed into a git-based exploration tool, breaking up the monolithic markdown source into granular files and timestamped commits. By structuring extracted prompts … 18th April 2026 Anthropic publish the system prompts for Claude chat and make that page avai...

链接：https://simonwillison.net/2026/Apr/18/extract-system-prompts/#atom-everything

观点：中文观点：把提示词演进整理成时间线非常聪明，它让模型策略变化具备了可审计性和可复盘性。

Adding a new content type to my blog-to-newsletter tool

来源：Simon Willison

标签：#ai_engineering_blogs #core

作者：

原文：Adding a new content type to my blog-to-newsletter tool - Agentic Engineering Patterns Guides > Agentic Engineering Patterns Here's an example of a deceptively short prompt that got a lot of work done in a single shot. First, some background.

链接：https://simonwillison.net/guides/agentic-engineering-patterns/adding-a-new-content-type/#atom-everything

观点：中文观点：这种工具链微创新很接地气，往往比宏大叙事更能真实提升内容生产效率。

Join us at PyCon US 2026 in Long Beach - we have new AI and security tracks this year

来源：Simon Willison

标签：#ai_engineering_blogs #core

作者：

原文：This year’s PyCon US is coming up next month from May 13th to May 19th, with the core conference talks from Friday 15th to Sunday 17th and tutorial and sprint … 17th April 2026 This year’s PyCon US is coming up next month from May 13th to May 19th, with the core conference talks from Friday 15th to Sunday 17th and tuto...

链接：https://simonwillison.net/2026/Apr/17/pycon-us-2026/#atom-everything

观点：中文观点：大会议程把 AI 与安全并列，说明它们已经不是边缘兴趣，而是开发者社区的核心议题。

Building a Fast Multilingual OCR Model with Synthetic Data

来源：Hugging Face Blog

标签：#ai_engineering_blogs #core

作者：

原文：nvidia/nemotron-ocr-v2 Image-to-Text • Updated 5 days ago • 917 • 121

链接：https://huggingface.co/blog/nvidia/nemotron-ocr-v2

观点：中文观点：合成数据配合多语 OCR 很实用，但最后拼的还是速度、错误率和真实业务里的脏数据适应性。

Engineering at Anthropic: Inside the team building reliable AI systems

来源：Anthropic Engineering

标签：#ai_engineering_blogs #core

作者：

原文：Agentic coding benchmarks like SWE-bench and Terminal-Bench are commonly used to compare the software engineering capabilities of frontier models—with top spots on leaderboards often separated by just a few percentage points. These scores are often treated as precise measurements of relative model capability and increa...

链接：https://www.anthropic.com/engineering/infrastructure-noise

观点：中文观点：这篇更像是在提醒大家，榜单上的细小差距未必可靠，真正影响交付的是噪声控制和评测方法本身。

Scaling Managed Agents: Decoupling the brain from the hands

来源：Anthropic Engineering

标签：#ai_engineering_blogs #core

作者：

原文：Get started with Claude Managed Agents by following our docs . A running topic on the Engineering Blog is how to build effective agents and design harnesses for long-running work .

链接：https://www.anthropic.com/engineering/managed-agents

观点：中文观点：把 brain 和 hands 解耦是对的方向，因为企业真正需要的不是更花哨的 agent，而是更可替换、更可控的执行层。

[AINews] Anthropic Claude Opus 4.7 - literally one step better than 4.6 in every dimension

来源：Latent Space

标签：#ai_engineering_blogs #core

作者：

原文：AINews: Weekday Roundups [AINews] Anthropic Claude Opus 4.7 - literally one step better than 4.6 in every dimension The new SOTA model asserts its dominance. Apr 17, 2026 ∙ Paid 65 4 Share Thursday mornings are for prestige AI launches, and while OpenAI put in a valiant effort with GPT-Rosalind and The New New Codex (w...

链接：https://www.latent.space/p/ainews-anthropic-claude-opus-47-literally

观点：中文观点：新模型发布当然吸睛，但更关键的是它会不会把开发者默认偏好、评测基线和定价预期一起往上推。

llm-anthropic 0.25

来源：Simon Willison

标签：#ai_engineering_blogs #core

作者：

原文：LLM access to models by Anthropic, including the Claude series 16th April 2026 This is a beat by Simon Willison, posted on 16th April 2026 . Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

链接：https://simonwillison.net/2026/Apr/16/llm-anthropic/#atom-everything

观点：中文观点：库级更新看起来不起眼，但它往往决定新模型能多快进入真实开发流程，所以很值得持续盯。

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

来源：Simon Willison

标签：#ai_engineering_blogs #core

作者：

原文：For anyone who has been (inadvisably) taking my pelican riding a bicycle benchmark seriously as a robust way to test models, here are pelicans from this morning’s two big model … 16th April 2026 For anyone who has been (inadvisably) taking my pelican riding a bicycle benchmark seriously as a robust way to test models,...

链接：https://simonwillison.net/2026/Apr/16/qwen-beats-opus/#atom-everything

观点：中文观点：这类轻松基准虽然不严肃，却很适合提醒大家一件事：本地模型的体验进步正在快速逼近云端旗舰。