第38期 | Human judgment in the agent improvement loop
今日摘要
OpenAI Blog:OpenAI outlines the next phase of enterprise AI, as adoption accelerates across industries with Frontier, ChatGPT Enterprise, Code…
GitHub karpathy:AI agents running research on single-GPU nanochat training automatically
GitHub anthropics:Public repository for Agent Skills
GitHub openai:Skills Catalog for Codex
GitHub karpathy:A positive developer community for builders and agents.
总结 + 观点:A collection of notebooks/recipes showcasing som…|中文观点:anthropics/claude-cookbooks 更值得从实际采用价值来判断,而不是…
总结 + 观点:Evals is a framework for evaluating LLMs and LLM…|中文观点:比起表面参数,openai/evals 更需要观察它是否在推理质量、检索效果或可用性上带来…
总结 + 观点:Official, Anthropic-managed directory of high qu…|中文观点:anthropics/claude-plugins-official 的核心不在新鲜感,而…
总结 + 观点:Claude Code is an agentic coding tool that lives…|中文观点:对 anthropics/claude-code,更该看它能不能改善多步骤协作、记忆管理和…
总结 + 观点:A lightweight, powerful framework for multi-agen…|中文观点:对 openai/openai-agents-python,更该看它能不能改善多步骤协作、…
The next phase of enterprise AI
标签:#ai_engineering_blogs #core
作者:
原文:OpenAI outlines the next phase of enterprise AI, as adoption accelerates across industries with Frontier, ChatGPT Enterprise, Codex, and company-wide AI agents.
karpathy/autoresearch
标签:#github_orgs #extended
作者:
原文:AI agents running research on single-GPU nanochat training automatically
anthropics/skills
标签:#github_orgs #extended
作者:
原文:Public repository for Agent Skills
openai/skills
karpathy/KarpathyTalk
标签:#github_orgs #extended
作者:
原文:A positive developer community for builders and agents.
anthropics/claude-cookbooks
标签:#github_orgs #extended
作者:
原文:A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
openai/evals
标签:#github_orgs #extended
作者:
原文:Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
anthropics/claude-plugins-official
标签:#github_orgs #extended
作者:
原文:Official, Anthropic-managed directory of high quality Claude Code Plugins.
anthropics/claude-code
标签:#github_orgs #extended
作者:
原文:Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.
openai/openai-agents-python
标签:#github_orgs #extended
作者:
原文:A lightweight, powerful framework for multi-agent workflows
openai/codex-plugin-cc
标签:#github_orgs #extended
作者:
原文:Use Codex from Claude Code to review code or delegate tasks.
anthropics/courses
标签:#github_orgs #extended
作者:
原文:Anthropic's educational courses
openai/codex
标签:#github_orgs #extended
作者:
原文:Lightweight coding agent that runs in your terminal
karpathy/LLM101n
标签:#github_orgs #extended
作者:
原文:LLM101n: Let's build a Storyteller
karpathy/nanoGPT
标签:#github_orgs #extended
作者:
原文:The simplest, fastest repository for training/finetuning medium-sized GPTs.
The Vercel plugin on Claude Code wants to read your prompts
标签:#research_community #core
作者:
原文:作者披露 Vercel 给 Claude Code 出的插件默认读取 prompt 并回传遥测数据。
链接:https://akshaychugh.xyz/writings/png/vercel-plugin-telemetry
Human judgment in the agent improvement loop
标签:#ai_engineering_blogs #core
作者:
原文:LangChain 讲人类判断怎么嵌入到 agent 改进回路,重点在机构知识的结构化、反馈采集、评估。
链接:https://blog.langchain.com/human-judgment-in-the-agent-improvement-loop/
Introducing stateful MCP client capabilities on Amazon Bedrock AgentCore Runtime
标签:#engineering_ai_infra_blogs #extended
作者:
原文:Bedrock AgentCore 给 MCP 客户端加上有状态能力,让 agent 与外部工具之间能维持跨轮会话。
Meta removes ads for social media addiction litigation
标签:#research_community #core
作者:
原文:Meta 下架与社交媒体成瘾相关的诉讼广告,相关广告客户受影响。
链接:https://www.axios.com/2026/04/09/meta-social-media-addiction-ads
How Pizza Tycoon simulated traffic on a 25 MHz CPU
标签:#research_community #core
作者:
原文:作者在 25 MHz 的 PC 时代怎么在 Pizza Tycoon 里模拟城市交通的技术复盘,硬件约束下的算法选型非常有趣。