第20周 AI Weekly huggingface/AnyLanguageModel

今日摘要

Simon Willison：GitLab Act 2 There's a lot going on in this announcement from GitLab about the "workforce reduction" and "structural and strategic…

Simon Willison：Release: llm 0.32a2 A bunch of useful stuff in this LLM alpha, but the most important detail is this one: Most reasoning-capable O…

GitHub huggingface：An API-compatible, drop-in replacement for Apple's Foundation Models framework with support for custom language model providers.

GitHub anthropics：A suite of plugins for legal workflows

Simon Willison：TIL: Using LLM in the shebang line of a script Kim_Bruning on Hacker News But seriously, you can put a shebang on an english text…

总结 + 观点：Learning on the Shop floor Tobias Lütke describe…｜中文观点：Learning on the Shop floor 的核心不在新鲜感，而在它是否能提升工…

总结 + 观点：Lightweight Android capture developer tools for…｜中文观点：openai/snap-o 的核心不在新鲜感，而在它是否能提升工程效率、部署稳定性或开发者…

总结 + 观点：Tool: CSP Allow-list Experiment An experiment th…｜中文观点：从 CSP Allow-list Experiment 看，后续更应关注安全事故是否改变企…

总结 + 观点：In preparation for a lightning talk I'm giving a…｜中文观点：Warelay -> OpenClaw 的核心不在新鲜感，而在它是否能提升工程效率、部署稳…

总结 + 观点：VQVAEs, GumbelSoftmaxes and friends｜中文观点：karpathy/deep-vector-quantization 更值得从实际采用价值来…

Thoughts on GitLab's workforce reduction" and "structural and strategic decisions"

来源：Simon Willison

标签：#ai_engineering_blogs #engineering-value

作者：

原文：GitLab Act 2 There's a lot going on in this announcement from GitLab about the "workforce reduction" and "structural and strategic decisions" they are making with respect to the agentic era. They're "planning to reduce the number of countries by up to 30% where we have small teams". One of the most interesting things about GitLab is that they have employees spread across a large number of countries - 18 are listed in their public employee handbook but this post says they are "operating in nearly 60 countries". That handbook used to document their payroll workflows for those countries too - they stopped publishing that in 2023 but the last public version (hooray for version control) remains a fascinating read. Since we don't know which of those 60 countries have small teams, we can't calculate how many countries that 30% applies to. "We're planning to flatten the organization, removing up to three layers of management in some functions so leaders are closer to the work." - this isn't the first announcement of this type I've seen that's trimming management. Coinbase recently announced a much more aggressive version of this: they were "flattening our org structure to 5 layers max below" and "No pure managers: Every leader at Coinbase must also be a strong and active individual contributor. Managers should be like player-coaches". In terms of team structure: "We're re-organizing R&D to create roughly 60 smaller, more empowered teams with end-to-end ownership, nearly doubling the number of independent teams." I've always loved the idea of individual teams that can ship features unblocked by other teams, and it makes sense to me that agentic engineering can increase the capability of such teams. The 37signals public employee handbook used to have a section on working In self-sufficient, independent teams which perfectly captured this for me, I'm sad to see they removed that detail in January 2024! Tucked away towards the bottom: We will be retiring CREDIT as our values framework - that's the values framework described on this page "Collaboration, Results for Customers, Efficiency, Diversity, Inclusion Belonging, Iteration, and Transparency". The new values are "Speed with Quality, Ownership Mindset, Customer Outcomes". The fact that "Diversity" is no longer in there is likely to attract a whole lot of attention, so it's worth noting that a sub-bullet under Customer Outcomes reads "Interpersonal excellence: individuals who are good humans, embrace diversity, inclusion and belonging, assume good intent and treat everyone with respect". Here's the part of their new strategy that most resonated with me: The agentic era multiplies demand for software Software has been the force multiplier behind nearly every business transformation of the last two decades. The constraint was the cost and time of producing and managing it. That constraint is collapsing. As the cost of producing software collapses, demand for it will expand. Last year, the developer platform market used to be measured in tens of dollars per user per month, this year it is hundreds/user/month and headed to thousands. Not only is the value of software for builders increasing, but we believe there will be more software and builders than ever, and we will serve an increasing volume of both That very much encapsulates my own optimistic, Jevons-paradox -inspired hope for how this will all work out. Their opinion on this does need to be taken with a big grain of salt though. GitLab's stock price was ~$52 a year ago and is ~$26 today, and it's plausible that the drop corresponds to uncertainty about GitLab's continued growth as agentic engineering eats its way through their core market. If your entire business depends on software engineering growing as a field and producing larger volumes of more lucrative seats, you have a strong incentive to believe that agents will have that effect! Via Hacker News Tags: 37signals careers ai gitlab coding-agents jevons-paradox agentic-engineering

链接：https://simonwillison.net/2026/May/11/gitlab-act-2/#atom-everything

观点：对 Thoughts on GitLab's workforce reduction" and "structural an...，更该看它能不能改善多步骤协作、记忆管理和稳定交付，而不是只看 demo 效果。

llm 0.32a2

来源：Simon Willison

标签：#ai_engineering_blogs #ecosystem-shift

作者：

原文：Release: llm 0.32a2 A bunch of useful stuff in this LLM alpha, but the most important detail is this one: Most reasoning-capable OpenAI models now use the /v1/responses endpoint instead of /v1/chat/completions This enables interleaved reasoning across tool calls for GPT-5 class models. #1435 This means you can now see the summarized reasoning tokens when you run prompts against an OpenAI model, displayed in a different color to standard error. Use the -R or --hide-reasoning flags if you don't want to see that. Tags: llm projects openai generative-ai annotated-release-notes ai llms

链接：https://simonwillison.net/2026/May/12/llm/#atom-everything

观点：比起表面参数，llm 0.32a2 更需要观察它是否在推理质量、检索效果或可用性上带来真实改进。

huggingface/AnyLanguageModel

来源：GitHub huggingface

标签：#github_orgs #engineering-value

作者：

原文：An API-compatible, drop-in replacement for Apple's Foundation Models framework with support for custom language model providers.

链接：https://github.com/huggingface/AnyLanguageModel

观点：对 huggingface/AnyLanguageModel 来说，更值得判断的是它会不会进入团队默认工具链，而不是短期讨论热度。

anthropics/claude-for-legal

来源：GitHub anthropics

标签：#github_orgs #workflow-impact

作者：

原文：A suite of plugins for legal workflows

链接：https://github.com/anthropics/claude-for-legal

观点：对 anthropics/claude-for-legal，更该看它能不能改善多步骤协作、记忆管理和稳定交付，而不是只看 demo 效果。

Using LLM in the shebang line of a script

来源：Simon Willison

标签：#ai_engineering_blogs #engineering-value

作者：

原文：TIL: Using LLM in the shebang line of a script Kim_Bruning on Hacker News But seriously, you can put a shebang on an english text file now (if you're sufficiently brave) This inspired me to look at patterns for doing exactly that with LLM Here's the simplest, which takes advantage of LLM fragments #!/usr/bin/env -S llm -f Generate an SVG of a pelican riding a bicycle But you can also incorporate tool calls using the -T name_of_tool option: #!/usr/bin/env -S llm -T llm_time -f Write a haiku that mentions the exact current time Or even execute YAML templates directly that define extra tools as Python functions: !/usr/bin/env -S llm -t model gpt-5.4-mini system Use tools to run calculations functions def add(a: int, b: int) - int: return a b def multiply(a: int, b: int) - int: return a b Then: ./calc.sh 'what is 2344 5252 134' --td Which outputs (thanks to that --td tools debug option): Tool call: multiply({'a': 2344, 'b': 5252}) 12310688 Tool call: add({'a': 12310688, 'b': 134}) 12310822 2344 5252 134 **12,310,822** Read the full TIL for a more complex example that uses the Datasette SQL API to answer questions about content on my blog. Tags: llm llm-tool-use llms ai generative-ai

链接：https://simonwillison.net/2026/May/11/llm-shebang/#atom-everything

观点：Using LLM in the shebang line of a script 的核心不在新鲜感，而在它是否能提升工程效率、部署稳定性或开发者工作流。

Learning on the Shop floor

来源：Simon Willison

标签：#ai_engineering_blogs #ecosystem-shift

作者：

原文：Learning on the Shop floor Tobias Lütke describes Shopify's internal coding agent tool, River, which operates entirely in public on their Slack: River does not respond to direct messages. She politely declines and suggests to create a public channel for you and her to start working in. I myself work with river in #tobi_river channel and many followed this pattern. Every conversation is therefore searchable. Anyone at Shopify can jump in. In my own channel, there are over 100 people who, react to threads, add color and add context, pick up the torch, help with the reviews, remind me how rusty I am, and importantly, learn from watching. As so often with German, there is a word for the kind of environment: Lehrwerkstatt Literally: A teaching workshop The whole shop floor is the classroom. You learn by being near the work. Being a constant learner is one of the core values of the firm. Shopify wants to be a Lehrwerkstatt at scale and River has now gotten us closer to this ideal than ever. It’s osmosis learning because it does not require a curriculum, a training plan, or a manager. It just requires everyone's work to be visible to the maximum extent possible. Everyone learns from each other. I'm reminded of how Midjourney spent its first few years with the primary interface being public Discord channels, forcing users to share their prompts and learn from each other's experiments. I continue to believe that the early success of Midjourney was tied to this mechanism, helping to compensate for how weird and finicky text-to-image prompting is. Tags: ai slack generative-ai llms midjourney coding-agents tobias-lutke

链接：https://simonwillison.net/2026/May/11/learning-on-the-shop-floor/#atom-everything

观点：Learning on the Shop floor 的核心不在新鲜感，而在它是否能提升工程效率、部署稳定性或开发者工作流。

openai/snap-o

来源：GitHub openai

标签：#github_orgs #learning-value

作者：

原文：Lightweight Android capture developer tools for macOS

链接：https://github.com/openai/snap-o

观点：openai/snap-o 的核心不在新鲜感，而在它是否能提升工程效率、部署稳定性或开发者工作流。

CSP Allow-list Experiment

来源：Simon Willison

标签：#ai_engineering_blogs #trend-signal

作者：

原文：Tool: CSP Allow-list Experiment An experiment that shows that you can load an app in a CSP-protected sandboxed iframe (see previous note and have a custom fetch() that intercepts CSP errors and passes them up to the parent window... which can then prompt the user to add that domain to an allow-list and then refresh the page. I built this one with GPT-5.5 xhigh running in the Codex desktop app. Tags: content-security-policy iframes security

链接：https://simonwillison.net/2026/May/13/csp-allow/#atom-everything

观点：从 CSP Allow-list Experiment 看，后续更应关注安全事故是否改变企业采购、接入和上线前的合规门槛。

Warelay - OpenClaw

来源：Simon Willison

标签：#ai_engineering_blogs #workflow-impact

作者：

原文：In preparation for a lightning talk I'm giving at PyCon US this afternoon I decided to figure out how many names OpenClaw has actually had since that first commit back in November. Thanks to this first_line_history.py tool code here the answer, according to the Git history of the OpenClaw README, is: Warelay CLAWDIS CLAWDBOT Clawdbot Moltbot OpenClaw Or in detail (the output from the tool): 2025-11-24T11:23:15+01:00 16dfc1a Warelay WhatsApp Relay CLI (Twilio) 2025-11-24T11:41:37+01:00 d4153da Warelay WhatsApp Relay CLI (Twilio) 2025-11-24T17:47:57+01:00 343ef9b warelay WhatsApp Relay CLI (Twilio) 2025-11-25T04:44:10+01:00 14b3c6f warelay WhatsApp Relay CLI 2025-11-25T12:48:40+01:00 4814021 warelay Send, receive, and auto-reply on WhatsApp—Twilio-backed or QR-linked. 2025-11-25T13:50:18+01:00 d51a3e9 warelay - Send, receive, and auto-reply on WhatsApp via Twilio or QR-linked WhatsApp Web; webhook setup in one command 2025-11-25T13:51:13+01:00 4d2a8a8 warelay Send, receive, and auto-reply on WhatsApp—Twilio-backed or QR-linked. 2025-11-25T14:52:43+01:00 1ef7f4d warelay Send, receive, and auto-reply on WhatsApp. 2025-12-03T15:45:32+00:00 a27ee23 CLAWDIS WhatsApp Gateway for AI Agents 2025-12-08T12:43:13+01:00 17fa2f4 CLAWDIS WhatsApp Telegram Gateway for AI Agents 2025-12-19T18:41:17+01:00 7710439 CLAWDIS Personal AI Assistant 2026-01-04T14:32:47+00:00 246adaa CLAWDBOT Personal AI Assistant 2026-01-10T05:14:09+01:00 cdb915d Clawdbot Personal AI Assistant 2026-01-27T13:37:47-05:00 3fe4b25 Moltbot Personal AI Assistant 2026-01-30T03:15:10+01:00 9a71607 OpenClaw Personal AI Assistant Tags: openclaw git tools

链接：https://simonwillison.net/2026/May/16/openclaw-names/#atom-everything

观点：Warelay -> OpenClaw 的核心不在新鲜感，而在它是否能提升工程效率、部署稳定性或开发者工作流。

karpathy/deep-vector-quantization

来源：GitHub karpathy

标签：#github_orgs #learning-value

作者：

原文：VQVAEs, GumbelSoftmaxes and friends

链接：https://github.com/karpathy/deep-vector-quantization

观点：karpathy/deep-vector-quantization 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。