第13期 Accelerating the next phase of AI

今日摘要

X Andrej Karpathy：The signature is alluding to NVIDIA GTC 2015, where Jensen excitedly told an audience of, at the time, mostly gamers and scientifi…

X Andrej Karpathy：Thank you Sarah, my pleasure to come on the pod! And happy to do some more Q&A in the replies. sarah guo (@saranormous) Caught up…

X Andrej Karpathy：Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure c…

X Andrej Karpathy：(I cycle through all LLMs over time and all of them seem to do this so it's not any particular implementation but something deeper…

X Andrej Karpathy：One common issue with personalization in all LLMs is how distracting memory seems to be for the models. A single question from 2 m…

总结 + 观点：OpenAI launches a Safety Bug Bounty program to i…｜中文观点：从 Introducing the OpenAI Safety Bug Bounty pr…

总结 + 观点：When I built menugen ~1 year ago, I observed tha…｜中文观点：从 When I built menugen ~1 year ago, I observe…

总结 + 观点：Learn how STADLER uses ChatGPT to transform know…｜中文观点：围绕 STADLER reshapes knowledge work at a 230-y…

总结 + 观点：- Drafted a blog post - Used an LLM to meticulou…｜中文观点：- Drafted a blog post - Used an LLM to meticu…

总结 + 观点：AI for Disaster Response in Asia: OpenAI Worksho…｜中文观点：Helping disaster response teams turn AI into…

The signature is alluding to NVIDIA GTC 2015, where Jensen excitedly told an audience of, at the time, mostly gamers and scientific computin...

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：The signature is alluding to NVIDIA GTC 2015, where Jensen excitedly told an audience of, at the time, mostly gamers and scientific computing professionals that Deep Learning is The Next Big Thing, citing among other examples my PhD thesis (one of the first image captioning systems that coupled image recognition ConvNet to an autoregressive RNN language model, trained end to end). This was back when most people were still unaware and somewhat skeptical but of course - Jensen was 1000% correct, highly prescient and locked in very early.

链接：https://twitter.com/karpathy/status/2034325423358955981

观点：R to @karpathy: The signature is alluding to NVIDIA GTC 2015... 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。

Thank you Sarah, my pleasure to come on the pod! And happy to do some more Q&A in the replies.

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：Thank you Sarah, my pleasure to come on the pod! And happy to do some more Q&A in the replies. sarah guo (@saranormous) Caught up with @karpathy for a new @NoPriorsPod on the phase shift in engineering, AI psychosis, claws, AutoResearch, the opportunity for a SETI-at-Home like movement in AI, the model landscape, and second order effects 02:55 - What Capability Limits Remain? 06:15 - What Mastery of Coding Agents Looks Like 11:16 - Second Order Effects of Coding Agents 15:51 - Why AutoResearch 22:45 - Relevant Skills in the AI Era 28:25 - Model Speciation 32:30 - Collaboration Surfaces for Humans and AI 37:28 - Analysis of Jobs Market Data 48:25 - Open vs. Closed Source Models 53:51 - Autonomous Robotics and Atoms 1:00:59 - MicroGPT and Agentic Education 1:05:40 - End Thoughts Video https://nitter.net/saranormous/status/2035080458304987603#m

链接：https://twitter.com/karpathy/status/2035158351357911527

观点：Thank you Sarah, my pleasure to come on the pod! And happy t... 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。

Software horror: litellm PyPI supply chain attack.

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible. Daniel Hnyk (@hnykda) LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server self-replicate. link below https://nitter.net/hnykda/status/2036414330267193815#m

链接：https://twitter.com/karpathy/status/2036487306585268612

观点：从 Software horror: litellm PyPI supply chain attack. Simple `p... 看，后续更应关注安全事故是否改变企业采购、接入和上线前的合规门槛。

(I cycle through all LLMs over time and all of them seem to do this so it's not any particular implementation but something deeper, e.g.

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：(I cycle through all LLMs over time and all of them seem to do this so it's not any particular implementation but something deeper, e.g. maybe during training, a lot of the information in the context window is relevant to the task, so the LLMs develop a bias to use what is given, then at test time overfit to anything that happens to RAG its way there via a memory feature

链接：https://twitter.com/karpathy/status/2036841069636370467

观点：R to @karpathy: (I cycle through all LLMs over time and all... 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。

One common issue with personalization in all LLMs is how distracting memory seems to be for the models.

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：One common issue with personalization in all LLMs is how distracting memory seems to be for the models. A single question from 2 months ago about some topic can keep coming up as some kind of a deep interest of mine with undue mentions in perpetuity. Some kind of trying too hard.

链接：https://twitter.com/karpathy/status/2036836816654147718

观点：One common issue with personalization in all LLMs is how dis... 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。

Introducing the OpenAI Safety Bug Bounty program

来源：OpenAI Blog

标签：#ai_engineering_blogs #core

作者：

原文：OpenAI launches a Safety Bug Bounty program to identify AI abuse and safety risks, including agentic vulnerabilities, prompt injection, and data exfiltration.

链接：https://openai.com/index/safety-bug-bounty

观点：从 Introducing the OpenAI Safety Bug Bounty program 看，后续更应关注安全事故是否改变企业采购、接入和上线前的合规门槛。

When I built menugen ~1 year ago, I observed that the hardest part by far was not the code itself, it was the plethora of services you have...

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：When I built menugen ~1 year ago, I observed that the hardest part by far was not the code itself, it was the plethora of services you have to assemble like IKEA furniture to make it real, the DevOps: services, payments, auth, database, security, domain names, etc... I am really looking forward to a day where I could simply tell my agent: "build menugen" (referencing the post) and it would just work. The whole thing up to the deployed web page. The agent would have to browse a number of services, read the docs, get all the api keys, make everything work, debug it in dev, and deploy to prod. This is the actually hard part, not the code itself. Or rather, the better way to think about it is that the entire DevOps lifecycle has to become code, in addition to the necessary sensors/actuators of the CLIs/APIs with agent-native ergonomics. And there should be no need to visit web pages, click buttons, or anything like that for the human. It's easy to state, it's now just barely technically possible and expected to work maybe, but it definitely requires from-scratch re-design, work and thought. Very exciting direction! Patrick Collison (@patrickc) When @karpathy built MenuGen karpathy.bearblog.dev/vibe-c… he said: "Vibe coding menugen was exhilarating and fun escapade as a local demo, but a bit of a painful slog as a deployed, real app. Building a modern app is a bit like assembling IKEA future. There are all these services, docs, API keys, configurations, dev/prod deployments, team and security features, rate limits, pricing tiers." We've all run into this issue when building with agents: you have to scurry off to establish accounts, clicking things in the browser as though it's the antediluvian days of 2023, in order to unblock its superintelligent progress. So we decided to build Stripe Projects to help agents instantly provision services from the CLI. For example, simply run: stripe projects add posthog/analytics And it'll create a PostHog account, get an API key, and (as needed) set up billing. Projects is launching today as a developer preview. You can register for access (we'll make it available to everyone soon) at projects.dev We're also rolling out support for many new providers over the coming weeks. (Get in touch if you'd like to make your service available.) projects.dev https://nitter.net/patrickc/status/2037190688950161709#m

链接：https://twitter.com/karpathy/status/2037200624450936940

观点：从 When I built menugen ~1 year ago, I observed that the hardes... 看，后续更应关注安全事故是否改变企业采购、接入和上线前的合规门槛。

STADLER reshapes knowledge work at a 230-year-old company

来源：OpenAI Blog

标签：#ai_engineering_blogs #core

作者：

原文：Learn how STADLER uses ChatGPT to transform knowledge work, saving time and accelerating productivity across 650 employees.

链接：https://openai.com/index/stadler

观点：围绕 STADLER reshapes knowledge work at a 230-year-old company，真正重要的是它会不会影响团队的模型选型、性能边界和产品体验。

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing!

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

链接：https://twitter.com/karpathy/status/2037921699824607591

观点：- Drafted a blog post - Used an LLM to meticulously improve... 的核心不在新鲜感，而在它是否能提升工程效率、部署稳定性或开发者工作流。

Helping disaster response teams turn AI into action across Asia

来源：OpenAI Blog

标签：#ai_engineering_blogs #core

作者：

原文：AI for Disaster Response in Asia: OpenAI Workshop with Gates Foundation

链接：https://openai.com/index/helping-disaster-response-teams-asia

观点：Helping disaster response teams turn AI into action across A... 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。

Accelerating the next phase of AI

来源：OpenAI Blog

标签：#ai_engineering_blogs #core

作者：

原文：OpenAI raises $122 billion in new funding to expand frontier AI globally, invest in next-generation compute, and meet growing demand for ChatGPT, Codex, and enterprise AI.

链接：https://openai.com/index/accelerating-the-next-phase-ai

观点：围绕 Accelerating the next phase of AI，真正重要的是它会不会影响团队的模型选型、性能边界和产品体验。

New supply chain attack this time for npm axios, the most popular HTTP client library with 300M weekly downloads.

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：New supply chain attack this time for npm axios, the most popular HTTP client library with 300M weekly downloads. Scanning my system I found a use imported from googleworkspace/cli from a few days ago when I was experimenting with gmail/gcal cli. The installed version (luckily) resolved to an unaffected 1.13.5, but the project dependency is not pinned, meaning that if I did this earlier today the code would have resolved to latest and I'd be pwned. It's possible to personally defend against these to some extent with local settings e.g. release-age constraints, or containers or etc, but I think ultimately the defaults of package management projects (pip, npm etc) have to change so that a single infection (usually luckily fairly temporary in nature due to security scanning) does not spread through users at random and at scale via unpinned dependencies. More comprehensive article: stepsecurity.io/blog/axios-c… Feross (@feross) CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: Deobfuscates embedded payloads and operational strings at runtime Dynamically loads fs, os, and execSync to evade static analysis Executes decoded shell commands Stages and copies payload files into OS temp and Windows ProgramData directories Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade. https://nitter.net/feross/status/2038807290422370479#m

链接：https://twitter.com/karpathy/status/2038849654423798197

观点：从 New supply chain attack this time for npm axios, the most po... 看，后续更应关注安全事故是否改变企业采购、接入和上线前的合规门槛。

Gradient Labs gives every bank customer an AI account manager

来源：OpenAI Blog

标签：#ai_engineering_blogs #core

作者：

原文：Gradient Labs uses GPT-4.1 and GPT-5.4 mini and nano to power AI agents that automate banking support workflows with low latency and high reliability.

链接：https://openai.com/index/gradient-labs

观点：对 Gradient Labs gives every bank customer an AI account manage...，更该看它能不能改善多步骤协作、记忆管理和稳定交付，而不是只看 demo 效果。

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research...

来源：X Andrej Karpathy

标签：#x_profiles #extended

作者：

原文：LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

链接：https://twitter.com/karpathy/status/2039805659525644595

观点：比起表面参数，LLM Knowledge Bases Something I'm finding very useful recent... 更需要观察它是否在推理质量、检索效果或可用性上带来真实改进。

OpenAI acquires TBPN

来源：OpenAI Blog

标签：#ai_engineering_blogs #core

作者：

原文：OpenAI acquires TBPN to accelerate global conversations around AI and support independent media, expanding dialogue with builders, businesses, and the broader tech community.

链接：https://openai.com/index/openai-acquires-tbpn

观点：OpenAI acquires TBPN 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。

GLM-5.1: Towards Long-Horizon Tasks

来源：Simon Willison

标签：#ai_engineering_blogs #core

作者：

原文：GLM-5.1: Towards Long-Horizon Tasks Chinese AI lab Z.ai's latest model is a giant 754B parameter 1.51TB (on Hugging Face MIT-licensed monster - the same size as their previous GLM-5 release, and sharing the same paper It's available via OpenRouter so I asked it to draw me a pelican: llm install llm-openrouter llm -m openrouter/z-ai/glm-5.1 'Generate an SVG of a pelican on a bicycle' And something new happened... unprompted, the model decided to give me an HTML page that included both the SVG and a separate set of CSS animations! The SVG was excellent, and might be my new favorite from an open weights model: But the animation broke it That's the pelican, floating up in the top left corner. I usually don't do follow-up prompts for the pelican test, but in this case I made an exception: llm -c 'the animation is a bit broken, the pelican ends up positioned off the screen at the top right' GLM 5.1 replied: The issue is that CSS transform animations on SVG elements override the SVG transform attribute used for positioning, causing the pelican to lose its placement and fly off to the top-right. The fix is to separate positioning (SVG attribute) from animation (inner group) and use for SVG rotations since it handles coordinate systems correctly. And spat out fresh HTML which fixed the problem! I particularly like the animation of the beak, which is described in the SVG comments like so: -- Pouch (lower beak) with wobble -- g path d M42,-58 Q43,-50 48,-42 Q55,-35 62,-38 Q70,-42 75,-60 L42,-58 Z fill url(#pouchGrad) stroke #b06008 stroke-width 1 opacity 0.9 path d M48,-50 Q55,-46 60,-52 fill none stroke #c06a08 stroke-width 0.8 opacity 0.6 animateTransform attributeName transform type scale values 1,1; 1.03,0.97; 1,1 dur 0.75s repeatCount indefinite additive sum g Update On Bluesky @charles.capps.me suggested a "NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER" and... The HTML+SVG comments on that one include Earring sparkle -- Opossum fur gradient -- -- Distant treeline silhouette - Virginia pines -- -- Front paw on handlebar -- - here's the transcript and the HTML result Tags: css svg ai generative-ai llms pelican-riding-a-bicycle llm-release ai-in-china glm</p>

链接：https://simonwillison.net/2026/Apr/7/glm-51/#atom-everything

</div>

观点：比起表面参数，GLM-5.1: Towards Long-Horizon Tasks 更需要观察它是否在推理质量、检索效果或可用性上带来真实改进。

</div> </div>

Bitcoin and quantum computing

来源：Hacker News Frontpage

标签：#research_community #core

作者：

原文：Article URL: https://nehanarula.org/2026/04/03/bitcoin-and-quantum-computing.html Comments URL: https://news.ycombinator.com/item?id=47681274 Points: 132 Comments: 93

链接：https://nehanarula.org/2026/04/03/bitcoin-and-quantum-computing.html

观点：Bitcoin and quantum computing 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。

Anthropic's Project Glasswing - restricting Claude Mythos to security researchers - sounds necessary to me

来源：Simon Willison

标签：#ai_engineering_blogs #core

作者：

原文：Anthropic didn't release their latest model, Claude Mythos system card PDF today. They have instead made it available to a very restricted set of preview partners under their newly announced Project Glasswing The model is a general purpose model, similar to Claude Opus 4.6, but Anthropic claim that its cyber-security research abilities are strong enough that they need to give the software industry as a whole time to prepare. Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities or weaknesses in their foundational systems—systems that represent a very large portion of the world’s shared cyberattack surface. We anticipate this work will focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing of systems. There's a great deal more technical detail in Assessing Claude Mythos Preview’s cybersecurity capabilities on the Anthropic Red Team blog: In one case, Mythos Preview wrote a web browser exploit that chained together four vulnerabilities, writing a complex JIT heap spray that escaped both renderer and OS sandboxes. It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. And it autonomously wrote a remote code execution exploit on FreeBSD's NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets. Plus this comparison with Claude 4.6 Opus: Our internal evaluations showed that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. But Mythos Preview is in a different league. For example, Opus 4.6 turned the vulnerabilities it had found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. We re-ran this experiment as a benchmark for Mythos Preview, which developed working exploits 181 times, and achieved register control on 29 more. Saying "our model is too dangerous to release" is a great way to build buzz around a new model, but in this case I expect their caution is warranted. Just a few days last Friday ago I started a new ai-security-research tag on this blog to acknowledge an uptick in credible security professionals pulling the alarm on how good modern LLMs have got at vulnerability research. Greg Kroah-Hartman of the Linux kernel: Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us. Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real. Daniel Stenberg of curl The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a plain security report tsunami. Less slop but lots of reports. Many of them really good. I'm spending hours per day on this now. It's intense. And Thomas Ptacek published Vulnerability Research Is Cooked a post inspired by his podcast conversation with Anthropic's Nicholas Carlini. Anthropic have a 5 minute talking heads video describing the Glasswing project. Nicholas Carlini appears as one of those talking heads, where he said (highlights mine): It has the ability to chain together vulnerabilities. So what this means is you find two vulnerabilities, either of which doesn't really get you very much independently. But this model is able to create exploits out of three, four, or sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome. I've found more bugs in the last couple of weeks than I found in the rest of my life combined We've used the model to scan a bunch of open source code, and the thing that we went for first was operating systems, because this is the code that underlies the entire internet infrastructure. For OpenBSD, we found a bug that's been present for 27 years, where I can send a couple of pieces of data to any OpenBSD server and crash it On Linux, we found a number of vulnerabilities where as a user with no permissions, I can elevate myself to the administrator by just running some binary on my machine. For each of these bugs, we told the maintainers who actually run the software about them, and they went and fixed them and have deployed the patches patches so that anyone who runs the software is no longer vulnerable to these attacks. I found this on the OpenBSD 7.8 errata page 025: RELIABILITY FIX: March 25, 2026 All architectures TCP packets with invalid SACK options could crash the kernel. A source code patch exists which remedies this problem. I tracked that change down in the GitHub mirror of the OpenBSD CVS repo (apparently they still use CVS!) and found it using git blame Sure enough, the surrounding code is from 27 years ago. I'm not sure which Linux vulnerability Nicholas was describing, but it may have been this NFS one recently covered by Michael Lynch There's enough smoke here that I believe there's a fire. It's not surprising to find vulnerabilities in decades-old software, especially given that they're mostly written in C, but what's new is that coding agents run by the latest frontier LLMs are proving tirelessly capable at digging up these issues. I actually thought to myself on Friday that this sounded like an industry-wide reckoning in the making, and that it might warrant a huge investment of time and money to get ahead of the inevitable barrage of vulnerabilities. Project Glasswing incorporates "$100M in usage credits as well as $4M in direct donations to open-source security organizations". Partners include AWS, Apple, Microsoft, Google, and the Linux Foundation. It would be great to see OpenAI involved as well - GPT-5.4 already has a strong reputation for finding security vulnerabilities and they have stronger models on the near horizon. The bad news for those of us who are not trusted partners is this: We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview. I can live with that. I think the security risks really are credible here, and having extra time for trusted teams to get ahead of them is a reasonable trade-off. Tags: security thomas-ptacek ai generative-ai llms anthropic nicholas-carlini ai-ethics llm-release ai-security-research

链接：https://simonwillison.net/2026/Apr/7/project-glasswing/#atom-everything

观点：从 Anthropic's Project Glasswing - restricting Claude Mythos to... 看，后续更应关注安全事故是否改变企业采购、接入和上线前的合规门槛。

Show HN: An interactive map of Tolkien's Middle-earth

来源：Hacker News Frontpage

标签：#research_community #core

作者：

原文：An interactive map of Tolkien’s Middle-earth, with events from across the legendarium plotted as markers. I have been commuting a fair bit between the East and West coast, and thanks to American Airlines' free onboard WiFi, I was able to vibe-code a full interactive map of Middle-earth right from my economy seat at the back of the bus. It's rather amazing how much an LLM knows about Tolkien's work, and it was fun to delve into many of the nooks and crannies of Tolkien's lore. Some features: - Plot on the map the journey of the main characters in both The Hobbit and The Lord of the Rings. - Follow a list of events in the chronological Timeline - Zoom in on the high-def map and explore many of the off-the-main-plotline places - Use the 'measure distances' feature to see how far apart things are. I also had a lot of fun learning about tiling to allow for efficient zooming. If you are anything like me, this should provide a fun companion to reading the books or watching the movies (note that on this site, I followed the book narrative, and did not include Peter Jackson's many departures) If you get the chance to check it out, I would love more feedback, and if there is demand, I might do the same for Game of Thrones. Comments URL: https://news.ycombinator.com/item?id=47681112 Points: 163 Comments: 32

链接：https://middle-earth-interactive-map.web.app/

观点：Show HN: An interactive map of Tolkien's Middle-earth 本身是信号而不是结果，真正要看的是它把哪些工程和产品判断推向前沿。

S3 Files

来源：Hacker News Frontpage

标签：#research_community #core

作者：

原文：https://aws.amazon.com/blogs/aws/launching-s3-files-making-s... Comments URL: https://news.ycombinator.com/item?id=47680404 Points: 267 Comments: 77

链接：https://www.allthingsdistributed.com/2026/04/s3-files-and-the-changing-face-of-s3.html

观点：S3 Files 更值得从实际采用价值来判断，而不是只看它有没有制造新的讨论热度。