In this roundup you’ll learn about a coordinated supply‑chain attack on Mistral and TanStack packages, Anthropic’s Mythos model autonomously finding zero‑days, Google’s 75 % AI‑generated code milestone, new US AI oversight agreements, and the leaked Gemini 3.2 Flash release.
Tracks model releases, funding rounds and tool updates that matter for developers and builders.
1. Supply chain under attack
12. Mai 2026
A coordinated software supply chain campaign named "Mini Shai-Hulud" hit two of the most widely used package ecosystems in the AI developer world on the same day. Attackers injected malicious code into Mistral's PyPI package v2.4.6 and into multiple TanStack npm packages, including `@tanstack/react-router`, `@tanstack/history` and `@tanstack/router-core`. The TanStack packages alone are downloaded tens of millions of times per week, making the potential blast radius unusually large.
The attack vector is straightforward and difficult to catch before damage is done. The injected code in the Mistral package executes automatically on import, without any explicit invocation by the developer. It then downloads an external payload from a remote server and launches a second-stage attack targeting Linux systems. GitHub, cloud credentials and CI/CD tokens are among the reported exposure categories. Microsoft confirmed it is actively investigating the Mistral PyPI v2.4.6 compromise.
What makes this incident particularly relevant for AI developers is the target profile. Mistral's Python SDK and TanStack's routing libraries are both common dependencies in AI-adjacent projects. Developers building agentic pipelines, LLM wrappers or modern frontend interfaces for AI tools are squarely in scope. The campaign signals that AI developer tooling is now an attractive target for supply chain attackers, not just a productivity category.
The ultimate shortcut to flawless AI results
Stop wasting time guessing prompts. Get consistent, professional AI results right from the first try, every time.
The disclosure came from security firm Aikido, which flagged the TanStack packages first before the Mistral compromise surfaced hours later. The two attacks appear to be part of the same ongoing campaign. Neither Mistral nor TanStack had made a public announcement at the time of disclosure.
Heads up: Pin all Mistral and TanStack packages to verified versions and run `npm audit` and `pip-audit` before the next deploy.
Anthropic introduced Claude Mythos Preview, a general-purpose frontier model that the company describes as a step change over its predecessor, Claude Opus 4.6. The model was not explicitly trained for cybersecurity work. Its capabilities in vulnerability discovery emerged as a downstream consequence of general improvements in reasoning, long-context understanding and software engineering, which makes the result harder to dismiss as a narrow capability demo.
Over a period of several weeks, Anthropic used Mythos to run autonomous security scans across every major operating system and browser. The model identified thousands of zero-day vulnerabilities, many of them critical, without human steering after an initial prompt. Among the findings was a 27-year-old flaw in OpenBSD, an operating system with a long-standing reputation as one of the most security-hardened platforms in existence, widely used to run firewalls and critical infrastructure.
The performance difference relative to Opus 4.6 on controlled exploit generation tasks is significant:
Model
Working Firefox exploits out of hundreds of attempts
Claude Opus 4.6
2
Claude Mythos Preview
181
On the OSS-Fuzz corpus, Mythos achieved full control flow hijack on ten separate, fully patched targets. Anthropic engineers with no formal security background asked the model to find remote code execution vulnerabilities overnight and found complete, working exploits waiting for them in the morning. Anthropic describes the situation as a watershed moment: the window between a vulnerability being discovered and being weaponised has collapsed from months to hours.
Given those capabilities, Anthropic decided not to release the model publicly. Instead, it launched Project Glasswing, a restricted consortium of 12 partner organisations that will use Mythos to identify and patch vulnerabilities in critical software before making findings available to the broader industry. Partners include Apple, Amazon, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks. Anthropic has committed 100 million dollars in usage credits to the initiative. For organisations outside the consortium, access to Mythos is limited to a monitored group of roughly 40 organisations that build or maintain critical software.
Hint: Mythos is priced at five times the cost of Opus 4.6 for Glasswing partners. Factor tiered pricing into your API budget planning.
In a public blog post, Sundar Pichai confirmed that 75 % of all new code at Google is now AI-generated and reviewed by engineers before it ships. The number is striking on its own, but the trajectory makes it more so. In 2024 the share stood at 25 %. By fall 2025 it had doubled to 50 %. The jump to 75 % in the span of roughly two quarters reflects an acceleration, not a plateau.
Period
AI-generated code share
2024
25 %
Fall 2025
50 %
April 2026
75 %
The productivity gains Pichai cited are concrete. One complex code migration completed by agents and engineers working together ran six times faster than the equivalent task a year prior with engineers alone. Pichai described the direction of travel as a shift toward "truly agentic workflows," where engineers increasingly function as orchestrators of autonomous digital task forces rather than direct authors of code.
Google is not alone in this direction. Meta is targeting a comparable AI code share by mid-2026. Anthropic has stated that between 70 and 90 percent of its own code is written with Claude Code. But Google's disclosure carries particular weight because of its scale. The company employs tens of thousands of engineers and operates some of the world's most complex software infrastructure.
The internal picture is more complicated than the headline number suggests. The New York Times reported that Google pushed employees to create so many AI agents that additional agents had to be introduced just to find and rate the existing ones, producing a recursive loop that sparked anger and anxiety among staff. Some employees are looking for new jobs or positioning themselves to be laid off with severance. Google has also formally tied AI usage to engineer performance reviews for 2026, meaning developers who do not demonstrate active adoption risk negative evaluations.
Hint: If your team has no AI adoption metrics yet, define them now before leadership does it for you.
The Center for AI Standards and Innovation (CAISI), a division of the US Department of Commerce housed within NIST, announced new pre-deployment evaluation agreements with Google DeepMind, Microsoft and Elon Musk's xAI. The agreements allow the US government to test frontier AI models in classified environments before they are made publicly available. The move represents a notable shift in posture for an administration that had positioned itself explicitly against the AI safety and oversight frameworks established under the Biden White House.
CAISI is the renamed successor to the Biden-era US AI Safety Institute. Early in the Trump administration, the institute was effectively sidelined and its staff reduced. The new agreements represent a reversal. CAISI has now completed more than 40 model evaluations, including assessments of state-of-the-art models that remain unreleased. Congress approved funding increases for NIST's AI work in January 2026, including 55 million dollars for AI research and measurement efforts and up to 10 million dollars specifically for CAISI's expansion.
The scope of the evaluation programme covers three risk categories:
Cybersecurity - assessment of a model's ability to autonomously discover and exploit software vulnerabilities
Biosecurity - assessment of potential uplift for biological threat actors seeking to synthesise or deploy dangerous agents
Chemical weapons risks - assessment of a model's ability to assist with synthesis or weaponisation of chemical agents
Fortune reported that the policy reversal was triggered directly by Anthropic's Mythos model and its demonstrated ability to find and exploit critical vulnerabilities autonomously. The White House is also reported to be consulting a group of experts on a possible executive order that would create a formal review process for advanced AI systems before release. CAISI Director Chris Fall stated that "independent, rigorous measurement science is essential to understanding frontier AI and its national security implications." OpenAI separately confirmed it provided GPT-5.5 to the government ahead of its public release for national security testing and evaluation.
The existing agreements with OpenAI and Anthropic, originally signed in 2024 under Biden, have been renegotiated to reflect CAISI's updated directives from Commerce Secretary Howard Lutnick and the America's AI Action Plan.
Hint: Monitor CAISI guidance as a leading indicator for compliance requirements in regulated industries.
Users discovered Gemini 3.2 Flash running live inside the official iOS Gemini app and Google AI Studio without any prior announcement from Google. The discovery followed Google's established pattern of quietly rolling out significant model updates in the lead-up to major events. Reports surfaced first on Reddit and were corroborated by benchmark activity on Eleuther AI Arena. Google did not issue a press release or any public confirmation at the time of the sighting.
Leaked pricing data extracted from AI Studio positions Gemini 3.2 Flash as a cost-effective step up from existing Flash-tier models, reportedly matching much of Gemini 3.1 Pro's capability in coding and creative tasks while maintaining Flash-tier speed:
Model
Input price per million tokens
Output price per million tokens
Gemini 3.2 Flash (leaked)
$0.25
$2.00
Beyond the model itself, the leak revealed two additional signals about Google's direction. A new "Agents (Beta)" sidebar tab appeared in the Gemini UI, currently inactive, suggesting upcoming agentic capabilities are in preparation. A visual overhaul labelled "Liquid Glass" was also spotted, featuring a pill-shaped prompt bar and pulsating gradient backgrounds, consistent with the broader Gemini 2.0 UI changes previously staged for iOS.
The same day, Google held its Android Show I/O Edition event and officially introduced "Gemini Intelligence," a significant repositioning. Rather than being a standalone app, Gemini Intelligence is described as the intelligence layer running underneath Android itself, similar in concept to how Apple Intelligence is integrated into iOS. The branding appeared first in a confidential Pixel video that surfaced earlier in the week. The new Googlebook laptop line, built from the ground up around Gemini Intelligence, was also announced at the event. The full developer-facing reveal for Gemini 3.2 Flash and any additional model announcements is expected at Google I/O on 19 May.
Hint: Hold off on locking in Gemini model contracts until the I/O announcement confirms final specs and pricing.