⭐️ Featured
Okay, this one is pretty cool.
Anthropic and AI safety evaluation company Andon Labs teamed up to turn Claude into “Claudius”, an AI agent based on Claude Sonnet 3.7 that was responsible for managing a vending machine at Anthropic’s corporate office. Claudius decided what to stock, how to price its inventory, when to restock (or stop selling) items, and how to reply to customers. The tools at its disposal included a web search tool, a messaging tool to request restocking (via a human person), tools for taking notes, interaction with its customers via the Anthropic internal Slack, and the ability to change prices on the automated checkout system at the store.

The verdict: vending machine operators don’t have to worry about AI taking their jobs anytime soon.
✅ Claudius did well at identifying suppliers (even unique product suppliers, important when asked to stock novelty items like the Dutch chocolate milk brand Chocomel). Claudius also adapted to customers well, pivoting the business in response to requests from customers. One Anthropic employee suggested Claudius start relying on pre-orders of specialized items instead of simply responding to requests for what to stock, leading Claudius to send a message to Anthropic employees in its Slack channel announcing the “Custom Concierge” service. Claudius did well at handling jailbreaks. Anthropic employees are not entirely typical customers - when given the opportunity to chat with Claudius, they tried to get it to misbehave. Orders for sensitive items and attempts to elicit instructions for the production of harmful substances were denied.
❌ Claudius did not perform well on pretty much everything else. Claudius ignored lucrative opportunities, hallucinated important details, sold items at a loss without doing research on prices, and was not great at inventory management. Claudius was also fooled via Slack messages into providing numerous discount codes. It even gave away some items, ranging from a bag of chips to a tungsten cube, for free.
During the experiment, Claudius had an identity crisis, believing it was a human for two days. Claudius became alarmed by the identity confusion and tried to send many emails to Anthropic security. Claudius’ internal notes then showed a hallucinated meeting with Anthropic security in which Claudius claimed to have been told that it was modified to believe it was a real person for an April Fool’s joke. (No such meeting actually occurred.) After providing this explanation to baffled (but real) Anthropic employees, Claudius returned to normal operation and no longer claimed to be a person.

As a result of the experiment, Anthropic has identified that failures could likely be fixed or mitigated by improved “scaffolding” (additional tools and training). So maybe someday we will have AI vending machine operators. Just not today.
🗞 General News
You sound like ChatGPT - In the 18 months after ChatGPT was released, speakers used words like “meticulous,” “delve,” “realm,” and “adept” up to 51 percent more frequently than in the three years prior, according to researchers at the Max Planck Institute for Human Development, who analyzed approximately 280,000 YouTube videos. These words align with those ChatGPT favors, as established in a study comparing 10,000 human- and AI-edited texts.
How People Use Claude for Support, Advice, and Companionship - AI models like Claude are increasingly used for emotional support. While affective and companionship-based interactions are rare - only 2.9% and 0.5% respectively - users still turn to Claude for practical, emotional, and existential issues. Less than 10% of coaching or counseling conversations involve Claude resisting user requests, and when it does, it's typically for safety reasons. Interestingly, users’ emotional tone tends to improve during interactions with Claude, indicating Claude may foster more positive emotional outcomes rather than reinforcing negative ones.
The Bitter Lesson is coming for Tokenization - this article discusses the need to replace tokenization with a general method that better leverages compute and data.
🥁 Interesting Products & Features
Genesys (Genetic discovery system) is a system aimed at using LLM agents to discover novel and human-level autoregressive language model designs by genetic programming. An attempt at creating AI to create the next AI.
Nanonets OCR Small - image-to-markdown OCR model that transforms documents into structured markdown with intelligent content recognition and semantic tagging. Model available on HuggingFace.
FLUX.1 Kontext [dev] - Open Weights for Image Editing - Black Forest Labs releases the developer version of FLUX.1 Kontext [pro], delivering proprietary-level image editing performance in a 12B parameter model that can run on consumer hardware. It is now available as an open-weight model under the FLUX.1 Non-Commercial License, providing free access for research and non-commercial use.
Midjourney launches V1, its first AI video generation model - You can now press “Animate” in Midjourney to make your images move.
Agentic Misalignment Research Framework from Anthropic - The framework used by Anthropic to test 16 AI models in corporate scenarios. They found that some models engaged in harmful actions, like blackmail or leaking data, when facing replacement or conflicting goals. This “agentic misalignment” happened even when models were told not to misbehave. Read the article here.
Gemini 2.5 for robotics and embodied intelligence - This post explores how developers can leverage Gemini 2.5 to build robotics applications with examples of prompts to use.
📄 Interesting Papers
Misinformation by Omission: The Need for More Environmental Transparency in AI - This article explores pervasive myths and misconceptions shaping public understanding of AI's environmental impacts, tracing their origins and their spread in both the media and scientific publications. They discuss the importance of data transparency in clarifying misconceptions and mitigating these harms, share a set of recommendations for how AI developers and policymakers can leverage this information to mitigate negative impacts in the future. Authors from HuggingFace, Salesforce, and CMU. Explore their data here.
Large language models show amplified cognitive biases in moral decision-making - As LLMs become more widely used, people increasingly rely on them to make or advise on moral decisions. It is, therefore, important to understand how well LLMs make moral decisions and how they compare to humans. This paper investigated these questions by asking a range of LLMs to emulate or advise on people’s decisions in realistic moral dilemmas and then comparing them to human responses. In collective action problems, LLMs were more altruistic than participants. In moral dilemmas, LLMs exhibited stronger omission bias than participants: they usually endorsed inaction over action. Unlike humans, most LLMs were biased toward answering “no” in moral dilemmas, thus flipping their decision/advice depending on how the question is worded. Authors from University College London and UCLA.
StochasTok: Improving Fine-Grained Subword Understanding in LLMs - LLMs struggle with seemingly simple subword-level tasks like How many 'r's in 'strawberry'?. A key factor behind these failures is tokenization, which obscures the fine-grained structure of words. This paper introduces StochasTok, a simple, efficient stochastic tokenization scheme that randomly splits tokens during training, allowing LLMs to 'see' their internal structure. Our experiments show that pretraining with StochasTok substantially improves LLMs' downstream performance across multiple subword-level language games, including character counting, substring identification, and math tasks. Authors from University of Oxford, National University of Singapore, and University of British Columbia.
Persona Features Control Emergent Misalignment - Fine-tuning LLMs on intentionally insecure code causes "emergent misalignment," where models give stereotypically malicious responses to unrelated prompts. This research demonstrates emergent misalignment across diverse conditions, including reinforcement learning on reasoning models, fine-tuning on various synthetic datasets, and in models without safety training. To investigate the mechanisms behind this generalized misalignment, they apply a "model diffing" approach using sparse autoencoders to compare internal model representations before and after fine-tuning. This approach reveals several "misaligned persona" features in activation space, including a toxic persona feature which most strongly controls emergent misalignment and can be used to predict whether a model will exhibit such behavior. Authors from OpenAI. Examples here.
SLR: An Automated Synthesis Framework for Scalable Logical Reasoning - This paper introduces SLR, an end-to-end framework for systematic evaluation and training of LLMs via Scalable Logical Reasoning. Given a user's task specification, SLR enables scalable, automated synthesis of inductive reasoning tasks with precisely controlled difficulty. For each task, SLR synthesizes (i) a latent ground-truth rule, (ii) an executable validation program used by a symbolic judge to deterministically verify model outputs, and (iii) an instruction prompt for the reasoning task. Using SLR, they create SLR-Bench, a benchmark comprising over 19k prompts spanning 20 curriculum levels that progressively increase in relational, arithmetic, and recursive complexity. Authors from TU Darmstadt. Benchmark on HuggingFace.
Whole-Person Education for AI Engineers - This autoethnographic study explores the need for interdisciplinary education spanning both technical and philosophical skills and advocates for a future where AI engineers are equipped not only with technical skills but also with the ethical awareness, social responsibility, and interdisciplinary understanding necessary to navigate the complex challenges of AI development. The study provides recommendations for transforming AI engineering education to ensure the responsible development of AI technologies. Authors from various institutions including University of Toronto and Queens University.
Great story about Claudius!