Posts

Claude Mythos Preview: The Most Important AI Release Wasn't a Release

Image
Anthropic’s most important signal this month is not a benchmark chart. It is the fact that the company published a full system card for Claude Mythos Preview and then explicitly said it does not plan to make the model generally available. That is a very different kind of launch. If Anthropic’s own materials are directionally right, then we are moving from “AI can help security teams” to something more serious: frontier models can compress vulnerability discovery and exploit development enough that release governance, disclosure capacity, and patching speed become first-order engineering problems . What is Claude Mythos Preview? According to Anthropic’s Project Glasswing announcement , Claude Mythos Preview is a general-purpose, unreleased frontier model . According to the system card , it is also Anthropic’s most capable frontier model to date . The same system card says the company decided not to make it generally available. Instead, Anthropic is restricting access t...

The Anatomy of an Agent Harness: Engineering Without Code

Image
 It's true. A team at OpenAI spent five months building and shipping a complex internal product with 0 lines of manually-written code .  Let's dive into their recent breakdown of Harness Engineering , how they pulled off shipping a million lines of agent-generated code, and why it fundamentally redefines what it means to be a software engineer.    🐎 The Horse and the Harness We need to stop thinking about prompting, and start thinking about systems. As LangChain brilliantly puts it , there is a core equation to this new era: Agent = Model + Harness . The model is the horse : the raw intelligence, the chaotic, organic power capable of generating endless tokens. But a wild horse cannot pull a cart. It needs a harness . The harness is everything else: the code, configuration, state management, and execution logic that isn't the model itself. The engineer's job shifts from typing code to designing harnesses where agents can predictably succeed. Drawing from ...

Israel Ranked #1 Globally in AI Adoption: Insights from Anthropic’s Latest Report

Agentic AI: Fun & Games... Until It’s Not (A Security Reality Check)

Image
 In my previous posts, we explored the magic of Agentic AI . We built agents with LangChain and LangGraph , and we saw how amazing it is when an AI can actually do things: like searching the web, running code, or managing workflows. It feels like the future. And it is.  But usually, when we talk about "Agents," we talk about the capabilities (what they can do). This month, the conversation shifted to liabilities (what they can do to us). 2 major reports dropped just 6 days apart : one from OpenAI (Nov 7) and one from Anthropic (Nov 13). If you are building AI apps, you need to read this.  🕵️‍♂️ 1. Anthropic: The Spy in the Machine Anthropic released a report about disrupting a cyber espionage campaign. But the interesting part isn't just the hackers: it's the method . They focused on "AI-orchestrated" attacks . We aren't just talking about a chatbot writing a phishing email anymore. We are talking about agents that can...

Hands-on Agentic AI App: LangGraph 1.0

Image
  What is Agentic AI App? Following  Hands-on Agentic AI: LangChain 1.0 post ,  A simple way to think about Agentic AI is: model + tools +  web-service  = agentic AI app . With LangGraph 1.0 now stable , building them is straightforward. Let's build a 'Weather Poet' agent app that: Runs as a web service, accessible via UI and API. .  Uses a web search tool to find a forecast. Write a poem about it.

Hands-on Agentic AI: LangChain 1.0

Image
What is Agentic AI?  A simple way to think about Agentic AI is: model + tools = agentic AI.    With  LangChain 1.0 now stable , building them is straightforward.  Let's build a 'Weather Poet' agent that: Uses a web search tool to find a forecast. Write a poem about it.

Hands-on Generative AI Enhanced Thinking

Quick update on last week's AI news: Google launched Gemini 2.5 Pro Generative AI model, ranked #1 on https://lmarena.ai/ ! 🚀 If you tried the code from my Gemini 2.0 post , there's an easy upgrade: Just 1 line change to try the new leader: Update MODEL from "gemini-2.0-flash-thinking-exp" to: "gemini-2.5-pro-exp-03-25" Experience its enhanced 'thinking', reasoning, and coding power compared to 2.0 Flash. For further information about this model: deepmind.google/models/gemini/pro/