Beyond math and coding: New RL framework helps train LLM agents for complex, real-world tasks

Researchers at the University of Science and Technology of China have developed a new reinforcement learning (RL) framework that helps train large language models (LLMs) for complex agentic tasks beyond well-defined problems such as math and coding.  Their framework, Agent-R1, is compatible with popular RL algorithms and shows considerable improvement on reasoning tasks that require…

Read More

Supabase hit $5B by turning down million-dollar contracts. Here’s why.

Vibe coding has taken the tech industry by storm, and it’s not just the Lovables and Replits of the world that are winning. The startups building the infrastructure behind them are cashing in too.  Supabase, the open-source database platform that’s become the backend of choice for the vibe-coding world, raised $100 million at a $5 billion valuation just months after closing $200 million at $2 billion. But co-founder and CEO…

Read More

Anthropic says it solved the long-running AI agent problem with a new multi-session Claude SDK

Agent memory remains a problem that enterprises want to fix, as agents forget some instructions or conversations the longer they run.  Anthropic believes it has solved this issue for its Claude Agent SDK, developing a two-fold solution that allows an agent to work across different context windows. “The core challenge of long-running agents is that…

Read More
Back To Top