🐍 Newsletters
AI Snake Oil
39 min read
Open-world evaluations for measuring frontier AI capabilities
Introducing CRUX, a new project for evaluating AI on long, messy tasks
Explore the latest AI news and research tagged #frontier ai capabilities — curated from top sources including OpenAI, Anthropic, Google DeepMind, and more.
Introducing CRUX, a new project for evaluating AI on long, messy tasks