٩◔̯◔۶ Web 3.0 slept with my wife

Feeds: RSS, JSON

Simulated Company Shows Most AI Agents Flunk the Job:

The results should comfort people worried about AI replacing them. The best of them, Claude 3.5 Sonnet from Anthropic, only completed 24% of the tasks. Google’s Gemini 2.0 Flash came in second with 11.4%, and OpenAI’s GPT-4o was third with 8.6%.

lol.