Original Source Material
Benchmarks Shows Matched Capability, Brittle Reasoning Two artificial intelligence models from competing labs have essentially the same offensive cyber capability level, with consistent reasoning failures that the cyber scores alone do not capture. OpenAI's GPT-5.5 and Anthropic's Mythos Preview now deliver near-identical offensive cyber performance.