Large language model developer Anthropic PBC today rolled out its newest Claude 4 frontier models, starting with Opus 4 and Sonnet 4, which the company said set new standards for coding, advanced ...
SAN FRANCISCO--(BUSINESS WIRE)--Writer, the full-stack generative AI platform for the enterprise, today released its newest large language model (LLM) to power the next generation of AI applications ...
New “AI GYM for Science” dramatically boosts the biological and chemical intelligence of any causal or frontier LLM, delivering up to 10x performance gains on key drug discovery benchmarks and ...
Specialized models trained on narrow, domain-specific data. They're being honed through intensive training until their ...
Oct. 12, 2024 — A research team led by the University of Maryland has been nominated for the Association for Computing Machinery’s Gordon Bell Prize. The team is being recognized for developing a ...
There’s a paradox at the heart of modern AI: The kinds of sophisticated models that companies are using to get real work done and reduce head count aren’t the ones getting all the attention. Ever-more ...
MOUNTAIN VIEW, CA, October 31, 2025 (EZ Newswire) -- Fortytwo, opens new tab research lab today announced benchmarking results for its new AI architecture, known as Swarm Inference. Across key AI ...
In the lead-up to China's Labor Day Golden Week, the country's AI sector is experiencing a flurry of large language model (LLM) upgrades. Baidu and Alibaba have rolled out new flagship models, while ...
Unrelenting, persistent attacks on frontier models make them fail, with the patterns of failure varying by model and developer. Red teaming shows that it’s not the sophisticated, complex attacks that ...
Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results