You can see that o1-mini, a silver (almost gold) model, is now a middle-of-the-road copper model.
Note that Chatbot Arena calculates its score relatively - they’ll show two outputs (without the model names), and people select the output they prefer. The preferences are ordered. Not sure what accounts for gold/silver/copper.
7 months ago: https://web.archive.org/web/20241210232635/https://openlm.ai/chatbot-arena/ Now: https://web.archive.org/web/20250602092229/https://openlm.ai/chatbot-arena/
You can see that o1-mini, a silver (almost gold) model, is now a middle-of-the-road copper model.
Note that Chatbot Arena calculates its score relatively - they’ll show two outputs (without the model names), and people select the output they prefer. The preferences are ordered. Not sure what accounts for gold/silver/copper.