Skip to content

Commit b377e77

Browse files
committed
update leaderboard for three o3 variants
1 parent ba74df9 commit b377e77

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

docs/leaderboard.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,11 @@
66

77
| Models | Main Problem Resolve Rate | <span style="color:grey">Subproblem</span> |
88
|--------------------------|-------------------------------------|-------------------------------------|
9-
| 🥇 OpenAI o3-mini | <div align="center">**9.2**</div> | <div align="center" style="color:grey">33.0</div> |
10-
| 🥈 OpenAI o1-preview | <div align="center">**7.7**</div> | <div align="center" style="color:grey">28.5</div> |
11-
| 🥉 Deepseek-R1 | <div align="center">**4.6**</div> | <div align="center" style="color:grey">28.5</div> |
9+
| 🥇 OpenAI o3-mini-low | <div align="center">**10.8**</div> | <div align="center" style="color:grey">33.3</div> |
10+
| 🥈 OpenAI o3-mini-high | <div align="center">**9.2**</div> | <div align="center" style="color:grey">34.4</div> |
11+
| 🥉 OpenAI o3-mini-medium | <div align="center">**9.2**</div> | <div align="center" style="color:grey">33.0</div> |
12+
| OpenAI o1-preview | <div align="center">**7.7**</div> | <div align="center" style="color:grey">28.5</div> |
13+
| Deepseek-R1 | <div align="center">**4.6**</div> | <div align="center" style="color:grey">28.5</div> |
1214
| Claude3.5-Sonnet | <div align="center">**4.6**</div> | <div align="center" style="color:grey">26.0</div> |
1315
| Claude3.5-Sonnet (new) | <div align="center">**4.6**</div> | <div align="center" style="color:grey">25.3</div> |
1416
| Deepseek-v3 | <div align="center">**3.1**</div> | <div align="center" style="color:grey">23.7</div> |

0 commit comments

Comments
 (0)