bird-bench

vparity

BIRD SQL parity subset (150 tasks, seed 42). Original benchmark: https://huggingface.co/datasets/birdsql/bird_sql_dev_20251106. Adapter: https://github.com/laude-institute/harbor/tree/main/adapters/bird-bench.

uvx harbor run -d bird-bench@parity

Tasks (150)

student_club__1423
uvx harbor run -d bird-bench@parity -t student_club__1423
82d1fb0
student_club__1428
uvx harbor run -d bird-bench@parity -t student_club__1428
82d1fb0
student_club__1438
uvx harbor run -d bird-bench@parity -t student_club__1438
82d1fb0
student_club__1445
uvx harbor run -d bird-bench@parity -t student_club__1445
82d1fb0
student_club__1449
uvx harbor run -d bird-bench@parity -t student_club__1449
82d1fb0
student_club__1456
uvx harbor run -d bird-bench@parity -t student_club__1456
82d1fb0
student_club__1457
uvx harbor run -d bird-bench@parity -t student_club__1457
82d1fb0
superhero__738
uvx harbor run -d bird-bench@parity -t superhero__738
82d1fb0
superhero__746
uvx harbor run -d bird-bench@parity -t superhero__746
82d1fb0
superhero__752
uvx harbor run -d bird-bench@parity -t superhero__752
82d1fb0
superhero__764
uvx harbor run -d bird-bench@parity -t superhero__764
82d1fb0
superhero__765
uvx harbor run -d bird-bench@parity -t superhero__765
82d1fb0
superhero__786
uvx harbor run -d bird-bench@parity -t superhero__786
82d1fb0
superhero__799
uvx harbor run -d bird-bench@parity -t superhero__799
82d1fb0
superhero__802
uvx harbor run -d bird-bench@parity -t superhero__802
82d1fb0
superhero__813
uvx harbor run -d bird-bench@parity -t superhero__813
82d1fb0
superhero__814
uvx harbor run -d bird-bench@parity -t superhero__814
82d1fb0
superhero__816
uvx harbor run -d bird-bench@parity -t superhero__816
82d1fb0
thrombosis_prediction__1157
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1157
82d1fb0
thrombosis_prediction__1162
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1162
82d1fb0
thrombosis_prediction__1164
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1164
82d1fb0
thrombosis_prediction__1168
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1168
82d1fb0
thrombosis_prediction__1171
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1171
82d1fb0
thrombosis_prediction__1192
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1192
82d1fb0
thrombosis_prediction__1195
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1195
82d1fb0
thrombosis_prediction__1202
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1202
82d1fb0
thrombosis_prediction__1209
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1209
82d1fb0
thrombosis_prediction__1222
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1222
82d1fb0
thrombosis_prediction__1232
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1232
82d1fb0
thrombosis_prediction__1248
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1248
82d1fb0
thrombosis_prediction__1250
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1250
82d1fb0
thrombosis_prediction__1254
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1254
82d1fb0
thrombosis_prediction__1263
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1263
82d1fb0
thrombosis_prediction__1269
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1269
82d1fb0
thrombosis_prediction__1272
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1272
82d1fb0
thrombosis_prediction__1302
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1302
82d1fb0
thrombosis_prediction__1305
uvx harbor run -d bird-bench@parity -t thrombosis_prediction__1305
82d1fb0
toxicology__211
uvx harbor run -d bird-bench@parity -t toxicology__211
82d1fb0
toxicology__223
uvx harbor run -d bird-bench@parity -t toxicology__223
82d1fb0
toxicology__228
uvx harbor run -d bird-bench@parity -t toxicology__228
82d1fb0
toxicology__229
uvx harbor run -d bird-bench@parity -t toxicology__229
82d1fb0
toxicology__230
uvx harbor run -d bird-bench@parity -t toxicology__230
82d1fb0
toxicology__269
uvx harbor run -d bird-bench@parity -t toxicology__269
82d1fb0
toxicology__272
uvx harbor run -d bird-bench@parity -t toxicology__272
82d1fb0
toxicology__293
uvx harbor run -d bird-bench@parity -t toxicology__293
82d1fb0
toxicology__299
uvx harbor run -d bird-bench@parity -t toxicology__299
82d1fb0
toxicology__307
uvx harbor run -d bird-bench@parity -t toxicology__307
82d1fb0
toxicology__311
uvx harbor run -d bird-bench@parity -t toxicology__311
82d1fb0
toxicology__313
uvx harbor run -d bird-bench@parity -t toxicology__313
82d1fb0
toxicology__316
uvx harbor run -d bird-bench@parity -t toxicology__316
82d1fb0