bfcl

v1.0

Berkeley Function-Calling Leaderboard: 3,641 function calling tasks for evaluating LLM tool use capabilities across simple, multiple, parallel, and irrelevance categories.

uvx harbor run -d bfcl@1.0

Tasks (3641)

bfcl-live-multiple-3-2-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-3-2-0
6bedd78
bfcl-live-multiple-30-10-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-30-10-0
6bedd78
bfcl-live-multiple-300-130-9
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-300-130-9
6bedd78
bfcl-live-multiple-301-131-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-301-131-0
6bedd78
bfcl-live-multiple-302-131-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-302-131-1
6bedd78
bfcl-live-multiple-303-131-2
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-303-131-2
6bedd78
bfcl-live-multiple-304-131-3
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-304-131-3
6bedd78
bfcl-live-multiple-305-131-4
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-305-131-4
6bedd78
bfcl-live-multiple-306-131-5
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-306-131-5
6bedd78
bfcl-live-multiple-307-131-6
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-307-131-6
6bedd78
bfcl-live-multiple-308-131-7
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-308-131-7
6bedd78
bfcl-live-multiple-309-131-8
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-309-131-8
6bedd78
bfcl-live-multiple-31-10-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-31-10-1
6bedd78
bfcl-live-multiple-310-132-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-310-132-0
6bedd78
bfcl-live-multiple-311-132-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-311-132-1
6bedd78
bfcl-live-multiple-312-132-2
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-312-132-2
6bedd78
bfcl-live-multiple-313-132-3
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-313-132-3
6bedd78
bfcl-live-multiple-314-132-4
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-314-132-4
6bedd78
bfcl-live-multiple-315-132-5
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-315-132-5
6bedd78
bfcl-live-multiple-316-132-6
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-316-132-6
6bedd78
bfcl-live-multiple-317-132-7
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-317-132-7
6bedd78
bfcl-live-multiple-318-132-8
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-318-132-8
6bedd78
bfcl-live-multiple-319-132-9
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-319-132-9
6bedd78
bfcl-live-multiple-32-10-2
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-32-10-2
6bedd78
bfcl-live-multiple-320-132-10
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-320-132-10
6bedd78
bfcl-live-multiple-321-132-11
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-321-132-11
6bedd78
bfcl-live-multiple-322-132-12
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-322-132-12
6bedd78
bfcl-live-multiple-323-132-13
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-323-132-13
6bedd78
bfcl-live-multiple-324-132-14
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-324-132-14
6bedd78
bfcl-live-multiple-325-132-15
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-325-132-15
6bedd78
bfcl-live-multiple-326-132-16
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-326-132-16
6bedd78
bfcl-live-multiple-327-132-17
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-327-132-17
6bedd78
bfcl-live-multiple-328-132-18
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-328-132-18
6bedd78
bfcl-live-multiple-329-132-19
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-329-132-19
6bedd78
bfcl-live-multiple-33-10-3
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-33-10-3
6bedd78
bfcl-live-multiple-330-132-20
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-330-132-20
6bedd78
bfcl-live-multiple-331-132-21
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-331-132-21
6bedd78
bfcl-live-multiple-332-132-22
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-332-132-22
6bedd78
bfcl-live-multiple-333-132-23
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-333-132-23
6bedd78
bfcl-live-multiple-334-132-24
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-334-132-24
6bedd78
bfcl-live-multiple-335-132-25
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-335-132-25
6bedd78
bfcl-live-multiple-336-133-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-336-133-0
6bedd78
bfcl-live-multiple-337-133-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-337-133-1
6bedd78
bfcl-live-multiple-338-133-2
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-338-133-2
6bedd78
bfcl-live-multiple-339-133-3
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-339-133-3
6bedd78
bfcl-live-multiple-34-11-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-34-11-0
6bedd78
bfcl-live-multiple-340-133-4
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-340-133-4
6bedd78
bfcl-live-multiple-341-133-5
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-341-133-5
6bedd78
bfcl-live-multiple-342-133-6
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-342-133-6
6bedd78
bfcl-live-multiple-343-133-7
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-343-133-7
6bedd78
bfcl-live-multiple-344-133-8
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-344-133-8
6bedd78
bfcl-live-multiple-345-133-9
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-345-133-9
6bedd78
bfcl-live-multiple-346-133-10
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-346-133-10
6bedd78
bfcl-live-multiple-347-133-11
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-347-133-11
6bedd78
bfcl-live-multiple-348-133-12
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-348-133-12
6bedd78
bfcl-live-multiple-349-133-13
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-349-133-13
6bedd78
bfcl-live-multiple-35-11-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-35-11-1
6bedd78
bfcl-live-multiple-350-133-14
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-350-133-14
6bedd78
bfcl-live-multiple-351-133-15
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-351-133-15
6bedd78
bfcl-live-multiple-352-133-16
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-352-133-16
6bedd78
bfcl-live-multiple-353-133-17
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-353-133-17
6bedd78
bfcl-live-multiple-354-133-18
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-354-133-18
6bedd78
bfcl-live-multiple-355-134-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-355-134-0
6bedd78
bfcl-live-multiple-356-134-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-356-134-1
6bedd78
bfcl-live-multiple-357-134-2
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-357-134-2
6bedd78
bfcl-live-multiple-358-134-3
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-358-134-3
6bedd78
bfcl-live-multiple-359-134-4
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-359-134-4
6bedd78
bfcl-live-multiple-36-12-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-36-12-0
6bedd78
bfcl-live-multiple-360-134-5
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-360-134-5
6bedd78
bfcl-live-multiple-361-134-6
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-361-134-6
6bedd78
bfcl-live-multiple-362-134-7
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-362-134-7
6bedd78
bfcl-live-multiple-363-134-8
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-363-134-8
6bedd78
bfcl-live-multiple-364-134-9
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-364-134-9
6bedd78
bfcl-live-multiple-365-134-10
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-365-134-10
6bedd78
bfcl-live-multiple-366-134-11
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-366-134-11
6bedd78
bfcl-live-multiple-367-134-12
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-367-134-12
6bedd78
bfcl-live-multiple-368-134-13
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-368-134-13
6bedd78
bfcl-live-multiple-369-134-14
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-369-134-14
6bedd78
bfcl-live-multiple-37-13-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-37-13-0
6bedd78
bfcl-live-multiple-370-134-15
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-370-134-15
6bedd78
bfcl-live-multiple-371-134-16
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-371-134-16
6bedd78
bfcl-live-multiple-372-134-17
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-372-134-17
6bedd78
bfcl-live-multiple-373-134-18
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-373-134-18
6bedd78
bfcl-live-multiple-374-134-19
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-374-134-19
6bedd78
bfcl-live-multiple-375-134-20
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-375-134-20
6bedd78
bfcl-live-multiple-376-135-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-376-135-0
6bedd78
bfcl-live-multiple-377-135-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-377-135-1
6bedd78
bfcl-live-multiple-378-135-2
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-378-135-2
6bedd78
bfcl-live-multiple-379-136-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-379-136-0
6bedd78
bfcl-live-multiple-38-14-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-38-14-0
6bedd78
bfcl-live-multiple-380-136-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-380-136-1
6bedd78
bfcl-live-multiple-381-136-2
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-381-136-2
6bedd78
bfcl-live-multiple-382-137-0
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-382-137-0
6bedd78
bfcl-live-multiple-383-137-1
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-383-137-1
6bedd78
bfcl-live-multiple-384-137-2
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-384-137-2
6bedd78
bfcl-live-multiple-385-137-3
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-385-137-3
6bedd78
bfcl-live-multiple-386-137-4
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-386-137-4
6bedd78
bfcl-live-multiple-387-137-5
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-387-137-5
6bedd78
bfcl-live-multiple-388-137-6
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-388-137-6
6bedd78
bfcl-live-multiple-389-137-7
uvx harbor run -d bfcl@1.0 -t bfcl-live-multiple-389-137-7
6bedd78