harbor

Reward Kit

Built-in Criteria

Reference for all built-in criterion functions

Rewardkit ships with built-in criteria for common rubrics. Use them from any Python file in your tests directory:

import rewardkit as rk

rk.file_exists("output.txt")
rk.command_succeeds("python main.py", weight=2.0)

All criteria accept optional weight (default 1.0) and isolated (default false) parameters in addition to the ones listed below.

File criteria

CriterionParametersDescription
file_existspathFile exists in workspace
file_not_existspathFile does not exist
file_containspath, textFile contains a substring
file_contains_regexpath, patternFile content matches a regex pattern
file_matchespath, expectedFile content equals expected text (whitespace-stripped)
files_equalpath1, path2Two files have identical content
diff_ratiopath, expectedSimilarity ratio between file content and expected text (returns 0.0–1.0)

Command criteria

CriterionParametersDescription
command_succeedscmd, cwd?, timeout?Command exits with code 0
command_output_containscmd, text, cwd?, timeout?Command stdout contains text
command_output_matchescmd, expected, cwd?, timeout?Command stdout equals expected (stripped)
command_output_matches_regexcmd, pattern, cwd?, timeout?Command stdout matches a regex

Default timeout is 30 seconds. The cwd parameter is relative to the workspace.

Data format criteria

CriterionParametersDescription
json_key_equalspath, key, expectedTop-level JSON key equals a value
json_path_equalspath, json_path, expectedDot-separated path into JSON equals a value
csv_cell_equalspath, row, col, expectedCSV cell at row/col equals a value
xlsx_cell_equalspath, cell, expected, sheet?Excel cell equals a value
sqlite_query_equalsdb_path, query, expectedSQL query result equals a value

xlsx_cell_equals requires the office extra: uv add harbor-rewardkit[office]

For csv_cell_equals, row numbering depends on the column type. When col is an integer, a raw CSV reader is used and row 0 is the header row. When col is a string (column name), row 0 is the first data row after the header.

HTTP criteria

CriterionParametersDescription
http_status_equalsurl, status?, timeout?HTTP response has the expected status code (default 200)
http_response_containsurl, text, timeout?HTTP response body contains text

Image criteria

CriterionParametersDescription
image_similaritypath1, path2Pixel-level similarity ratio (returns 0.0–1.0)
image_size_equalspath, width, heightImage has the expected dimensions

Image criteria require the image extra: uv add harbor-rewardkit[image]

Trajectory criteria

These criteria inspect the agent's ATIF trajectory file (default path: /logs/trajectory.json).

CriterionParametersDescription
trajectory_tool_usedtool_name, min_count?, path?Agent used a specific tool at least min_count times (default 1)
trajectory_tool_not_usedtool_name, path?Agent did not use a specific tool
trajectory_turn_countmax_turns, path?Penalizes exceeding a turn budget — returns 1.0 at max_turns, linearly decays to 0.0 at double

Optional extras

ExtraCriteriaInstall
officexlsx_cell_equalsuv add harbor-rewardkit[office]
imageimage_similarity, image_size_equalsuv add harbor-rewardkit[image]
allAll of the aboveuv add harbor-rewardkit[all]

On this page