Built-in Criteria
Reference for all built-in criterion functions
Rewardkit ships with built-in criteria for common rubrics. Use them from any Python file in your tests directory:
import rewardkit as rk
rk.file_exists("output.txt")
rk.command_succeeds("python main.py", weight=2.0)All criteria accept optional weight (default 1.0) and isolated (default false) parameters in addition to the ones listed below.
File criteria
| Criterion | Parameters | Description |
|---|---|---|
file_exists | path | File exists in workspace |
file_not_exists | path | File does not exist |
file_contains | path, text | File contains a substring |
file_contains_regex | path, pattern | File content matches a regex pattern |
file_matches | path, expected | File content equals expected text (whitespace-stripped) |
files_equal | path1, path2 | Two files have identical content |
diff_ratio | path, expected | Similarity ratio between file content and expected text (returns 0.0–1.0) |
Command criteria
| Criterion | Parameters | Description |
|---|---|---|
command_succeeds | cmd, cwd?, timeout? | Command exits with code 0 |
command_output_contains | cmd, text, cwd?, timeout? | Command stdout contains text |
command_output_matches | cmd, expected, cwd?, timeout? | Command stdout equals expected (stripped) |
command_output_matches_regex | cmd, pattern, cwd?, timeout? | Command stdout matches a regex |
Default timeout is 30 seconds. The cwd parameter is relative to the workspace.
Data format criteria
| Criterion | Parameters | Description |
|---|---|---|
json_key_equals | path, key, expected | Top-level JSON key equals a value |
json_path_equals | path, json_path, expected | Dot-separated path into JSON equals a value |
csv_cell_equals | path, row, col, expected | CSV cell at row/col equals a value |
xlsx_cell_equals | path, cell, expected, sheet? | Excel cell equals a value |
sqlite_query_equals | db_path, query, expected | SQL query result equals a value |
xlsx_cell_equals requires the office extra: uv add harbor-rewardkit[office]
For csv_cell_equals, row numbering depends on the column type. When col is
an integer, a raw CSV reader is used and row 0 is the header row. When
col is a string (column name), row 0 is the first data row after the
header.
HTTP criteria
| Criterion | Parameters | Description |
|---|---|---|
http_status_equals | url, status?, timeout? | HTTP response has the expected status code (default 200) |
http_response_contains | url, text, timeout? | HTTP response body contains text |
Image criteria
| Criterion | Parameters | Description |
|---|---|---|
image_similarity | path1, path2 | Pixel-level similarity ratio (returns 0.0–1.0) |
image_size_equals | path, width, height | Image has the expected dimensions |
Image criteria require the image extra: uv add harbor-rewardkit[image]
Trajectory criteria
These criteria inspect the agent's ATIF trajectory file (default path: /logs/trajectory.json).
| Criterion | Parameters | Description |
|---|---|---|
trajectory_tool_used | tool_name, min_count?, path? | Agent used a specific tool at least min_count times (default 1) |
trajectory_tool_not_used | tool_name, path? | Agent did not use a specific tool |
trajectory_turn_count | max_turns, path? | Penalizes exceeding a turn budget — returns 1.0 at max_turns, linearly decays to 0.0 at double |
Optional extras
| Extra | Criteria | Install |
|---|---|---|
office | xlsx_cell_equals | uv add harbor-rewardkit[office] |
image | image_similarity, image_size_equals | uv add harbor-rewardkit[image] |
all | All of the above | uv add harbor-rewardkit[all] |