harbor

Tasks

Windows tasks

Create and run Harbor tasks inside Windows containers with Docker Desktop in Windows container mode.

Harbor supports running tasks inside Windows containers (e.g., windowsservercore, nanoserver) on a Windows host with Docker Desktop in Windows container mode.

Prerequisites

  • Windows 10/11 Pro, Enterprise, or Education (or Windows Server 2019+) — Windows Home does not support Hyper-V, which is required for Windows containers
  • Docker Desktop installed and set to Windows containers mode (right-click system tray icon → "Switch to Windows containers...")
  • tar.exe available on the host (ships with Windows 10 1803+ at C:\Windows\System32\tar.exe)

How It Works

Declaring the target OS

Tasks declare their target container OS in task.toml under [environment]:

[environment]
os = "windows"   # or "linux" (default)

The framework reads this field at startup and adapts every OS-conditional behavior accordingly:

  • container-side paths (POSIX vs C: drive),
  • script discovery (.sh vs .bat),
  • file transfer (native docker cp on Linux vs tar-over-exec on Windows — docker cp does not work with running Hyper-V isolated Windows containers, so files are streamed through tar piped over docker exec instead),
  • command execution (bash -c vs cmd /S /C),
  • container keepalive command,
  • environment variables (LOGS_DIR).

Omitting the field is equivalent to os = "linux" and preserves the behavior of every existing Linux task.

Daemon-mode and image-OS validation

Two preflight checks run when the environment starts:

  1. Daemon mode: Harbor calls docker info --format "{{.OSType}}" and compares the result to [environment].os. Mismatch → fail fast with a message instructing you to switch Docker Desktop to the matching container mode. A Windows task on a non-Windows host is also rejected.
  2. Image OS: After build/pull but before docker compose up, Harbor runs docker inspect --format "{{.Os}}" <image> and compares it to [environment].os. Mismatch → fail fast indicating the actual image OS.

These checks replace the previous host-only auto-detection and surface configuration mistakes immediately rather than mid-run.

Container-Side Paths

Linux containers use POSIX paths (/logs, /tests, /solution). Windows containers use drive-prefixed paths:

PurposeLinuxWindows
Logs root/logsC:/logs
Agent logs/logs/agentC:/logs/agent
Verifier logs/logs/verifierC:/logs/verifier
Artifacts/logs/artifactsC:/logs/artifacts
Tests/testsC:/tests
Solution/solutionC:/solution

These are provided by EnvironmentPaths.for_windows() and accessed via environment.env_paths.

Volume Mounts

Docker Compose volume mounts traditionally use a short-form colon-separated syntax:

volumes:
  - C:/host/path:/container/path

This works fine on Linux where paths look like /host/path:/container/path — Docker splits on the first colon to separate source from target. But on Windows, paths include a drive letter with a colon (e.g., C:/host/path), so Docker sees C as the source, /host/path as the target, and /container/path as an invalid extra field. This causes errors like:

invalid volume specification: 'C:/git/harbor/jobs/.../verifier:/logs/verifier:rw'

To fix this, the docker-compose base file uses long-form syntax which avoids colon-based parsing entirely:

volumes:
  - type: bind
    source: ${HOST_VERIFIER_LOGS_PATH}
    target: ${ENV_VERIFIER_LOGS_PATH}

Each part is a separate YAML key, so colons in Windows paths are never ambiguous. This syntax works identically on both Linux and Windows.

Additionally, host paths are converted to forward-slash format (C:/git/harbor/... instead of C:\git\harbor\...) since Docker expects forward slashes regardless of the host OS.

File Transfer (tar-over-exec)

Standard docker cp does not work with running Hyper-V isolated Windows containers. Instead, file transfers use tar piped through docker exec:

  • Upload: tar cf - -C <source> . | docker exec -i <container> tar xf - -C <target>
  • Download: docker exec <container> tar cf - -C <source> . | tar xf - -C <target>

This works with both Hyper-V (default) and process isolation modes.

The container is targeted by an explicit name (a random GUID assigned via container_name in the Windows keepalive compose file), avoiding the need to query docker compose ps for a container ID at runtime.

Command Execution

Commands inside Windows containers are executed via cmd /S /C <command> instead of bash -c <command>. This correctly handles all script types.

When a task has multiple script formats, they are discovered in priority order filtered by [environment].os:

osExtensions tried (in order)
linux (default).sh
windows.bat

Within each list the first matching script wins. Because filtering is OS-driven, a task may safely ship cross-platform scripts (e.g. both solve.sh and solve.bat) — the right one is selected automatically.

For .sh scripts, chmod +x is executed as a separate step with user="root" before the script runs. This ensures scripts are executable even when the task configures a non-root agent or verifier user.

The execution shells used per extension are:

  1. .sh → direct execution (callers run chmod +x as root separately)
  2. .batcmd /c <script>

Users who need PowerShell can call it from within a .bat file (e.g. powershell -ExecutionPolicy Bypass -File script.ps1).

Container Keepalive

Linux containers use sh -c "sleep infinity". Windows containers use cmd /C "ping -t localhost > nul" to keep the container running. While ping -t has minor CPU overhead from ICMP packets, alternatives like timeout /t -1 do not work in non-interactive Docker containers.

Environment Variables

The LOGS_DIR environment variable is automatically set to C:\logs inside Windows containers, allowing test scripts to locate the verifier output directory via %LOGS_DIR% (batch).

Writing Windows Tasks

A minimal Windows task structure:

my-task/
├── environment/
│   └── Dockerfile
├── solution/
│   └── solve.bat
├── tests/
│   └── test.bat
├── instruction.md
└── task.toml

Dockerfile

Any Windows-based Docker image should work. Tested base images:

  • mcr.microsoft.com/windows/servercore:ltsc2022
  • python:3.13-windowsservercore-ltsc2022

Example:

FROM mcr.microsoft.com/windows/servercore:ltsc2022
WORKDIR C:\\app

Note: Windows base images are several GB in size. The first pull will take significantly longer than typical Linux images.

Test Scripts

Test scripts must write a reward value to %LOGS_DIR%\verifier\reward.txt:

Batch (.bat):

@echo off
echo 1 > "%LOGS_DIR%\verifier\reward.txt"

Multi-Step Tasks

Windows multi-step tasks use the same per-step layout as Linux tasks, with Batch entrypoints:

my-task/
├── steps/
│   └── grade/
│       ├── instruction.md
│       ├── solution/
│       │   └── solve.bat
│       └── tests/
│           └── test.bat
├── environment/
│   └── Dockerfile
└── task.toml

For each step, Harbor first looks for steps/<step>/tests/test.bat. If a step does not provide its own test script, Harbor falls back to the shared top-level tests/test.bat. When both shared and step-specific test directories exist, shared tests are uploaded first and step-specific tests are uploaded second, so step files can override shared helpers.

Oracle solutions follow the same rule: a step-local steps/<step>/solution/solve.bat is preferred when that solution directory exists. Otherwise, the oracle falls back to top-level solution/solve.bat.

task.toml

Set [environment].os = "windows" to declare the task as Windows-targeted. Everything else is standard:

version = "1.0"

[verifier]
timeout_sec = 300.0

[agent]
timeout_sec = 300.0

[environment]
os = "windows"
build_timeout_sec = 600.0
cpus = 1
memory_mb = 2048
storage_mb = 10240

Omitting os (or setting it to "linux") keeps the task on Linux containers — this is the default for back-compat.

Sample Tasks

A sample hello-world task is included in examples/tasks/, demonstrating Windows batch script support. It creates a greet script at C:\app\ and verifies its output:

SampleScript TypePath
hello-world-bat.bat (Batch)examples/tasks/hello-world-bat/
hello-multi-step-bat.bat (Batch, multi-step)examples/tasks/hello-multi-step-bat/

Run it with the oracle agent:

uv run harbor run -p examples/tasks/hello-world-bat -a oracle --debug

It has been tested and produces Mean: 1.000 (full score).

Limitations

  • Windows host only: Windows containers require a Windows host with Docker Desktop. They cannot run on Linux hosts.
  • Hyper-V isolation: The default isolation mode is Hyper-V. Process isolation is possible but may trigger Windows Defender Controlled Folder Access blocks.
  • No GPU support: Windows containers in Harbor do not support GPU allocation.
  • No internet isolation: The network_mode: none compose override has not been tested with Windows containers.
  • Performance: Windows base images are several GB (vs. tens of MB for Linux), resulting in longer initial pulls and slower first-time builds due to image size and Windows-specific tooling installation.
  • Only Docker environment: Windows tasks are supported only in the local Docker environment, not in cloud providers (Daytona, Modal, E2B, etc.).
  • Agent compatibility: Only agents with SUPPORTS_WINDOWS = True can run Windows tasks. Currently this is limited to oracle and nop. All other installed agents (terminus, claude-code, codex, aider, etc.) rely on Linux-only tools (bash, apt-get, tmux) and will be rejected with a clear error before setup. Custom agents must set SUPPORTS_WINDOWS = True to opt in.
  • CRLF handling: No automatic CRLF→LF conversion is performed for Windows containers (it is only needed for Linux containers running on Windows hosts).

On this page