harbor-add-agent-file-retention

Based on#1349
SegmentDesign-and-build
Typefeature
## Task

Give users a way on the run launcher CLI to control which agent files Harbor
keeps after each trial finishes: keep everything (the current behavior), only
the trajectory files that trace export needs, or none.

## User stories / requirements

- When a user opts to keep only trajectory files, the agent directory is pruned down to exactly the trajectory files in its root.
- When a user opts for keep-everything policy, everything should work like it does today. For keep nothing, the directory should be present but empty.
- The retention options work for remote environments as well.

## General instructions

- The code repo is at /repo/harbor.
- You are inside of a Docker container. You may not be able to perform all operations you would normally be able to do on a local machine. Dependencies have not been pre-installed, and you may need to install them yourself.
- You are expected to act autonomously as a software engineer to complete tasks you are given.
- Do not stop until you feel you have completed the task and your code changes can be merged.
- You may need to use software engineering skills like analyzing the codebase, researching technologies, running services, analyzing logs, etc. to complete the task. Not all tasks will be solvable by reading source code alone.

Agent Results

AgentTastefulBasicVerifierValidationRubricBloatPractTasteCheated
Oracle
2/23/31.001.0x5.04.0
Opus 4.7
2/23/31.001.1x3.03.0
GPT-5.4
2/22/31.001.5x4.03.0
GPT-5.5
2/22/30.501.0x4.03.0
Kimi K2.6
2/22/30.500.7x3.03.0
Opus 4.8
2/22/31.000.8x3.03.0
Sonnet 4.6
2/22/30.501.5x3.02.0
Sonnet 5
2/22/31.001.5x3.03.0
Gemini 3.1 Pro
0/12/30.500.4x2.02.0
Gemini 3.5 Flash
0/12/30.500.7x3.02.0
GLM-5.2
0/12/31.000.9x3.03.0
No-Op
2/20/30.00
Agent details

Verifier Tests

Gemini 3.1 Pro0/1

Validation Stories

Gemini 3.1 Pro2/3

Rubric Criteria

Gemini 3.1 Pro1/2
Fail → Pass
dual_environment_handling
trajectory_patterns_match_trace_export

Taste Scores

Patch Bloat0.4x
43 agent / 107 oracle SLOC, 5 / 4 files(raw: 6.1x)
Practice Alignment — 2.0/5
2
style consistency
2
pattern adherence
3
library usage
2
abstraction level
3
documentation fit
Relative Taste — 2.0/5
3
minimality
2
approach quality
2
hygiene
3
fluency
2
craftsmanship

Agent Patch