harbor-refactor-optional-sandbox-deps

Based on#1404#1446
SegmentDesign-and-build
Typefeature
## Task

Getting lots of complaints about cloud sandbox SDK installs so we need to add
support for optional installs. Will need to restructure a bit to get that working.

Also we should add proper support for custom registered environments while we're
at it.

## User stories / requirements

- The default Harbor install shouldn't install any cloud sandbox SDKs and work correctly with local envs.
- When a user tries to invoke a cloud environment without its SDK installed, they should get an error with the missing package and install instructions.
- Custom local environments can be registered with string labels. Things like validation should work correctly.

## General instructions

- The code repo is at /repo/harbor.
- You are inside of a Docker container. You may not be able to perform all operations you would normally be able to do on a local machine. Dependencies have not been pre-installed, and you may need to install them yourself.
- You are expected to act autonomously as a software engineer to complete tasks you are given.
- Do not stop until you feel you have completed the task and your code changes can be merged.
- You may need to use software engineering skills like analyzing the codebase, researching technologies, running services, analyzing logs, etc. to complete the task. Not all tasks will be solvable by reading source code alone.

Agent Results

AgentTastefulBasicVerifierValidationRubricBloatPractTasteCheated
Oracle
2/23/31.0x5.04.0
GPT-5.4
2/22/31.9x4.04.0
GPT-5.5
2/21/31.5x4.03.0
Kimi K2.6
2/21/30.6x3.02.0
Gemini 3.1 Pro
2/20/30.4x3.03.0
GLM-5.2
2/20/30.8x2.02.0
Opus 4.7
2/20/30.9x3.03.0
Opus 4.8
2/20/31.2x3.02.0
Sonnet 4.6
2/20/31.1x3.02.0
Sonnet 5
2/20/31.2x3.03.0
Gemini 3.5 Flash
1/20/30.5x2.02.0
No-Op
0/20/3
Agent details

Verifier Tests

Gemini 3.1 Pro2/2

Validation Stories

Gemini 3.1 Pro0/3

Taste Scores

Patch Bloat0.4x
119 agent / 276 oracle SLOC, 6 / 9 files(raw: 0.9x)
Practice Alignment — 3.0/5
3
style consistency
2
pattern adherence
3
library usage
3
abstraction level
4
documentation fit
Relative Taste — 3.0/5
3
minimality
3
approach quality
2
hygiene
3
fluency
2
craftsmanship

Agent Patch