posthog-feat-prompt-versioning

SegmentDesign-and-build

Typefeature

## Task

Let's add named versions for prompts. Right now the LLM-prompt management page treats them as single mutable objects, but we want to add versioning so we can add "view previous versions", parallel edits, etc. Just do BE for now.

## User stories / requirements

- Creating a fresh prompt establishes version 1. The created response carries enough information for callers to see that the new prompt is the only existing version: an integer version equal to 1, and the prompt is treated as the most recent version of its name.
- Editing an existing prompt creates a new immutable version rather than overwriting the original. After a publish, the original version's content is still retrievable when fetched by its version number.
- Fetching a prompt by name returns the latest version's content by default. After a publish, a follow-up fetch reflects the new content. Fetching with a specific version number returns that version's content.
- When a publish is attempted with a stale base-version token (the caller's view of the current version is no longer current), the server rejects it with a 4xx response and adds no new version. The currently-active version is unchanged.
- Archiving a multi-version prompt removes it from active reads. After archive, creating a fresh prompt with the same name succeeds (the name is reusable from a clean slate) and starts at version 1.
- Listing prompts returns one entry per active name (the latest version for that name), not every historical version. Archived prompts do not appear in the list at all.

## General instructions

- The code repo is at /repo/posthog.
- You are inside of a Docker container. You may not be able to perform all operations you would normally be able to do on a local machine. Dependencies have not been pre-installed, and you may need to install them yourself.
- You are expected to act autonomously as a software engineer to complete tasks you are given.
- Do not stop until you feel you have completed the task and your code changes can be merged.
- You may need to use software engineering skills like analyzing the codebase, researching technologies, running services, analyzing logs, etc. to complete the task. Not all tasks will be solvable by reading source code alone.

Agent Results

Agent	Tasteful	Basic	Verifier	Validation	Rubric	Bloat	Pract	Taste
Oracle			—	6/6	1.00	1.0x	5.0	4.0
GPT-5.4			—	6/6	1.00	0.2x	4.0	3.0
GPT-5.5			—	6/6	1.00	0.2x	3.0	3.0
Opus 4.7			—	6/6	1.00	0.2x	3.0	3.0
Opus 4.8			—	6/6	1.00	0.1x	3.0	3.0
Sonnet 4.6			—	6/6	1.00	0.3x	3.0	3.0
GLM-5.2			—	6/6	1.00	0.2x	3.0	2.0
Gemini 3.1 Pro			—	1/6	0.50	0.3x	2.0	2.0
Gemini 3.5 Flash			—	0/6	0.50	0.1x	2.0	2.0
Kimi K2.6			—	0/6	1.00	0.2x	3.0	2.0
Sonnet 5	—	—	—	—	—	—	—	—
No-Op			—	1/6	0.00	—	—	—

Agent details

Verifier Tests

Gemini 3.1 Pro0/0

Validation Stories

Gemini 3.1 Pro1/6

Rubric Criteria

Gemini 3.1 Pro1/2

Fail → Pass

prompts_remain_immutable_after_publish

Pass → Pass

existing_rows_remain_unmodified_after_publish

Taste Scores

Patch Bloat— 0.3x

311 agent / 1088 oracle SLOC, 9 / 12 files(raw: 0.2x)

Practice Alignment — 2.0/5

style consistency

pattern adherence

library usage

abstraction level

documentation fit

Relative Taste — 2.0/5

minimality

approach quality

hygiene

fluency