Content Automation

Reel.

Short-form video is repetitive, expensive, and slow by hand. Reel is a seven-stage pipeline that takes a topic brief and produces a fully assembled, published short, script, images, render, voice, audio, thumbnail and upload, with two human gates and zero manual production work.

Status

Running in production

Cost / video

Under £5

Models

5 AI models

Human gates

2 (script + clip)

<£5

Per published video

97%

Cost cut vs agency

Automated stages

Human quality gates

The problem

Content at volume costs a fortune.

A single short from an agency costs £150 to £400. A freelancer charges £50 to £100. Either way, daily content at real volume is financially unsustainable for most operations.

But the work isn’t creative in the way that justifies the cost. Script structure follows patterns. Visual styles follow templates. Voice follows tone guides. Audio follows formulas. Every step is describable, repeatable, and therefore automatable.

Reel was built on a premise: the production layer of content should cost almost nothing, leaving human energy for strategy, not execution.

Cost per video

Production agency£200+

Freelancer£75

Reel (automated)<£5

How it works

Seven stages. Two human gates.

1
Script Generation
Claude writes a structured five-segment script, hook, setup, build, tension, cliffhanger, capped at 145 words for a 55 to 60s video, with a visual bible per scene.
G1
Gate 1, Script Approval
The script goes to Telegram before any spend on visuals or voice. Approve or regenerate in one tap. Stops wasted spend on weak scripts.
2
Image Generation
Imagen generates 19 images across the five segments using the script's visual bible, segment-specific palette, camera distance, and lighting.
3
Video Clip Rendering
Veo animates the hook image from a purpose-built motion prompt; remaining stills get Ken Burns motion scaled by narrative intensity.
4
Voice Synthesis
ElevenLabs voices the approved script. Casting is topic-aware; speed tuned to 0.85 to 0.9× for clarity and pacing.
5
Audio Mixing
A four-layer mix assembled programmatically, music follows a segment envelope, with hard silence punches at the key transitions.
G2
Gate 2, Final Clip Approval
The assembled video goes to Telegram for final review before upload. Catches assembly or visual failures before they reach the channel.
6
Upload & Analytics
Approved videos upload via the YouTube API with metadata and scheduling. Analytics are pulled back and logged against each run.

The prompt bridge problem

Claude writes 150-word cinematic scene descriptions; Veo responds to 30-word motion-verb prompts and ignores narrative. The bridge extracts key nouns from Claude's description and fills proven Veo templates, the two models never need to understand each other.

Domain collision fix

When a cosmic story uses biological vocabulary, the image model drifts to medical microscopy instead of alien terrain. An anti-collision layer detects the crossover and reframes the prompt with explicit geological framing before generation.

Technology

Five models. One coherent pipeline.

Reel orchestrates five AI models, each handling what it’s actually best at. Claude handles language and structure. Imagen handles image generation. Veo handles motion. ElevenLabs handles voice. FFmpeg handles assembly and audio engineering.

Each stage has a defined input and output schema, so a model can be swapped or added without rewriting the adjacent stages.

Claude API (Opus)Imagen (Vertex AI)Veo (Vertex AI)ElevenLabs APIPythonFFmpegYouTube Data APITelegram Bot APIJSON structured outputsCost tracking per run

What we learned

The failures that taught the most.

Different models need different prompt formats. Writing for Claude and passing it to Veo does not work, each model has its own instruction vocabulary. The prompt bridge was the most valuable engineering in the pipeline.

Video length determines performance more than content. Videos under 62 seconds consistently outperformed those over 90, regardless of topic. The 145-word cap came straight from that data.

Posting 3 to 4 videos a day cannibalises your own distribution. Each Short gets an initial algorithmic test audience. Flooding the channel dilutes it. One video a day at the right time beat four overnight.

Human gates are a design choice, not a failure of automation. Two approval gates add five minutes and prevent every class of catastrophic quality failure. They exist by design.

More work

Scout →

Outbound Intelligence Pipeline

Vigil →

Autonomous Market Intelligence