Runway Gen-4 Prompt Guide: Write Prompts for Cinematic AI Video

Runway Gen-4 works differently from Sora, Kling, or Veo. It generates video from an input image plus a text prompt — meaning the image sets the visual foundation and your prompt’s main job is to describe motion. Getting this wrong is the most common mistake new users make.

This guide covers how to write Gen-4 prompts that produce cinematic results, based on the official Runway documentation and practical testing.

How Gen-4 Prompts Work

Gen-4 is an image-to-video model. You provide a reference image and a text prompt, and the model generates a 5 or 10-second video clip. This is fundamentally different from text-to-video models where the prompt describes everything from scratch.

Because the image already defines composition, lighting, color palette, subjects, and style, your prompt should focus on one thing: what moves and how.

Reiterating visual details that already exist in the image — “a woman with brown hair wearing a red dress in a modern kitchen” — can actually reduce the quality of the output. Gen-4 gets confused when the prompt contradicts or redundantly describes what’s in the image. Instead, describe the action: “She picks up a coffee mug and takes a sip, steam rising from the cup.”

The Prompt Structure That Works

Gen-4 prefers complete sentences over keyword lists. Write like you’re describing a scene to another person, not tagging an image.

Good prompt format:

The camera slowly pushes in as the woman turns her head toward the window, soft afternoon light catching her face.

Bad prompt format:

slow zoom, woman turning, window light, cinematic, 4K, dramatic

The keyword-list approach works for image generators. For Gen-4 video, complete sentences produce better motion, smoother transitions, and more natural-looking results.

One Action Per Prompt

Gen-4 generates short clips — 5 or 10 seconds. That’s enough time for one clear action, not three. Overloading a prompt with multiple actions causes the model to either rush through all of them (producing jerky, unnatural movement) or ignore some entirely.

One action (good):

He slowly opens the old leather-bound book, dust particles floating in the shaft of light from the window.

Multiple actions (bad):

He opens the book, reads the first page, closes it, and walks to the window to look outside while it starts raining.

If you need a sequence of actions, generate separate clips for each and edit them together. Five focused 5-second clips create a better 25-second sequence than one overloaded 10-second prompt.

Camera Motion Prompts

Camera movement is one of Gen-4’s strongest capabilities. You can describe standard cinematic camera moves and the model reproduces them well:

“The camera slowly pushes forward” — a dolly-in effect
“The camera pulls back to reveal the full scene” — a dolly-out / reveal shot
“The camera pans left across the cityscape” — a horizontal pan
“The camera tracks alongside the runner” — a tracking shot
“The camera tilts up from the street to the rooftops” — a vertical tilt

Combine camera movement with subject movement for more dynamic results:

The camera tracks alongside the cyclist as she rounds the corner, the city lights blurring in the background.

Avoid asking for camera movements that fight the image composition. If your reference image is a tight close-up, asking for a wide establishing shot will produce artifacts. Match the camera move to what the image can support.

Mood and Atmosphere

You can influence the mood of the generated video through your prompt, even though the image sets the base look:

“Gentle breeze moves through the curtains” — adds subtle environmental motion that makes static rooms feel alive
“Rain begins to fall softly” — adds weather as a mood element
“The light shifts as clouds pass overhead” — creates natural lighting changes
“Fog drifts slowly across the ground” — adds atmospheric depth

These environmental details work well because they add motion without requiring complex subject animation. Gen-4 handles particle effects (rain, snow, dust, fog, steam) reliably.

What Gen-4 Does Well

Based on testing and community results, Gen-4 excels at:

Subtle human motion — head turns, hand gestures, facial expressions, walking
Camera movement — smooth dolly, pan, track, and tilt shots
Environmental effects — rain, wind, fog, lighting changes, water ripples
Fabric and hair movement — clothing blowing in wind, hair movement
Atmospheric shifts — time-of-day transitions, cloud movement, fire/candlelight flicker

What Gen-4 Struggles With

These areas consistently produce less reliable results:

Complex hand interactions — picking up small objects, typing, playing instruments
Multiple characters interacting — conversations, handshakes, group activities
Fast action — running, fighting, sports, explosions
Physics-dependent motion — liquid pouring, objects falling, mechanical movement
Text in video — signs, screens, and written words tend to distort

Plan your prompts around the model’s strengths. If you need a character picking up a coffee cup, frame the shot so the hand interaction is implied rather than the focal point — or use a reference image where the hand is already near the cup.

Iterative Prompting Strategy

Runway recommends an iterative approach: start simple, then add complexity one element at a time.

Iteration 1: “The woman turns her head to the right.”

Iteration 2: “The woman slowly turns her head to the right, a slight smile forming.”

Iteration 3: “The woman slowly turns her head to the right, a slight smile forming as warm light catches her face.”

Each iteration adds one new element. If something breaks, you know exactly which addition caused the problem. This is faster than debugging a complex prompt that fails for unclear reasons.

Negative Prompts Don’t Work

Gen-4 does not support negative prompting. Phrases like “no blur,” “don’t zoom out,” or “avoid shaky camera” will produce unpredictable results — sometimes the exact opposite of what you want.

Instead of telling Gen-4 what not to do, describe what you want positively:

Instead of “no shaky camera” → “smooth, steady camera movement”
Instead of “don’t make it too fast” → “slow, deliberate motion”
Instead of “no dark shadows” → “even, soft lighting throughout”

Using Gen-4 for Different Styles

Cinematic / Film Look

Focus on slow, deliberate camera movement with shallow depth of field. Use reference images with film-like lighting (golden hour, practical lights, contrasty shadows).

The camera slowly pushes in as he stares into the distance, the background gently falling out of focus.

Documentary Style

Describe observational camera work — handheld-feeling movement, natural reactions, candid moments.

She looks up from her work and brushes hair from her face, the camera observing from across the table.

Product Showcase

For product videos, start with a clean product photo and describe a rotation or reveal:

The camera orbits slowly around the watch, catching reflections on the crystal face as it rotates.

Abstract / Artistic

Gen-4 handles abstract motion well. Use reference images with graphic or artistic compositions and describe fluid transformations:

The colors swirl and blend slowly, the shapes morphing into new patterns as the light pulses gently.

Generating Prompts Faster

Writing detailed motion prompts for every shot in a project gets tedious. If you’re producing multiple AI video clips — for a short film, product video series, or social content — a prompt generator can save significant time.

LzyPrompt generates structured prompts for Runway Gen-4, Sora, Kling, Veo, and other AI video tools. Describe what you want in plain language, and it produces a detailed, model-specific prompt with camera direction, motion description, and mood cues. It’s built for people who know what they want to create but don’t want to spend 10 minutes crafting each prompt.

Generate your first prompt free →

FAQ

What’s the difference between Runway Gen-3 and Gen-4?

Gen-4 is an image-to-video model, while Gen-3 Alpha supported both text-to-video and image-to-video. Gen-4 produces significantly better motion consistency, more natural-looking human movement, and smoother camera work. The trade-off is that Gen-4 requires an input image — you can’t generate video from text alone.

How long are Runway Gen-4 videos?

Gen-4 generates clips in 5-second and 10-second durations. For longer content, generate multiple clips and edit them together. Each clip should cover one clear action or camera movement. A 30-second sequence typically requires 3-6 individual clips.

Can I use Runway Gen-4 commercially?

Yes. Runway’s paid plans include commercial usage rights for generated content. Check their current terms of service for specifics, as licensing terms can change. The Standard, Pro, and Unlimited plans all include commercial rights.

Why does my Gen-4 video have barely any motion?

This usually happens when your prompt describes what the image looks like instead of what should move. Remove visual descriptions and focus entirely on action and motion. Short, direct prompts like “she slowly smiles and tilts her head” produce more motion than long descriptive prompts about the scene’s appearance.