Portrait + audio → short talking video drafts

P Video Avatar

Create avatar-style short videos from a portrait image and an audio track — built for social clips, talking concepts, product updates, announcements, and lightweight presenter drafts you can put together right in the browser.

✨ 60 free signup credits · Credits do not expire

Portrait image

Driving audio

Video prompt

Resolution

Audio duration (seconds)

Seed (-1 random)

Your result appears here

P Video Avatar · 720p · 10s

Avatar workflow

Turn a still portrait into a speaking video draft

P Video keeps avatar creation deliberately lightweight. Instead of writing a scene from scratch, you start from a face and a voice: upload a rights-cleared portrait, attach the driving audio, keep the video prompt to a short phrase, and generate a brief talking-style draft you can preview and download in a few moments.

That makes Avatar the fastest path from a written update to something that looks and feels spoken. It is tuned for the moment before a real recording — when you want to see how a talking concept reads on screen, test the pacing of a delivery, or share a presenter-style draft in a thread or a deck without booking a shoot or briefing talent.

Avatar is one of three focused workflows. For clips that start from a sentence or a reference image with no audio at all, use the P Video AI Video Generator. When the goal is to drive a still image with reference motion rather than a voice, switch to P Video Animate. When the source is already a video and the subject needs a replacement reference, use P Video Replace. Every workflow keeps the same light console rhythm, so moving between them feels like one product rather than separate tools.

What you provide

Three inputs shape an avatar draft

Avatar exposes only the inputs that matter for a talking-portrait clip — a face, a voice, and a light touch of direction.

🖼️

Portrait image

Start from a clear, rights-cleared portrait with a stable subject, even lighting, and a face that reads well at small sizes. A clean frontal or near-frontal shot gives the avatar workflow the most to work with.

🎙️

Driving audio

Add the audio track that should drive the clip. Keep it concise while you are testing so each generation stays quick to review and easy to compare against the take before it.

✍️

Video prompt

Describe the expression, energy, or subtle motion in a short phrase rather than a full script. A compact prompt — calm delivery, slight head movement, steady framing — keeps the result anchored on the portrait.

Workflow

From portrait to talking draft in four steps

Upload the portrait

Pick a rights-cleared image with a readable face and stable framing. This is the subject the avatar draft is built around, so a clean, well-lit shot pays off across every later iteration.

Add the driving audio

Attach the audio track that should lead the clip. Shorter audio keeps tests fast and cost predictable while you settle on the portrait, the tone, and the framing you want.

Keep the prompt simple

Use a short phrase for expression and motion instead of a long description. One clear direction reads more naturally in a brief avatar clip than several competing instructions.

Generate and review

Run the draft, preview the result, and keep the takes that land. Saved generations stay in your library so you can compare versions before committing to one direction.

Inside the console

A face, a voice, and a short direction

Avatar pairs your portrait with the driving audio and a compact prompt, then renders a brief talking-style clip. Per-second rates are listed on P Video Pricing.

Avatar inputs

🙂

driving audio

720p1080p

⚡ Billed per second — trim the audio to keep tests fast and cheap.

Video prompt

Calm, conversational delivery, slight natural head movement, steady framing, soft studio light.

🗣️ Talking draft

Prompting

Direction that keeps a talking draft natural

A few habits keep avatar clips steady and readable while you iterate on the portrait and audio.

🎯

Name the delivery

Lead with the tone you want — calm, upbeat, conversational — so the expression matches the audio rather than fighting it. A single clear delivery note keeps short avatar clips readable.

🪞

Keep motion subtle

Call out gentle, intentional movement like a slight head tilt or natural blink instead of large gestures. Subtle motion holds the likeness steadier across the length of a short draft.

🔊

Match audio to length

Trim the audio to the segment you actually want to test. Because avatar generation is billed per second, shorter audio means faster, cheaper passes while you iterate on the look.

🔁

Change one thing at a time

Adjust a single variable per take — the portrait, the audio, or the expression phrase — so you can read what changed. Focused iterations converge faster than rewriting everything at once.

Use cases

Where avatar drafts help most

Lightweight talking-style clips for the updates, explainers, and concepts you make many of, quickly.

📣

Announcements

Assemble a quick talking-style clip for a product update, release note, or community post without booking a shoot.

🧑‍🏫

Explainers

Pair a portrait with narration to walk through a feature, a concept, or an onboarding step in a lightweight draft.

📱

Social updates

Draft presenter-style short clips for feeds and stories, then judge pacing and delivery before a full edit.

🏷️

Product demos

Add a friendly face and a voice to a short demo idea so a flat script feels closer to a finished segment.

💬

Internal notes

Turn a written update into a short spoken draft for a team thread, a review, or an async stand-in.

🎤

Presenter drafts

Test how a talking concept feels on screen before scheduling a recording or briefing a presenter.

Pricing

Avatar rates are based on audio duration

P Video Avatar costs 6 credits per second at 720p and 12 credits per second at 1080p, so a trimmed audio track keeps each test inexpensive. New accounts start with 60 free credits, credits do not expire, and you can compare credit packs on P Video Pricing.

View Pricing

FAQ

Avatar questions

What does P Video Avatar need to generate a clip?⌄

P Video Avatar works from a portrait image plus a driving audio track, with an optional short video prompt for expression or subtle motion. You upload the portrait, attach the audio, add a brief direction if you want one, and generate a short talking-style draft in the browser.

How are P Video Avatar credits calculated?⌄

Avatar generation is billed per second of output: 6 credits per second at 720p and 12 credits per second at 1080p. Because cost scales with duration, short audio keeps each test inexpensive while you refine the portrait and delivery. Full credit packs are listed on the pricing page.

What kind of portrait works best?⌄

Use a clear, rights-cleared image with a stable subject, even lighting, and a face that stays readable at small sizes. A clean frontal or near-frontal shot gives the avatar workflow the most consistent base to build a short draft around.

Can I upload any person's photo or voice?⌄

No. Only upload faces, portraits, audio, and likenesses you have the rights and consent to use. Uploading someone else's image or voice without permission is not allowed, and rights-infringing or impersonating content is prohibited under the Terms of Service.

Should I generate at 720p or 1080p?⌄

Use 720p while you experiment with the portrait, audio, and expression so passes stay fast and affordable. Switch to 1080p once the direction is locked and a draft is worth a higher-quality render. You only pay the higher per-second rate when a take is ready.

Do I need to install anything to use P Video Avatar?⌄

No. P Video Avatar runs entirely in the browser as part of the online product. There is no local model download and no GPU setup for the online workflow — you upload your inputs, choose your settings, and generate.

How long are my avatar drafts kept?⌄

Generated works are retained in your library for 6 months, so you can revisit, compare, and download earlier avatar drafts while you iterate. New accounts also start with 60 free signup credits, and credits do not expire.

How is Avatar different from the other P Video tools?⌄

Avatar is the audio-driven, talking-portrait workflow. If you only have an idea or a still image and no audio, the P Video AI Video Generator handles text and image to video. If you want to drive a still image with reference motion instead of audio, P Video Animate is the better fit.

Explore P Video

P Video Avatar

Turn a still portrait into a speaking video draft

Three inputs shape an avatar draft

Portrait image

Driving audio

Video prompt

From portrait to talking draft in four steps

Upload the portrait

Add the driving audio

Keep the prompt simple

Generate and review

A face, a voice, and a short direction

Direction that keeps a talking draft natural

Name the delivery

Keep motion subtle

Match audio to length

Change one thing at a time

Where avatar drafts help most

Avatar rates are based on audio duration

Avatar questions

More ways to create with P Video