ressive compression, and maintain consistent styling across views.
interface ReferenceView {
id: string;
angle: 'front' | 'profile' | 'three_quarter' | 'close_up' | 'full_body';
url: string;
metadata: {
lighting: string;
resolution: { width: number; height: number };
occlusion: string[];
};
}
class ReferenceCollageBuilder {
private views: ReferenceView[] = [];
addView(view: ReferenceView): this {
if (this.views.length >= 5) {
throw new Error('Maximum 5 views supported for Seedance 2 reference injection');
}
this.views.push(view);
return this;
}
generateCollageManifest(): Record<string, unknown> {
return {
reference_type: 'multi_angle_collage',
view_count: this.views.length,
views: this.views.map(v => ({
id: v.id,
angle: v.angle,
url: v.url,
constraints: ['preserve_facial_geometry', 'maintain_skin_tone', 'consistent_hairstyle']
})),
output_format: 'png',
compression_level: 'lossless'
};
}
}
Architecture Decision: We enforce a 5-view maximum because Seedance 2's reference encoder optimizes for compact spatial bundles. Exceeding this threshold dilutes attention weights and increases inference latency. Lossless PNG output prevents JPEG artifacts from corrupting facial feature extraction.
Step 2: Prompt Engineering & Reference Injection
Seedance 2 uses the @reference token to bind the visual identity to the generation request. The prompt must explicitly declare identity preservation constraints while describing motion, environment, and camera behavior.
interface VideoPromptConfig {
referenceToken: string;
subjectAction: string;
environment: string;
lighting: string;
cameraMovement: string;
consistencyDirectives: string[];
styleModifiers: string[];
}
class SeedancePromptBuilder {
private config: VideoPromptConfig;
constructor(config: VideoPromptConfig) {
this.config = config;
}
build(): string {
const base = `Generate a cinematic video of ${this.config.referenceToken} ${this.config.subjectAction} in ${this.config.environment}.`;
const lighting = `Lighting: ${this.config.lighting}.`;
const camera = `Camera: ${this.config.cameraMovement}.`;
const consistency = this.config.consistencyDirectives.join(' ');
const style = this.config.styleModifiers.join(', ');
return `${base} ${lighting} ${camera} Identity constraints: ${consistency}. Style: ${style}.`;
}
}
// Usage Example
const prompt = new SeedancePromptBuilder({
referenceToken: '@reference',
subjectAction: 'walking confidently through a neon-lit urban corridor',
environment: 'modern city street at night',
lighting: 'volumetric neon reflections with soft rim lighting',
cameraMovement: 'slow dolly-in with shallow depth of field',
consistencyDirectives: [
'maintain identical facial geometry across all frames',
'preserve hairstyle and skin tone from reference',
'no identity drift between shots',
'consistent character proportions throughout clip'
],
styleModifiers: [
'cinematic color grading',
'high-detail skin texture',
'realistic motion blur',
'photorealistic rendering'
]
}).build();
Architecture Decision: Explicit consistency directives are placed after the core scene description. Diffusion models prioritize early tokens for composition and late tokens for refinement. By isolating identity constraints in a dedicated clause, we prevent them from competing with motion or environment tokens in the attention matrix.
Step 3: Temporal Consistency Validation
Before committing to full-resolution generation, run a low-res preview pass. Validate frame-to-frame feature variance using structural similarity metrics or manual inspection. If drift exceeds threshold, adjust the collage composition or tighten prompt constraints.
interface GenerationResult {
status: 'success' | 'drift_detected' | 'prompt_conflict';
previewUrl: string;
metrics: {
identityStability: number; // 0-1 scale
motionCoherence: number;
promptAlignment: number;
};
}
function validateTemporalConsistency(result: GenerationResult): boolean {
const STABILITY_THRESHOLD = 0.85;
if (result.metrics.identityStability < STABILITY_THRESHOLD) {
console.warn('Identity drift detected. Recommend: add profile view to collage or tighten consistency directives.');
return false;
}
return true;
}
Architecture Decision: Validation happens before high-cost generation. Seedance 2 supports preview modes that consume fewer tokens. Catching drift early prevents wasted compute and accelerates iteration cycles.
Pitfall Guide
1. Single-View Dependency
Explanation: Relying on one portrait forces the model to hallucinate unseen angles. The attention mechanism lacks geometric anchors, causing facial features to morph as the camera moves.
Fix: Always aggregate 3–5 distinct angles. Include at least one profile and one three-quarter view to establish depth.
2. Lighting & Style Mismatch
Explanation: If reference images use drastically different lighting or color grading, the model averages the inputs, resulting in washed-out or conflicting skin tones.
Fix: Normalize reference images to a consistent lighting profile before collage generation. Use neutral, even illumination as the baseline.
3. Over-Constrained Motion Prompts
Explanation: Adding excessive motion directives (e.g., "running, jumping, turning head, waving") competes with identity preservation tokens. The model prioritizes motion, dropping facial consistency.
Fix: Limit motion to one primary action. Use secondary modifiers for subtle gestures. Keep identity constraints in a separate clause.
4. Ignoring Aspect Ratio & Resolution Alignment
Explanation: Seedance 2 expects reference and output dimensions to align. Mismatched ratios cause cropping artifacts or forced stretching, breaking facial proportions.
Fix: Match collage resolution to target output (e.g., 1080x1920 for vertical, 1920x1080 for horizontal). Pre-resize references before injection.
5. Reference Image Compression Artifacts
Explanation: JPEG compression introduces blocking artifacts around eyes, lips, and hair edges. The model interprets these as facial features, embedding them into the generation.
Fix: Export references as PNG or WebP with quality ≥90%. Avoid social media downloads that apply aggressive recompression.
6. Neglecting Temporal Smoothing Parameters
Explanation: Some platforms expose motion strength or temporal consistency sliders. Leaving them at default values can cause jitter or frame blending that obscures facial details.
Fix: Set temporal smoothing to medium-high for character-focused clips. Lower values increase motion freedom but sacrifice identity stability.
7. Prompt Syntax Misalignment
Explanation: Placing @reference mid-sentence or after style modifiers dilutes its binding strength. The model may treat it as a background element rather than the primary subject.
Fix: Always position @reference immediately after the subject/action verb. Example: video of @reference walking... not video walking of @reference...
Production Bundle
Action Checklist
Decision Matrix
| Scenario | Recommended Approach | Why | Cost Impact |
|---|
| Single promotional clip (5s) | Multi-angle collage + standard prompt | Fast iteration, no training overhead | Low (token-based) |
| Episodic character series | Collage + saved prompt templates + consistency validation | Ensures cross-episode identity retention | Medium (preview passes + storage) |
| High-fidelity brand avatar | Fine-tuned LoRA + multi-view reference | Maximum identity stability for commercial use | High (training compute + dataset curation) |
| Rapid prototyping / mood boards | Single reference + aggressive motion prompts | Speed prioritized over consistency | Low (minimal tokens) |
Configuration Template
{
"pipeline": "seedance2_identity_stable",
"reference": {
"type": "multi_angle_collage",
"max_views": 5,
"required_angles": ["front", "profile", "three_quarter"],
"format": "png",
"compression": "lossless",
"resolution_alignment": "output_matched"
},
"prompt": {
"reference_token": "@reference",
"structure": "subject_action -> environment -> lighting -> camera -> identity_constraints -> style",
"consistency_directives": [
"maintain identical facial geometry across all frames",
"preserve hairstyle and skin tone from reference",
"no identity drift between shots",
"consistent character proportions throughout clip"
],
"motion_limit": "single_primary_action"
},
"validation": {
"preview_mode": true,
"identity_stability_threshold": 0.85,
"fallback_strategy": "add_profile_view_or_tighten_directives"
}
}
Quick Start Guide
- Prepare References: Collect 3–5 images of your subject. Ensure coverage of front, side, and three-quarter angles. Export as PNG at target resolution.
- Generate Collage: Use an image aggregator or AI collage tool to combine views into a single reference asset. Verify no compression artifacts or lighting mismatches.
- Construct Prompt: Use the
@reference token immediately after the subject action. Append explicit identity preservation directives. Keep motion focused on one primary action.
- Run Preview: Generate a low-resolution draft. Check frame-to-frame facial stability. If drift occurs, add a missing angle to the collage or tighten consistency constraints.
- Commit to Production: Once preview stability exceeds 85%, trigger full-resolution generation. Save the collage-prompt pair for reuse in subsequent clips.