ChatGPT 2.0 created the raw sheet music. Seedance 2.0 generates the vocals. Seedance 2.0 can't read notes from image — but you can upload 15 seconds audio and get it done. Its vocal generation is insane.
Seedance 2.0 is used here for AI music video generation via audio upload + TTS. The workflow: ChatGPT generates sheet music → convert to audio → upload to Seedance 2.0 → Seedance generates the vocal track synced to music. The result is a fully AI-generated music video.
Prompt used for video generation: describing the music video visual direction — a demonstration of Seedance 2.0's ability to generate vocals from audio and sync them with visual storytelling.
Note: Prompt details are in the first comment of the original tweet (reply chain).
A new experiment with Seedance 2.0 on @mitte_ai: FACS.
For this video, I only used FACS codes in the prompt. I didn't describe the facial expressions in plain language at all.
FACS (Facial Action Coding System) is a system for describing facial expressions using individual muscle movements called Action Units (AUs), instead of general emotion labels like "happy" or "sad." It breaks the face into controllable components such as brow movement, eyelid tension, lip movement and cheek activation.
Even though it didn't follow all 14 Action Units perfectly, it still interpreted most of them surprisingly well.
Seedance 2.0 Prompt for the first 15s:
Use the provided character @[image1] as the fixed identity reference.
15s, 1:1, 14 beats, beat-synced, cinematic tight close-up, subtle neutral background, high facial clarity, slow micro push-in, shallow depth of field.
1: AU10
2: AU20
3: AU22
4: AU23
5: AU27
6: AU28
7: AU45
8: AU53
9: AU61
10: AU62
11: AU64
12: AU85
13: AU84
14: AU46
Uneasy, hypnotic, controlled mood. No monster transformation, no gore, no comedy, no text overlay, no watermark.
A super simple workflow: Your Image → Custom GPT Image 2 storyboard → Seedance 2 on @renoiseai
My Custom GPT Link: https://t.co/THguEGt0GU
Prompt:
(0–1s) Cinematic wide shot of an exhausted 25-year-old South Asian woman with messy black hair walking alone on a dark empty street at night, dim streetlights, moody atmosphere, slow tracking shot from behind, psychological horror, realistic, film grain.
(1–2s) Extreme close-up of an exhausted 25-year-old South Asian woman with messy hair and dark circles, tired eyes looking down, dim lighting, subtle tension, psychological horror style, realistic skin texture, shallow depth of field.
(2–3.5s) Low angle POV shot from street looking up at a third-floor apartment window at night, a tired South Asian woman standing in the lit window, eerie glow, unsettling atmosphere, slow subtle zoom in, psychological horror.
(3.5–4.5s) Cinematic shot of a confident version of a 25-year-old South Asian woman standing in a brightly lit window at night, gently waving with a warm smile, perfect hair, glowing skin, eerie and unsettling, slow zoom in, horror atmosphere.
(4.5–5.5s) Close-up of the exhausted woman's face in shock, eyes wide open, mouth slightly open, looking up at the window, dramatic lighting, intense fear, psychological horror, realistic.
(5.5–6.5s) Dynamic shot of the tired woman running fast towards an apartment building at night, panicked expression, motion blur, dark moody lighting, intense atmosphere.
(6.5–7.5s) Close-up of a trembling hand pushing open an old apartment door at night, dim light coming from inside, tense and eerie, slow motion, psychological horror.
(7.5–9s) Medium shot as the door slowly opens revealing a small apartment at night, the tired woman standing in shock in the doorway, another smiling version of herself standing calmly inside, dramatic contrast lighting, high tension.
(9–10.5s) Cinematic two-shot of two identical 25-year-old South Asian women facing each other in a dimly lit room, one exhausted and scared, the other calm and smiling, intense eye contact, psychological horror, slow subtle camera movement.
(10.5–12s) Extreme close-up of two pairs of identical eyes staring at each other, one tired and frightened, the other calm and unsettling, dramatic lighting, intense psychological tension, film grain.
(12–13.5s) Close-up of the confident version of the woman slowly smiling wider, her smile becoming slightly creepy, cold eyes, eerie atmosphere, slow zoom in, horror style.
(13.5–15s) Final cinematic shot of the exhausted woman staring in horror at her smiling double, then freeze frame, text appears "WINDOW SELF", dark moody color grade, psychological horror ending, film grain.
Each individual frame of the images has been assembled to create a music video. This is an emotionally rich, MV-style production featuring English lyrics and a touching, ballad-like aesthetic.
A lively and humorous American female influencer is hosting a livestream product promotion, holding a quirky and adorable "ChillPill Talking Stress Toy" shaped like a cute pill with a smiling face. The toy lights up and makes funny, sarcastic voice responses when squeezed. She speaks in fluent English with a natural American accent, using a playful, comedic, and highly engaging tone like a viral livestream entertainer. She laughs, exaggerates expressions, maintains eye contact with the camera, and reacts dramatically while demonstrating the product.
She says:
"Okay this is literally the funniest thing I've bought all year 😭—you squeeze it and it roasts you when you're stressed! Like—listen to this—" squeezes toy, it says something sarcastic
"EXCUSE ME?? Not the attitude 💀 But honestly, it actually makes you feel better. And right now on this livestream, it's only $19—like… why is this cheaper than my coffee??"
The scene is styled like a chaotic, fun e-commerce livestream interface. Floating chat comments appear on the left (e.g., "LMAOO I NEED THIS 😂", "Wait what did it say??", "Add to cart immediately"), along with exaggerated gift animations like exploding hearts, confetti, and meme-style emojis. On the right side, there's a colorful product info card with the price, product name, and a flashing "Buy Now" button. The livestream viewer count shows "320K+ watching".
Camera shot: medium close-up with quick zoom-ins during funny moments and reactions.
Lighting: bright and vibrant with slightly colorful highlights to match the comedic vibe.
Overall style: playful, meme-worthy, TikTok-style chaotic energy with a polished livestream aesthetic.
Language: English
Accent: American
Lip sync: enabled
Tone: funny, exaggerated, energetic
Speaking speed: fast, expressive (livestream comedy style)
Cinematic horror scene, abandoned hospital at night, cold fluorescent lighting, wet reflective floor, peeling walls, rusted wheelchairs, distant metallic sounds, tense realistic atmosphere, photorealistic, high contrast, slow suspense, no comedy, no gore, hard cuts only Shot 1: Empty hospital corridor, flickering fluorescent lights, static wide shot, damp floor reflections, deep darkness at the far end Cut to Shot 2: Medium shot of a frightened woman slowly backing away, breathing heavily, looking past camera in fear Cut to Shot 3: Close-up of her eyes widening as the lights flicker rapidly Cut to Shot 4: Static shot of an old wheelchair slightly moving on its own in the hallway Cut to Shot 5: Long corridor again, a tall shadow figure briefly appears under a blinking light at the end Cut to Shot 6: Close-up of the woman turning suddenly, terrified expression, harsh light flicker Cut to Shot 7: Final wide shot, corridor empty again, silence, one light still blinking
Image1 is the main character maintain consistent facial features and body type throughout. The main character appears only once in every frame no duplicates, no red-haired people in the crowd. Cinematic time-freeze short film, 15 seconds, ultra-realistic, Arri Alexa Mini shooting texture, 50mm lens, natural daylight hard shadows, shallow depth of field.
[0:00-0:03] Busy cobblestone street in an Italian old town, normal time flow. Steadicam front-facing medium shot tracking: the main character wearing a loose linen shirt tucked into high-waisted jeans and white sneakers walks confidently through the crowd. Pedestrians walk, check phones, chat; a flock of pigeons flies across the bright sunny sky in the distance. As she walks, she raises her right hand and snaps her fingers.
(0:03-0:06] The instant of the snap a powerful white spherical shockwave bursts from her fingertips, carrying visible air distortion and light refraction, spreading rapidly in all directions...
A fearless tiny man stands in the path of a colossal giant carrying a mountain-sized boulder on his shoulder, raising his hand and shouting for him to stop - the giant glances down briefly but keeps marching forward with unstoppable force - Cinematic low-angle tracking shot, vast rocky desert filled with scattered boulders and dust clouds, golden sunlight casting dramatic shadows, intense scale contrast between both characters - Ground trembling with each giant footstep, rocks cracking beneath his feet, dust swirling through the air, epic fantasy atmosphere - Deep cinematic bass hits, rumbling footsteps, distant wind howls, ultra realistic, 4K blockbuster scene.
Cinematic continuous single-take shot, IMAX film simulation, Panavision C-series 35mm lens, f/4, strong backlighting. heavy shadows, desaturated grey colors. The camera starts with a low-angle shot from behind a battle-worn warrior sitting motionless on a massive black warhorse on dark soil under a dull grey sky. He is wearing black warrior armor. The camera smoothly rises toward his right hand as he violently pulls the reins. The horse rears up, instantly triggering a dramatic bullet-time slow-motion effect. Black-red flames violently erupt from the horse, burning nearby alien insects to ash. The rider's face, mostly in shadow, undergoes a painful, raw transformation: his previously normal eyes ignite with a piercing white-gold flash, leaking thick black smoke. His right palm splits open, erupting with dark blood and twisting nerve-like fibers that solidify into a brutal halberd made of smoke and living tissue. The horse's body fractures with glowing red cracks, channeling lava-like energy up into the rider. Heavy, uneven black bio-mechanical armor pieces aggressively slam onto his body with bursts of red light and intense shockwaves, tearing away his worn clothes. A dark, organic-metal mask violently grows across his face, forming uneven, flickering compound eyes and flaming horns. As bullet time suddenly ends, the camera circles closely into fast-paced brutal combat. The rider aggressively thrusts and sweeps the halberd, impaling and shattering enemies with trails of black smoke and flames. The heavy warhorse spins and charges, crushing alien insects beneath its hooves. Suddenly, the violence ceases.
seedance
seedance2
cinematic
warrior
transformation
slow motion
bullet-time
dark
fantasy
combat
A 19-year-old woman with micro braids dyed deep indigo, warm brown skin, wearing a shredded athletic wrap top and cargo trousers - hook: extreme snap zoom to her eyes, pupils dilate, she exhales smoke in cold air, then launches a dynamic surge directly into frame. Combat takes place in an abandoned symmetrical library - towering bookshelves as leading lines, books exploding outward with every forceful impact, paper filling the air as environmental debris. Her movement vocabulary: rhythmic loops of continuous forward momentum, never retreating, each strike building on the last with no recovery given. Recoil and recovery mechanics exaggerated on every block - shelves bend, spines crack, the architecture deforms around the exchange. Velocity ramp on the mid-fight escalation peak: dual simultaneous strikes slow to crystal clarity then detonate at full speed. Vertical stakes: shelves begin falling like dominoes, the fighters ascending the collapsing structure. Unexpected ending: she wins the exchange, opponent is down - she looks around at the total destruction, reaches into the rubble, pulls out a specific undamaged book, tucks it under her arm, and walks out through a hole in the wall. Texture duality of leather and paper against brutalist stone, practical debris physics, warm amber library light shifting to cold daylight, cinematic 4K.