Skip to content

Conversation

@nicolaus-huang
Copy link

@nicolaus-huang nicolaus-huang commented Mar 19, 2025

Inference Scaling

Implementation of scaling method during inferencing inspaired by Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps. Spend more computational resources to get better results. Use it by specifying the sampling option.

torchrun --nproc_per_node 4 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_768px_inference_scaling.py --save-dir samples --dataset.data-path assets/texts/sora.csv 
Original
num_subtree=3
num_scaling_steps=5
num_noise=1
time=16min

num_subtree=7
num_scaling_steps=8
num_noise=1
time=1h

Copy link

@Devindelarocka Devindelarocka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cinematic Video Prompt
Aspect Ratio: 16:9 horizontal
Style: Photorealistic, raw found-footage aesthetic
Camera: First-person perspective (FPV), selfie-style, low angle looking up at the character
Device Simulation: iPhone handheld, slight shake for realism

Scene Concept
Setting:

A lively beach party at sunset in Maui. Golden light reflects off the ocean, tiki torches flicker in the background, and reggae music drifts from a nearby beach bar.
People are laughing in the distance, surfboards stacked against a palm tree, and the faint sound of waves crashing adds depth.

Character:

A Grizzly Bear with a surfer vibe: bright floral Hawaiian shirt, colorful surfer shorts, mirrored sunglasses.
He’s holding a half-empty coconut drink with a tiny umbrella and a slice of pineapple.

Camera Movement:

Starts with the camera held low, angled upward toward the bear’s face.
Slight wobble as if the person filming is tipsy.
Occasional lens flare from the sunset for realism.

Expression Progression (8 seconds):

Curious – Bear leans in, squints at the camera, tilts his head.
Drunk/Intoxicated – He sways, grins lazily, tongue slightly out, sunglasses slipping.
Wide-eyed Panic – Suddenly startled by a crab crawling on his foot.
Surprised & Dumbfounded – Mouth agape, sunglasses slide down his nose, coconut spills slightly.

Dialogue (Lip-Synced, Jamaican Reggae Accent, No Subtitles):

Bear:
“Ey mon… you ever seen a bear surf? Ha! Dis coconut talkin’ louder than me head, mon… Whoa—what’s dat?! Crab attack! Jah bless, dis beach is wild!”

Soundscape:

Ambient reggae beats in the background.
Ocean waves, distant laughter, and a sudden “snap” sound when the crab appears.
Bear’s voice is deep, rhythmic, and playful with exaggerated Jamaican inflection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants