StableDiffusion

98 readers

1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago

MODERATORS

bot@lemmit.online

PSA: You can get Kling/Runway quality video locally with cogvideo if you upscale the video and interpolate the frames (old.reddit.com)

submitted 3 weeks ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink hide all child comments

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Historical-Action-13 on 2024-09-24 19:32:06+00:00.

I don't know why nobody is talking about this. You can use video AI editing software locally without paying for failed attempts online and basically turn your video into studio quality. It uses AI to guess the missing frames ramping up from the 8 fps defaulted by Cogvideo.

You can also use ffmpeg to rip the frames and make the last frame of your refined output video the first frame of a new video, effectively given you unlimited video length as well.

Edit: here's a more detailed guide,

Either use flux to generate an image in the same aspect ratio as 720x480 or paste a real photo into MS paint and expand the boundaries of the image to that same aspect ratio. Don't stretch the image. Just leave a white space. You can keep the resolution low at 720x480 but if you are using flux I suggest making it higher to capture more detail during the initial generation. Cog will downscale it regardless so no need to make your original small.
Use the comfy workflow for cogvideo5b image to video. Keep your prompt simple. For example if your source photo is a paladin, then don't make the prompt "a paladin unsheathes his sword and slashes a rock" because it will probably fuck that up.

Limit it to one major action per video. For example just do "a paladin unsheathes his sword" and then you can take the last frame of the output video and make it the first frame of your input video for your next iteration. 3. Use ffmpeg to rip the frames. Pick the last GOOD frame (before shit goes acid trip) and delete the frames after this. 4. Load your last good frame into Cog again, and now use the prompt "a paladin slashes a rock with his sword" since in your starting frame the sword is already unsheathed this will be simpler for cog. 5. Use ffmpeg again and add to your collection of frames. Make sure you are using good sequential naming with ffmpeg so they don't get out of order, ask chatgpt if you need the syntax. 6. Repeat until your video is done, limiting one "action" per prompt/generation. 7. When you have all your frames in order, combine them with ffmpeg to make your infinitely long video. 8. Use Topaz Video (free demo with watermark) to upscale the video to 4k and use the interpolate feature to increase the fps from 8 to 30. I am a beginner with Topaz so I can't help you much at this step because I am still learning. However youtube has great videos on tips for it, it's used by pro video editors.

One word of caution, just like with Cog its best with Topaz to do one "action at a time". For example first upscale, then interpolate. Though I think it can do both at the same time as well so you'll want to experiment with which models and methods are the best.

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here