StableDiffusion

98 readers

1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago

MODERATORS

bot@lemmit.online

Authors of CogVideoX reveals that they have no plans to open-source their fine-tuned Image-To-Video model in the near future. (old.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink hide all child comments

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/jmellin on 2024-09-01 12:18:26+00:00.

I love the new CogVideoX-5b model and think it's great that we finally have a strong competitor in the open-source space, rivaling Kling, Runway, and others. However, I believe the community's demand for an image-to-video (img2vid) feature is evident.

Fine-tuned image-to-video model of curent text-to-video model existing but not released

After doing some research on GitHub, I found that the authors have stated they have no plans to open-source their current Image-to-Video model, which I find disappointing. I hope they reconsider in the future.

I believe that the first person or team to fine-tune the current model to handle image-to-video (which I know is no small task) and open-source it will gain a lot while also becoming a community legend. Alternatively, if someone develops a software solution, similar to inpainting I guess, that allows setting the first latent image, they would also be eligible for that recognition.

Keeping my fingers crossed for any of the above.

Links:

Authors response to Image To Video request in their github

kijai mention it as a reply in his ComfyUI-wrapper node

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here