StableDiffusion

98 readers

1 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago

MODERATORS

bot@lemmit.online

Mid-week update for r/StableDiffusion - all the major developments in a nutshell (old.reddit.com)

submitted 1 month ago by bot@lemmit.online to c/stablediffusion@lemmit.online

0 comments fedilink hide all child comments

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/OkSpot3819 on 2024-08-29 08:05:10+00:00.

CogVideoX-5B: Open-source video generation model originating from QingYing (with diffuserslib, it fits on < 10GB VRAM) (HUGGING FACE | GITHUB | PAPER)
Meta Sapiens: AI vision models for human analysis at 1k resolution - 2D pose estimation, body-part segmentation, depth estimation, and surface normal prediction (GITHUB | HUGGING FACE)
LayerPano3D: a novel framework to generate full-view, explorable panoramic 3D scene from a single text prompt (GITHUB)
Kolors Virtual Try-On (HUGGING FACE DEMO)
GenWarp: AI model that can generate new views of a scene from just a single input image (PAPER | HUGGING FACE DEMO | GITHUB)
Hyper-SD (Flux): Bytedance released Flux.1-Dev 8/16step LoRAs - generate images in just 8/16 steps (HUGGING FACE DEMO)
Imagen 3 is now available on Gemini. Source.
Background removal with WebGPU: in-browser background removal (GITHUB | HUGGING FACE DEMO)
Deforum Studio Updates: four new presets based on "audio events", which you can detect or manually place on the audio track. Also, smoothing is now available for classic presets. Link.
Freepik Mystic: New image generator. Source.
Fotographer.ai Fuzer v0.1: image editing tool that allows users to combine foreground elements with different backgrounds. It aims to preserve the shape and style of the foreground while integrating it into the new background (HUGGING FACE DEMO)
MagicMan: generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement (HUGGING FACE PAPER)
MeTTA: Single-View to 3D Textured Mesh Reconstruction with Test-Time Adaptation (PROJECT PAGE)

These will all be covered in the weekly newsletter, check out the most recent issue.

Here are the updates from the previous week:

⚓ CCTV-style images: Flux dev capable of generating convincing surveillance-like footage.
⚓ Amateur Photography LoRA v2: Enhanced Flux LoRA for realistic casual photographs.
⚓ Personal likeness LoRA: Successful training with only 15 self-captioned images.
⚓ Low VRAM training: Flux LoRA training achieved on RTX 3060 with 12GB VRAM.
⚓ 16GB VRAM guide: Method for training Flux LoRA using only 16GB of VRAM shared.
⚓ FinetunersAI insights: Valuable recommendations on training LoRA models for Flux.
⚓ XLabs ControlNet: New Canny, HED, and Depth models (Version 3) for Flux released.
⚓ Union ControlNet: InstantX's union ControlNet implemented in ComfyUI for Flux.
⚓ AI in politics: Trump's use of AI-generated images sparks debate on misinformation.
⚓ Procreate's stance: Popular illustration app announces no integration of generative AI.
⚓ Pony Diffusion V7: Significant update announced with various improvements.
⚓ Black Forest Labs interview: Founders discuss journey from Stable Diffusion to new ventures.
⚓ Ideogram 2.0: New AI image generation platform released with various features.
⚓Luma AI Dream Machine 1.5: Upgraded text-to-video generator with enhanced capabilities.
⚓ Flux Deforum: XLabs-AI releases Flux implementation of Deforum framework.
⚓ ComfyUI-Nexus: New extension enabling multiplayer collaboration in ComfyUI.
⚓ Flux LoRA showcase: New LoRAs for custom typefaces and themed designs.

Compiled resource for all links can be found here.

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here