this post was submitted on 12 Jan 2025
658 points (98.0% liked)

Technology

60456 readers
4098 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

cross-posted from: https://lemmy.ca/post/37011397

!opensource@programming.dev

The popular open-source VLC video player was demonstrated on the floor of CES 2025 with automatic AI subtitling and translation, generated locally and offline in real time. Parent organization VideoLAN shared a video on Tuesday in which president Jean-Baptiste Kempf shows off the new feature, which uses open-source AI models to generate subtitles for videos in several languages. 

you are viewing a single comment's thread
view the rest of the comments
[–] Nalivai@lemmy.world 12 points 2 days ago (4 children)

The technology is nowhere near being good though. On synthetic tests, on the data it was trained and tweeked on, maybe, I don't know.
I corun an event when we invite speakers from all over the world, and we tried every way to generate subtitles, all of them run on the level of YouTube autogenerated ones. It's better than nothing, but you can't rely on it really.

[–] lukewarm_ozone@lemmy.today 2 points 1 day ago* (last edited 1 day ago) (1 children)

Really? This is the opposite of my experience with (distil-)whisper - I use it to generate subtitles for stuff like podcasts and was stunned at first by how high-quality the results are. I typically use distil-whisper/distil-large-v3, locally. Was it among the models you tried?

[–] Nalivai@lemmy.world 1 points 15 hours ago

I unfortunately don't know the specific names of the models, I will comment additionally if I will not forget to ask people who spun up the models themselves.
The difference might be that live vs recorded stuff, I don't know.

[–] TriflingToad@sh.itjust.works 4 points 1 day ago* (last edited 1 day ago) (1 children)

is your goal to rely on it, or to have it as a backup?
For my purpose of having backup nearly anything will be better than nothing.

[–] Nalivai@lemmy.world 1 points 9 hours ago

When you do live streaming there is no time for backup, it either works or not. Better than nothing, that's for sure, but also maybe marginally better than whatever we had 10 years ago

[–] Scrollone@feddit.it 6 points 2 days ago (1 children)

No, but I think it would be super helpful to synchronize subtitles that are not aligned to the video.

[–] Telodzrum@lemmy.world 5 points 2 days ago

This is already trivial. Bazarr has been doing it for all my subtitles for almost a decade.

[–] Petter1@lemm.ee -1 points 2 days ago (1 children)

You were not able to test it yet calling it nowhere near good 🤦🏻

Like how should you know?!

[–] Nalivai@lemmy.world 2 points 2 days ago* (last edited 2 days ago)

Relax, they didn't write a new way of doing magic, they integrated a solution from the market.
I don't know what the new BMW car they introduce this year is capable of, but I know for a fact it can't fly.