this post was submitted on 12 Jan 2025
658 points (98.0% liked)

Technology

60456 readers
4127 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

cross-posted from: https://lemmy.ca/post/37011397

!opensource@programming.dev

The popular open-source VLC video player was demonstrated on the floor of CES 2025 with automatic AI subtitling and translation, generated locally and offline in real time. Parent organization VideoLAN shared a video on Tuesday in which president Jean-Baptiste Kempf shows off the new feature, which uses open-source AI models to generate subtitles for videos in several languages. 

you are viewing a single comment's thread
view the rest of the comments
[–] tja@sh.itjust.works 34 points 2 days ago (5 children)

Problem ist that now people will say that they don't get to create accurate subtitles because VLC is doing the job for them.

Accessibility might suffer from that, because all subtitles are now just "good enough"

[–] spankmonkey@lemmy.world 25 points 2 days ago

Regular old live broadcast closed captioning is pretty much 'good enough' and that is the standard I'm comparing to.

Actual subtitles created ahead of time should be perfect because they have the time to double check.

[–] Railcar8095@lemm.ee 32 points 2 days ago

Or they can get OK ones with this tool, and fix the errors. Might save a lot of time

[–] LandedGentry@lemmy.zip 12 points 2 days ago (1 children)

Honestly though? If your audio is even half decent you’ll get like 95% accuracy. Considering a lot of media just wouldn’t have anything, that is a pretty fair trade off to me

[–] TheMachineStops@discuss.tchncs.de 6 points 2 days ago* (last edited 2 days ago) (2 children)

From experience AI translation is still garbage, specially for languages like Chinese, Japanese, and Korean , but if it only subtitles in the actual language such creating English subtitles for English then it is probably fine.

[–] catloaf@lemm.ee 2 points 2 days ago (1 children)

That's probably more due to lack of training than anything else. Existing models are mostly made by American companies and trained on English-language material. Naturally, the further you get from the model, the worse the result.

[–] TheMachineStops@discuss.tchncs.de 0 points 2 days ago (1 children)

It is not the lack of training material that is the issue, it doesn't understand context and cultural references. Someone commented here that crunchyroll AI subtitles translated Asura Hall a name to asshole.

[–] Petter1@lemm.ee 1 points 2 days ago (1 children)

It would be able to behave like it understands context and cultural references it if it had the appropriate training data, no problem.

[–] TheMachineStops@discuss.tchncs.de 1 points 2 days ago* (last edited 2 days ago) (1 children)

I highly doubt that it will be as good as human translation anytime soon, maybe around 10 years or so. Also they have profanity filters and they also hallucinate a lot. https://www.businessinsider.com/ai-peak-data-google-deepmind-researchers-solution-test-time-compute-2025-1

[–] Petter1@lemm.ee 1 points 1 day ago (1 children)
[–] TheMachineStops@discuss.tchncs.de 1 points 1 day ago (1 children)

You said that with training data it will be able to understand. I mean that even with training data it will take years and it also has other problems like hallucinations. I admit, I didn't word it correctly.

[–] Petter1@lemm.ee 1 points 1 day ago (1 children)

*would, not will.

It is not know if the needed training data will ever even exist. But if it did, training an AI with that data would result in great, cultural subtitle generation.

[–] TheMachineStops@discuss.tchncs.de 1 points 1 day ago (1 children)

Are you sure it is would? In the sentence you are referring to the AI understanding culture from language which is future tense.

[–] Petter1@lemm.ee 1 points 1 day ago

Will is future tense in a way that it is definitely gonna happen. Would just means there is the possibility.

And yes, I am sure, that one could brute force a solution with having enough computing power and learning data. If it would make sense (ethical and sustainably wise) is a whole other question.

I am sure it can, because LLM are statistically systems as humans are as well for a great factor (just not as strict as a machine). If you have enough data with action and response to such cultural traditions, there is nothing that would suggest that a LLM would fail to replicate that.

[–] LandedGentry@lemmy.zip 1 points 2 days ago

English it’s been great for me yes

[–] TachyonTele@lemm.ee 11 points 2 days ago

I have a feeling that if you care enough about subtitles you're going to look for good ones, instead of using "ok" ai subs.

[–] shyguyblue@lemmy.world 2 points 2 days ago* (last edited 2 days ago) (1 children)

I imagine it would be not-exactly-simple-but-not- complicated to add a "threshold" feature. If Ai is less than X% certain, it can request human clarification.

Edit: Derp. I forgot about the "real time" part. Still, as others have said, even a single botched word would still work well enough with context.

[–] spankmonkey@lemmy.world 1 points 2 days ago* (last edited 2 days ago) (1 children)

That defeats the purpose of doing it in real time as it would introduce a delay.

[–] shyguyblue@lemmy.world 1 points 2 days ago

Derp. You're right, I've added an edit to my comment.