Technology

61632 readers

4123 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

773

US Bill proposed to jail people who download Deepseek (www.404media.co)

submitted 1 day ago by JOMusic@lemmy.ml to c/technology@lemmy.world

136 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] cyd@lemmy.world 20 points 1 day ago* (last edited 1 day ago) (1 children)

Base models are general purpose language models, mainly useful for AI researchers and people who want to build on top of them.

Instruct or chat models are chatbots. They are made by fine-tuning base models.

The V3 models linked by OP are Deepseek's non-reasoning models, similar to Claude or ChatGPT4o. These are the "normal" chatbots that reply with whatever comes to their mind. Deepseek also has a reasoning model, R1. Such models take time to "think" before supplying their final answer; they tend to give better performance for stuff like math problems, at the cost of being slower to get the answer.

It should be mentioned that you probably won't be able to run these models yourself unless you have a data center style rig with 4-5 GPUs. The Deepseek V3 and R1 models are chonky beasts. There are smaller "distilled" forms of R1 that are possible to run locally, though.

[–] DogWater@lemmy.world 5 points 1 day ago (2 children)

I heard people saying they could run the r1 32B model on moderate gaming hardware albeit slowly

[–] FrederikNJS@lemm.ee 5 points 1 day ago (1 children)

32b is still distilled. The full one is 671b.

[–] DogWater@lemmy.world 2 points 20 hours ago (1 children)

I know, but the fall off in performance isn't supposed to be severe

[–] FrederikNJS@lemm.ee 1 points 12 hours ago

You are correct. And yes that is kinda the whole point of the distilled models.

[–] meliante@lemmy.world 1 points 1 day ago

My legion slim 5 14" can run it not too bad.