this post was submitted on 18 Nov 2024

22 points (100.0% liked)

TechTakes

1438 readers

40 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 1 year ago

MODERATORS

dgerard@awful.systems

Stubsack: weekly thread for sneers not worth an entire post, week ending 24th November 2024 (awful.systems)

submitted 1 week ago by BlueMonday1984@awful.systems to c/techtakes@awful.systems

181 comments fedilink hide all child comments

Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

Last week's thread

(Semi-obligatory thanks to @dgerard for starting this)

(page 3) 50 comments

sorted by: hot top controversial new old

[–] misterbngo@awful.systems 10 points 1 week ago

Stack overflow now with the sponsored crypto blogspam Joining forces: How Web2 and Web3 developers can build together

I really love the byline here. "Kindest view of one another". Seething rage at the bullshittery these "web3" fuckheads keep producing certainly isn't kind for sure.

[–] gerikson@awful.systems 10 points 1 week ago (6 children)

Dude discovers that one LLM model is not entirely shit at chess, spends time and tokens proving that other models are actually also not shit at chess.

The irony? He's comparing it against Stockfish, a computer chess engine. Computers playing chess at a superhuman level is a solved problem. LLMs have now slightly approached that level.

For one, gpt-3.5-turbo-instruct rarely suggests illegal moves,

Writeup https://dynomight.net/more-chess/

HN discussion https://news.ycombinator.com/item?id=42206817

[–] YourNetworkIsHaunted@awful.systems 9 points 1 week ago

Particularly hilarious at how thoroughly they're missing the point. The fact that it suggests illegal moves at all means that no matter how good it's openings are the scaling laws and emergent behaviors haven't magicked up an internal model of the game of Chess or even the state of the chess board it's working with. I feel like playing games is a particularly powerful example of this because the game rules provide a very clear structure to model and it's very obvious when that model doesn't exist.

[–] pikesley@mastodon.me.uk 9 points 1 week ago

@gerikson @BlueMonday1984 the only analysis of computer chess anybody needs https://youtu.be/DpXy041BIlA?si=a1vU3zmOWs8UqlSQ

[–] sailor_sega_saturn@awful.systems 8 points 1 week ago* (last edited 1 week ago)

Here are the results of these three models against Stockfish—a standard chess AI—on level 1, with a maximum of 0.01 seconds to make each move

I'm not a Chess person or familiar with Stockfish so take this with a grain of salt, but I found a few interesting things perusing the code / docs which I think makes useful context.

Skill Level

I assume "level" refers to Stockfish's Skill Level option.

If I mathed right, Stockfish roughly estimates Skill Level 1 to be around 1445 ELO (source). However it says "This Elo rating has been calibrated at a time control of 60s+0.6s" so it may be significantly lower here.

Skill Level affects the search depth (appears to use depth of 1 at Skill Level 1). It also enables MultiPV 4 to compute the four best principle variations and randomly pick from them (more randomly at lower skill levels).

Move Time & Hardware

This is all independent of move time. This author used a move time of 10 milliseconds (for stockfish, no mention on how much time the LLMs got). ... or at least they did if they accounted for the "Move Overhead" option defaulting to 10 milliseconds. If they left that at it's default then 10ms - 10ms = 0ms so 🤷‍♀️.

There is also no information about the hardware or number of threads they ran this one, which I feel is important information.

Evaluation Function

After the game was over, I calculated the score after each turn in “centipawns” where a pawn is worth 100 points, and ±1500 indicates a win or loss.

Stockfish's FAQ mentions that they have gone beyond centipawns for evaluating positions, because it's strong enough that material advantage is much less relevant than it used to be. I assume it doesn't really matter at level 1 with ~0 seconds to produce moves though.

Still since the author has Stockfish handy anyway, it'd be interesting to use it in it's not handicapped form to evaluate who won.

[–] BigMuffin69@awful.systems 8 points 1 week ago* (last edited 1 week ago)

I remember when several months (a year ago?) when the news got out that gpt-3.5-turbo-papillion-grumpalumpgus could play chess around ~1600 elo. I was skeptical the apparent skill wasn't just a hacked-on patch to stop folks from clowning on their models on xitter. Like if an LLM had just read the instructions of chess and started playing like a competent player, that would be genuinely impressive. But if what happened is they generated 10^12 synthetic games of chess played by stonk fish and used that to train the model- that ain't an emergent ability, that's just brute forcing chess. The fact that larger, open-source models that perform better on other benchmarks, still flail at chess is just a glaring red flag that something funky was going on w/ gpt-3.5-turbo-instruct to drive home the "eMeRgEnCe" narrative. I'd bet decent odds if you played with modified rules, (knights move a one space longer L shape, you cannot move a pawn 2 moves after it last moved, etc), gpt-3.5 would fuckin suck.

Edit: the author asks "why skill go down tho" on later models. Like isn't it obvious? At that moment of time, chess skills weren't a priority so the trillions of synthetic games weren't included in the training? Like this isn't that big of a mystery...? It's not like other NN haven't been trained to play chess...

load more comments (2 replies)

[–] swlabr@awful.systems 8 points 1 week ago (1 children)

Strap in and start blasting the Depeche Mode.

load more comments (1 replies)

[–] o7___o7@awful.systems 8 points 1 week ago* (last edited 1 week ago) (2 children)

Peter Watts's Blindsight is a potent vector for brain worms.

[–] Soyweiser@awful.systems 10 points 1 week ago* (last edited 1 week ago) (1 children)

Watts has always been a bit of a weird vector. While he doesn't seem a far righter himself, he accidentally uses a lot of weird far right dogwhistles. (prob some cross contamination as some of these things are just scientific concepts (esp the r/K selection thing stood out very much to me in the rifters series, of course he has a phd in zoology, and the books predate the online hardcore racists discovering the idea by more than a decade, but still odd to me)).

To be very clear, I don't blame Watts for this, he is just a science fiction writer, a particularly gloomy one. The guy himself seems to be pretty ok (not a fan of trump for example).

[–] Architeuthis@awful.systems 7 points 1 week ago* (last edited 1 week ago)

That's a good way to put it. Another thing that was really en vogue at one point and might have been considered hard-ish scifi when it made it into Rifters was all the deep water telepathy via quantum brain tubules stuff, which now would only be taken seriously by wellness influencers.

not a fan of trump for example

In one the Eriophora stories (I think it's officially the sunflower circle) I think there's a throwaway mention about the Kochs having been lynched along with other billionaires on the early days of a mass mobilization to save what's savable in the face of environmental disaster (and also rapidly push to the stars because a Kardashev-2 civilization may have emerged in the vicinity so an escape route could become necessary in the next few millenia and this scifi story needs a premise).

[–] antifuchs@awful.systems 7 points 1 week ago (1 children)

Huh. Say more?

[–] o7___o7@awful.systems 9 points 1 week ago* (last edited 1 week ago) (3 children)

Oh man where to begin. For starters:

Sentience is overrated
All communication is manipulative
Assumes intelligence has a "value" and that it stacks like a Borderlands damage buff
Superintelligence operates in the world like the chaos god Tzeench from WH40K. Humans can't win, because all events are "just as planned"
Humanity is therefore gormless and helpless in the face of superintelligence

It just feeds right into all of the TESCREAL nonsense, particularly those parts that devalue the human part of humanity.

[–] Architeuthis@awful.systems 8 points 1 week ago* (last edited 1 week ago) (2 children)

Sentience is overrated

Not sentience, self awareness, and not in a parτicularly prescriptive way.

Blindsight is pretty rough and probably Watt's worst book that I've read but it's original, ambitious and mostly worth it as an introduction to thinking about selfhood in a certain way, even if this type of scifi isn't one's cup of tea.

It's a book that makes more sense after the fact, i.e. after reading the appendix on phenomenal self-model hypothesis. Which is no excuse -- cardboard characters that are that way because the author is struggling to make a point about how intelligence being at odds with self awareness would lead to individuals with nonexistent self-reflection that more or less coast as an extension of their (ultrafuturistic) functionality, are still cardboard characters that you have to spend a whole book with.

I remember he handwaves a lot of stuff regarding intelligence, like at some point straight up writing that what you are reading isn't really what's being said, it's just the jargonaut pov character dumbing it way down for you, which is to say he doesn't try that hard for hyperintelligence show-don't-tell. Echopraxia is better in that regard.

It just feeds right into all of the TESCREAL nonsense, particularly those parts that devalue the human part of humanity.

Not really, there are some common ideas mostly because tesrealism already is scifi tropes awkwardly cobbled together, but usually what tescreals think is awesome is presented in a cautionary light or as straight up dystopian.

Like, there's some really bleak transhumanism in this book, and the view that human cognition is already starting to become alien in the one hour into the future setting is kind of anti-longtermist, at least in the sense that the utilitarian calculus turns way messed up.

And also I bet there's nothing in The Sequences about Captain Space Dracula.

[–] o7___o7@awful.systems 7 points 1 week ago* (last edited 1 week ago) (1 children)

I hear you. I should clarify, because I didn't do a good job of saying why those things bothered me and nerd-vented instead. I understand that an author doesn't necessarily believe the things used as plot devices in their books. Blindsight a horror/speculative fiction book that asks "what if these horrible things were true" and works out the consequences in an entertaining way. And, no doubt there's absolutely a place for horror in spec fic, but Blindsight just feels off. I think @Soyweiser explained the vibes better than I did. Watts isn't a bad guy. Maybe it's just me. To me, it feels less Hellraiser and more Human Centipede i.e. here's a lurid idea that would be tremendously awful in reality, now buckle up and let's see how it goes to an uncomfortable extent. That's probably just a matter of taste, though.

Unfortunately, the kind of people who read these books don't get that, because media literacy is dead. Everyone I've heard from (online) seems to think that it is saying big deep things that should be taken seriously. It surfaces in discussions about whether or not ChatGPT is "alive" and how it might be alive in a way different from us. Eric Schmidt's recent insane ramblings about LLMs being an "alien intelligence," which don't call Blindsight out directly, certainly resonate the same way.

Maybe I'm being unfair, but it all just goes right up my back.

load more comments (1 replies)

load more comments (2 replies)

load more comments