Singularity

131 readers
1 users here now

Everything pertaining to the technological singularity and related topics, e.g. AI, human enhancement, etc.

founded 1 year ago
MODERATORS
126
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Woootdafuuu on 2024-10-13 15:06:47+00:00.

Original Title: I remember reading this neuroscience paper from the Planck Institute for Psycholinguistics and Radboud University’s Donders Institute back before Chat GPT came out and now that we have models like o1 preview it made complete sense. The paper is about the brain, but it makes so much sense now.

127
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/jiayounokim on 2024-10-13 13:42:38+00:00.

128
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Nunki08 on 2024-10-13 12:52:59+00:00.

129
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/ryan13mt on 2024-10-13 12:34:31+00:00.

130
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Designer-Pair5773 on 2024-10-13 12:23:55+00:00.

131
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/ryan13mt on 2024-10-13 11:56:05+00:00.

132
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/NewChallengers_ on 2024-10-13 11:47:07+00:00.

133
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/PewPewDiie on 2024-10-13 10:49:57+00:00.


New Claude with o1 type of reasoning capabilities? More iterations of o1? will google finally drop something language wise? What hints have we gotten? Dario seem to be on a twitter spree right now.

Would really appreciate your thoughts on this final quarter, need me some good Sunday reading of the tea leaves.

134
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/longiner on 2024-10-13 08:08:26+00:00.

135
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/obvithrowaway34434 on 2024-10-13 05:44:13+00:00.


Seeing this annoying thing on social media with everyone saying LLMs cannot do this or that. Most common being vague sh*t like "reasoning", "agents", "consciousness" and so on. This is absolutely pointless. It sort of reminds one about quantum mechanics in the 1930s when everyone and their dog tried to come up with some philosophical interpretations. It's only when physicists decided to "shut up and calculate" that we made progress and developed the most accurate and predictive physical theory humans ever created. I am not saying philosophical interpretations and/or a deeper understanding are not important, but they crucially rely on data and metrics and properly controlled tests. These give us measurable outcomes and how to improve on something and they also inspire better theories and interpretation. Without these you get pseudo-scientific theories like ether, soul and phlogiston etc. LLMs and in general deep neural networks are the most complex and impenetrable objects ever created. If we are to understand how they work and what they can do, there is no alternative to extensive and rigorous testing. Everything else is distraction.

136
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/obvithrowaway34434 on 2024-10-13 04:24:56+00:00.


Looks quite useful for research, especially when combined with the other tools.

Link to original tweet:

137
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Anen-o-me on 2024-10-13 03:04:14+00:00.

138
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Woootdafuuu on 2024-10-13 02:59:29+00:00.

Original Title: I'm confused about this recent Apple research paper, because I ran all of the test examples in the paper on open o1 preview and it was able to answer correctly. Is this an actual apple paper or is someone trolling?

139
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Variouss on 2024-10-13 02:35:44+00:00.

140
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/duluoz1 on 2024-10-13 00:41:16+00:00.


What effects, if any, has this timeline changed your plans? Have your career goals shifted, have you pivoted to a different field, has your investment strategy changed, do you plan to live in another part of the world etc.

I’m super interested in any practical changes that you’ve all made to your lives and keen to hear about them

141
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/NaoCustaTentar on 2024-10-13 00:23:49+00:00.

Original Title: Is this a new Gemini feature? It just did some kind of "o1 preview" (?) in this answer for me. Is google testing a new reasoning model as well or just a fancy way of displaying it? It answered each paragraph separately while telling his "thought" process"

142
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Gothsim10 on 2024-10-12 22:25:11+00:00.

143
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Gothsim10 on 2024-10-12 21:35:10+00:00.

144
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Junior_Edge9203 on 2024-10-12 19:01:10+00:00.


We all know the predicted dates that Ray gave us, 2029 for agi, then 2045 for the singularity itself. But I can't help but wonder about these dates here, would it really take 16 whole years in between for us to reach the singularity from AGI? It always seemed so awfully long in my opinion, especially if we had AGI, that should be self improving in my opinion... what do you all think?

145
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Happysedits on 2024-10-12 20:03:05+00:00.

146
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Wiskkey on 2024-10-12 16:11:34+00:00.

Original Title: o1-preview (via Web) performs much better on "trick" math reasoning problems than other language models. Paper: Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning.

147
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Wiskkey on 2024-10-12 14:43:07+00:00.

Original Title: OpenAI's o1 Model Excels in Reasoning But Struggles with Rare and Complex Tasks [About paper "When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1"]


OpenAI's o1 Model Excels in Reasoning But Struggles with Rare and Complex Tasks.

In an article recently submitted to the arXiv preprint* server, researchers investigated whether OpenAI's o1, a language model optimized for reasoning, overcame limitations seen in previous large language models (LLMs). The study showed that while o1 performed significantly better, especially on rare tasks, it still exhibited sensitivity to probability, a trait from its autoregressive origins. This suggests that while optimizing for reasoning enhances performance, it might not entirely eliminate the probabilistic biases that remain embedded in the model.

When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1.

In "Embers of Autoregression" (McCoy et al., 2023), we showed that several large language models (LLMs) have some important limitations that are attributable to their origins in next-word prediction. Here we investigate whether these issues persist with o1, a new system from OpenAI that differs from previous LLMs in that it is optimized for reasoning. We find that o1 substantially outperforms previous LLMs in many cases, with particularly large improvements on rare variants of common tasks (e.g., forming acronyms from the second letter of each word in a list, rather than the first letter). Despite these quantitative improvements, however, o1 still displays the same qualitative trends that we observed in previous systems. Specifically, o1 -- like previous LLMs -- is sensitive to the probability of examples and tasks, performing better and requiring fewer "thinking tokens" in high-probability settings than in low-probability ones. These results show that optimizing a language model for reasoning can mitigate but might not fully overcome the language model's probability sensitivity.

Embers of autoregression show how large language models are shaped by the problem they are trained to solve.

Significance

ChatGPT and other large language models (LLMs) have attained unprecedented performance in AI. These systems are likely to influence a diverse range of fields, such as education, intellectual property law, and cognitive science, but they remain poorly understood. Here, we draw upon ideas in cognitive science to show that one productive way to understand these systems is by analyzing the goal that they were trained to accomplish. This perspective reveals some surprising limitations of LLMs, including difficulty on seemingly simple tasks such as counting words or reversing a list. Our empirical results have practical implications for when language models can safely be used, and the approach that we introduce provides a broadly useful perspective for reasoning about AI.

Abstract

The widespread adoption of large language models (LLMs) makes it important to recognize their strengths and limitations. We argue that to develop a holistic understanding of these systems, we must consider the problem that they were trained to solve: next-word prediction over Internet text. By recognizing the pressures that this task exerts, we can make predictions about the strategies that LLMs will adopt, allowing us to reason about when they will succeed or fail. Using this approach—which we call the teleological approach—we identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input. To test our predictions, we evaluate five LLMs (GPT-3.5, GPT-4, Claude 3, Llama 3, and Gemini 1.0) on 11 tasks, and we find robust evidence that LLMs are influenced by probability in the hypothesized ways. Many of the experiments reveal surprising failure modes. For instance, GPT-4’s accuracy at decoding a simple cipher is 51% when the output is a high-probability sentence but only 13% when it is low-probability, even though this task is a deterministic one for which probability should not matter. These results show that AI practitioners should be careful about using LLMs in low-probability situations. More broadly, we conclude that we should not evaluate LLMs as if they are humans but should instead treat them as a distinct type of system—one that has been shaped by its own particular set of pressures.

X thread about the 2 papers from one of the authors. Alternate link #1. Alternate link #2.

148
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/ryan13mt on 2024-10-12 17:08:58+00:00.

149
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/MetaKnowing on 2024-10-12 15:49:00+00:00.

150
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/MetaKnowing on 2024-10-12 15:20:28+00:00.

Original Title: Dario Amodei says AGI could arrive in 2 years, will be smarter than Nobel Prize winners, will run millions of instances of itself at 10-100x human speed, and can be summarized as a "country of geniuses in a data center"

view more: ‹ prev next ›