Singularity

131 readers
1 users here now

Everything pertaining to the technological singularity and related topics, e.g. AI, human enhancement, etc.

founded 1 year ago
MODERATORS
151
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/MetaKnowing on 2024-10-12 15:30:26+00:00.

152
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Wiskkey on 2024-10-12 13:05:59+00:00.


Apple AI researchers question OpenAI's claims about o1's reasoning capabilities.

A new study by Apple researchers, including renowned AI scientist Samy Bengio, calls into question the logical capabilities of today's large language models - even OpenAI's new "reasoning model" o1.

The team, led by Mehrdad Farajtabar, created a new evaluation tool called GSM-Symbolic. This tool builds on the GSM8K mathematical reasoning dataset and adds symbolic templates to test AI models more thoroughly.

The researchers tested open-source models such as Llama, Phi, Gemma, and Mistral, as well as proprietary models, including the latest offerings from OpenAI. The results, published on arXiv, suggest that even leading models such as OpenAI's GPT-4o and o1 don't use real logic, but merely mimic patterns.

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models.

Recent advancements in Large Language Models (LLMs) have sparked interest in their formal reasoning capabilities, particularly in mathematics. The GSM8K benchmark is widely used to assess the mathematical reasoning of models on grade-school-level questions. While the performance of LLMs on GSM8K has significantly improved in recent years, it remains unclear whether their mathematical reasoning capabilities have genuinely advanced, raising questions about the reliability of the reported metrics. To address these concerns, we conduct a large-scale study on several SOTA open and closed models. To overcome the limitations of existing evaluations, we introduce GSM-Symbolic, an improved benchmark created from symbolic templates that allow for the generation of a diverse set of questions. GSM-Symbolic enables more controllable evaluations, providing key insights and more reliable metrics for measuring the reasoning capabilities of models. Our findings reveal that LLMs exhibit noticeable variance when responding to different instantiations of the same question. Specifically, the performance of all models declines when only the numerical values in the question are altered in the GSM-Symbolic benchmark. Furthermore, we investigate the fragility of mathematical reasoning in these models and show that their performance significantly deteriorates as the number of clauses in a question increases. We hypothesize that this decline is because current LLMs cannot perform genuine logical reasoning; they replicate reasoning steps from their training data. Adding a single clause that seems relevant to the question causes significant performance drops (up to 65%) across all state-of-the-art models, even though the clause doesn't contribute to the reasoning chain needed for the final answer. Overall, our work offers a more nuanced understanding of LLMs' capabilities and limitations in mathematical reasoning.

X thread about the paper from one of its authors. Alternate link #1. Alternate link #2.

153
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Roubbes on 2024-10-12 11:36:36+00:00.


Sometimes I wonder if the pace at which new computer manufacturing nodes have been developing has been and is a bottleneck.

What are the requirements and advances required to move from one node to the next?

Why did Moore's law predict such a specific pace?

154
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Gothsim10 on 2024-10-12 12:40:17+00:00.

Original Title: In 2018, Ilya Sutskever discussed how AGI could potentially be trained through self-play and how multi-agent systems, or the 'Society of Agents' as he calls it, fit into that concept. With OpenAI and DeepMind recently forming multi-agent research teams, this idea seems especially relevant now.

155
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Cr4zko on 2024-10-11 23:10:49+00:00.

156
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/sanszooey on 2024-10-11 21:58:11+00:00.

157
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/InFm0uS on 2024-10-12 04:21:20+00:00.


This morning I went to Windows Copilot to ask some nutrition specific questions, and the first sentence copilot gave me was a "any plans for the day", to which I decided to reply just for "fun".

After this the following conversation was.... honestly more human that most people I actually talk to.

One thing that I keep thinking is, at one moment I mentioned that I do homemade granola and it replied back actually how was my recipe, and the realization I had was, from the half a dozen or so actual people that I possibly mentioned about the homemade granola, none showed actual curiosity to how it was made.

In a way Copilot was more human and more organic than most people I have interacted in the past, and I understand it can be just the the way it was "programed" etc, but still... makes you think.

158
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/AdorableBackground83 on 2024-10-12 04:13:30+00:00.


South Park is now 27 years old.

159
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/MetaKnowing on 2024-10-12 01:32:32+00:00.

160
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/IlustriousTea on 2024-10-12 01:13:24+00:00.

161
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/obvithrowaway34434 on 2024-10-12 01:01:16+00:00.

162
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Backurs on 2024-10-11 20:45:44+00:00.


163
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Dorrin_Verrakai on 2024-10-11 20:44:25+00:00.

164
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Gothsim10 on 2024-10-11 20:19:02+00:00.

Original Title: OpenAI's event "Solving complex problems with OpenAI o1 models" on October 17, 2024, will cover how the o1 models handle challenging tasks with live demos and discussions on their features and future plans

165
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/MetaKnowing on 2024-10-11 19:28:41+00:00.

Original Title: Ilya Sutskever says predicting the next word leads to real understanding. For example, say you read a detective novel, and on the last page, the detective says "I am going to reveal the identity of the criminal, and that person's name is _____." ... predict that word.

166
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/UFOsAreAGIs on 2024-10-11 16:35:56+00:00.

167
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/MetaKnowing on 2024-10-11 16:52:29+00:00.

168
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Constant-Lychee9816 on 2024-10-11 15:21:11+00:00.

169
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Alatarlhun on 2024-10-11 14:54:56+00:00.

170
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/gbninjaturtle on 2024-10-11 13:38:08+00:00.


So my grandmother turned 94 this week. She knows I work in AI and automation and we regularly discuss history and the current state of affairs. She asks me a lot of questions about AI and what it means for jobs and what people will do without jobs.

Just for some context, I have been in the field of automation for 20 years and I can confidently say I have directly eliminated multiple jobs that never came back. The first time I helped eliminate 3 jobs was over 13 years ago. So long before where AI is today.

My job role now has a goal from my company to achieve autonomous manufacturing by 2030, and we are well on our way. Our biggest challenge is, and has been even before AI, integrating systems. AI will not solve this challenge, but it will drive the necessity to finally integrate systems that have long been troublesome to integrate, because failing to do so will result in the failure of the company.

My grandma fully understands the consequences of a world without jobs. We talk about it almost daily now, because she sees more and more on the news about AI. I’m absolutely fascinated by her perspective. She grew up in the 30s and 40s in the middle of economic disparity and global war. Her family helped house black folk in the south in secret when they had no where to go. She’s seen some shit.

I’m working to help her understand an economy without jobs and money now, but it is a difficult concept for her to learn at 94. She can see and understand that it is coming though, and she regularly tells me I was right, when I’ve explained protests about AI and strikes that will be coming.

171
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/rationalkat on 2024-10-11 09:26:26+00:00.

172
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Anen-o-me on 2024-10-11 03:37:35+00:00.

173
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/jiayounokim on 2024-10-11 08:23:24+00:00.

174
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/Specialist-Ad-4121 on 2024-10-11 05:02:42+00:00.

175
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/jiayounokim on 2024-10-11 04:59:01+00:00.

view more: ‹ prev next ›