Surrounded By Asymptotes: 4 Papers Showing LLMs & DNNs Won’t Become AGI

Posted on November 23rd, 2025

Filed under: Code,General — Karl Olson @ 1:44 am

(This will be rather different than my usual, but I promise, the next post won’t be so heavy.)

Lately, it’s been wild to me how the markets and mainstream/enthusiast tech news generally have not sounded the alarm about how Apple, OpenAI, Anthropic and Basis have all published papers in past 6 months that, when considered together, underline that far from being anywhere close to the oft-prophesied “Technological Singularity,” Large Language Modeling (LLM) approaches, including so-called Large Reasoning Modeling (LRM), come up short with no path forward given the techniques used, and this even has implications regarding asymptotic limitations on the capabilities of Deep Neural Network (DNN) models generally.

Apple revealed that LLMs and even LRMs can’t truly reason, even when given explicit instructions and the opportunity for infinite run time. OpenAI discovered that LLMs will also always hallucinate, no matter the model size, training goals or data quality due to the intrinsic resolution limitations of any given parameter and the relationships between them in a DNN model. Anthropic showed that LLMs and diffusion models are easily poisoned regardless of the model size, as the amount of bad data required can be nearly constant, yet keep poisoning the output as a model’s parameters exponentially increase. Most damningly, Basis, an AI research firm, demonstrated through a new benchmark suite for World Modeling (put simply, DNNs for physical tasks and spaces) that LLMs and LRMs are all dramatically outperformed by humans in unique problem solving tests, as it can be shown those techniques all rely on their pattern matching abilities alone, unlike humans which can truly learn on the fly, likely further explaining Apple’s results. Just by those four papers, it suggests AGI shall never be born alone from the LLM/LRM/DNN approach: there’s no ghost in the machine to be found here without redressing those papers’ concerns, if that’s possible at all. Given that, it calls into question the gold rush surrounding the technology generally.

Financial viability and general ethics aside (I’ll mostly leave that to Ed Zitron, who you should subscribe to,) these shortcomings explain in some ways why, for some, LLMs can be amazing mirrors and amplifiers of one’s own beliefs, emotions and worldview, even to the point of self-destruction. This has been satirically demonstrated in abstract by comedy youtuber Eddy Burback, but also shown in research specifically buried by Meta which revealed that time off from their Machine Learning-guided feeds improved mental health, and reflected in LLM-induced downward spirals documented by medical journals and made easy to understand by medical youtuber ChubbyEmu. These pattern matching recommendation engines and conversation simulators, which are intrinsically dependent on the end user to guide their generated responses, will trend towards a feedback loop, ultimately pocketing themself and the user into a hyper-focused world in a fashion even the most self-congratulatory forums, chat rooms and skewed social media algorithms could never replicate. As an interactive medium, neither the model nor the human engaging with it are necessarily encouraged or even capable of checking the other by design (or perhaps rather by the lack thereof.) It is entirely upon the user to maintain a grasp of reality, as they are the only participant capable of conceptualizing it in any fashion that matters at all. In practice, the ability to rapidly reroll for a different response encourages gacha-game-like gambling mechanics, brow-beating any model’s responses down whatever path that user wants.

Similarly, that shows why LLMs & LRMs may demonstrate value to some people as a super-charged autocomplete, a draft writer, or an automatic clean-up assistant, especially for specific verticals with heavy repetition like software development, at least when audited by a careful eye that already has the domain knowledge to see the aforementioned impossible to stop hallucinations and artifacts. Barring other externalities, being wrong faster can be surprisingly fine, so long as it’s not always wrong, and what’s wrong is easy to spot and correct. It’s similar to how even classic autocorrection and autocomplete techniques, like those in word processors or Intellisense in Visual Studio, can relieve some cognitive load, even when rather imperfect and limited. Especially in something like unit test writing, which often involves nearly identical, very repetitive blocks of code, that can be a space where supercharged mad-libs might be favorable to a lot of tedious copying, pasting and hand-editing. That said, like those aforementioned traditional autocorrection functions, this means AI augmented professional work is just a feature, not software or services that can be sold or subscribed to alone, and they don’t provide enough value to demand a high fee as just a feature. It’s certainly not worth the investment bubble we’re seeing, nor the approaches being taken for developing and marketing it for mass-adoption.

No, as evidenced by myriad stories of vibe-coding creating more additional work than momentum, especially as code scales in size, if not the more damning and dangerous stories of agentic systems driven by LRMs creating malware vectors in popular operating systems, wasting the limited resolution available for any given problem space by diluting it with lots of unrelated (so effectively poisonous,) and certainly noisy data, is just not the path forward, even for the tasks this has shown some utility. On the contrary, this points towards efficient, secure, locally-run, relatively domain-specific models with highly validated, curated and legitimate training data, and again, only to build super-charged autocomplete and assistance for certain tasks of domain specialists, not to sell a subscription forever to a machine god slave as their various corporate creators often imply lies just beyond the horizon. On the contrary, all state of art LLMs and LRMs currently can have their guard rails consistently bypassed in a single adversarial prompt, so long as it’s in poem form.

That is to say, these companies’ own research, to say nothing of anyone else’s, shows they already are fully boxed in. Definitionally, given the findings of these papers, no matter the volume, curation and annotation of information that is shoved into these models, even given the most perfect training systems and benchmarks intended to avoid all rewarding of hallucinated results, and also given better technology to represent these models in hardware to reduce the discontinuities and increase data resolution to their theoretical limits, the mathematical space these techniques are based upon do not continuously re-balance and update themselves in the sense of reasoning as we know it in biological systems. Given that, trying to build general purpose models in that way will not get these companies and their clients the Swiss Army Knife they’re circulating trillions around. Even outside of LLMs, LRMs, and diffusion models, this likely means any modeling using a DNN approach to generate something, including doing a set of actions, as is being promised with everything from self-driving cars to forthcoming humanoid home robots using World Modeling approaches, will have those issues of constant-rate poisoning, hallucinations from data resolution limitations, and the inability to scale past their inelastic, pattern-matching underpinnings.

Even if we could ignore all of the external issues for this approach and its marketing to the world (and we can’t,) the internal issues are clear as crystal now, and the sooner we can all get on that page, and only use DNNs in the places they have improved things like computational photography or even for light recreation if trained and run ethically and sustainably, the more we can look at what other new frontiers in computing exist, discovering what they can do to genuinely make our lives directly, reliably better.

Comments (0)

No Comments

No comments yet.

RSS feed for comments on this post. TrackBack URL

Sorry, the comment form is closed at this time.