Surrounded By Asymptotes: 4 Papers Showing LLMs & DNNs Won’t Become AGI

Posted on November 23rd, 2025

Filed under: Code,General — Karl Olson @ 1:44 am

(This will be rather different than my usual, but I promise, the next post won’t be so heavy.)

Lately, it’s been wild to me how the markets and mainstream/enthusiast tech news generally have not sounded the alarm about how Apple, OpenAI, Anthropic and Basis have all published papers in past 6 months that, when considered together, underline that far from being anywhere close to the oft-prophesied “Technological Singularity,” Large Language Modeling (LLM) approaches, including so-called Large Reasoning Modeling (LRM), come up short with no path forward given the techniques used, and this even has implications regarding asymptotic limitations on the capabilities of Deep Neural Network (DNN) models generally.

Apple revealed that LLMs and even LRMs can’t truly reason, even when given explicit instructions and the opportunity for infinite run time. OpenAI discovered that LLMs will also always hallucinate, no matter the model size, training goals or data quality due to the intrinsic resolution limitations of any given parameter and the relationships between them in a DNN model. Anthropic showed that LLMs and diffusion models are easily poisoned regardless of the model size, as the amount of bad data required can be nearly constant, yet keep poisoning the output as a model’s parameters exponentially increase. Most damningly, Basis, an AI research firm, demonstrated through a new benchmark suite for World Modeling (put simply, DNNs for physical tasks and spaces) that LLMs and LRMs are all dramatically outperformed by humans in unique problem solving tests, as it can be shown those techniques all rely on their pattern matching abilities alone, unlike humans which can truly learn on the fly, likely further explaining Apple’s results. Just by those four papers, it suggests AGI shall never be born alone from the LLM/LRM/DNN approach: there’s no ghost in the machine to be found here without redressing those papers’ concerns, if that’s possible at all. Given that, it calls into question the gold rush surrounding the technology generally.

Financial viability and general ethics aside (I’ll mostly leave that to Ed Zitron, who you should subscribe to,) these shortcomings explain in some ways why, for some, LLMs can be amazing mirrors and amplifiers of one’s own beliefs, emotions and worldview, even to the point of self-destruction. This has been satirically demonstrated in abstract by comedy youtuber Eddy Burback, but also shown in research specifically buried by Meta which revealed that time off from their Machine Learning-guided feeds improved mental health, and reflected in LLM-induced downward spirals documented by medical journals and made easy to understand by medical youtuber ChubbyEmu. These pattern matching recommendation engines and conversation simulators, which are intrinsically dependent on the end user to guide their generated responses, will trend towards a feedback loop, ultimately pocketing themself and the user into a hyper-focused world in a fashion even the most self-congratulatory forums, chat rooms and skewed social media algorithms could never replicate. As an interactive medium, neither the model nor the human engaging with it are necessarily encouraged or even capable of checking the other by design (or perhaps rather by the lack thereof.) It is entirely upon the user to maintain a grasp of reality, as they are the only participant capable of conceptualizing it in any fashion that matters at all. In practice, the ability to rapidly reroll for a different response encourages gacha-game-like gambling mechanics, brow-beating any model’s responses down whatever path that user wants.

Similarly, that shows why LLMs & LRMs may demonstrate value to some people as a super-charged autocomplete, a draft writer, or an automatic clean-up assistant, especially for specific verticals with heavy repetition like software development, at least when audited by a careful eye that already has the domain knowledge to see the aforementioned impossible to stop hallucinations and artifacts. Barring other externalities, being wrong faster can be surprisingly fine, so long as it’s not always wrong, and what’s wrong is easy to spot and correct. It’s similar to how even classic autocorrection and autocomplete techniques, like those in word processors or Intellisense in Visual Studio, can relieve some cognitive load, even when rather imperfect and limited. Especially in something like unit test writing, which often involves nearly identical, very repetitive blocks of code, that can be a space where supercharged mad-libs might be favorable to a lot of tedious copying, pasting and hand-editing. That said, like those aforementioned traditional autocorrection functions, this means AI augmented professional work is just a feature, not software or services that can be sold or subscribed to alone, and they don’t provide enough value to demand a high fee as just a feature. It’s certainly not worth the investment bubble we’re seeing, nor the approaches being taken for developing and marketing it for mass-adoption.

No, as evidenced by myriad stories of vibe-coding creating more additional work than momentum, especially as code scales in size, if not the more damning and dangerous stories of agentic systems driven by LRMs creating malware vectors in popular operating systems, wasting the limited resolution available for any given problem space by diluting it with lots of unrelated (so effectively poisonous,) and certainly noisy data, is just not the path forward, even for the tasks this has shown some utility. On the contrary, this points towards efficient, secure, locally-run, relatively domain-specific models with highly validated, curated and legitimate training data, and again, only to build super-charged autocomplete and assistance for certain tasks of domain specialists, not to sell a subscription forever to a machine god slave as their various corporate creators often imply lies just beyond the horizon. On the contrary, all state of art LLMs and LRMs currently can have their guard rails consistently bypassed in a single adversarial prompt, so long as it’s in poem form.

That is to say, these companies’ own research, to say nothing of anyone else’s, shows they already are fully boxed in. Definitionally, given the findings of these papers, no matter the volume, curation and annotation of information that is shoved into these models, even given the most perfect training systems and benchmarks intended to avoid all rewarding of hallucinated results, and also given better technology to represent these models in hardware to reduce the discontinuities and increase data resolution to their theoretical limits, the mathematical space these techniques are based upon do not continuously re-balance and update themselves in the sense of reasoning as we know it in biological systems. Given that, trying to build general purpose models in that way will not get these companies and their clients the Swiss Army Knife they’re circulating trillions around. Even outside of LLMs, LRMs, and diffusion models, this likely means any modeling using a DNN approach to generate something, including doing a set of actions, as is being promised with everything from self-driving cars to forthcoming humanoid home robots using World Modeling approaches, will have those issues of constant-rate poisoning, hallucinations from data resolution limitations, and the inability to scale past their inelastic, pattern-matching underpinnings.

Even if we could ignore all of the external issues for this approach and its marketing to the world (and we can’t,) the internal issues are clear as crystal now, and the sooner we can all get on that page, and only use DNNs in the places they have improved things like computational photography or even for light recreation if trained and run ethically and sustainably, the more we can look at what other new frontiers in computing exist, discovering what they can do to genuinely make our lives directly, reliably better.

Comments Off on Surrounded By Asymptotes: 4 Papers Showing LLMs & DNNs Won’t Become AGI


Everything Old is New Again

Posted on December 24th, 2022

Filed under: Code,General — Karl Olson @ 4:03 pm

I remember setting up a LiveJournal crossposter for my blog, now it’s a ActivityPub plugin. Let’s see if it worked.

Edit: not looking good.

Edit Edit: Yeah, I think I’ll turn this off until it’s something inbuilt to WordPress.

Comments Off on Everything Old is New Again


Some Discovery Warner Implosion Thoughts

Posted on August 19th, 2022

Filed under: Code,General — Karl Olson @ 5:21 pm

I have been tweeting a lot about the endless cancellations/shelving/write-offs/whatever going on over at Discovery/WB. Until yesterday, I was mostly thinking of it purely as just a fan or at most, a long-time, amateur, industry observer/occasional podcast talking-head. Sure, by happenstance, I am very lucky and happy to have made a few friends in animation and localization, but generally, it all felt like a disaster in the distance, as previous entertainment industry management failures have for me.

Then Discovery started shelving if not fully memory-holing a bunch of shows that weren’t as far from me as I thought, at least once I thought about it. Victor & Valentino had development and storyboard work from my great friend and Storyboard Pro code collaborator Corey Barnes. Infinity Train had storyboard work from Marie Lum, who once kindly said the Storyboard Pro scripts that Corey & I built were worthy of a Winsor McCay Animation Lifetime Achievement award: a level of praise I never expected for any code I’d write.

Ruminating on those connections changed the context. This debacle is all at a very different distance than when I used to complain about TV network mismanagement as an aimless, 20-something forum-goer turned volunteer animation critic & forum mod. Sure, that also meant I was very aware of folks moving on to new roles and new opportunities; by and large, I know this won’t instantly throw people out in the cold. However, more than I’d ever had known previously, I was keenly aware of just how much work was being cast into limbo as I’d literally had helped reduce the work load with the only relevant talent I could contribute. All these realizations did was make me more upset at how callous and unjust the rules around intellectual property and copyrights owned by companies are. The artists and their fans deserve better.

So, while I usually don’t get so heavy, I want to take a moment to say if one your favorite shows is being caught up in all of this, and if the artists who made it have any direct support options – commissions, ko-fi, gumroad, patreon, etc. – now’s a good time to lend a hand, if only emotionally, if not materially, by taking advantage of those options. Further still, we need to agitate for changes in copyright and IP write offs such that that works intentionally orphaned via said write-offs either return to the original creatives, or go instantly into the public domain such that it still frees the original creators, the greater staff and even fans to distribute and celebrate these works, so they are not lost to time. I dearly hope reform like that happens, and that, as I have to admit myself, is no longer me pontificating as someone on the sidelines, but as someone who at least helped people play the game a little bit, and would like to see a system that encourages their endeavors, not one that squanders them for single quarter’s balance sheet.

Comments Off on Some Discovery Warner Implosion Thoughts


More Storyboard Pro Scripting!

Posted on November 24th, 2018

Filed under: Code — Karl Olson @ 7:13 pm

Yet again, I’ve helped my animator friend Corey Barnes (who I’ll again note has worked on a litany of animated series) with another very useful script for ToonBoom’s Storyboard Pro.

This time, we’ve made it easier to bulk rename layers across multiple panels, making it easier for storyboard artists to relabel layers after working on a draft. Instead of cleaning up each panel by hand, you select your panels, enter new name, pick the layer on each panel and/or skip panels you don’t want to rename anything on, then let it rip. It’s easy to use, just like our previous script for bulk editing captions across multiple panels. Both scripts are on Corey’s Gumroad page as pay what you’d like, and he and I should have more scripts to come on Gumroad, so if you’re an animator, you’ll likely want to get updates from his page in future.

Comments Off on More Storyboard Pro Scripting!


But wait, there’s more (for Storyboard Pro users)

Posted on September 4th, 2018

Filed under: Code — Karl Olson @ 9:38 pm

Set or Reset Captions in Storyboard Pro.

After the very positive response to the previous script from storyboarders and animators working at just countless major animation studios all over the world, many independent creators and the actual software company itself regarding my first Storyboard Pro script, I was eager to revise the script with my friend Corey and make it even more useful by extending the simple caption delete script into something with a text input so you could also replace captions across multiple panels (or just do a delete still by leaving the text input empty!) So, after a few false starts and some double checking, the result is already being just as well received as the first script. I hope I can do some more of this work in future, so maybe I’ll see if I can get an old copy of SB Pro at some point (the script engine is an older version of QT Script, so the latest/greatest is probably fine.)

Comments Off on But wait, there’s more (for Storyboard Pro users)


My First Storyboard Pro Script / Remote Pair Programming

Posted on August 26th, 2018

Filed under: Code — Karl Olson @ 12:50 pm

One side effect of writing and modding for an animation news website back in the day, is that I made some great friends who went on to actually work in the animation, comics or anime/manga industries professionally. One such person is Corey Barnes, an animator/storyboarder/director who has worked on a number of high profile projects in various animation roles including Netflix’s Big Mouth, FX’s Archer, Adult Swim’s China, IL and much, much more.

Last night, he hit me up for programming advice with an issue inside of ToonBoom’s Storyboard Pro, a rather popular piece of software used for the storyboarding process in modern studios. Basically, while it’s easy to copy and paste existing storyboard panels, you’re then left with clearing out anything you didn’t want duplicated by hand, such as captions, unless you’re feeling confident enough with QTScript (a cousin of JavaScript/ECMAScript,) and Storyboard Pro’s API docs to start to automate clean up the specific info you want. Thus I hopped on Skype wtih screenshare, and over the next hour, we worked together, stumbling through the documentation to work out which options and data lived where, towards building this script: TB_Delete_Caption_Text.js on GitHub Gist (Dropbox.)

Now, instead of manually cleaning up each storyboard panel’s captions, you can just select as many panels as you want, pick which captions to clear in that selection, and the script does everything else. I have no idea how much time this might save storyboard artists, but it’s certainly proving popular with storyboarders on Twitter. I hope this is the start of writing more scripts that help animators. I can’t draw, but I certainly can code, and if I can use that skill so software gets out of the way of animators, that seems like a cool way to help with/participate in that industry.

Comments Off on My First Storyboard Pro Script / Remote Pair Programming


Nick Arcade Game Board Demo: A 3 hour React sketch.

Posted on August 2nd, 2018

Filed under: Code — Karl Olson @ 1:19 am

Remember Nick Arcade? If so, you might recall the basic game play of the show involved moving a cartoon character, Mikey, around a map, where you might reveal a prize, a trivia question, a game challenge or an event where you lose control to the other team. A friend of mine who streams old shows joked that he wanted to pick shows the same way – randomly as you moved around a map. So, while he streamed recently, I wrote up this very rough draft of the Nick Arcade board, complete with shows. Since it’s a pain to share a JSFiddle otherwise, here it is embedded.

In future, I’ll definitely add a component to populate the show list so it’s not fixed in the component (it is randomly built each load though – navigate around and see which “classic” you might get to watch,) options to load a background graphic, and maybe even an animated Mikey (well, probably not.) Still, I think I have a good basis here, and it’s a fun, small project I can probably handle in the background.

Edit: took an extra half hour the next day to get it a bit more visually polished, look at the fiddle history to see the process in action.

Comments Off on Nick Arcade Game Board Demo: A 3 hour React sketch.


Lyricalist Alpha & Nerdcore MCs Verbal Perspicasity

Posted on May 25th, 2014

Filed under: Code — Karl Olson @ 6:50 pm

So a while back a rather interesting chart was posted online noting the unique words used in the first 35,000 words of many major MCs. Of course, being an MC, musician and computer scientist, I was immediately intrigued yet let down. Obviously, you can’t really put up the exact same source lyrics since that would be infringement (sadly), but it would’ve been awesome if some source code was available and at least a listing of the sources used.

So, I just built my own solution that would be a lot more transparent, and that could eventually act as a framework for something more collaborative.

Presenting: Lyricalist.

On one hand, it’s a really basic word count. However, it has a lot of nice, little tweaks that let the user (probably an insecure rapper like myself,) carefully manipulate the behavior of any automatic reformatting and correction of their text. To further demonstrate the transparency of how it works, it show not only the unique counts, but it also shows a raw JSON count of the object (that will probably be wired into some graphing functionality in future,) and the processed version of the input so that you can see exactly what any replacement has done. Beyond that, there is a also field for inputting excluded words if you want to see what happens to the unique word count if you exclude certain common words as well.

On that note, here are unique word counts for the following artists first 35000 words (or as many words as they’ve released to date if they don’t have 35000+ words on their own official, non-best of albums):

Aesop Rock: 8411/35000 (sourced from lyricswiki**)
mc chris: 6467/35000 (sourced from A-Z Lyrics**)
MC Frontalot*: 5163/21708 (sourced from his own website**)
Whoremoans*: 3682/19779 (forwarded directly to me from a transcription, no edits made by me.)
Ultraklystron: 6492/35000 (from my own archives**)

*=under the 35000 count.
**=manually corrected for variations in spelling, transcription errors and reduced repetition of choruses when feasible.

I want to note these numbers are obtained after removing non-essential punctuation, making it all lowercase and removing apostrophes via the Lyricist page. Additionally, as noted, I also corrected for transcription errors and inconsistent spelling and also removed obvious repetition like choruses/hooks since there is no consistency in the notation of that kind of thing. Also omitted when possible/reasonable were any lyrics from featured MCs on those artist’s releases. In the case of annotations that didn’t clearly breakdown which MCs said what, the entire song was cut from the count.

Lessons Learned so far:
-There is so much variation in the count due to the very issues I’m trying to correct for above that at best, you can probably say that if two rappers are within a 500 words of each other, they’re probably comparable when it comes to vocabulary, even after running corrections/clean up over all of their lyrics. This is reinforced by the fact that MCs will trade spaces depending on the removal of apostrophes or not.

-Most MCs hit a logarithmic ceiling as they go on with time. Aesop doesn’t appear to though. In fact, even if a fan put serious time into working through and getting a very accurate transcription of his first 35000 words fully corrected for any of the possible duplicates sneaking by, he’d still probably be smashing it.

-I have a lot of artists I want to gradually add this to list (MC Lars, Megaran and YTCracker to name 3 off the top of my head,) but finding a good, preferably single source for their lyrics is going to critical to the accuracy of the analysis.

-Making this an un-moderated, open source list will probably be a fiasco. To make this work, it almost needs to be integrated into Rap Genius or something similar. I kind of hope this spurs the major lyrics site into integrating this kind of analysis as a way of engaging people with the words of their favorite musicians and MCs at a lexicographical level.

Comments Off on Lyricalist Alpha & Nerdcore MCs Verbal Perspicasity


Snake Charming

Posted on October 5th, 2012

Filed under: Code — Karl Olson @ 4:04 am

Or, why I’ve added a new category to the blog.

I’ve come to realize that when I don’t have school work to do, music to write/produce for other rad musicians, or reviews to write about cartoons, I should probably try to be proactive in keep my programming skills sharp. While I wouldn’t feel comfortable doing anything too serious at the moment, I figure short Python scripts might be a good plan. Thus, I’ve added a code section to my blog, and (hopefully) this means I put the extra time in my day that isn’t going into the above responsibilities into something more productive than reading up on internet news I’ll forget a day later.

Eventually, these posts will just be updates that say I’ve posted something new to git-hub or something similar, but to inaugurate this, I have written a little script that compares cost of ownership between two cars. Check it out after the break.

(more…)

Comments Off on Snake Charming