Posted on May 25th, 2014
Filed under: Code — Karl Olson @ 6:50 PM
So a while back a rather interesting chart was posted online noting the unique words used in the first 35,000 words of many major MCs. Of course, being an MC, musician and computer scientist, I was immediately intrigued yet let down. Obviously, you can’t really put up the exact same source lyrics since that would be infringement (sadly), but it would’ve been awesome if some source code was available and at least a listing of the sources used.
So, I just built my own solution that would be a lot more transparent, and that could eventually act as a framework for something more collaborative.
On one hand, it’s a really basic word count. However, it has a lot of nice, little tweaks that let the user (probably an insecure rapper like myself,) carefully manipulate the behavior of any automatic reformatting and correction of their text. To further demonstrate the transparency of how it works, it show not only the unique counts, but it also shows a raw JSON count of the object (that will probably be wired into some graphing functionality in future,) and the processed version of the input so that you can see exactly what any replacement has done. Beyond that, there is a also field for inputting excluded words if you want to see what happens to the unique word count if you exclude certain common words as well.
On that note, here are unique word counts for the following artists first 35000 words (or as many words as they’ve released to date if they don’t have 35000+ words on their own official, non-best of albums):
Aesop Rock: 8411/35000 (sourced from lyricswiki**)
mc chris: 6467/35000 (sourced from A-Z Lyrics**)
MC Frontalot*: 5163/21708 (sourced from his own website**)
Whoremoans*: 3682/19779 (forwarded directly to me from a transcription, no edits made by me.)
Ultraklystron: 6492/35000 (from my own archives**)
*=under the 35000 count.
**=manually corrected for variations in spelling, transcription errors and reduced repetition of choruses when feasible.
I want to note these numbers are obtained after removing non-essential punctuation, making it all lowercase and removing apostrophes via the Lyricist page. Additionally, as noted, I also corrected for transcription errors and inconsistent spelling and also removed obvious repetition like choruses/hooks since there is no consistency in the notation of that kind of thing. Also omitted when possible/reasonable were any lyrics from featured MCs on those artist’s releases. In the case of annotations that didn’t clearly breakdown which MCs said what, the entire song was cut from the count.
Lessons Learned so far:
-There is so much variation in the count due to the very issues I’m trying to correct for above that at best, you can probably say that if two rappers are within a 500 words of each other, they’re probably comparable when it comes to vocabulary, even after running corrections/clean up over all of their lyrics. This is reinforced by the fact that MCs will trade spaces depending on the removal of apostrophes or not.
-Most MCs hit a logarithmic ceiling as they go on with time. Aesop doesn’t appear to though. In fact, even if a fan put serious time into working through and getting a very accurate transcription of his first 35000 words fully corrected for any of the possible duplicates sneaking by, he’d still probably be smashing it.
-I have a lot of artists I want to gradually add this to list (MC Lars, Megaran and YTCracker to name 3 off the top of my head,) but finding a good, preferably single source for their lyrics is going to critical to the accuracy of the analysis.
-Making this an un-moderated, open source list will probably be a fiasco. To make this work, it almost needs to be integrated into Rap Genius or something similar. I kind of hope this spurs the major lyrics site into integrating this kind of analysis as a way of engaging people with the words of their favorite musicians and MCs at a lexicographical level.
Comments Off on Lyricalist Alpha & Nerdcore MCs Verbal Perspicasity
Posted on March 31st, 2014
Filed under: Music News — Karl Olson @ 11:58 PM
This isn’t April Fools’ Day nonsense. Or if it is, all you have to do is press play above or here and spoil it in the comments. Anyways, pay what you want. I assume that’ll be 0 though.
Comments Off on Unwarranted Self Importance
Posted on November 29th, 2013
Filed under: General Noise,Music News — Karl Olson @ 10:43 PM
So I have added BitCoin payment via CoinBase to the web store. It’s rough looking as ever – I really should redo it with div and css, for now, I’m ready to take BTC (though I’m keeping the prices in USD since that’s what I payed for the material in.) I’ll admit I was a skeptic, and I still am a little, but sites like CoinBase seem to provide a reasonable means of giving it a shot, and it’s not like PayPal has been a bastion of honest dealings anyway.
To celebrate the occasion, let’s repost this remix, again:
Comments Off on Bitcoin Accepted Here, Apparently.
Posted on July 7th, 2013
Filed under: Music News,Videos — Karl Olson @ 5:27 PM
Besides all of the instrumentals I’ve actually I’d finished and even in some cases circulated to collaborators only to have them choose only handful from what was offered, I also have a number of songs that were done or close to done that I’d never even rendered out because they were little one off ideas while I was in the middle of other projects. However, a little digging let put together the above Drum n’ Bass EP. It’s pay what you want (including free,) so long as you have time download it, you might as well 😀
Regarding those other unreleased instrumentals, I’m going to consider putting those together into a release as well.
Oh, and here’s the YouTube version of the EP:
Comments Off on Still Breaking Beats
Posted on July 5th, 2013
Filed under: General Noise,Music News,Videos — Karl Olson @ 8:00 AM
All of my back catalog that wasn’t posted to my main YouTube channel is now online here. I decided to split that material off from my main channel as I no longer know which albums these belonged to (well, I didn’t want to dig through CDs at my parents’ place to work it out,) and because most of it is fairly mediocre. Of course, I was only a teenager when I wrote most of it, and there are few decent gems hiding in with the junk there, but there it wasn’t worth binning the content beyond maybe a few YouTube playlists I’ll put together later. Besides, I figured since I could upload it easily thanks to a couple of nifty open source python scripts I tweaked and knitted together (I will post my code online once I clean it up a little,) I might as well do so.
Having posted those songs up, that pretty much only leaves mash-ups, remixes for other artists and mixtapes as the only material I haven’t uploaded to YouTube. I don’t see my self bothering with uploading them at the moment. I now have over 500 songs on YouTube on those two accounts alone, and I figure that’s good enough for the moment.
Otherwise, the main thing going on with me is that I’ve graduated university, and I’m looking for work. I’m doing a lot of programming tests at the moment, and though I’m feeling a little fatigued, I’m pretty sure that’s just anxiety regarding these opportunities. Hopefully, I’ll have something lined up soon enough.