How a simple twitter crush turned into three open source projects of questionable usefulness for the general public.

Last year I came across a twitter profile of a developer who blogged/tweeted incessantly about computers. She had articles about deep dives into javascript libraries, being a woman of color in technology, being a student, a startup founder, speaking at conferences etc. I had a technology crush on her work and looking back, her internet presence served as early inspiration for me starting this blog. Anyways, one day I came across a cool project she had made to give an inspirational quote whenever you opened up your command line.

For the uninitiated, a command line is an interface that allows you to interact with the your computer's operating system. You can use it to do things as simple as listing the files on your computer, to as complex as connecting to remote servers over the internet. Lots of developers spend tons of times on the command line so its cool to be able to add some personality to it. Generally speaking it'll look something like this:

Exploring Files via Terminal

I downloaded the quotes program and got it up and running. Every time I opened up the terminal - BAM! A new quote.

Fortuity Quotes on CLI Demo from Safia's ReadME... not sure how she got the GIF resolution so much nicer than mine 🤔

The project was cool, I love quotes.

But you know what I like more than quotes?

Bars.

Grown man, God body, Razor sharp, Language Bending, Fire in the booth, Bars.

I thought to myself - it'd be great if I could replace these inspirational quotes, with snippets from the most expansive, influential, and pervasive literary canon in human history. Rap lyrics.

A Problem

While my mission was now well defined, I still wasn't sure how to make it happen. The data source of the inspirational quotes was a completely separate API hosted by a third party. The actual command line tool was just making a request to the API for a quote every time the terminal opened... For some reason I had assumed that the quotes were stored within the module somewhere. Totally wrong. Was there a simple service out there that could just give you a quote from a rap song on demand?

My first thought was RapGenius, but I quickly got pretty frustrated with the their API documentation. It looks nice, but after ~15 minutes of reading I realized they don't actually have direct lyric access through the API because of licensing agreements and things of that nature.

A fellow music enthusiast had written about their exact experience with this. The solution they proposed in the article works well for getting specific songs, but I wanted to grab whole artists' discographies quickly and efficiently. So back to the drawing board.

Path to A Solution

I decided to go the DIY route and collect the lyrics myself. Luckily, I was in a Slack group called Blacks In Technology and had been talking about my big dreams to anyone who would listen. The conversations eventually produced this exceedingly helpful gist from my friend ManagedKaos.

This code snippet would serve as the basis for a larger project to automate web scraping of azlyrics.com. That project would then become the basis of another project to provide a public API endpoint for anyone to retrieve bars from their favorite artist. And finally that project would feed content into the command line tool that I had been dreaming of.

Without further ado, the three projects.

Project #1: Webscraping AZ Lyrics

Without going into toooo much detail, here are the high level steps and things I ran into during development:

  1. check the sites robots.txt file of the site to make sure they are okay with scrapers.
  2. gather the urls of each artist's page on the lyrics website
  3. loop through the urls and write some logic to parse through the html and filter out our relevant information (the lyrics). Take special care to account for special characters and their encodings.
  4. don't get blocked. Since you are "viewing" the webpages via code, you can request the pages from the site much faster than people normally would during casual browsing. As a result of this super human web surfing capability, many sites will block your IP address if you start to take up more than your fair share of bandwidth. I added random sleep times between each request to take some load off of the site and prevent myself from getting blocked.
  5. Automate. I deployed the code as a Lambda on AWS & triggered it using Cloudwatch events to automate this process. This means that every single day my store of rap lyrics is growing by a couple of songs. Keeping the content fresh and minimizing the amount of maintenance required.

I've linked the gist here in case you are super interested in exactly how the scraping script works. The code works but there are some admittedly unsavory patterns in there(non-specified caught exceptions for instance), feel free to ask in the comments if you have any questions.

Project #2: Standing Up an API Endpoint

Cardi B inspired this project about a year ago because literally every one of her songs had hella quotables in it. Fortunately the service has expanded to include a number of other sources of quotes and bars, but the name persists.

A quick walk through on using the API:

The quickest way to use the bars api is by sending a GET request.

URL: https://a3odwonexi.execute-api.us-east-2.amazonaws.com/default/Bars_API
Request Type: GET

This will return any quote from any artist available.

Test it out below, the API is currently live  (🚨 warning, you're likely to encounter strong language).

Another other approach to using the bars api is to send a POST request to specify more options.

This allows you to specify a number of artists from whom to retrieve quotes as well as the option to avoid any "bad" language. We included this in case you happen to be using the API at work or in an application that's meant to be a little more family friendly. You can enable this mode by including the "safe for work" parameter which we will show in some examples below.

Not specifying sfw means you are open to the full variety of colorful language included within the catalog.

The examples below highlight some of the options and how to use them.

Example 1
We are indicating to the api that we want a quote from Jay-Z or Earl Sweatshirt, however we don't mind curse words (this is implied through the absence of the sfw option):

Jay-Z, Earl Sweatshirt, and An Important Literary Device
URL: https://a3odwonexi.execute-api.us-east-2.amazonaws.com/default/Bars_API
Request Type: POST
Payload:

{
    method:"getQuote",
    category: ['jayz','earlsweatshirt']
}

Example 2
Another POST request format example, in this example we only want quotes from Tyler The Creator, but we don't want any offensive language...the irony 😆

Not Sure How Large The Set of Lyrics That Meet This Criteria Is.
URL: https://a3odwonexi.execute-api.us-east-2.amazonaws.com/default/Bars_API
Request Type: POST
Payload:

{
    method:"getQuote",
    category: ['sfw','tylerthecreator']
}

Example 3
And one more POST example for clarity; here we are indicating that we are open to quotes from any of the available sources, as long as they don't contain offensive language:

URL: https://a3odwonexi.execute-api.us-east-2.amazonaws.com/default/Bars_API
Request Type: POST
Payload:

{
    method:"getQuote",
    category: ['sfw']
}

Source code available here.

I initially used pythonanywhere to host the API because it's cheap, easy, and works really well with Flask, a lightweight Python framework for making web applications. Recently I migrated it to an AWS Lambda triggered via API Gateway to help handle larger loads and keep costs low. Serverless functions are cool or whatever, they offer pay as you use type plans and require 0 server admin work. However they also introduce the now infamous cold start issue, which, depending on the nature of your application, may be an itty-bitty issue, or a complete show stopper.

Project #3: Copying and Pasting Code from People Smarter Than Me

Leveraging the two projects above gets us to the last piece. The actual downloadable node package. The package is hosted on npm and can be downloaded following the linked instructions. As I mentioned before, it's mostly a fork of the original project with a few poorly implemented modifications to parse the inputs and create POST request JSON bodies.

$ npm install -g cardib4cli

Once downloaded, it can be configured to run every time you open the terminal. The following section walks through a couple configurations and how to set them up in your terminal.

The "Give Me Everything" Configuration

The configuration version below will run just a standard GET request with no options specified, meaning you'll get anything and everything available; you can explore exactly what is included in "everything available" on the api homepage .

$ echo 'cardib4cli' >> ~/.bash_profile

The "Woke" Configuration

The next configuration option will run a POST request where each word after cardib4cli separated by a space will be included as a category entry in the JSON body. Thus every time we open the command line we will get either a quote from James Baldwin or a lyric from Ms. Hill, and neither of them will contain curse words (since we are including the sfw option).

A command line configuration suitable for the wholesome, righteous, and woke amongst us.
$ echo 'cardib4cli james_baldwin lauryn_hill sfw' >> ~/.bash_profile
Wisdom and Bars. Sent from Above.

If you loved this, consider any combination of the following actions:

  1. support
  2. subscribe
  3. comment

or....

(no, Maino is not an artist available in the bars api 🤣 butttt like if you wanted him to be, see step 3)