Bars For Days, Chips With Lays
How a simple twitter crush turned into three open source projects of questionable usefulness for the general public.
Last year I came across a twitter profile of a developer who blogged/tweeted incessantly about computers. They had articles about deep dives into javascript libraries, being a student, a startup founder, speaking at conferences etc. I had a technology crush on their work and looking back, their internet presence served as early inspiration for me starting this blog. Anyways, one day I came across a cool project they had- it give an inspirational quote whenever you opened up your command line.
For the uninitiated, a command line is an interface that allows you to interact with the your computer's operating system. You can use it to do things as simple as listing the files on your computer, to something as complex as connecting to remote servers over the internet. Lots of developers spend tons of times on the command line so it's cool to be able to add some personality to it. Generally speaking it'll look something like this:
I downloaded the quotes program and got it up and running. Every time I opened up the terminal - BAM! A new quote.
The project was cool, I love quotes.
But you know what I like more than quotes?
Bars.
Grown man, God body, Razor sharp, Language Bending, Fire in the booth, Bars.
I thought to myself - it'd be great if I could replace these inspirational quotes, with snippets from the most expansive, influential, and pervasive literary canon in human history. Rap lyrics.
A Problem
While my mission was now well defined, I still wasn't sure how to make it happen. The data source of the inspirational quotes was a completely separate API hosted by a third party. The actual command line tool was just making a request to the API for a quote every time the terminal opened... For some reason I had assumed that the quotes were stored within the module somewhere. Totally wrong. Was there a simple service out there that could just give you a quote from a rap song on demand?
My first thought was RapGenius, but I quickly got pretty frustrated with the their API documentation. It looks nice, but after ~15 minutes of reading I realized they don't actually have direct lyric access through the API because of licensing agreements and things of that nature.
A fellow music enthusiast had written about their exact experience with this. The solution they proposed in the article works well for getting specific songs, but I wanted to grab whole artists' discographies quickly and efficiently. So back to the drawing board.
Path to A Solution
I decided to go the DIY route and collect the lyrics myself. Luckily, I was in a Slack group called Blacks In Technology and had been talking about my big dreams to anyone who would listen. The conversations eventually produced this exceedingly helpful gist from my friend ManagedKaos.
This code snippet would serve as the basis for a larger project to automate web scraping of azlyrics.com. That project would then become the basis of another project to provide a public API endpoint for anyone to retrieve bars from their favorite artist. And finally that project would feed content into the command line tool that I had been dreaming of.
Without further ado, the three projects.
Project #1: Webscraping AZ Lyrics
Without going into toooo much detail, here are the high level steps and things I ran into during development:
- check the sites
robots.txt
file of the site to make sure they are okay with scrapers. - gather the urls of each artist's page on the lyrics website
- loop through the urls and write some logic to parse through the html and filter out our relevant information (the lyrics). Take special care to account for special characters and their encodings.
- don't get blocked. Since you are "viewing" the webpages via code, you can request the pages from the site much faster than people normally would during casual browsing. As a result of this super human web surfing capability, many sites will block your IP address if you start to take up more than your fair share of bandwidth. I added random sleep times between each request to take some load off of the site and prevent myself from getting blocked.
- Automate. I deployed the code as a Lambda on AWS & triggered it using Cloudwatch events to automate this process. This means that every single day my store of rap lyrics is growing by a couple of songs. Keeping the content fresh and minimizing the amount of maintenance required.
I've linked the gist here in case you are super interested in exactly how the scraping script works. The code works but there are some admittedly unsavory patterns in there(non-specified caught exceptions for instance), feel free to ask in the comments if you have any questions.
Project #2: Standing Up an API Endpoint
Cardi B inspired this project about a year ago because literally every one of her songs had hella quotables in it. Fortunately the service has expanded to include a number of other sources of quotes and bars, but the name persists.
A quick walk through on using the API:
The quickest way to use the bars api is by sending a GET
request.
URL: https://a3odwonexi.execute-api.us-east-2.amazonaws.com/default/Bars_API Request Type: GET
This will return any quote from any artist available.
Test it out below, the API is currently live (🚨 warning, you're likely to encounter strong language).
Another other approach to using the bars api is to send a POST
request to specify more options.
This allows you to specify a number of artists from whom to retrieve quotes as well as the option to avoid any "bad" language. We included this in case you happen to be using the API at work or in an application that's meant to be a little more family friendly. You can enable this mode by including the "safe for work" parameter which we will show in some examples below.
Not specifying sfw
means you are open to the full variety of colorful language included within the catalog.
The examples below highlight some of the options and how to use them.
Example 1
We are indicating to the api that we want a quote from Jay-Z or Earl Sweatshirt, however we don't mind curse words (this is implied through the absence of the sfw
option):
URL: https://a3odwonexi.execute-api.us-east-2.amazonaws.com/default/Bars_API
Request Type: POST
Payload:
{
method:"getQuote",
category: ['jayz','earlsweatshirt']
}
Example 2
Another POST
request format example, in this example we only want quotes from Tyler The Creator, but we don't want any offensive language...the irony 😆
URL: https://a3odwonexi.execute-api.us-east-2.amazonaws.com/default/Bars_API
Request Type: POST
Payload:
{
method:"getQuote",
category: ['sfw','tylerthecreator']
}
Example 3
And one more POST
example for clarity; here we are indicating that we are open to quotes from any of the available sources, as long as they don't contain offensive language:
URL: https://a3odwonexi.execute-api.us-east-2.amazonaws.com/default/Bars_API
Request Type: POST
Payload:
{
method:"getQuote",
category: ['sfw']
}
I initially used pythonanywhere
to host the API because it's cheap, easy, and works really well with Flask, a lightweight Python framework for making web applications. Recently I migrated it to an AWS Lambda triggered via API Gateway to help handle larger loads and keep costs low. Serverless functions are cool or whatever, they offer pay as you use type plans and require 0 server admin work. However they also introduce the now infamous cold start issue, which, depending on the nature of your application, may be an itty-bitty issue, or a complete show stopper.
Project #3: Copying and Pasting Code from People Smarter Than Me
Leveraging the two projects above gets us to the last piece. The actual downloadable node package. The package is hosted on npm and can be downloaded following the linked instructions. As I mentioned before, it's mostly a fork of the original project with a few poorly implemented modifications to parse the inputs and create POST
request JSON bodies.
$ npm install -g cardib4cli
Once downloaded, it can be configured to run every time you open the terminal. The following section walks through a couple configurations and how to set them up in your terminal.
The "Give Me Everything" Configuration
The configuration version below will run just a standard GET
request with no options specified, meaning you'll get anything and everything available; you can explore exactly what is included in "everything available" on the api homepage .
$ echo 'cardib4cli' >> ~/.bash_profile
The "Woke" Configuration
The next configuration option will run a POST
request where each word after cardib4cli
separated by a space will be included as a category
entry in the JSON body. Thus every time we open the command line we will get either a quote from James Baldwin or a lyric from Ms. Hill, and neither of them will contain curse words (since we are including the sfw
option).
$ echo 'cardib4cli james_baldwin lauryn_hill sfw' >> ~/.bash_profile
If you loved this, consider any combination of the following actions:
or....
(no, Maino is not an artist available in the bars api 🤣 butttt like if you wanted him to be, see step 3)