For SEO audits, one area you may want to detect and store the different technologies a website is using. Sure you can spot check and run a few console commands, but what if you could have an API do it all for you. The service BuiltWith has just that capability! BuiltWith has several very interesting APIs you’ll want to look at, but this tutorial will focus on the “Domain API“. This API returns back a list of technologies detected on the website or things like DNS. Technologies like, are they using PHP, GTM, jQuery, Google Fonts, WordPress, IPv6, and hundreds more.
When you understand the technologies being used you can be better prepared to tackle issues as you understand their environment better. Additionally, BuiltWidth returns some additional information like some domain ranking scores and social account detection. See the full API documentation for full capabilities.
Note that BuiltWith has a free web interface to get this information. The advantage of using the API as always is to break free the data for you to store, scale, manipulate, visualize and automate.
Also, note that BuiltWith has a free API with limited functions, but most require credits, as does the API used in this tutorial. In general, I find the API costs fairly reasonable. I believe it is 2000 API calls for $100.
Table of Contents
Requirements and Assumptions
- Python 3 is installed and basic Python syntax understood
- Access to a Linux installation (I recommend Ubuntu) or Google Colab
Import Modules
- json: for handling the API response
- requests: for making the API call
import json import requests
Build API Call
Now that the modules have been installed and imported we need to set our domain variable which contains the site domain you want to use and use that domain variable in our API call we will store in the variable url. There are several parameters you can use for the API call. You can see their documentation here. We just need three:
- key: your API key for authentication
- liveonly: only send current information, not historical, which is an option
- lookup: the site domain you want to check
Be sure to add your own API Key to the URL.
domain = "rocketclicks.com" url = "https://api.builtwith.com/v14/api.json?KEY=YOUR_API_KEY&liveonly=yes&LOOKUP="+domain
Next, we build the GET API call using the URL variable we built and store the response in the variable data. We also create a couple of empty variables to store the tech and social data in later. You could use a pandas dataframe, but we’ll just do a simple console printout.
payload = {} headers= {} response = requests.request("GET", url, data = payload) data = response.json() technology="" social=""
Process API Response
We now have our API JSON response in the data. Let’s print out the name, documentation link, and description of each technology detected on the domain.
for result in data["Results"][0]["Result"]["Paths"]: for tech in result["Technologies"]: technology += tech["Name"] + "\n" + tech["Link"]+ "\n" + tech["Description"] + "\n"
Output
Note, this output image is cropped in width and height. There were dozens of technologies listed.
Let’s move on to printing out the detect social media account for the domain. The value is the URL of the social account. We’ll use a try/except because there could be domains without a detected social media account.
try: for value in data["Results"][0]["Meta"]["Social"]: social += str(value) + "\n" except: social="n/a"
Social Output
Next, we can print out the Majestic Rank, Referring Subnets, Referring IPs, Ahrefs Rank, QuantCast Rank, and Umbrella Rank. Note, you’ll only get data from these metrics if your domain is considered in the top 1M sites (by Majestic).
print("MJRank: " + str(data["Results"][0]["Attributes"]["MJRank"])) print("RefSN: " + str(data["Results"][0]["Attributes"]["RefSN"])) print("RefIP: " + str(data["Results"][0]["Attributes"]["RefIP"])) print("ARank: " + str(data["Results"][0]["Meta"]["ARank"])) print("QRank: " + str(data["Results"][0]["Meta"]["QRank"])) print("URank: " + str(data["Results"][0]["Meta"]["Umbrella"]))
Stats Output
As mentioned before there are other data returned. I only highlighted what I felt was most interesting, but to see everything just loop through the entire JSON response using the code below.
for key,value in data["Results"][0]["Attributes"].items(): print(str(key) + ":" + str(value)) for key,value in data["Results"][0]["Meta"].items(): print(str(key) + ":" + str(value))
That’s all folks! As always be sure to follow me and let me know how this script is working for you and how you are extending it.
- Find Interlinking Opps via Entity N-gram Matches Using Python - April 3, 2023
- Build and Run Python Scripts on the Fly With GPT-3 - January 5, 2023
- Compare Keyword SERP Similarity in Bulk with Python - November 18, 2022