Audit URLs for SEO Using ahrefs Backlink API Data
This step-by-step SEO tutorial shows a Python script that retrieves and analyzes domain data using the Ahrefs API. It helps SEOs, webmasters, and data analysts monitor metrics such as broken backlinks, total backlinks, and domain rating.
Note: The script requires an Ahrefs API key. For large-scale or high-speed requests, use a paid API plan; free options may impose rate limits or delays. Obtain your API key from your Ahrefs account.
Table of Contents
Requirements and Assumptions
- Basic Python Knowledge: Ensure Python 3 is installed and you’re familiar with Python syntax. Alternatively, use a notebook such as Google Colab.
- Ahrefs API Access: You’ll need an Ahrefs API key to run the API requests in this script.
- CSV File of Domains: A CSV file named
domains.csvcontaining a column calleddomainswith the URLs to analyze.
Step 1: Install Required Libraries
The script requires pandas and requests for data handling and API calls. Install them with:
!pip3 install pandas requests
Step 2: Import Libraries
After installing, import the required libraries:
import pandas as pd import requests import json import time from datetime import date
Step 3: Set Up API and Define Functions
The following functions will connect to Ahrefs and retrieve metrics for each domain:
3.1. Fetch Broken Backlinks
This function retrieves broken backlinks from Ahrefs. You can adjust the “limit” in the query string; it is set to 5000 here to reduce API costs.
def fetch_broken_backlinks(url,api_token):
bl_url = "https://api.ahrefs.com/v3/site-explorer/broken-backlinks"
headers = {
"Accept": "application/json",
"Authorization": f"Bearer {api_token}"
}
querystring = {"limit": "5000", "select": "http_code", "target": url, "aggregation": "similar_links"}
try:
response = requests.get(bl_url, headers=headers, params=querystring)
response.raise_for_status()
data = response.json()
except requests.RequestException as e:
print(f"Request failed: {e}")
return 0
return len(data.get('backlinks', []))
3.2. Fetch Domain Rating
Retrieve the domain rating using the Ahrefs API.
def fetch_domain_rating(url,api_token):
dr_url = "https://api.ahrefs.com/v3/site-explorer/domain-rating"
headers = {
"Accept": "application/json",
"Authorization": f"Bearer {api_token}"
}
querystring = {"target": url, "date": date.today().strftime("%Y-%m-%d")}
try:
response = requests.get(dr_url, headers=headers, params=querystring)
response.raise_for_status()
data = response.json()
except requests.RequestException as e:
print(f"Request failed: {e}")
return "n/a"
return data.get("domain_rating", {}).get("domain_rating", "n/a")
3.3. Fetch Total Backlinks
This function retrieves the total number of backlinks.
def fetch_backlinks(url,api_token):
bl_url = "https://api.ahrefs.com/v3/site-explorer/backlinks-stats"
headers = {
"Accept": "application/json",
"Authorization": f"Bearer {api_token}"
}
querystring = {
"target": url,
"mode": "exact",
"output": "json",
"date": date.today().strftime("%Y-%m-%d")
}
try:
response = requests.get(bl_url, headers=headers, params=querystring)
response.raise_for_status()
data = response.json()
except requests.RequestException as e:
print(f"Request failed: {e}")
return 0
return int(data.get('metrics', {}).get('live', 0))
3.4. Check Domain Status
This function checks the HTTP status of each domain.
def get_status(url):
try:
response = requests.get(url, timeout=5)
return response.status_code
except requests.RequestException:
return "n/a"
Step 4: Read the Domain List and Loop Through URLs
Load the domain list from the CSV file and loop over each domain. Note: this example assumes your CSV does not include the protocol (http/https) in the addresses. If it does, remove the added protocol from the get_status() call in the code below.
urls = pd.read_csv("domains.csv")["domains"].str.strip().tolist()
results = []
api_token = "" # Enter your ahrefs API key
for count, url in enumerate(urls):
time.sleep(1) # Delay to avoid rate limits
broken_backlinks = fetch_broken_backlinks(url,api_token)
url_status_code = get_status("https://" + url) #only use if your domains in the CSV don't contain the protocol in the address.
backlinks = fetch_backlinks(url,api_token)
rating = fetch_domain_rating(url,api_token)
if count % 10 == 0:
print(f"{count} - {url_status_code}: {url}")
results.append({
"URL": url,
"Status Code": url_status_code,
"Domain Rating": rating,
"Broken Backlinks": broken_backlinks,
"Total Backlinks": backlinks
})
Step 5: Compile Data in a DataFrame and Export
The scraped data for each URL is stored in a DataFrame.
df = pd.DataFrame(results)
df.to_csv("domain_metrics.csv", index=False)
- Evaluate Subreddit Posts in Bulk Using GPT4 Prompting - December 12, 2024
- Calculate Similarity Between Article Elements Using spaCy - November 13, 2024
- Audit URLs for SEO Using ahrefs Backlink API Data - November 11, 2024














