The Python SEO movement is not new, but in recent years has been taken to a whole new level due to influencers like the late Hamlet Batista and the progression of data science tools. The hype is at an all-time high and for good reason, but this can lead to anxiety for SEOs wondering if they missed the train, if they have the time to learn, if it is worth it to learn, and where to start.
I don’t claim to be a Python expert, nor do I have all the answers. I just want to cover some frequently asked questions I see and answer them from my perspective as someone who has grappled with these questions over the past year. I hope the information below helps those questioning Python’s role in SEO and Python’s role in their own work. Let’s dive into some questions about how this SEO tool can elevate your tasks.
Table of Contents
What is Python?
Python is a high-level general-use programming language created in 1991 (30 years ago!). High-level programming usually refers to a programming language’s abstraction from a computer’s core functions. With Python, you don’t generally have to deal with memory, registers, and call stacks. High-level languages give some control over some tasks to the machine rather than making the programmer worry about everything. A “half-baked” (lol) analogy could be one of baking bread. You buy the ingredients, mix it all together, but the oven is on and already set to the right temp. You put the dough in the oven and take it out when the timer goes off. High-level language means you don’t worry about turning the oven on, setting the oven’s temp, turning the oven off, or any of the actual mechanics of the oven.
Python has always had very strong support and usage in the STEM communities. You’ll find many computational physicists, chemists, biologists, etc. using Python for their models and experiments. Because of that, Python has developed powerful data science, data manipulation, and data visualization modules.
A module is a collection of functions revolving around a specific purpose that extends the capability of core Python functionality. Instead of everyone reinventing the wheel every time they write some code, you can use what someone already has done to be efficient. Think of it as a hardware store. Say you’re working on your car and you need a socket wrench. Do you go into your garage and create a socket wrench? No, someone already has created it! Go buy it from your hardware store and move on with your life. So, with Python, need to optimize an image? You don’t need to write it from scratch. Import the Pillow module, learn how their functions work from their documentations, customize and apply.
Is Python Just a Trend?
Python has been used in SEO since near the beginning of, well, search engines! Guido van Rossum, the creator of Python, in fact, was at one point a Google employee in 2005! Google was posting API examples in Python at least 10 years ago! However, there is no doubt that there has been a large Python movement in the past few years. This is in large part due to vocal influencers like the late Hamlet Batista. We lost Hamlet earlier this year, but his passion and the movement have continued. SEO’s taking advantage of Python’s utility will go on for a very long time. Regardless of language, the concepts of automation, machine learning, and general scripting are not going away.
Can I Be an Elite SEO Without Python?
There is certainly some anxiety to the Python SEO movement. Some of that is due to a sense of pressure SEO’s feel on the need to jump on the Python bandwagon when they aren’t ready or don’t see the application in what they are doing day-to-day. I will say, you can without question become an elite SEO without one line of Python code or any programming code for that matter. SEO is so incredibly broad and the basics of marketing are rooted in psychology, not technology. Marketing is about understanding people and their buying motivations. Through the decades, technology started to infuse into marketing departments, not replace them. If I want to buy a new jacket, my motivations don’t change if I am using a phone for my purchase or if I am parking about to go in-store. Technology only changes how marketers interact with those motivations.
Python SEO is nothing more than a tool for efficiency and precision. Python doesn’t generally invent new core SEO processes that can’t be done manually (even if monstrously hard). You start with your SEO process and figure out how Python can make that process more efficient and precise, even with machine learning. You don’t think back in the ’60s, Fortune 500 companies were using rooms of statisticians to create data models? The only thing that has changed is that the room of statisticians has been replaced with a computer processer and a Python module for machine learning.
There are thousands of SEO’s all over the world making big money without ever writing a line of Python. Move where your motivation takes you. There is a strong imposter syndrome in SEO due to its mysterious and mercurial nature. Some may worry they are not an elite SEO because they aren’t on the Python SEO wagon or that they aren’t good enough. The truth is, there will always be someone better at something than you and if you are the best today, I doubt you will be tomorrow. Learn new technologies for the right reasons; because you love it and it helps you. Be an expert in something you love and you’ll be successful. If at some point you find some free time or have a process you think you could make efficient or automate, consider looking at Python.
Why Python? Is It the Only Option?
Now, I’ll tackle why Python. There are a few large reasons:
- There has been strong STEM community support since Python’s beginning. There are very handy modules and support communities to be able to quickly do what you need to do and move on.
- Python was written to buck the C style syntax which can be considered messy and confusing. Python uses a more readable and compact syntax.
- There has been rise in online notebooks like Google Colab and Jupyter Notebooks for people to quickly write, run and share for collaboration. You don’t need to set up a confusing environment in Linux (you can install on Windows, Linux, or code online!), set up a compiler, or mess with Github to share (although Github is still recommended).
Where Do I Start If I Don’t Have Much Time?
This is the first and most common hurdle. You read a Python article, feel motivated to start, and then reality hits. Now what? What is the first step? It looked easy in the article, but now I am lost and feeling overwhelmed. I think I’ll just read my email and watch some Netflix. This happens to just about everyone who’s not a professional or hobbyist developer. This is a matter of thinking too big and not coming to grips with the fact that programming is not a “learn today, use tomorrow”-type activity. I have been using Python at least once a week for almost exactly a year now and I very much still consider myself a novice. It takes time, patience, repetition, application, and a series of small wins to start.
I would recommend learning and working with Python 3 times a week for 30-60min each session. During the time when you’re not coding, try to keep Python on your mind every day. Keep thinking about your lessons and scripts, read module documentations, read Python articles, read Python news, discuss Python with your peers, and so on. You don’t have to be writing code all the time in order to learn and keep your mental muscle memory alive.
The first place to start is by having an application for how to use Python. It should be small and specific. It should not be, “I want to do machine learning.” That long-term goal is great, but it will take many months or years to be able to tackle that with confidence and success. Pick something like, “I want to scrape Wikipedia for mentions of XYZ” or “I want to check if my sitemap links are returning a code 200.”
Where Can I Go to Start Learning Python Basics?
There are so many paths you can take. I had some past programming experience and still used a place called CodeAcademy. I believe the core Python 3 course is free, but the real money-maker is their data science Python course which covers a bit of Pandas, Numpy and Mathplotlib. Of course, there are other course platforms to research. Python.org has a page with some learning resources here. I would always recommend learning the basic syntax before diving into scripts with purpose. I know some people can dive right in, but for those true beginners, skip the tutorials on applications and put in the time to learn the core language otherwise you’ll find yourself patting your back after hacking your way through an SEJ Python tutorial thinking you learned something when you didn’t.
What Are Some Editors/IDEs I Can Use?
For both experienced programmers or those of you who are working on a more complex multi-file project, I’d recommend either PyCharm or Visual Studio for an IDE. I really enjoy Thonny for it’s super simplicity when writing short, one file scripts. Just write, and run. No setup or config. Certainly, you can also write straight in the terminal or you can use any code editor like Notepad++ or Sublime Text.
You can also write straight in Google Colab.
What Are Some Basic Applications for Using Python?
The possibilities are endless which is one of the draws of programming. Some of the more common applications for using Python in SEO are web crawling and scraping, API calling and processing, data blending, data visualization, machine learning, natural language processing, and process automation. Think of an SEO process you’re sick of doing manually, start with that, and off you go! See some of my tutorial ideas from the homepage of this blog.
Can I Use Python as a Web App?
Yes! You can certainly be successful running your scripts locally or in Google Colab, but there is also the option to turn your scripts into a web app that can be used by your team or the public via the browser. There are a number of frameworks and platforms for deployment. Admittedly, I am not very experienced in web apps and it is one of my goals for 2021.
What Are Some Popular Modules?
I love finding new modules that help uncomplicate difficult tasks or open my mind to new possibilities. Rather than listing a bunch of one-off modules I’ve used, I’ll list the modules I find myself using over and over again.
Pandas is a data storage and data organizer like a non-GUI table or spreadsheet. It is so powerful I do believe it could replace your need for Excel or Google Sheets if you don’t need spreadsheet visualization. Pandas has been around for over 10 years, but just in the past few years, they have made some remarkable strides. Pandas is a staple for SEOs and data science because of its powerful data manipulation features. With one small line of code, you can import a CSV or excel file that contains URLs and start feeding them to another module to crawl.
Requests has become the defacto HTTP module for Python (once held by urllib). It can get URL text, status codes, header info, JSON, encoding, and more!
JSON, for a while now, has been a preferred data format. Nearly every API will return its response in JSON format. To help work with that format, you can use the JSON module to encode and decode the JSON.
One if the most powerful things you can do is record and store information over time and that is best done in a database. There are a few MySQL modules but I found mysql.connector to be easy and straightforward when I was working with it.
Need to grab today’s date or compare the difference in days between 3 months ago and today? You’re going to need the datetime module. Not sexy, but many times necessary.
This module bundle created by Elias Dabbas has several very useful functions. The crawler simplifies the already useful module scrappy. There is a Twitter API function that is great for scraping Twitter. There are some light NLP functions and some nice Sitemap features. Read the full docs here and see what fits your need.
There are several modules for handling email, but I’ve always found yagmail to be easy. There are many times when you need to send an email such as after a monitoring alert has been triggered. I’ll also mention there are some nice SMS modules and APIs that can send texts to your phone for very, very cheap. I monitor my websites for downtime. I get a text if the site is down and a text when it is back up.
If you can’t live in Pandas forever, the gspread module is great for interacting with Google Sheets. You can update, write, or read specific cells or the entire sheet. Speaking of Pandas, there are ways to communicate between both modules which is magically using two gspread extensions called gspread-dataframe and gspread-pandas.
There are times in a script where you want to take pauses or count the amount of time that has passed. I use this for crawling, especially Google SERPs where you can get IP blocked super fast. Instead, take pauses of random times between 15-20 seconds per request and Google lets you scrape.
OS module lets you interact with the filesystem on your operating system, even Google Colab. Need to open or save files? You’ll need the OS module!
The Pillow module (formerly PIL) is for light image optimization and manipulation. Sounds complicated, but for the vanilla functions, it’s a breeze.
re is for using regular expressions. RegEx is very powerful and useful in just about any script. I especially use this for some scraping.
matplotlib is a powerful graphing module that lets you turn data into a visual. To be honest, this one many times trips me up. I’m not good at statistics or data visualization so I stick to the simple line and box charts, but the potential is here to be very complex.
Numpy is a mathematics module I primarily use for its random number function. This helps in conjunction with the time module to randomize script pauses. Overall it’s a staple for anyone doing data science or advanced mathematical computations.
Beautiful Soup 4 (BS4) is an HTML and XML parser. It’s a bit different than a crawler. It won’t natively seek out URLs and crawl and scrape them for SEO important info automatically like scrapy, but if you are interested in manually web scraping at a granular level, this is the module for you.
Those are the majority of modules I use! There are a handful of others like SSL, NLTK, DiffLib, fake_useragent, google.cloud, io, and collections I’ll use for special circumstances.
Who Should I Follow for Python SEO?
Below are people who I follow, who post regularly about Python, and who I trust on Python SEO. There are no doubt dozens and maybe hundreds of amazing Pythonistas out there on social media posting about their amazing work, so search and find out, but you may choose to start with the people listed below:
If you’ve read this far, I’ll be brief! Python is just one more tool available in the SEO workbench. If you are interested, dive in, but it takes time. If nothing else, you now have a base of understanding and you can introduce someone on your team to Python. Enjoy the journey, not just the end result. Most importantly: stay curious, follow your motivations, and compare yourself today to yourself yesterday and not to anyone else!
Follow me on Twitter and let me know your applications and ideas! I’ll likely be updating this as I think of more beginner questions. Please feel free to reach out to give me a question you want me to answer here.
- Detect Text in Images in Bulk With Tesseract Using Python for SEO - June 4, 2022
- Create a Topical Internal Link Graph for SEO with NetworkX and Python - April 24, 2022
- Evaluate Sentiment Analysis in Bulk with spaCy and Python - March 17, 2022