In 2017 Chrome released a headless (no GUI) feature of being able to take a screenshot of a single web page from a specified viewport. This helps in keeping an archive for version comparison, monitoring, and client-facing deliverables. Because it’s a headless feature it’s perfect for use with Python. In just several lines we’ll run headless Chrome, take the screenshot, optimize it, and then we’re ready to do whatever else with it. It can help detect changes and look for performance opportunities. Let’s get started!
Table of Contents
Requirements and Assumptions
- Python 3 is installed and basic Python syntax understood
- Access to a Linux installation (I recommend Ubuntu)
- Chrome Browser installed
Starting the Script
First, let’s import the optimize-images (command-line) module to compress the screenshot into an optimized PNG file.
pip3 install optimize-images
Next, we import our required modules. All we need outside of core Python is time, datetime, and OS module to run the optimize-images module. Remember it’s a command-line module so we’ll need to run it as an executable.
import os from datetime import datetime import time
Let’s set a few variables we’ll need. The name variable will be used when creating the screenshot file name. Reassign the name variable to your website name, with no spaces. Reassign the URL to the address of the page you want to take a screenshot of.
name = "importsem" url = "https://importsem.com" getdate = datetime.now().strftime("%m-%d-%y")
Take the Screenshot
Now we can go ahead and use the os module to run Chrome in headless mode, hide the scrollbars for a better image, and set the window size. The window size is something you’ll have to play around with depending on your layout. Do be sure to swap in your screenshot destination path. We will use a 15-second delay to make sure the file is ready before continuing. So we can compare the original size to the optimized size we’ll use the os.stat() function to get file size.
try: stream = os.popen("chromium-browser --headless --hide-scrollbars --screenshot='/PATH_TO_DESTINATION/" + name + "_org_" + getdate + ".png' --window-size=1920,1200 " + url) time.sleep(15) org_png = os.stat('/PATH_TO_DESTINATION/" + name + "_org_" + getdate + ".png').st_size
Optimize the Screenshot Image
Now that we have our screenshot PNG we should optimize it to save some file size. We’ll use the optimize-images module. In the documentation, there are some configurations you can play with to get better compression. I just leave them on default. I generally save about 10-15% of file size. Once again be sure to change the destination path in the code. We will use a 20-second delay to make sure the file is ready before continuing.
stream = os.popen("optimize-images /PATH_TO_DESTINATION/" + name + "_op_" + getdate + ".png") time.sleep(20) op_png = os.stat('/PATH_TO_DESTINATION/" + name + "_op_" + getdate + ".png').st_size
Now we close out the Try/Except catching any errors.
except: print("Screenshot failed")
Compare the Two Images
To close out, let’s compare the file sizes and report the bytes saved from compression. First, we check if the images exist where we expect them to be. Then we print out each image size and calculate the difference.
if os.path.isfile() == true and os.path.isfile() == true: print("Original Image: " + org_png) print("Optimized Image" + op_png) print("Saved: " + org_png-op_png) else: print("One of the files doesn't exist")
Conclusion
You’ve just learned how easy it is to take screenshots from headless Chrome using Python. You can easily automate this script to store the data in a database and or loop through an entire site by loading a CSV from a Screaming Frog crawl containing every URL on your site. Now get out there and try it out! Follow me on Twitter and let me know your Chrome screenshot applications and ideas!
Python and Chrome Screenshot FAQ
How can I use Python and Chrome to take screenshots of webpages programmatically?
To capture webpage screenshots using Python and Chrome, you can utilize tools like Selenium WebDriver. Write Python scripts that control Chrome via WebDriver, navigating to the desired webpage and taking screenshots using the appropriate commands.
What Python libraries are commonly used for taking screenshots with Chrome?
Selenium is a popular Python library for browser automation, including taking screenshots. Coupled with the ChromeDriver, it allows Python scripts to interact with the Chrome browser, enabling tasks like capturing webpage screenshots.
Are there specific dependencies or installations required to use Python and Chrome for webpage screenshots?
Yes, you need to install the Selenium library and the ChromeDriver executable. Selenium can be installed via pip (pip install selenium
), and the ChromeDriver executable needs to be downloaded and its path specified in your Python script.
Can I capture full-page screenshots using Python and Chrome?
Yes, Selenium in combination with Chrome supports capturing full-page screenshots. You can utilize specific commands to take screenshots of the entire webpage rather than just the visible portion.
Are there alternatives to Selenium for taking screenshots with Python and Chrome?
While Selenium is widely used, there are other libraries and tools for web scraping and automation in Python, such as Puppeteer (for headless Chrome) or Pyppeteer. However, Selenium remains a popular choice due to its versatility.
- Evaluate Subreddit Posts in Bulk Using GPT4 Prompting - December 12, 2024
- Calculate Similarity Between Article Elements Using spaCy - November 13, 2024
- Audit URLs for SEO Using ahrefs Backlink API Data - November 11, 2024