Mutale Mulenga
3 min readDec 14, 2020

--

Here is something i have just leant. Maybe like me, you have a Mysql DB with a long list of URLs and you wish to screenshot them with.

After researching, found some using code on github and stackoverflow… most of them where showing how to screenshot already filled urls.
In this case, we are getting urls stored in mysql database.

Process One.

Make sure you have installed selenium in your machine.
$admin: pip install selenium

$admin: pip install mysqlclient

Process Two

Download chromedriver @ https://chromedriver.chromium.org/downloads
You can also use firefox driver or any of your choice, but in this case we are using chromedriver.
When you go to the link, you should select the version based your chrome browser number. You can check your Chrome version here chrome://settings/help

Save the downloaded chromedriver in a path that you will reference in the code. eg. if you are on Windows, you can save it in your programs, if on MAC, you can save it in Applications folder.

Process Three

The code.

from selenium import webdriver
from selenium.common.exceptions import WebDriverException, NoSuchElementException, NoSuchWindowException
import pymysql
import time

from selenium import webdriver
from selenium.common.exceptions import WebDriverException, NoSuchElementException, NoSuchWindowException
import pymysql
import time
#import requests
from urllib.parse import urlparse, ParseResult

#define chrome driver path like.

Path = ‘Applications/chromedriver’ — — for Mac

Path = ‘cd\Programmes\chromedriver.exe’ — — for windows.

#connect to mysql Db like

server = pymysql.connect(host="localhost", user="root", passwd="", database="myDB")
cursor = server.cursor()

sql = cursor.execute('SELECT website_link FROM myDB')
rows = cursor.fetchall()

You can set the chrome to run headless or not.
Headless is the chromedriver working without opening the browser.

Create a folder where you wish to store the screenshots.

options = webdriver.ChromeOptions()
options.add_argument('headless') # we don't wish to launch the browser
folder = "screenshots/"
def strip_scheme(url):
parsed = urlparse(url)
scheme = "%s://" % parsed.scheme
return parsed.geturl().replace(scheme, '', 1)

The function strip_scheme is to remove http or https from the url so that you can have an image with just the name of the site like exmeple.co.png

with webdriver.Chrome(PATH, options=options) as driver:
for row in rows:
link = row[0]
name = strip_scheme(link)
desktop = {'output': folder + str(name) + '-desktop.png',
'width': 2200,
'height': 1800}
tablet = {'output': folder + str(name) + '-tablet.png',
'width': 1200,
'height': 1400}
mobile = {'output': folder + str(name) + '-mobile.png',
'width': 680,
'height': 1200}
try:
linkWithProtocol = str(link)

# set the window size for desktop
driver.set_window_size(desktop['width'], desktop['height'])

driver.get(linkWithProtocol)
time.sleep(2)
driver.save_screenshot(desktop['output'])

# set the window size for tablet
driver.set_window_size(tablet['width'], tablet['height'])
driver.get(linkWithProtocol)
time.sleep(2)
driver.save_screenshot(tablet['output'])

# set the window size for mobile
driver.set_window_size(mobile['width'], mobile['height'])
driver.get(linkWithProtocol)
time.sleep(2)
driver.save_screenshot(mobile['output'])
except:
continue

The code above with screenshot the url 3times as Desktop, Tablet, Mobile version of the site.

The try, except function there so that if the url gives a 404 0r 500 error, it will be skipped and then the next url is selected for screenshot.

The Full Code.

from selenium import webdriver
from selenium.common.exceptions import WebDriverException, NoSuchElementException, NoSuchWindowException
import pymysql
import time
#import requests
from urllib.parse import urlparse, ParseResult

server = pymysql.connect(host="localhost", user="root", passwd="", database="myDB")
cursor = server.cursor()

sql = cursor.execute('SELECT website_link FROM myDB')
rows = cursor.fetchall()


PATH = "/Applications/chromedriver"

options = webdriver.ChromeOptions()
options.add_argument('headless') # we don't wish to launch the browser
folder = "screenshots/"

def strip_scheme(url):
parsed = urlparse(url)
scheme = "%s://" % parsed.scheme
return parsed.geturl().replace(scheme, '', 1)


with webdriver.Chrome(PATH, options=options) as driver:
for row in rows:
link = row[0]
name = strip_scheme(link)
desktop = {'output': folder + str(name) + '-desktop.png',
'width': 2200,
'height': 1800}
tablet = {'output': folder + str(name) + '-tablet.png',
'width': 1200,
'height': 1400}
mobile = {'output': folder + str(name) + '-mobile.png',
'width': 680,
'height': 1200}
try:
linkWithProtocol = str(link)

# set the window size for desktop
driver.set_window_size(desktop['width'], desktop['height'])

driver.get(linkWithProtocol)
time.sleep(2)
driver.save_screenshot(desktop['output'])

# set the window size for tablet
driver.set_window_size(tablet['width'], tablet['height'])
driver.get(linkWithProtocol)
time.sleep(2)
driver.save_screenshot(tablet['output'])

# set the window size for mobile
driver.set_window_size(mobile['width'], mobile['height'])
driver.get(linkWithProtocol)
time.sleep(2)
driver.save_screenshot(mobile['output'])
except:
continue

This was something i enjoyed learning from https://github.com/Bilal-io/Selenium-Screenshot-Script,
And I added up a few script to make it pull dynamic Urls.

--

--