by Al Sweigart
First, a photo file must have the file extension .png or .jpg. Also, photos are large images; a photo file’s width and height must both be larger than 500 pixels. This is a safe bet, since most digital camera photos are several thousand pixels in width and height.
As a hint, here’s a rough skeleton of what this program might look like:
#! python3 # Import modules and write comments to describe this program. for foldername, subfolders, filenames in os.walk('C:\'): numPhotoFiles = 0 numNonPhotoFiles = 0 for filename in filenames: # Check if file extension isn't .png or .jpg. if TODO: numNonPhotoFiles += 1 continue # skip to next filename # Open image file using Pillow. # Check if width & height are larger than 500. if TODO: # Image is large enough to be considered a photo. numPhotoFiles += 1 else: # Image is too small to be a photo. numNonPhotoFiles += 1 # If more than half of files were photos, # print the absolute path of the folder. if TODO: print(TODO)
When the program runs, it should print the absolute path of any photo folders to the screen.
Custom Seating Cards
Chapter 13 included a practice project to create custom invitations from a list of guests in a plaintext file. As an additional project, use the pillow module to create images for custom seating cards for your guests. For each of the guests listed in the guests.txt file from the resources at http://nostarch.com/automatestuff/, generate an image file with the guest name and some flowery decoration. A public domain flower image is available in the resources at http://nostarch.com/automatestuff/.
To ensure that each seating card is the same size, add a black rectangle on the edges of the invitation image so that when the image is printed out, there will be a guideline for cutting. The PNG files that Pillow produces are set to 72 pixels per inch, so a 4×5-inch card would require a 288×360-pixel image.
Chapter 18. Controlling the Keyboard and Mouse with GUI Automation
Knowing various Python modules for editing spreadsheets, downloading files, and launching programs is useful, but sometimes there just aren’t any modules for the applications you need to work with. The ultimate tools for automating tasks on your computer are programs you write that directly control the keyboard and mouse. These programs can control other applications by sending them virtual keystrokes and mouse clicks, just as if you were sitting at your computer and interacting with the applications yourself. This technique is known as graphical user interface automation, or GUI automation for short. With GUI automation, your programs can do anything that a human user sitting at the computer can do, except spill coffee on the keyboard.
Think of GUI automation as programming a robotic arm. You can program the robotic arm to type at your keyboard and move your mouse for you. This technique is particularly useful for tasks that involve a lot of mindless clicking or filling out of forms.
The pyautogui module has functions for simulating mouse movements, button clicks, and scrolling the mouse wheel. This chapter covers only a subset of PyAutoGUI’s features; you can find the full documentation at http://pyautogui.readthedocs.org/.
Installing the pyautogui Module
The pyautogui module can send virtual keypresses and mouse clicks to Windows, OS X, and Linux. Depending on which operating system you’re using, you may have to install some other modules (called dependencies) before you can install PyAutoGUI.
On Windows, there are no other modules to install.
On OS X, run sudo pip3 install pyobjc-framework-Quartz, sudo pip3 install pyobjc-core, and then sudo pip3 install pyobjc.
On Linux, run sudo pip3 install python3-xlib, sudo apt-get install scrot, sudo apt-get install python3-tk, and sudo apt-get install python3-dev. (Scrot is a screenshot program that PyAutoGUI uses.)
After these dependencies are installed, run pip install pyautogui (or pip3 on OS X and Linux) to install PyAutoGUI.
Appendix A has complete information on installing third-party modules. To test whether PyAutoGUI has been installed correctly, run import pyautogui from the interactive shell and check for any error messages.
Staying on Track
Before you jump in to a GUI automation, you should know how to escape problems that may arise. Python can move your mouse and type keystrokes at an incredible speed. In fact, it might be too fast for other programs to keep up with. Also, if something goes wrong but your program keeps moving the mouse around, it will be hard to tell what exactly the program is doing or how to recover from the problem. Like the enchanted brooms from Disney’s The Sorcerer’s Apprentice, which kept filling—and then overfilling—Mickey’s tub with water, your program could get out of control even though it’s following your instructions perfectly. Stopping the program can be difficult if the mouse is moving around on its own, preventing you from clicking the IDLE window to close it. Fortunately, there are several ways to prevent or recover from GUI automation problems.
Shutting Down Everything by Logging Out
Perhaps the simplest way to stop an out-of-control GUI automation program is to log out, which will shut down all running programs. On Windows and Linux, the logout hotkey is CTRL-ALT-DEL. On OS X, it is -SHIFT-OPTION-Q. By logging out, you’ll lose any unsaved work, but at least you won’t have to wait for a full reboot of the computer.
Pauses and Fail-Safes
You can tell your script to wait after every function call, giving you a short window to take control of the mouse and keyboard if something goes wrong. To do this, set the pyautogui.PAUSE variable to the number of seconds you want it to pause. For example, after setting pyautogui.PAUSE = 1.5, every PyAutoGUI function call will wait one and a half seconds after performing its action. Non-PyAutoGUI instructions will not have this pause.
PyAutoGUI also has a fail-safe feature. Moving the mouse cursor to the upper-left corner of the screen will cause PyAutoGUI to raise the pyautogui.FailSafeException exception. Your program can either handle this exception with try and except statements or let the exception crash your program. Either way, the fail-safe feature will stop the program if you quickly move the mouse as far up and left as you can. You can disable this feature by setting pyautogui.FAILSAFE = False. Enter the following into the interactive shell:
>>> import pyautogui >>> pyautogui.PAUSE = 1 >>> pyautogui.FAILSAFE = True
Here we import pyautogui and set pyautogui.PAUSE to 1 for a one-second pause after each function call. We set pyautogui.FAILSAFE to True to enable the fail-safe feature.
Controlling Mouse Movement
In this section, you’ll learn how to move the mouse and track its position on the screen using PyAutoGUI, but first you need to understand how PyAutoGUI works with coordinates.
The mouse functions of PyAutoGUI use x- and y-coordinates. Figure 18-1 shows the coordinate system for the computer screen; it’s similar to the coordinate system used for images, discussed in Chapter 17. The origin, where x and y are both zero, is at the upper-left corner of the screen. The x-coordinates increase going to the right, and the y-coordinates increase going down. All coordinates are positive integers; there are no negative coordinates.
Figure 18-1. The coordinates of a computer screen with 1920×1080 resolution
Your resolution is how many pixels wide and tall your screen is. If your screen’s resolution is set to 1920×1080, then the coordinate for the upper-left corner will be (0, 0), and the coordinate for the bottom-right corner will be (1919, 1079).
The pyautogui.size() function returns a two-integer tuple of the screen’s width and height in pixels. Enter the following into the interactive shell:
>>> import pyautogui >>> pyautogui.size() (1920, 1080) >>> width, height = pyautogui.size()
pyautogui.size() returns (1920, 1080) on a computer with a 1920×1080 resolution; depending on your screen’s resolution, your return value may be different. You can store the width and height from pyautogui.size() in variables like width and height for better readability in your programs.
Moving the Mouse
Now that you understand screen coordinates, let’s move the mouse.
The pyautogui.moveTo() function will instantly move the mouse cursor to a specified position on the screen. Integer values for the x- and y-coordinates make up the function’s first and second arguments, respectively. An optional duration integer or float keyword argument specifies the number of seconds it should take to move the mouse to the destination. If you leave it out, the default is 0 for instantaneous movement. (All of the duration keyword arguments in PyAutoGUI functions are optional.) Enter the following into the interactive shell:
>>> import pyautogui >>> for i in range(10): pyautogui.moveTo(100, 100, duration=0.25) pyautogui.moveTo(200, 100, duration=0.25) pyautogui.moveTo(200, 200, duration=0.25) pyautogui.moveTo(100, 200, duration=0.25)
This example moves the mouse cursor clockwise in a square pattern among the four coordinates provided a total of ten times. Each movement takes a quarter of a second, as specified by the duration=0.25 keyword argument. If you hadn’t passed a third argument to any of the pyautogui.moveTo() calls, the mouse cursor would have instantly teleported from point to point.
The pyautogui.moveRel() function moves the mouse cursor relative to its current position. The following example moves the mouse in the same square pattern, except it begins the square from wherever the mouse happens to be on the screen when the code starts running:
>>> import pyautogui >>> for i in range(10): pyautogui.moveRel(100, 0, duration=0.25) pyautogui.moveRel(0, 100, duration=0.25) pyautogui.moveRel(-100, 0, duration=0.25) pyautogui.moveRel(0, -100, duration=0.25)
pyautogui.moveRel() also takes three arguments: how many pixels to move horizontally to the right, how many pixels to move vertically downward, and (optionally) how long it should take to complete the movement. A negative integer for the first or second argument will cause the mouse to move left or upward, respectively.
Getting the Mouse Position
You can determine the mouse’s current position by calling the pyautogui.position() function, which will return a tuple of the mouse cursor’s x and y positions at the time of the function call. Enter the following into the interactive shell, moving the mouse around after each call:
>>> pyautogui.position() (311, 622) >>> pyautogui.position() (377, 481) >>> pyautogui.position() (1536, 637)
Of course, your return values will vary depending on where your mouse cursor is.
Project: “Where Is the Mouse Right Now?”
Being able to determine the mouse position is an important part of setting up your GUI automation scripts. But it’s almost impossible to figure out the exact coordinates of a pixel just by looking at the screen. It would be handy to have a program that constantly displays the x- and y-coordinates of the mouse cursor as you move it around.
At a high level, here’s what your program should do:
Display the current x- and y-coordinates of the mouse cursor.
Update these coordinates as the mouse moves around the screen.
This means your code will need to do the following:
Call the position() function to fetch the current coordinates.
Erase the previously printed coordinates by printing b backspace characters to the screen.
Handle the KeyboardInterrupt exception so the user can press CTRL-C to quit.
Open a new file editor window and save it as mouseNow.py.
Step 1: Import the Module
Start your program with the following:
#! python3 # mouseNow.py - Displays the mouse cursor's current position. import pyautogui print('Press Ctrl-C to quit.') #TODO: Get and print the mouse coordinates.
The beginning of the program imports the pyautogui module and prints a reminder to the user that they have to press CTRL-C to quit.
Step 2: Set Up the Quit Code and Infinite Loop
You can use an infinite while loop to constantly print the current mouse coordinates from mouse.position(). As for the code that quits the program, you’ll need to catch the KeyboardInterrupt exception, which is raised whenever the user presses CTRL-C. If you don’t handle this exception, it will display an ugly traceback and error message to the user. Add the following to your program:
#! python3 # mouseNow.py - Displays the mouse cursor's current position. import pyautogui print('Press Ctrl-C to quit.') try: while True: # TODO: Get and print the mouse coordinates. ➊ except KeyboardInterrupt: ➋ print('nDone.')
To handle the exception, enclose the infinite while loop in a try statement. When the user presses CTRL-C, the program execution will move to the except clause ➊ and Done. will be printed in a new line ➋.
Step 3: Get and Print the Mouse Coordinates
The code inside the while loop should get the current mouse coordinates, format them to look nice, and print them. Add the following code to the inside of the while loop:
#! python3 # mouseNow.py - Displays the mouse cursor's current position. import pyautogui print('Press Ctrl-C to quit.') --snip-- # Get and print the mouse coordinates. x, y = pyautogui.position() positionStr = 'X: ' + str(x).rjust(4) + ' Y: ' + str(y).rjust(4) --snip--
Using the multiple assignment trick, the x and y variables are given the values of the two integers returned in the tuple from pyautogui.position(). By passing x and y to the str() function, you can get string forms of the integer coordinates. The rjust() string method will right-justify them so that they take up the same amount of space, whether the coordinate has one, two, three, or four digits. Concatenating the right-justified string coordinates with 'X: ' and ' Y: ' labels gives us a neatly formatted string, which will be stored in positionStr.
At the end of your program, add the following code:
#! python3 # mouseNow.py - Displays the mouse cursor's current position. --snip-- print(positionStr, end='') ➊ print('b' * len(positionStr), end='', flush=True)
This actually prints positionStr to the screen. The end='' keyword argument to print() prevents the default newline character from being added to the end of the printed line. It’s possible to erase text you’ve already printed to the screen—but only for the most recent line of text. Once you print a newline character, you can’t erase anything printed before it.
To erase text, print the b backspace escape character. This special character erases a character at the end of the current line on the screen. The line at ➊ uses string replication to produce a string with as many b characters as the length of the string stored in positionStr, which has the effect of erasing the positionStr string that was last printed.
For a technical reason beyond the scope of this book, always pass flush=True to print() calls that print b backspace characters. Otherwise, the screen might not update the text as desired.
Since the while loop repeats so quickly, the user won’t actually notice that you’re deleting and reprinting the whole number on the screen. For example, if the x-coordinate is 563 and the mouse moves one pixel to the right, it will look like only the 3 in 563 is changed to a 4.
When you run the program, there will be only two lines printed. They should look like something like this:
Press Ctrl-C to quit. X: 290 Y: 424
The first line displays the instruction to press CTRL-C to quit. The second line with the mouse coordinates will change as you move the mouse around the screen. Using this program, you’ll be able to figure out the mouse coordinates for your GUI automation scripts.
Controlling Mouse Interaction
Now that you know how to move the mouse and figure out where it is on the screen, you’re ready to start clicking, dragging, and scrolling.
Clicking the Mouse
To send a virtual mouse click to your computer, call the pyautogui.click() method. By default, this click uses the left mouse button and takes place wherever the mouse cursor is currently located. You can pass x- and y-coordinates of the click as optional first and second arguments if you want it to take place somewhere other than the mouse’s current position.
If you want to specify which mouse button to use, include the button keyword argument, with a value of 'left', 'middle', or 'righ
t'. For example, pyautogui.click(100, 150, button='left') will click the left mouse button at the coordinates (100, 150), while pyautogui.click(200, 250, button='right') will perform a right-click at (200, 250).
Enter the following into the interactive shell:
>>> import pyautogui >>> pyautogui.click(10, 5)
You should see the mouse pointer move to near the top-left corner of your screen and click once. A full “click” is defined as pushing a mouse button down and then releasing it back up without moving the cursor. You can also perform a click by calling pyautogui.mouseDown(), which only pushes the mouse button down, and pyautogui.mouseUp(), which only releases the button. These functions have the same arguments as click(), and in fact, the click() function is just a convenient wrapper around these two function calls.
As a further convenience, the pyautogui.doubleClick() function will perform two clicks with the left mouse button, while the pyautogui.rightClick() and pyautogui.middleClick() functions will perform a click with the right and middle mouse buttons, respectively.