Automate the Boring Stuff with Python
Page 1
Automate the Boring Stuff with Python: Practical Programming for Total Beginners
Al Sweigart
Published by No Starch Press
For my nephew Jack
About the Author
Al Sweigart is a software developer and tech book author living in San Francisco. Python is his favorite programming language, and he is the developer of several open source modules for it. His other books are freely available under a Creative Commons license on his website http://www.inventwithpython.com/. His cat weighs 14 pounds.
About the Tech Reviewer
Ari Lacenski is a developer of Android applications and Python software. She lives in San Francisco, where she writes about Android programming at http://gradlewhy.ghost.io/ and mentors with Women Who Code. She’s also a folk guitarist.
Acknowledgments
I couldn’t have written a book like this without the help of a lot of people. I’d like to thank Bill Pollock; my editors, Laurel Chun, Leslie Shen, Greg Poulos, and Jennifer Griffith-Delgado; and the rest of the staff at No Starch Press for their invaluable help. Thanks to my tech reviewer, Ari Lacenski, for great suggestions, edits, and support.
Many thanks to our Benevolent Dictator For Life, Guido van Rossum, and everyone at the Python Software Foundation for their great work. The Python community is the best one I’ve found in the tech industry.
Finally, I would like to thank my family, friends, and the gang at Shotwell’s for not minding the busy life I’ve had while writing this book. Cheers!
Introduction
“You’ve just done in two hours what it takes the three of us two days to do.” My college roommate was working at a retail electronics store in the early 2000s. Occasionally, the store would receive a spreadsheet of thousands of product prices from its competitor. A team of three employees would print the spreadsheet onto a thick stack of paper and split it among themselves. For each product price, they would look up their store’s price and note all the products that their competitors sold for less. It usually took a couple of days.
“You know, I could write a program to do that if you have the original file for the printouts,” my roommate told them, when he saw them sitting on the floor with papers scattered and stacked around them.
After a couple of hours, he had a short program that read a competitor’s price from a file, found the product in the store’s database, and noted whether the competitor was cheaper. He was still new to programming, and he spent most of his time looking up documentation in a programming book. The actual program took only a few seconds to run. My roommate and his co-workers took an extra-long lunch that day.
This is the power of computer programming. A computer is like a Swiss Army knife that you can configure for countless tasks. Many people spend hours clicking and typing to perform repetitive tasks, unaware that the machine they’re using could do their job in seconds if they gave it the right instructions.
Whom Is This Book For?
Software is at the core of so many of the tools we use today: Nearly everyone uses social networks to communicate, many people have Internet-connected computers in their phones, and most office jobs involve interacting with a computer to get work done. As a result, the demand for people who can code has skyrocketed. Countless books, interactive web tutorials, and developer boot camps promise to turn ambitious beginners into software engineers with six-figure salaries.
This book is not for those people. It’s for everyone else.
On its own, this book won’t turn you into a professional software developer any more than a few guitar lessons will turn you into a rock star. But if you’re an office worker, administrator, academic, or anyone else who uses a computer for work or fun, you will learn the basics of programming so that you can automate simple tasks such as the following:
Moving and renaming thousands of files and sorting them into folders
Filling out online forms, no typing required
Downloading files or copy text from a website whenever it updates
Having your computer text you custom notifications
Updating or formatting Excel spreadsheets
Checking your email and sending out prewritten responses
These tasks are simple but time-consuming for humans, and they’re often so trivial or specific that there’s no ready-made software to perform them. Armed with a little bit of programming knowledge, you can have your computer do these tasks for you.
Conventions
This book is not designed as a reference manual; it’s a guide for beginners. The coding style sometimes goes against best practices (for example, some programs use global variables), but that’s a trade-off to make the code simpler to learn. This book is made for people to write throwaway code, so there’s not much time spent on style and elegance. Sophisticated programming concepts—like object-oriented programming, list comprehensions, and generators—aren’t covered because of the complexity they add. Veteran programmers may point out ways the code in this book could be changed to improve efficiency, but this book is mostly concerned with getting programs to work with the least amount of effort.
What Is Programming?
Television shows and films often show programmers furiously typing cryptic streams of 1s and 0s on glowing screens, but modern programming isn’t that mysterious. Programming is simply the act of entering instructions for the computer to perform. These instructions might crunch some numbers, modify text, look up information in files, or communicate with other computers over the Internet.
All programs use basic instructions as building blocks. Here are a few of the most common ones, in English:
“Do this; then do that.”
“If this condition is true, perform this action; otherwise, do that action.”
“Do this action that number of times.”
“Keep doing that until this condition is true.”
You can combine these building blocks to implement more intricate decisions, too. For example, here are the programming instructions, called the source code, for a simple program written in the Python programming language. Starting at the top, the Python software runs each line of code (some lines are run only if a certain condition is true or else Python runs some other line) until it reaches the bottom.
➊ passwordFile = open('SecretPasswordFile.txt') ➋ secretPassword = passwordFile.read() ➌ print('Enter your password.') typedPassword = input() ➍ if typedPassword == secretPassword: ➎ print('Access granted') ➏ if typedPassword == '12345': ➐ print('That password is one that an idiot puts on their luggage.') else: ➑ print('Access denied')
You might not know anything about programming, but you could probably make a reasonable guess at what the previous code does just by reading it. First, the file SecretPasswordFile.txt is opened ➊, and the secret password in it is read ➋. Then, the user is prompted to input a password (from the keyboard) ➌. These two passwords are compared ➍, and if they’re the same, the program prints Access granted to the screen ➎. Next, the program checks to see whether the password is 12345 ➏ and hints that this choice might not be the best for a password ➐. If the passwords are not the same, the program prints Access denied to the screen ➑.
What Is Python?
Python refers to the Python programming language (with syntax rules for writing what is considered valid Python code) and the Python interpreter software that reads source code (written in the Python language) and performs its instructions. The Python interpreter is free to download from http://python.org/, and there are versions for Linux, OS X, and Windows.
The name Python comes from the surreal British comedy group Monty Python, not from the snake. Python programmers
are affectionately called Pythonistas, and both Monty Python and serpentine references usually pepper Python tutorials and documentation.
Programmers Don’t Need to Know Much Math
The most common anxiety I hear about learning to program is that people think it requires a lot of math. Actually, most programming doesn’t require math beyond basic arithmetic. In fact, being good at programming isn’t that different from being good at solving Sudoku puzzles.
To solve a Sudoku puzzle, the numbers 1 through 9 must be filled in for each row, each column, and each 3×3 interior square of the full 9×9 board. You find a solution by applying deduction and logic from the starting numbers. For example, since 5 appears in the top left of the Sudoku puzzle shown in Figure 1, it cannot appear elsewhere in the top row, in the leftmost column, or in the top-left 3×3 square. Solving one row, column, or square at a time will provide more number clues for the rest of the puzzle.
Figure 1. A new Sudoku puzzle (left) and its solution (right). Despite using numbers, Sudoku doesn’t involve much math. (Images © Wikimedia Commons)
Just because Sudoku involves numbers doesn’t mean you have to be good at math to figure out the solution. The same is true of programming. Like solving a Sudoku puzzle, writing programs involves breaking down a problem into individual, detailed steps. Similarly, when debugging programs (that is, finding and fixing errors), you’ll patiently observe what the program is doing and find the cause of the bugs. And like all skills, the more you program, the better you’ll become.
Programming Is a Creative Activity
Programming is a creative task, somewhat like constructing a castle out of LEGO bricks. You start with a basic idea of what you want your castle to look like and inventory your available blocks. Then you start building. Once you’ve finished building your program, you can pretty up your code just like you would your castle.
The difference between programming and other creative activities is that when programming, you have all the raw materials you need in your computer; you don’t need to buy any additional canvas, paint, film, yarn, LEGO bricks, or electronic components. When your program is written, it can easily be shared online with the entire world. And though you’ll make mistakes when programming, the activity is still a lot of fun.
About This Book
The first part of this book covers basic Python programming concepts, and the second part covers various tasks you can have your computer automate. Each chapter in the second part has project programs for you to study. Here’s a brief rundown of what you’ll find in each chapter:
Part I
Chapter 1. Covers expressions, the most basic type of Python instruction, and how to use the Python interactive shell software to experiment with code.
Chapter 2. Explains how to make programs decide which instructions to execute so your code can intelligently respond to different conditions.
Chapter 3. Instructs you on how to define your own functions so that you can organize your code into more manageable chunks.
Chapter 4. Introduces the list data type and explains how to organize data.
Chapter 5. Introduces the dictionary data type and shows you more powerful ways to organize data.
Chapter 6. Covers working with text data (called strings in Python).
Part II
Chapter 7. Covers how Python can manipulate strings and search for text patterns with regular expressions.
Chapter 8. Explains how your programs can read the contents of text files and save information to files on your hard drive.
Chapter 9. Shows how Python can copy, move, rename, and delete large numbers of files much faster than a human user can. It also explains compressing and decompressing files.
Chapter 10. Shows how to use Python’s various bug-finding and bug-fixing tools.
Chapter 11. Shows how to write programs that can automatically download web pages and parse them for information. This is called web scraping.
Chapter 12. Covers programmatically manipulating Excel spreadsheets so that you don’t have to read them. This is helpful when the number of documents you have to analyze is in the hundreds or thousands.
Chapter 13. Covers programmatically reading Word and PDF documents.
Chapter 14. Continues to explain how to programmatically manipulate documents with CSV and JSON files.
Chapter 15. Explains how time and dates are handled by Python programs and how to schedule your computer to perform tasks at certain times. This chapter also shows how your Python programs can launch non-Python programs.
Chapter 16. Explains how to write programs that can send emails and text messages on your behalf.
Chapter 17. Explains how to programmatically manipulate images such as JPEG or PNG files.
Chapter 18. Explains how to programmatically control the mouse and keyboard to automate clicks and keypresses.
Downloading and Installing Python
You can download Python for Windows, OS X, and Ubuntu for free from http://python.org/downloads/. If you download the latest version from the website’s download page, all of the programs in this book should work.
Warning
Be sure to download a version of Python 3 (such as 3.4.0). The programs in this book are written to run on Python 3 and may not run correctly, if at all, on Python 2.
You’ll find Python installers for 64-bit and 32-bit computers for each operating system on the download page, so first figure out which installer you need. If you bought your computer in 2007 or later, it is most likely a 64-bit system. Otherwise, you have a 32-bit version, but here’s how to find out for sure:
On Windows, select Start▸Control Panel▸System and check whether System Type says 64-bit or 32-bit.
On OS X, go the Apple menu, select About This Mac▸More Info ▸ System Report▸Hardware, and then look at the Processor Name field. If it says Intel Core Solo or Intel Core Duo, you have a 32-bit machine. If it says anything else (including Intel Core 2 Duo), you have a 64-bit machine.
On Ubuntu Linux, open a Terminal and run the command uname -m. A response of i686 means 32-bit, and x86_64 means 64-bit.
On Windows, download the Python installer (the filename will end with .msi) and double-click it. Follow the instructions the installer displays on the screen to install Python, as listed here:
Select Install for All Users and then click Next.
Install to the C:Python34 folder by clicking Next.
Click Next again to skip the Customize Python section.
On Mac OS X, download the .dmg file that’s right for your version of OS X and double-click it. Follow the instructions the installer displays on the screen to install Python, as listed here:
When the DMG package opens in a new window, double-click the Python.mpkg file. You may have to enter the administrator password.
Click Continue through the Welcome section and click Agree to accept the license.
Select HD Macintosh (or whatever name your hard drive has) and click Install.
If you’re running Ubuntu, you can install Python from the Terminal by following these steps:
Open the Terminal window.
Enter sudo apt-get install python3.
Enter sudo apt-get install idle3.
Enter sudo apt-get install python3-pip.
Starting IDLE
While the Python interpreter is the software that runs your Python programs, the interactive development environment (IDLE) software is where you’ll enter your programs, much like a word processor. Let’s start IDLE now.
On Windows 7 or newer, click the Start icon in the lower-left corner of your screen, enter IDLE in the search box, and select IDLE (Python GUI).
On Windows XP, click the Start button and then select Programs ▸ Python 3.4▸IDLE (Python GUI).
On Mac OS X, open the Finder window, click Applications, click Python 3.4, and then click the IDLE icon.
On Ubuntu, select Applications▸Accessories▸Terminal and then enter idle3. (You may also be able to click
Applications at the top of the screen, select Programming, and then click IDLE 3.)
The Interactive Shell
No matter which operating system you’re running, the IDLE window that first appears should be mostly blank except for text that looks something like this:
Python 3.4.0 (v3.4.0:04f714765c13, Mar 16 2014, 19:25:23) [MSC v.1600 64 bit (AMD64)] on win32Type "copyright", "credits" or "license()" for more information. >>>
This window is called the interactive shell. A shell is a program that lets you type instructions into the computer, much like the Terminal or Command Prompt on OS X and Windows, respectively. Python’s interactive shell lets you enter instructions for the Python interpreter software to run. The computer reads the instructions you enter and runs them immediately.