by Al Sweigart
The password manager program you’ll create in this example isn’t secure, but it offers a basic demonstration of how such programs work.
The Chapter Projects
This is the first “chapter project” of the book. From here on, each chapter will have projects that demonstrate the concepts covered in the chapter. The projects are written in a style that takes you from a blank file editor window to a full, working program. Just like with the interactive shell examples, don’t only read the project sections—follow along on your computer!
Step 1: Program Design and Data Structures
You want to be able to run this program with a command line argument that is the account’s name—for instance, email or blog. That account’s password will be copied to the clipboard so that the user can paste it into a Password field. This way, the user can have long, complicated passwords without having to memorize them.
Open a new file editor window and save the program as pw.py. You need to start the program with a #! (shebang) line (see Appendix B) and should also write a comment that briefly describes the program. Since you want to associate each account’s name with its password, you can store these as strings in a dictionary. The dictionary will be the data structure that organizes your account and password data. Make your program look like the following:
#! python3 # pw.py - An insecure password locker program. PASSWORDS = {'email': 'F7minlBDDuvMJuxESSKHFhTxFtjVB6', 'blog': 'VmALvQyKAxiVH5G8v01if1MLZF3sdt', 'luggage': '12345'}
Step 2: Handle Command Line Arguments
The command line arguments will be stored in the variable sys.argv. (See Appendix B for more information on how to use command line arguments in your programs.) The first item in the sys.argv list should always be a string containing the program’s filename ('pw.py'), and the second item should be the first command line argument. For this program, this argument is the name of the account whose password you want. Since the command line argument is mandatory, you display a usage message to the user if they forget to add it (that is, if the sys.argv list has fewer than two values in it). Make your program look like the following:
#! python3 # pw.py - An insecure password locker program. PASSWORDS = {'email': 'F7minlBDDuvMJuxESSKHFhTxFtjVB6', 'blog': 'VmALvQyKAxiVH5G8v01if1MLZF3sdt', 'luggage': '12345'} import sys if len(sys.argv) < 2: print('Usage: python pw.py [account] - copy account password') sys.exit() account = sys.argv[1] # first command line arg is the account name
Step 3: Copy the Right Password
Now that the account name is stored as a string in the variable account, you need to see whether it exists in the PASSWORDS dictionary as a key. If so, you want to copy the key’s value to the clipboard using pyperclip.copy(). (Since you’re using the pyperclip module, you need to import it.) Note that you don’t actually need the account variable; you could just use sys.argv[1] everywhere account is used in this program. But a variable named account is much more readable than something cryptic like sys.argv[1].
Make your program look like the following:
#! python3 # pw.py - An insecure password locker program. PASSWORDS = {'email': 'F7minlBDDuvMJuxESSKHFhTxFtjVB6', 'blog': 'VmALvQyKAxiVH5G8v01if1MLZF3sdt', 'luggage': '12345'} import sys, pyperclip if len(sys.argv) < 2: print('Usage: py pw.py [account] - copy account password') sys.exit() account = sys.argv[1] # first command line arg is the account name if account in PASSWORDS: pyperclip.copy(PASSWORDS[account]) print('Password for ' + account + ' copied to clipboard.') else: print('There is no account named ' + account)
This new code looks in the PASSWORDS dictionary for the account name. If the account name is a key in the dictionary, we get the value corresponding to that key, copy it to the clipboard, and print a message saying that we copied the value. Otherwise, we print a message saying there’s no account with that name.
That’s the complete script. Using the instructions in Appendix B for launching command line programs easily, you now have a fast way to copy your account passwords to the clipboard. You will have to modify the PASSWORDS dictionary value in the source whenever you want to update the program with a new password.
Of course, you probably don’t want to keep all your passwords in one place where anyone could easily copy them. But you can modify this program and use it to quickly copy regular text to the clipboard. Say you are sending out several emails that have many of the same stock paragraphs in common. You could put each paragraph as a value in the PASSWORDS dictionary (you’d probably want to rename the dictionary at this point), and then you would have a way to quickly select and copy one of many standard pieces of text to the clipboard.
On Windows, you can create a batch file to run this program with the WIN-R Run window. (For more about batch files, see Appendix B.) Type the following into the file editor and save the file as pw.bat in the C:Windows folder:
@py.exe C:Python34pw.py %* @pause
With this batch file created, running the password-safe program on Windows is just a matter of pressing WIN-R and typing pw
Project: Adding Bullets to Wiki Markup
When editing a Wikipedia article, you can create a bulleted list by putting each list item on its own line and placing a star in front. But say you have a really large list that you want to add bullet points to. You could just type those stars at the beginning of each line, one by one. Or you could automate this task with a short Python script.
The bulletPointAdder.py script will get the text from the clipboard, add a star and space to the beginning of each line, and then paste this new text to the clipboard. For example, if I copied the following text (for the Wikipedia article “List of Lists of Lists”) to the clipboard:
Lists of animals Lists of aquarium life Lists of biologists by author abbreviation Lists of cultivars
and then ran the bulletPointAdder.py program, the clipboard would then contain the following:
* Lists of animals * Lists of aquarium life * Lists of biologists by author abbreviation * Lists of cultivars
This star-prefixed text is ready to be pasted into a Wikipedia article as a bulleted list.
Step 1: Copy and Paste from the Clipboard
You want the bulletPointAdder.py program to do the following:
Paste text from the clipboard
Do something to it
Copy the new text to the clipboard
That second step is a little tricky, but steps 1 and 3 are pretty straightforward: They just involve the pyperclip.copy() and pyperclip.paste() functions. For now, let’s just write the part of the program that covers steps 1 and 3. Enter the following, saving the program as bulletPointAdder.py:
#! python3 # bulletPointAdder.py - Adds Wikipedia bullet points to the start # of each line of text on the clipboard. import pyperclip text = pyperclip.paste() # TODO: Separate lines and add stars. pyperclip.copy(text)
The TODO comment is a reminder that you should complete this part of the program eventually. The next step is to actually implement that piece of the program.
Step 2: Separate the Lines of Text and Add the Star
The call to pyperclip.paste() returns all the text on the clipboard as one big string. If we used the “List of Lists of Lists” example, the string stored in text would look like this:
'Lists of animalsnLists of aquarium lifenLists of biologists by author abbreviationnLists of cultivars'
The n newline characters in this string cause it to be displayed with multiple lines when it is printed or pasted from the clipboard. There are many “lines” in this one string value. You want to add a star to the start of each of these lines.
You could write code that searches for each n newline character in the string and then adds the star just after that. But it would be easier to use the split() method to return a list of strings, one for each line in the original string, and then add the star to the front of each string in the list.
Make your program look like the following:
#! python3 # bulletPointAdder.py - Adds Wikipedia bullet points
to the start # of each line of text on the clipboard. import pyperclip text = pyperclip.paste() # Separate lines and add stars. lines = text.split('n') for i in range(len(lines)): # loop through all indexes in the "lines" list lines[i] = '* ' + lines[i] # add star to each string in "lines" list pyperclip.copy(text)
We split the text along its newlines to get a list in which each item is one line of the text. We store the list in lines and then loop through the items in lines. For each line, we add a star and a space to the start of the line. Now each string in lines begins with a star.
Step 3: Join the Modified Lines
The lines list now contains modified lines that start with stars. But pyperclip.copy() is expecting a single string value, not a list of string values. To make this single string value, pass lines into the join() method to get a single string joined from the list’s strings. Make your program look like the following:
#! python3 # bulletPointAdder.py - Adds Wikipedia bullet points to the start # of each line of text on the clipboard. import pyperclip text = pyperclip.paste() # Separate lines and add stars. lines = text.split('n') for i in range(len(lines)): # loop through all indexes for "lines" list lines[i] = '* ' + lines[i] # add star to each string in "lines" list text = 'n'.join(lines) pyperclip.copy(text)
When this program is run, it replaces the text on the clipboard with text that has stars at the start of each line. Now the program is complete, and you can try running it with text copied to the clipboard.
Even if you don’t need to automate this specific task, you might want to automate some other kind of text manipulation, such as removing trailing spaces from the end of lines or converting text to uppercase or lowercase. Whatever your needs, you can use the clipboard for input and output.
Summary
Text is a common form of data, and Python comes with many helpful string methods to process the text stored in string values. You will make use of indexing, slicing, and string methods in almost every Python program you write.
The programs you are writing now don’t seem too sophisticated—they don’t have graphical user interfaces with images and colorful text. So far, you’re displaying text with print() and letting the user enter text with input(). However, the user can quickly enter large amounts of text through the clipboard. This ability provides a useful avenue for writing programs that manipulate massive amounts of text. These text-based programs might not have flashy windows or graphics, but they can get a lot of useful work done quickly.
Another way to manipulate large amounts of text is reading and writing files directly off the hard drive. You’ll learn how to do this with Python in the next chapter.
That just about covers all the basic concepts of Python programming! You’ll continue to learn new concepts throughout the rest of this book, but you now know enough to start writing some useful programs that can automate tasks. You might not think you have enough Python knowledge to do things such as download web pages, update spreadsheets, or send text messages, but that’s where Python modules come in! These modules, written by other programmers, provide functions that make it easy for you to do all these things. So let’s learn how to write real programs to do useful automated tasks.
Practice Questions
Q:
1. What are escape characters?
Q:
2. What do the n and t escape characters represent?
Q:
3. How can you put a backslash character in a string?
Q:
4. The string value "Howl's Moving Castle" is a valid string. Why isn’t it a problem that the single quote character in the word Howl's isn’t escaped?
Q:
5. If you don’t want to put n in your string, how can you write a string with newlines in it?
Q:
6. What do the following expressions evaluate to?
'Hello world!'[1]
'Hello world!'[0:5]
'Hello world!'[:5]
'Hello world!'[3:]
Q:
7. What do the following expressions evaluate to?
'Hello'.upper()
'Hello'.upper().isupper()
'Hello'.upper().lower()
Q:
8. What do the following expressions evaluate to?
'Remember, remember, the fifth of November.'.split()
'-'.join('There can be only one.'.split())
Q:
9. What string methods can you use to right-justify, left-justify, and center a string?
Q:
10. How can you trim whitespace characters from the beginning or end of a string?
Practice Project
For practice, write a program that does the following.
Table Printer
Write a function named printTable() that takes a list of lists of strings and displays it in a well-organized table with each column right-justified. Assume that all the inner lists will contain the same number of strings. For example, the value could look like this:
tableData = [['apples', 'oranges', 'cherries', 'banana'], ['Alice', 'Bob', 'Carol', 'David'], ['dogs', 'cats', 'moose', 'goose']]
Your printTable() function would print the following:
apples Alice dogs oranges Bob cats cherries Carol moose banana David goose
Hint: Your code will first have to find the longest string in each of the inner lists so that the whole column can be wide enough to fit all the strings. You can store the maximum width of each column as a list of integers. The printTable() function can begin with colWidths = [0] * len(tableData), which will create a list containing the same number of 0 values as the number of inner lists in tableData. That way, colWidths[0] can store the width of the longest string in tableData[0], colWidths[1] can store the width of the longest string in tableData[1], and so on. You can then find the largest value in the colWidths list to find out what integer width to pass to the rjust() string method.
Part II. Automating Tasks
Chapter 7. Pattern Matching with Regular Expressions
You may be familiar with searching for text by pressing CTRL-F and typing in the words you’re looking for. Regular expressions go one step further: They allow you to specify a pattern of text to search for. You may not know a business’s exact phone number, but if you live in the United States or Canada, you know it will be three digits, followed by a hyphen, and then four more digits (and optionally, a three-digit area code at the start). This is how you, as a human, know a phone number when you see it: 415-555-1234 is a phone number, but 4,155,551,234 is not.
Regular expressions are helpful, but not many non-programmers know about them even though most modern text editors and word processors, such as Microsoft Word or OpenOffice, have find and find-and-replace features that can search based on regular expressions. Regular expressions are huge time-savers, not just for software users but also for programmers. In fact, tech writer Cory Doctorow argues that even before teaching programming, we should be teaching regular expressions:
“Knowing [regular expressions] can mean the difference between solving a problem in 3 steps and solving it in 3,000 steps. When you’re a nerd, you forget that the problems you solve with a couple keystrokes can take other people days of tedious, error-prone work to slog through.”[1]
In this chapter, you’ll start by writing a program to find text patterns without using regular expressions and then see how to use regular expressions to make the code much less bloated. I’ll show you basic matching with regular expressions and then move on to some more powerful features, such as string substitution and creating your own character classes. Finally, at the end of the chapter, you’ll write a program that can automatically extract phone numbers and email addresses from a block of text.
Finding Patterns of Text Without Regular Expressions
Say you want to find a phone number in a string. You know the pattern: three numbers, a hyphen, three numbers, a hyphen, and four numbers. Here’s an example: 415-555-4242.
Let’s use a function named isPhoneNumber()
to check whether a string matches this pattern, returning either True or False. Open a new file editor window and enter the following code; then save the file as isPhoneNumber.py:
def isPhoneNumber(text): ➊ if len(text) != 12: return False for i in range(0, 3): ➋ if not text[i].isdecimal(): return False ➌ if text[3] != '-': return False for i in range(4, 7): ➍ if not text[i].isdecimal(): return False ➎ if text[7] != '-': return False for i in range(8, 12): ➏ if not text[i].isdecimal(): return False ➐ return True print('415-555-4242 is a phone number:') print(isPhoneNumber('415-555-4242')) print('Moshi moshi is a phone number:') print(isPhoneNumber('Moshi moshi'))
When this program is run, the output looks like this:
415-555-4242 is a phone number: True Moshi moshi is a phone number: False
The isPhoneNumber() function has code that does several checks to see whether the string in text is a valid phone number. If any of these checks fail, the function returns False. First the code checks that the string is exactly 12 characters ➊. Then it checks that the area code (that is, the first three characters in text) consists of only numeric characters ➋. The rest of the function checks that the string follows the pattern of a phone number: The number must have the first hyphen after the area code ➌, three more numeric characters ➍, then another hyphen ➎, and finally four more numbers ➏. If the program execution manages to get past all the checks, it returns True ➐.