Automate the Boring Stuff with Python

Home > Other > Automate the Boring Stuff with Python > Page 47
Automate the Boring Stuff with Python Page 47

by Al Sweigart


  Place the line of code that might cause an error in a try clause.

  The code that could potentially cause an error goes in the try clause.

  The code that executes if an error happens goes in the except clause.

  Chapter 4

  The empty list value, which is a list value that contains no items. This is similar to how '' is the empty string value.

  spam[2] = 'hello' (Notice that the third value in a list is at index 2 because the first index is 0.)

  'd' (Note that '3' * 2 is the string '33', which is passed to int() before being divided by 11. This eventually evaluates to 3. Expressions can be used wherever values are used.)

  'd' (Negative indexes count from the end.)

  ['a', 'b']

  1

  [3.14, 'cat', 11, 'cat', True, 99]

  [3.14, 11, 'cat', True]

  The operator for list concatenation is +, while the operator for replication is *. (This is the same as for strings.)

  While append() will add values only to the end of a list, insert() can add them anywhere in the list.

  The del statement and the remove() list method are two ways to remove values from a list.

  Both lists and strings can be passed to len(), have indexes and slices, be used in for loops, be concatenated or replicated, and be used with the in and not in operators.

  Lists are mutable; they can have values added, removed, or changed. Tuples are immutable; they cannot be changed at all. Also, tuples are written using parentheses, ( and ), while lists use the square brackets, [ and ].

  (42,) (The trailing comma is mandatory.)

  The tuple() and list() functions, respectively

  They contain references to list values.

  The copy.copy() function will do a shallow copy of a list, while the copy.deepcopy() function will do a deep copy of a list. That is, only copy.deepcopy() will duplicate any lists inside the list.

  Chapter 5

  Two curly brackets: {}

  {'foo': 42}

  The items stored in a dictionary are unordered, while the items in a list are ordered.

  You get a KeyError error.

  There is no difference. The in operator checks whether a value exists as a key in the dictionary.

  'cat' in spam checks whether there is a 'cat' key in the dictionary, while 'cat' in spam.values() checks whether there is a value 'cat' for one of the keys in spam.

  spam.setdefault('color', 'black')

  pprint.pprint()

  Chapter 6

  Escape characters represent characters in string values that would otherwise be difficult or impossible to type into code.

  n is a newline; t is a tab.

  The \ escape character will represent a backslash character.

  The single quote in Howl's is fine because you’ve used double quotes to mark the beginning and end of the string.

  Multiline strings allow you to use newlines in strings without the n escape character.

  The expressions evaluate to the following:

  'e'

  'Hello'

  'Hello'

  'lo world!

  The expressions evaluate to the following:

  'HELLO'

  True

  'hello'

  The expressions evaluate to the following:

  ['Remember,', 'remember,', 'the', 'fifth', 'of', 'November.']

  'There-can-be-only-one.'

  The rjust(), ljust(), and center() string methods, respectively

  The lstrip() and rstrip() methods remove whitespace from the left and right ends of a string, respectively.

  Chapter 7

  The re.compile() function returns Regex objects.

  Raw strings are used so that backslashes do not have to be escaped.

  The search() method returns Match objects.

  The group() method returns strings of the matched text.

  Group 0 is the entire match, group 1 covers the first set of parentheses, and group 2 covers the second set of parentheses.

  Periods and parentheses can be escaped with a backslash: ., (, and ).

  If the regex has no groups, a list of strings is returned. If the regex has groups, a list of tuples of strings is returned.

  The | character signifies matching “either, or” between two groups.

  The ? character can either mean “match zero or one of the preceding group” or be used to signify nongreedy matching.

  The + matches one or more. The * matches zero or more.

  The {3} matches exactly three instances of the preceding group. The {3,5} matches between three and five instances.

  The d, w, and s shorthand character classes match a single digit, word, or space character, respectively.

  The D, W, and S shorthand character classes match a single character that is not a digit, word, or space character, respectively.

  Passing re.I or re.IGNORECASE as the second argument to re.compile() will make the matching case insensitive.

  The . character normally matches any character except the newline character. If re.DOTALL is passed as the second argument to re.compile(), then the dot will also match newline characters.

  The .* performs a greedy match, and the .*? performs a nongreedy match.

  Either [0-9a-z] or [a-z0-9]

  'X drummers, X pipers, five rings, X hens'

  The re.VERBOSE argument allows you to add whitespace and comments to the string passed to re.compile().

  re.compile(r'^d{1,3}(,d{3})*$') will create this regex, but other regex strings can produce a similar regular expression.

  re.compile(r'[A-Z][a-z]*sNakamoto')

  re.compile(r'(Alice|Bob|Carol)s(eats|pets|throws) s(apples|cats|baseballs).', re.IGNORECASE)

  Chapter 8

  Relative paths are relative to the current working directory.

  Absolute paths start with the root folder, such as / or C:.

  The os.getcwd() function returns the current working directory. The os.chdir() function changes the current working directory.

  The . folder is the current folder, and .. is the parent folder.

  C:baconeggs is the dir name, while spam.txt is the base name.

  The string 'r' for read mode, 'w' for write mode, and 'a' for append mode

  An existing file opened in write mode is erased and completely overwritten.

  The read() method returns the file’s entire contents as a single string value. The readlines() method returns a list of strings, where each string is a line from the file’s contents.

  A shelf value resembles a dictionary value; it has keys and values, along with keys() and values() methods that work similarly to the dictionary methods of the same names.

  Chapter 9

  The shutil.copy() function will copy a single file, while shutil.copytree() will copy an entire folder, along with all its contents.

  The shutil.move() function is used for renaming files, as well as moving them.

  The send2trash functions will move a file or folder to the recycle bin, while shutil functions will permanently delete files and folders.

  The zipfile.ZipFile() function is equivalent to the open() function; the first argument is the filename, and the second argument is the mode to open the ZIP file in (read, write, or append).

  Chapter 10

  assert(spam >= 10, 'The spam variable is less than 10.')

  assert(eggs.lower() != bacon.lower(), 'The eggs and bacon variables are the same!') or assert(eggs.upper() != bacon.upper(), 'The eggs and bacon variables are the same!')

  assert(False, 'This assertion always triggers.')

  To be able to call logging.debug(), you must have these two lines at the start of your program:

  import logging logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s - %(message)s')

  To be able to send logging messages to a file named programLog.txt with logging.debug(), you must have these two lines at the start of your program:

  import logging >>> logging.
basicConfig(filename='programLog.txt', level=logging.DEBUG, format=' %(asctime)s - %(levelname)s - %(message)s')

  DEBUG, INFO, WARNING, ERROR, and CRITICAL

  logging.disable(logging.CRITICAL)

  You can disable logging messages without removing the logging function calls. You can selectively disable lower-level logging messages. You can create logging messages. Logging messages provides a timestamp.

  The Step button will move the debugger into a function call. The Over button will quickly execute the function call without stepping into it. The Out button will quickly execute the rest of the code until it steps out of the function it currently is in.

  After you click Go, the debugger will stop when it has reached the end of the program or a line with a breakpoint.

  A breakpoint is a setting on a line of code that causes the debugger to pause when the program execution reaches the line.

  To set a breakpoint in IDLE, right-click the line and select Set Breakpoint from the context menu.

  Chapter 11

  The webbrowser module has an open() method that will launch a web browser to a specific URL, and that’s it. The requests module can download files and pages from the Web. The BeautifulSoup module parses HTML. Finally, the selenium module can launch and control a browser.

  The requests.get() function returns a Response object, which has a text attribute that contains the downloaded content as a string.

  The raise_for_status() method raises an exception if the download had problems and does nothing if the download succeeded.

  The status_code attribute of the Response object contains the HTTP status code.

  After opening the new file on your computer in 'wb' “write binary” mode, use a for loop that iterates over the Response object’s iter_content() method to write out chunks to the file. Here’s an example:

  saveFile = open('filename.html', 'wb') for chunk in res.iter_content(100000): saveFile.write(chunk)

  F12 brings up the developer tools in Chrome. Pressing CTRL-SHIFT-C (on Windows and Linux) or ⌘-OPTION-C (on OS X) brings up the developer tools in Firefox.

  Right-click the element in the page, and select Inspect Element from the menu.

  '#main'

  '.highlight'

  'div div'

  'button[value="favorite"]'

  spam.getText()

  linkElem.attrs

  The selenium module is imported with from selenium import webdriver.

  The find_element_* methods return the first matching element as a WebElement object. The find_elements_* methods return a list of all matching elements as WebElement objects.

  The click() and send_keys() methods simulate mouse clicks and keyboard keys, respectively.

  Calling the submit() method on any element within a form submits the form.

  The forward(), back(), and refresh() WebDriver object methods simulate these browser buttons.

  Chapter 12

  The openpyxl.load_workbook() function returns a Workbook object.

  The get_sheet_names() method returns a Worksheet object.

  Call wb.get_sheet_by_name('Sheet1').

  Call wb.get_active_sheet().

  sheet['C5'].value or sheet.cell(row=5, column=3).value

  sheet['C5'] = 'Hello' or sheet.cell(row=5, column=3).value = 'Hello'

  cell.row and cell.column

  They return the highest column and row with values in the sheet, respectively, as integer values.

  openpyxl.cell.column_index_from_string('M')

  openpyxl.cell.get_column_letter(14)

  sheet['A1':'F1']

  wb.save('example.xlsx’)

  A formula is set the same way as any value. Set the cell’s value attribute to a string of the formula text. Remember that formulas begin with the = sign.

  When calling load_workbook(), pass True for the data_only keyword argument.

  sheet.row_dimensions[5].height = 100

  sheet.column_dimensions['C'].hidden = True

  OpenPyXL 2.0.5 does not load freeze panes, print titles, images, or charts.

  Freeze panes are rows and columns that will always appear on the screen. They are useful for headers.

  openpyxl.charts.Reference(), openpyxl.charts.Series(), openpyxl.charts. BarChart(), chartObj.append(seriesObj), and add_chart()

  Chapter 13

  A File object returned from open()

  Read-binary ('rb') for PdfFileReader() and write-binary ('wb') for PdfFileWriter()

  Calling getPage(4) will return a Page object for About This Book, since page 0 is the first page.

  The numPages variable stores an integer of the number of pages in the PdfFileReader object.

  Call decrypt('swordfish').

  The rotateClockwise() and rotateCounterClockwise() methods. The degrees to rotate is passed as an integer argument.

  docx.Document('demo.docx')

  A document contains multiple paragraphs. A paragraph begins on a new line and contains multiple runs. Runs are contiguous groups of characters within a paragraph.

  Use doc.paragraphs.

  A Run object has these variables (not a Paragraph).

  True always makes the Run object bolded and False makes it always not bolded, no matter what the style’s bold setting is. None will make the Run object just use the style’s bold setting.

  Call the docx.Document() function.

  doc.add_paragraph('Hello there!')

  The integers 0, 1, 2, 3, and 4

  Chapter 14

  In Excel, spreadsheets can have values of data types other than strings; cells can have different fonts, sizes, or color settings; cells can have varying widths and heights; adjacent cells can be merged; and you can embed images and charts.

  You pass a File object, obtained from a call to open().

  File objects need to be opened in read-binary ('rb') for Reader objects and write-binary ('wb') for Writer objects.

  The writerow() method

  The delimiter argument changes the string used to separate cells in a row. The lineterminator argument changes the string used to separate rows.

  json.loads()

  json.dumps()

  Chapter 15

  A reference moment that many date and time programs use. The moment is January 1st, 1970, UTC.

  time.time()

  time.sleep(5)

  It returns the closest integer to the argument passed. For example, round(2.4) returns 2.

  A datetime object represents a specific moment in time. A timedelta object represents a duration of time.

  threadObj = threading.Thread(target=spam)

  threadObj.start()

  Make sure that code running in one thread does not read or write the same variables as code running in another thread.

  subprocess.Popen('c:\Windows\System32\calc.exe')

  Chapter 16

  SMTP and IMAP, respectively

  smtplib.SMTP(), smtpObj.ehlo(), smptObj.starttls(), and smtpObj.login()

  imapclient.IMAPClient() and imapObj.login()

  A list of strings of IMAP keywords, such as 'BEFORE ', 'FROM ', or 'SEEN'

  Assign the variable imaplib._MAXLINE a large integer value, such as 10000000.

  The pyzmail module reads downloaded emails.

  You will need the Twilio account SID number, the authentication token number, and your Twilio phone number.

  Chapter 17

  An RGBA value is a tuple of 4 integers, each ranging from 0 to 255. The four integers correspond to the amount of red, green, blue, and alpha (transparency) in the color.

 

‹ Prev