Python for every one: The great book of python programming by Elias Paul & Elias Paul

Python for every one: The great book of python programming by Elias Paul & Elias Paul

Author:Elias Paul & Elias Paul [Paul, Elias]
Language: eng
Format: azw3
Published: 2020-06-14T16:00:00+00:00


Chapter 11

Regular expressions

So far we have read files, looking for patterns and extracting various sections of lines that we have found interesting. We have used string methods like split and find as well as slicing lists and strings to extract chunks of the lines.

This search and extract task is so common that Python has a very powerful regular expressions call that handles several of these tasks it was quite elegant. The reason we don't feature regular expressions before it is because, although they are very powerful, they are a little more complicated and It takes some time to get used to its syntax.

Regular expressions are almost your own miniature programming language to search and analyze strings. In fact, entire books have been written on the regular expressions. In this chapter, we will only cover the basics of regular expressions. For more information, we recommend seeing: https://es.wikipedia.orgwikiExpresi%C3%B3n_regular

https://docs.python.org/library/re.html

The regular expression library must be imported into your program before you can use them. The simplest way to use the regular expression library It is the search () function (N. of T .: "search" means search). The next program demonstrates a very simple way to use this function.

# Search for lines containing From

import re

man = open ( mbox-short.txt )

for linea in man:

line = line.rstrip ()

if re.search ( From :, line):

print (line)

# Code: https://es.py4e.com/code3/re01.py

We open the file, go through each line, and use the regular expression search () to print only the lines that contain the string “From”. This program does not 137

Page 150 138

CHAPTER 11. REGULAR EXPRESSIONS

take advantage of the true power of regular expressions, as we could simply having used line.find () to achieve the same result.

The power of regular expressions manifests itself when we add characters special to the search string that allows us to control more precisely which lines fit the chain. Add these special characters to our regular expression will allow us to search for matches and extract data using few lines of code.

For example, the caret (N. of the T .: "caret" in English, corresponds to the sign signo) is used in regular expressions to find “the beginning” of a lina. We could change our program so that it only returns lines where have “From:” at the beginning, as follows: # Search for lines containing From

import re

man = open ( mbox-short.txt )

for linea in man:

line = line.rstrip ()

if re.search ( ^ From :, line):

print (line)

# Code: https://es.py4e.com/code3/re02.py

Now it will only return lines that start with the string "From:". This is still a very simple example that we could have implemented using the method startswith () from the string library. But it serves to present the idea that regular expressions contain special characters that give us more control on which matches the regular expression will return.

11.1 Character matching in expressions

regular

There are several special characters that allow us to construct regular expressions.

lares even more powerful. The most common is the point, which matches any character.

In the following example, the regular expression F..m: would match the strings

"From:", "Fxxm:", "F12m:", or "F! @M:", since the period characters in the regular expression match any character.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.