Practical Programming: An Introduction to Computer Science Using Python 3.6 by Paul Gries & Jennifer Campbell & Jason Montojo

Practical Programming: An Introduction to Computer Science Using Python 3.6 by Paul Gries & Jennifer Campbell & Jason Montojo

Author:Paul Gries & Jennifer Campbell & Jason Montojo [Gries, Paul]
Language: eng
Format: azw3
Tags: Pragmatic Bookshelf
Publisher: Pragmatic Bookshelf
Published: 2017-12-05T16:00:00+00:00


Writing Algorithms That Use the File-Reading Techniques

There are several common ways to organize information in files. The rest of this chapter will show how to apply the various file-reading techniques to these situations and how to develop some algorithms to help with this.

Skipping the Header

Many data files begin with a header. As described in ​The Readline Technique​, TSDL files begin with a one-line description followed by comments in lines beginning with a #, and the Readline technique can be used to skip that header. The technique ends when we read the first real piece of data, which will be the first line after the description that doesn’t start with a #.

In English, we might try this algorithm to process this kind of a file:

​ Skip the first line in the file

​ Skip over the comment lines in the file

​ For each of the remaining lines in the file:

​ Process the data on that line

The problem with this approach is that we can’t tell whether a line is a comment line until we’ve read it, but we can read a line from a file only once—there’s no simple way to “back up” in the file. An alternative approach is to read the line, skip it if it’s a comment, and process it if it’s not. Once we’ve processed the first line of data, we process the remaining lines:

​ Skip the first line in the file

​ Find and process the first line of data in the file

​ For each of the remaining lines:

​ Process the data on that line

The thing to notice about this algorithm is that it processes lines in two places: once when it finds the first “interesting” line in the file and once when it handles all of the following lines:

​ ​from​ typing ​import​ TextIO

​ ​from​ io ​import​ StringIO

​ ​def​ skip_header(reader: TextIO) -> str:

​ ​"""Skip the header in reader and return the first real piece of data.​

​ ​ >>> infile = StringIO('Example​​\\​​n# Comment​​\\​​n# Comment​​\\​​nData line​​\\​​n')​

​ ​ >>> skip_header(infile)​

​ ​ 'Data line​​\\​​n'​

​ ​ """​

​ ​# Read the description line​

​ line = reader.readline()

​ ​# Find the first non-comment line​

​ line = reader.readline()

​ ​while​ line.startswith(​'#'​):

​ line = reader.readline()

​ ​# Now line contains the first real piece of data​

​ ​return​ line

​ ​def​ process_file(reader: TextIO) -> None:

​ ​"""Read and print the data from reader, which must start with a single​

​ ​ description line, then a sequence of lines beginning with '#', then a​

​ ​ sequence of data.​

​ ​ >>> infile = StringIO('Example​​\\​​n# Comment​​\\​​nLine 1​​\\​​nLine 2​​\\​​n')​

​ ​ >>> process_file(infile)​

​ ​ Line 1​

​ ​ Line 2​

​ ​ """​

​ ​# Find and print the first piece of data​

​ line = skip_header(reader).strip()

​ ​print​(line)

​ ​# Read the rest of the data​

​ ​for​ line ​in​ reader:

​ line = line.strip()

​ ​print​(line)

​ ​if​ __name__ == ​'__main__'​:

​ ​with​ open(​'hopedale.txt'​, ​'r'​) ​as​ input_file:

​ process_file(input_file)

In skip_header, we return the first line of read data, because once we’ve found it, we can’t read it again (we can go forward but not backward). We’ll want to use skip_header in all of the file-processing functions in this section.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.