Home > Computers & Technology > Programming > Software Design, Testing & Engineering > Software Development

Practical Programming: An Introduction to Computer Science Using Python 3.6 by Paul Gries & Jennifer Campbell & Jason Montojo

Author:Paul Gries & Jennifer Campbell & Jason Montojo [Gries, Paul] , Date: July 4, 2020 ,Views: 252

Practical Programming: An Introduction to Computer Science Using Python 3.6 by Paul Gries & Jennifer Campbell & Jason Montojo

Author:Paul Gries & Jennifer Campbell & Jason Montojo [Gries, Paul]
Language: eng
Format: azw3
Tags: Pragmatic Bookshelf
Publisher: Pragmatic Bookshelf
Published: 2017-12-05T16:00:00+00:00

Writing Algorithms That Use the File-Reading Techniques

There are several common ways to organize information in files. The rest of this chapter will show how to apply the various file-reading techniques to these situations and how to develop some algorithms to help with this.

Skipping the Header

Many data files begin with a header. As described in The Readline Technique, TSDL files begin with a one-line description followed by comments in lines beginning with a #, and the Readline technique can be used to skip that header. The technique ends when we read the first real piece of data, which will be the first line after the description that doesn’t start with a #.

In English, we might try this algorithm to process this kind of a file:

Skip the first line in the file

Skip over the comment lines in the file

For each of the remaining lines in the file:

Process the data on that line

The problem with this approach is that we can’t tell whether a line is a comment line until we’ve read it, but we can read a line from a file only once—there’s no simple way to “back up” in the file. An alternative approach is to read the line, skip it if it’s a comment, and process it if it’s not. Once we’ve processed the first line of data, we process the remaining lines:

Skip the first line in the file

Find and process the first line of data in the file

For each of the remaining lines:

Process the data on that line

The thing to notice about this algorithm is that it processes lines in two places: once when it finds the first “interesting” line in the file and once when it handles all of the following lines:

from typing import TextIO

from io import StringIO

def skip_header(reader: TextIO) -> str:

"""Skip the header in reader and return the first real piece of data.

>>> infile = StringIO('Example\\n# Comment\\n# Comment\\nData line\\n')

>>> skip_header(infile)

'Data line\\n'

"""

# Read the description line

line = reader.readline()

# Find the first non-comment line

line = reader.readline()

while line.startswith('#'):

line = reader.readline()

# Now line contains the first real piece of data

return line

def process_file(reader: TextIO) -> None:

"""Read and print the data from reader, which must start with a single

description line, then a sequence of lines beginning with '#', then a

sequence of data.

>>> infile = StringIO('Example\\n# Comment\\nLine 1\\nLine 2\\n')

>>> process_file(infile)

Line 1

Line 2

"""

# Find and print the first piece of data

line = skip_header(reader).strip()

print(line)

# Read the rest of the data

for line in reader:

line = line.strip()

print(line)

if __name__ == '__main__':

with open('hopedale.txt', 'r') as input_file:

process_file(input_file)

In skip_header, we return the first line of read data, because once we’ve found it, we can’t read it again (we can go forward but not backward). We’ll want to use skip_header in all of the file-processing functions in this section.

Download

Practical Programming: An Introduction to Computer Science Using Python 3.6 by Paul Gries & Jennifer Campbell & Jason Montojo.azw3

Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.

Categories

Coding Theory	Localization
Logic	Object-Oriented Design
Performance Optimization	Quality Control
Reengineering	Robohelp
Software Development	Software Reuse
Structured Design	Testing
Tools	UML