Intro to Python

Reading Text Files

files.py

open()

sample_file = open("sample.txt")

sample_file = open("sample.txt")

print(sample_file)

# Output:
# <_io.TextIOWrapper name='D:\\sample.txt' mode='r' encoding='cp1252'>

sample.txt

read-only

sample_file505 = open("sample505.txt")

print(sample_file505)

# Output:
# FileNotFoundError: [Errno 2] No such file or directory: 'sample505.txt'

read()

sample_file = open("sample.txt")

print(sample_file.read())

# Output:
# Welcome to Natural Language Processing
# It is one of the most exciting research areas as of today
# We will see how Python can be used to work with text files.

cursor

seek()

sample_file = open("sample.txt")

print(sample_file.read())

sample_file.seek(0)

print(sample_file.read())

close()

sample_file = open("sample.txt")

print(sample_file.read())

sample_file.seek(0)

print(sample_file.read())

sample_file.close()

Line by Line

readlines()

\n

sample_file = open("sample.txt")

print(sample_file.readlines())

# Output:
# ['Welcome to Natural Language Processing\n',
# 'It is one of the most exciting research areas as of today\n',
# 'We will see how Python can be used to work with text files.']

sample_file = open("sample.txt")

for lines in sample_file:
  print(lines.split()[0])

# Output:
# Welcome
# It
# We

for loop

split()

[0]

Writing to Text Files

w

w+

w

w+

sample_file = open("sample.txt", 'w+')

print(sample_file.read())

sample_file.write("This file has been rewritten.")

sample_file.seek(0)

print(sample_file.read())

sample_file.close()

write()

a+

sample_file = open("sample.txt", 'a+')

print(sample_file.read())

sample_file.write("\nThis file has been appended to.")

sample_file.seek(0)

print(sample_file.read())

sample_file.close()

\n

with

with open("sample.txt") as sample_file:
  print(sample_file.read())

with

Practice

.csv

.txt

.json

When attempting to read from a file, you may run into a UnicodeDecodeError on your console. The error occurs because the read() method does not recognize all of the characters in the file.

An encoding represents characters of human language (e.g., 'a', '!', '7') in digital form (e.g., binary). Not all encodings cover every character encountered in a file. So, a UnicodeDecodeError lets us know when the default encoding used in the open() method cannot translate the entire file to human language from machine language.

You can specify the encoding of the input file when you first open it with the following code: data_file = open("filename.csv", "r", encoding = "utf-8")

Often, setting the encoding to UTF-8 allows the program to read the file successfully. However, other popular file encodings include Latin-1 and ASCII. You may need to try different encoding arguments to find the correct one for your file.

Working with Files

Reading Text Files

Line by Line

Writing to Text Files

Practice

File Encoding