Reading and Writing Text Files

7 min read

Real test work involves files: a list of test cases on disk, a fixture pulled from a CSV, a run summary written for the next CI step, a log line appended after every test. Python's file API is short — open, read, write, close — wrapped in a with statement that handles cleanup for you. This lesson covers reading whole files and one line at a time, the four modes you'll actually use (r, w, a, r+), the modern pathlib approach to file paths, and why with open(...) is the only pattern you should ever use to open a file.

The with open() pattern

with open("test_log.txt", "r") as f:
    content = f.read()
print(content)

Read each piece:

  • open("test_log.txt", "r") — opens the file, returns a file object. "r" is read mode (the default).
  • with ... as f: — Python's context manager. The block runs with f bound to the file object. When the block ends — either normally or because an exception was raised — Python automatically closes the file.
  • f.read() — returns the entire file contents as a single string.

Compare to Java's try-with-resources (try (BufferedReader r = new BufferedReader(...)) { ... }) — same concept, far less ceremony. The with block is the idiomatic way to open a file in Python. Skip it and you risk leaked file handles in error paths.

File modes

The second argument to open() chooses the mode:

ModeMeaning
"r"Read (default). Errors if the file doesn't exist.
"w"Write. Truncates the file (replaces contents) — be careful. Creates if absent.
"a"Append. Writes go to the end. Existing content preserved.
"r+"Read + write. Cursor starts at byte 0; existing content stays.
"x"Exclusive create. Errors if the file already exists.

Add "b" (e.g. "rb") for binary mode — needed for images, PDFs, anything non-text. Default is text mode, which decodes as UTF-8 on most systems.

Reading line by line

For anything large, reading the whole file with f.read() can blow up memory. The idiomatic alternative is to iterate the file object directly — Python yields one line per loop:

with open("test_cases.txt", "r") as f:
    for line in f:
        print(line.strip())

line.strip() removes the trailing "\n" (and any leading or trailing whitespace). Without it, every print would add its own newline on top of the line's newline, doubling the spacing.

If you do want every line at once, f.readlines() returns a list:

with open("test_cases.txt", "r") as f:
    lines = f.readlines()
print(f"{len(lines)} lines")

Writing — a simple report

from datetime import datetime
 
passed = 28
failed = 4
 
with open("report.txt", "w") as f:
    f.write("Test Report\n")
    f.write(f"Date: {datetime.now()}\n")
    f.write(f"Passed: {passed}, Failed: {failed}\n")

Two important things:

  • f.write(s) does not add a newline. Unlike print, you have to add \n yourself. Forget it and your "lines" all run together on one line.
  • "w" truncates first. The previous contents are wiped the moment the file is opened. If you want to preserve them, use "a" (append).

For an entire list of lines: f.writelines(["one\n", "two\n", "three\n"]). Despite the name, it does not add separators — you bring the newlines yourself.

Appending — the right mode for a log

A test log accumulates over time. Use "a" so each run adds to the bottom rather than wiping previous runs:

from datetime import datetime
 
with open("test.log", "a") as f:
    f.write(f"[{datetime.now().isoformat()}] login_test PASSED\n")

Run that script twice and you'll see two lines in test.log. Run it with "w" instead and the second run replaces the first — a classic source of "where did my logs go?" puzzles.

Paths the modern way — pathlib

Python's older API uses string paths and os.path helpers (os.path.join, os.path.exists). The modern alternative is pathlib.Path, which makes paths objects with methods:

from pathlib import Path
 
data_dir = Path("fixtures")
user_file = data_dir / "users.json"      # / operator joins paths cleanly
 
if user_file.exists():
    content = user_file.read_text()
    print(f"Read {len(content)} characters")

pathlib highlights:

  • Path("a") / "b" / "c.txt" — the / operator joins path parts. Works on Windows (\\) and Unix (/) without you thinking about separators.
  • .exists(), .is_file(), .is_dir(), .suffix, .name, .parent — handy methods on every path object.
  • .read_text() / .write_text(s) — one-liner shortcuts when you just want all the text.

For QA scripts, prefer pathlib. It's been the standard since Python 3.4, every IDE knows it, and it removes a category of cross-platform path bugs.

Why with matters — the cleanup guarantee

The reason every Python tutorial insists on with open(...):

# Don't do this
f = open("report.txt", "w")
f.write("Test report\n")
process(...)        # if this raises, f never closes
f.close()

If process(...) raises, the f.close() line is skipped. The file handle leaks. On long-running processes (a CI agent, a server) leaks accumulate until the OS runs out of handles.

The with form closes the file in a hidden finally block — guaranteed, even on exceptions:

# Do this
with open("report.txt", "w") as f:
    f.write("Test report\n")
    process(...)        # exception or not, f closes when the block ends

Same idea as Java's try-with-resources or C#'s using. Always reach for it.

Encoding — UTF-8 by default, but make it explicit

Python's default text encoding has bitten more than one CI pipeline:

# Better — explicit UTF-8 on every platform
with open("report.txt", "w", encoding="utf-8") as f:
    f.write("Pass rate: 87.5% — 28/32 ✅\n")

On macOS and Linux, UTF-8 is the default; on some Windows configurations it isn't. Specifying encoding="utf-8" makes your script behave the same everywhere — worth doing for any file you share or commit.

A QA example — read inputs, run, write outputs

A complete end-to-end script: read a list of test names from a file, "run" each (we'll fake it with a function), and write a results report:

from pathlib import Path
from datetime import datetime
import random
 
def run_test(name: str) -> dict:
    """Pretend to run a test. Returns a result dict."""
    return {
        "name": name,
        "status": random.choice(["PASS", "FAIL"]),
        "duration_ms": random.randint(100, 3000)
    }
 
input_path = Path("fixtures/test_cases.txt")
output_path = Path("output/report.txt")
output_path.parent.mkdir(parents=True, exist_ok=True)   # create output/ if missing
 
with input_path.open("r", encoding="utf-8") as f:
    test_names = [line.strip() for line in f if line.strip()]
 
results = [run_test(name) for name in test_names]
 
with output_path.open("w", encoding="utf-8") as f:
    f.write(f"Test Report — {datetime.now().isoformat(timespec='seconds')}\n")
    f.write("-" * 40 + "\n")
    for r in results:
        f.write(f"{r['name']:<20} {r['status']:<6} {r['duration_ms']:>5} ms\n")
    passed = sum(1 for r in results if r["status"] == "PASS")
    f.write(f"\n{passed}/{len(results)} passed\n")
 
print(f"Wrote {output_path}")

Three patterns at once: list comprehension to filter blank lines on read, pathlib.Path.open(...), and a for loop to write a formatted report. The output_path.parent.mkdir(parents=True, exist_ok=True) line is a useful idiom — create the parent directory if it isn't already there.

What the with block actually does

Step 1 of 6

Resolve the path

Path('fixtures/users.json') — pathlib joins parts safely across operating systems.

That guarantee — open, work, close, even if something fails — is what makes with open(...) the only pattern worth memorising.

⚠️ Common mistakes

  • Opening with "w" when you meant "a". "w" wipes the file the moment it opens. If you wanted to add to the bottom of a log, use "a". There's no recovering the previous content from disk afterwards.
  • Forgetting \n in f.write. f.write("line one") followed by f.write("line two") produces line oneline two — the writes don't add newlines for you. Pass "\n" explicitly, or use print(..., file=f) which does.
  • Not using with. A bare f = open(...) followed by manual f.close() leaks the handle on any exception in between. Always with open(...) as f: — the cost is one keyword and the safety is total.

🎯 Practice task

Read inputs, transform, write outputs. 25-30 minutes.

  1. Create fixtures/test_cases.txt containing five or six test names, one per line. (Use any editor.)
  2. Create read_run_write.py. At the top: from pathlib import Path and from datetime import datetime.
  3. Read the file with with input.open("r", encoding="utf-8") as f: and a comprehension to skip blank lines: names = [line.strip() for line in f if line.strip()].
  4. Define def run_test(name: str) -> dict: returning a dict with name, status ("PASS"/"FAIL" — pick by length parity, e.g. len(name) % 2 == 0), and duration_ms (e.g. len(name) * 100).
  5. Build results = [run_test(n) for n in names].
  6. Write to output/report.txt:
    • A header line with datetime.now().isoformat(timespec="seconds").
    • One line per result, formatted like login_test PASS 1240 ms.
    • A footer with passed/total.
    • Don't forget \n and encoding="utf-8".
    • Use output.parent.mkdir(parents=True, exist_ok=True) to ensure the directory exists.
  7. Run python read_run_write.py and inspect output/report.txt.
  8. Now switch the output file mode from "w" to "a" and run the script twice. Confirm the report appended (two timestamps in the file).
  9. Stretch: add a simple log file with "a" mode. Each run, append [<timestamp>] ran N tests, P passed\n to output/run.log. After three runs you should have three lines.

You can now read, write, and append to any text file Python can reach. The next lesson moves to the format every test data spreadsheet uses: CSV.

// tip to track lessons you complete and pick up where you left off across devices.