Python Merge CSV Files Example

If you have several CSV files with the same columns, you can combine them into one output file with a short Python script.

This example shows a simple beginner-friendly way to:

  • read all .csv files from one folder
  • merge them into one file
  • write the header row only once
  • skip empty files safely

It also helps you avoid common problems such as repeated headers, wrong file paths, and CSV encoding issues.

Quick example

import csv
from pathlib import Path

input_folder = Path("csv_files")
output_file = Path("merged.csv")

csv_files = sorted(input_folder.glob("*.csv"))

with output_file.open("w", newline="", encoding="utf-8") as outfile:
    writer = None

    for file_index, file_path in enumerate(csv_files):
        with file_path.open("r", newline="", encoding="utf-8") as infile:
            reader = csv.reader(infile)
            header = next(reader, None)

            if header is None:
                continue

            if writer is None:
                writer = csv.writer(outfile)
                writer.writerow(header)

            for row in reader:
                writer.writerow(row)

print(f"Merged {len(csv_files)} files into {output_file}")

This example merges all .csv files in one folder and writes the header only once.

What this example does

This script:

  • Combines multiple CSV files into one file
  • Reads all .csv files from a folder
  • Writes the header row only once
  • Skips empty files safely

When to use this approach

Use this script when:

  • You have many CSV files with the same columns
  • You want one combined file for analysis
  • You need a simple script without extra libraries
  • You are learning file handling and loops

If you are new to file operations, it may also help to read Python file handling basics first.

How the script works

Here is the same script again with the key idea in mind: read each file, keep only the first header, and write all data rows into one output file.

import csv
from pathlib import Path

input_folder = Path("csv_files")
output_file = Path("merged.csv")

csv_files = sorted(input_folder.glob("*.csv"))

with output_file.open("w", newline="", encoding="utf-8") as outfile:
    writer = None

    for file_index, file_path in enumerate(csv_files):
        with file_path.open("r", newline="", encoding="utf-8") as infile:
            reader = csv.reader(infile)
            header = next(reader, None)

            if header is None:
                continue

            if writer is None:
                writer = csv.writer(outfile)
                writer.writerow(header)

            for row in reader:
                writer.writerow(row)

print(f"Merged {len(csv_files)} files into {output_file}")

Step by step, the script does this:

  • Use pathlib.Path to find CSV files in a folder
  • Open the output file in write mode
  • Loop through each input CSV file
  • Read the first row as the header
  • Write the first header only once
  • Write all remaining rows to the output file

If you want to understand the reading side in more detail, see how to read a CSV file in Python. For the writing side, see how to write a CSV file in Python.

Important lines to explain

input_folder.glob("*.csv")

csv_files = sorted(input_folder.glob("*.csv"))

This finds all files ending in .csv inside the csv_files folder.

  • glob("*.csv") matches CSV files only
  • sorted(...) gives a predictable file order

That means if your folder contains:

  • part1.csv
  • part2.csv
  • part3.csv

they will usually be processed in that sorted name order.

next(reader, None)

header = next(reader, None)

This reads the first row from the CSV file.

  • If the file has data, the first row is stored in header
  • If the file is empty, it returns None

That is why the script can safely skip empty files:

if header is None:
    continue

writer is None

if writer is None:
    writer = csv.writer(outfile)
    writer.writerow(header)

At the start, writer is set to None.

That lets the script know that it has not written the output header yet. When it reaches the first non-empty file:

  • it creates the CSV writer
  • it writes that file's header row once

After that, writer is no longer None, so later headers are skipped.

newline=""

with output_file.open("w", newline="", encoding="utf-8") as outfile:

Using newline="" is important when working with the csv module.

It helps CSV files be written correctly across operating systems and avoids extra blank lines in some environments.

If you want more detail on file opening modes and parameters, see Python open() explained.

Expected input structure

This example works best when:

  • All files are real CSV files
  • Each file uses the same column order
  • Each file usually has the same header row
  • Files are stored in the folder path used in the script

For example, your folder might look like this:

project/
├── merge_csv.py
└── csv_files/
    ├── sales_jan.csv
    ├── sales_feb.csv
    └── sales_mar.csv

And each file might contain data like this:

name,amount
Alice,100
Bob,200

Expected output

After running the script, you should get:

  • One merged CSV file
  • One header row at the top
  • All data rows from the input files below it
  • Rows in the order of the sorted file names

Example output:

name,amount
Alice,100
Bob,200
Carol,150
David,300
Eve,250

Common problems when merging CSV files

Here are some common issues beginners run into.

Headers get repeated in the output file

This happens when you write the first row from every input file.

To avoid that:

  • write the header only for the first non-empty file
  • skip headers in all later files

The example does this with:

if writer is None:
    writer = csv.writer(outfile)
    writer.writerow(header)

Files have different columns

This simple script assumes every file has the same columns in the same order.

If one file has different headers, the output may become inconsistent.

Common cause:

  • Using CSV files with different headers

A simple improvement is to compare each header to the first one before writing rows.

The folder path is wrong

If Python cannot find the folder or files, csv_files may be empty.

Common cause:

  • Using the wrong folder path

Useful debug checks:

print(csv_files)
print(len(csv_files))
print(output_file.resolve())

If you get file path errors, see FileNotFoundError: No such file or directory fix.

The script includes empty files

An empty file has no header row and no data rows.

This example handles that safely with:

header = next(reader, None)

if header is None:
    continue

Encoding problems cause read errors

Some CSV files are not saved as UTF-8.

Common cause:

  • Trying to merge files with a different text encoding

If you see decoding errors, check the file encoding or try a different one such as "latin-1" if appropriate.

Related help: UnicodeDecodeError UTF-8 codec can't decode byte fix.

Simple improvements

Once the basic script works, you can improve it in small steps.

You could:

  • Check that all headers match before writing rows
  • Skip files that do not match the first header
  • Print each file name while processing
  • Use csv.DictReader for column-based handling
  • Write only selected columns if needed

Here is a slightly safer version that checks headers before merging rows:

import csv
from pathlib import Path

input_folder = Path("csv_files")
output_file = Path("merged.csv")

csv_files = sorted(input_folder.glob("*.csv"))
expected_header = None

with output_file.open("w", newline="", encoding="utf-8") as outfile:
    writer = None

    for file_path in csv_files:
        print(f"Processing: {file_path}")

        with file_path.open("r", newline="", encoding="utf-8") as infile:
            reader = csv.reader(infile)
            header = next(reader, None)

            if header is None:
                print("  Skipped empty file")
                continue

            if expected_header is None:
                expected_header = header
                writer = csv.writer(outfile)
                writer.writerow(header)
            elif header != expected_header:
                print("  Skipped file with different header")
                continue

            for row in reader:
                writer.writerow(row)

print(f"Merged files into {output_file}")

Common causes

Many merge problems come from a few common mistakes:

  • Using CSV files with different headers
  • Forgetting to skip the header row after the first file
  • Using the wrong folder path
  • Trying to merge files with a different text encoding
  • Opening CSV files without newline=""

Useful debugging prints

If your script is not working as expected, these quick checks can help:

print(csv_files)
print(file_path)
print(header)
print(len(csv_files))
print(output_file.resolve())

These tell you:

  • which files were found
  • which file is being processed
  • what header was read
  • how many files matched
  • where the output file is being saved

FAQ

Can I merge CSV files with different columns?

Yes, but this simple example assumes the same columns in every file. If columns differ, use csv.DictReader and handle missing fields carefully.

Why is the header repeated more than once?

This usually happens when you write the first row from every file. Write the header only for the first non-empty file.

What if one of the files is empty?

The example uses next(reader, None), so empty files are skipped safely.

Can I merge CSV files from subfolders too?

Yes. You can use input_folder.rglob("*.csv") instead of glob("*.csv") to search recursively.

See also

If this example helped you get started, the next step is to learn the task-focused guides for reading, writing, and troubleshooting CSV files.