In Bash script, how to join multiple lines from a file?

Joining multiple lines from a file in Bash

The most common need is collapsing multi-line records into single lines. Here are the practical approaches:

Using tr to remove newlines

The simplest method removes all newlines:

tr -d '\n' < input.txt > output.txt

This works well for data without embedded line breaks you want to preserve. For CSV or structured data where you need spaces between joined lines:

tr '\n' ' ' < input.txt > output.txt

Using paste to merge adjacent lines

If you need to join every N lines, paste is efficient:

paste -d' ' - - - < input.txt

This joins every 3 lines with spaces. Adjust the number of - arguments for different groupings.

Using sed for conditional joining

Join lines that match a pattern or end with a continuation character:

sed -e ':a' -e '$!N' -e '$!ba' -e 's/\n/ /g' input.txt

This reads all lines into the pattern space, then replaces newlines with spaces. More readable with comments in a script:

sed -e ':a'              # label 'a'
    -e '$!N'            # if not last line, append next line
    -e '$!ba'           # if not last line, branch to 'a'
    -e 's/\n/ /g'       # replace all newlines with space

Using awk for field-based joining

For structured data, awk provides more control:

awk 'BEGIN{RS=""} {gsub(/\n/, " "); print}' input.txt

This treats blank lines as record separators (useful for multi-line records):

awk 'BEGIN{RS=""; OFS=" "} {$1=$1; print}' input.txt

The $1=$1 trick forces field re-evaluation with the new OFS.

Joining lines ending with backslash

For continuation lines (common in config files):

sed -e ':a' -e '/\\$/{N; s/\\\n//; ba' -e '}' input.txt

Real-world example: processing log entries

If you have multi-line log entries separated by blank lines:

awk 'BEGIN{RS=""; OFS=" | "} {gsub(/\n/, " | "); print NR, $0}' logfile.txt

Bash parameter expansion approach

For small files entirely in memory:

mapfile -t lines < input.txt
result=$(printf '%s ' "${lines[@]}")
echo "${result% }"  # remove trailing space

Performance considerations

  • tr is fastest for simple newline removal
  • paste excels at joining fixed numbers of lines
  • sed works well in pipelines without intermediate files
  • awk best for conditional logic or field manipulation
  • Bash loops are slowest but most readable for complex logic

For files over 1GB, avoid loading entirely into memory. Use streaming tools (sed, awk, tr) or process in chunks with split.

Common pitfalls

Preserving intentional spacing: If your file has meaningful blank lines or specific spacing, be explicit:

awk 'NF {printf "%s ", $0} END {print ""}' input.txt

This joins non-empty lines while skipping blanks.

Handling special characters: When joining, ensure delimiters work with your data:

paste -d'|' - - < input.txt  # Use pipe instead of space

Line ending issues: On mixed-line-ending files (CRLF and LF), preprocess first:

dos2unix input.txt
tr -d '\n' < input.txt > output.txt

Use the right tool for your specific use case—simple removal calls for tr, structured data favors awk, and complex logic benefits from sed patterns or pure Bash when performance isn’t critical.

Similar Posts

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *