cat (Unix)
cat is a standard Unix utility that reads files sequentially, writing them to standard output. The name is derived from its function to concatenate files.
History
Cat was part of the early versions of Unix, e.g., Version 1, and replaced pr, a PDP-7 utility for copying a single file to the screen.[1]
Usage
The Single Unix Specification defines the operation of cat to read files in the sequence given in its arguments, writing their contents to the standard output in the same sequence. The specification mandates the support of one option flag, u for unbuffered output, meaning that each byte is written after it has been read. Many operating systems do this by default and ignore the flag.
If one of the input filenames is specified as a single hyphen (-), then cat reads from standard input at that point in the sequence. If no files are specified, cat reads from standard input only.
The command-syntax is:
cat [options] [file_names]
The output of cat may be redirected to a file:
cat [options] [file_names] > newfile.txt
or it may be redirected as input to another program, e.g.:
cat file1 file2 | less
which invokes the less paging utility.
Options
The OpenBSD manual page and the GNU coreutils version of cat specify the following options:
- -b (GNU: --number-nonblank), number non-blank output lines
- -e implies -v but also display end-of-line characters as $ (GNU only: -E the same, but without implying -v)
- -n (GNU: --number), number all output lines
- -s (GNU: --squeeze-blank), squeeze multiple adjacent blank lines
- -t implies -v, but also display tabs as ^I (GNU: -T the same, but without implying -v)
- -u use unbuffered I/O for stdout. POSIX does not specify the behavior without this option.
- -v (GNU: --show-nonprinting), displays nonprinting characters, except for tabs and the end of line character
Use cases
cat can be used to pipe a file to a program that expects plain text or binary data on its input stream. cat does not destroy non-text bytes when concatenating and outputting. As such, its two main use cases are text files and certain format-compatible types of binary files.
Text use
As a simple example, to concatenate 2 text files and write them to a new file, you can use the following command:
cat file1.txt file2.txt > newcombinedfile.txt
Some implementations of cat, with option -n, can also number lines as follows:
cat -n file1.txt file2.txt > newnumberedfile.txt
Concatenation of text is limited to text files using the same legacy encoding, such as ASCII. cat does not provide a way to concatenate Unicode text files that have a Byte Order Mark or files using different text encodings from each other.
Other files
For many structured binary data sets, the resulting combined file may not be valid; for example, if a file has a unique header or footer, the result will spuriously duplicate these. However, for some multimedia digital container formats, the resulting file is valid, and so cat provides an effective means of appending files. Video streams can be a significant example of files that cat can concatenate without issue, e.g. the MPEG program stream (MPEG-1 and MPEG-2) and DV (Digital Video) formats, which are fundamentally simple streams of packets.
Unix culture
Jargon file definition
The Jargon File version 4.4.7 lists this as the definition of cat:
- To spew an entire file to the screen or some other output sink without pause (syn. blast).
- By extension, to dump large amounts of data at an unprepared target or with no intention of browsing it carefully. Usage: considered silly. Rare outside Unix sites. See also dd, BLT.
Among Unix fans, cat(1) is considered an excellent example of user-interface design, because it delivers the file contents without such verbosity as spacing or headers between the files, and because it does not require the files to consist of lines of text, but works with any sort of data.
Among Unix critics, cat(1) is considered the canonical example of bad user-interface design, because of its woefully unobvious name. It is far more often used to blast a single file to standard output than to concatenate two or more files. The name cat for the former operation is just as unintuitive as, say, LISP's cdr.
Useless use of cat
UUOC (from comp.unix.shell on Usenet) stands for "useless use of cat". comp.unix.shell observes: "The purpose of cat is to concatenate (or catenate) files. If it is only one file, concatenating it with nothing at all is a waste of time, and costs you a process."[2] This is also referred to as "cat abuse". Nevertheless, the following usage is common:
cat filename | command arg1 arg2 argn
This can be rewritten using redirection of stdin instead, in either of the following forms (the first is more traditional):
command arg1 arg2 argn < filename
<filename command arg1 arg2 argn
Beyond other benefits, the input redirection forms allow command to perform random access on the file, whereas the cat examples do not. This is because the redirection form opens the file as the stdin file descriptor which command can fully access, while the cat form simply provides the data as a stream of bytes.
Another common case where cat is unnecessary is where a command defaults to operating on stdin, but will read from a file, if the filename is given as an argument. This is the case for many common commands; the following examples:
cat "$file" | grep "$pattern"
cat "$file" | less
can instead be written as:
grep "$pattern" "$file"
less "$file"
A common interactive use of cat
for a single file is to output the content of a file to standard output. However, if the output is piped or redirected, cat is unnecessary.
Benefits
The primary benefits of using cat, even when unnecessary, are to avoid human error and for legibility. cat with one named file is safer where human error is a concern — one wrong use of the redirection symbol ">" instead of "<" (often adjacent on keyboards) may permanently delete the file you were just needing to read.[3] In terms of legibility, a sequence of commands starting with cat
and connected by pipes has a clear left-to-right flow of information, in contrast with the back-and-forth syntax and backwards-pointing arrows of using stdin redirection. Contrast:
command < in | command2 > out
<in command | command2 > out
with:
cat in | command | command2 > out
Culture
Since 1995, occasional awards for UUOC have been given out, usually by Perl programmer Randal L. Schwartz. In British hackerdom the activity of fixing instances of UUOC is sometimes called demoggification.[4]
Other operating systems
The equivalent command in the VMS, CP/M, DOS, OS/2, and Microsoft Windows operating system command shells is type.
In DOS/Windows multiple files may be combined with the "copy /b" command syntax, for example:
copy /b file1.txt + file2.txt file3.txt
This copies file1.txt and file2.txt in binary mode to one file, file3.txt.
See also
References
- ↑ McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.
- ↑ This refers to cat being used in a pipe. cat may also be used when not in a pipe, to output a short file to the screen, or to write a file or here document to the standard output of the current script; such uses are not covered by this objection.
- ↑ The default behavior for redirection is to clobber the file to its immediate right.
- ↑ moggy is a chiefly British word for "(mongrel) cat", hence demoggification literally means "removal of (non-special) cats".
External links
- : concatenate and print files – Commands & Utilities Reference, The Single UNIX® Specification, Issue 7 from The Open Group
- UNIX Style, or cat -v Considered Harmful - A paper by Rob Pike on proper Unix command design using cat as an example.
- cat(1) original manual page in the First Edition of Unix.
- : concatenate and write files – GNU Coreutils reference
- : concatenate and print files – OpenBSD General Commands Manual
- – FreeBSD General Commands Manual
- – Plan 9 Programmer's Manual, Volume 1
- Useless Use Of Cat Award
- 13 examples of cat and tac usage from tecmint