cmp (1p) - Linux Man Pages
cmp: compare two files
PROLOGThis manual page is part of the POSIX Programmer's Manual. The Linux implementation of this interface may differ (consult the corresponding Linux manual page for details of Linux behavior), or the interface may not be implemented on Linux.
cmp - compare two files
The cmp utility shall compare two files. The cmp utility shall write no output if the files are the same. Under default options, if they differ, it shall write to standard output the byte and line number at which the first difference occurred. Bytes and lines shall be numbered beginning with 1.
The cmp utility shall conform to the Base Definitions volume of IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines.
The following options shall be supported:
- (Lowercase ell.) Write the byte number (decimal) and the differing bytes (octal) for each difference.
Write nothing for differing files; return exit status only.
The following operands shall be supported:
- A pathname of the first file to be compared. If file1 is '-', the standard input shall be used.
A pathname of the second file to be compared. If file2 is '-',
the standard input shall be used.
The following environment variables shall affect the execution of cmp:
- Provide a default value for the internationalization variables that are unset or null. (See the Base Definitions volume of IEEE Std 1003.1-2001, Section 8.2, Internationalization Variables for the precedence of internationalization variables used to determine the values of locale categories.)
- If set to a non-empty string value, override the values of all the other internationalization variables.
- Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as opposed to multi-byte characters in arguments).
- Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error and informative messages written to standard output.
Determine the location of message catalogs for the processing of LC_MESSAGES
In the POSIX locale, results of the comparison shall be written to standard output. When no options are used, the format shall be:
"%s %s differ: char %d, line %d\n", file1, file2, <byte number>, <line number>
When the -l option is used, the format shall be:
"%d %o %o\n", <byte number>, <differing byte>, <differing byte>
for each byte that differs. The first <differing byte> number is from file1 while the second is from file2. In both cases, <byte number> shall be relative to the beginning of the file, beginning with 1.
The standard error shall be used only for diagnostic messages. If file1 and file2 are identical for the entire length of the shorter file, in the POSIX locale the following diagnostic message shall be written, unless the -s option is specified:
"cmp: EOF on %s%s\n", <name of shorter file>, <additional info>
The following exit values shall be returned:
- The files are identical.
- The files are different; this includes the case where one file is identical to the first part of the other.
- An error occurred.
CONSEQUENCES OF ERRORS
Although input files to cmp can be any type, the results might not be what would be expected on character special device files or on file types not described by the System Interfaces volume of IEEE Std 1003.1-2001. Since this volume of IEEE Std 1003.1-2001 does not specify the block size used when doing input, comparisons of character special files need not compare all of the data in those files.
The global language in Utility Description Defaults indicates that using two mutually-exclusive options together produces unspecified results. Some System V implementations consider the option usage:
cmp -l -s ...
to be an error. They also treat:
cmp -s -l ...
as if no options were specified. Both of these behaviors are considered bugs, but are allowed.
The word char in the standard output format comes from historical usage, even though it is actually a byte number. When cmp is supported in other locales, implementations are encouraged to use the word byte or its equivalent in another language. Users should not interpret this difference to indicate that the functionality of the utility changed between locales.
Some implementations report on the number of lines in the identical-but-shorter file case. This is allowed by the inclusion of the <additional info> fields in the output format. The restriction on having a leading <blank> and no <newline>s is to make parsing for the filename easier. It is recognized that some filenames containing white-space characters make parsing difficult anyway, but the restriction does aid programs used on systems where the names are predominantly well behaved.
COPYRIGHTPortions of this text are reprinted and reproduced in electronic form from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology -- Portable Operating System Interface (POSIX), The Open Group Base Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of Electrical and Electronics Engineers, Inc and The Open Group. In the event of any discrepancy between this version and the original IEEE and The Open Group Standard, the original IEEE and The Open Group Standard is the referee document. The original Standard can be obtained online at http://www.opengroup.org/unix/online.html .