The Single UNIX ® Specification, Version 2
Copyright © 1997 The Open Group

 NAME

join - relational database operator

 SYNOPSIS



join [ -a file_number | -v file_number ][-e string][-o list][-t char]
[-1 field][-2 field] file1 file2

join [-a file_number][-e string][-j field][-j1 field][-j2 field]
[-o list...][-t char][-t char] file1 file2

 DESCRIPTION

The join utility will perform an equality join on the files file1 and file2. The joined files will be written to the standard output.

The join field is a field in each file on which the files are compared. By default, join writes one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line by default will consist of the join field, then the remaining fields from file1, then the remaining fields from file2. This format can be changed by using the -o option (see below). The -a option can be used to add unmatched lines to the output. The -v option can be used to output only unmatched lines.

By default, the files file1 and file2 should be ordered in the collating sequence of sort -b on the fields on which they are to be joined, by default the first in each line. All selected output will be written in the same collating sequence.

The default input field separators will be blank characters. In this case, multiple separators will count as one field separator, and leading separators will be ignored. The default output field separator will be a space character.

The field separator and collating sequence can be changed by using the -t option (see below).

If the input files are not in the appropriate collating sequence, the results are unspecified.

 OPTIONS

The join utility supports the XBD specification, Utility Syntax Guidelines  . The obsolescent version does not follow the utility argument syntax guidelines: the -j1 and -j2 options are multi-character options and the -o option takes multiple arguments.

The following options are supported:

-a file_number
Produce a line for each unpairable line in file file_number, where file_number is 1 or 2, in addition to the default output. If both -a 1 and -a 2 are specified, all unpairable lines will be output.
-e string
Replace empty output fields in the list selected by -o with the string string.
-j field
Equivalent to: -1 field -2 field.
-j1 field
Equivalent to: -1 field.
-j2 field
Equivalent to: -2 field.
-o list
Construct the output line to comprise the fields specified in list, each element of which has one of the following two forms:
  • file_number.field, where file_number is a file number and field is a decimal integer field number
  • 0 (zero), representing the join field.
The elements of list are either comma- or blank-separated, as specified in Guideline 8 of the XBD specification, Utility Syntax Guidelines  . The fields specified by list will be written for all selected output lines. Fields selected by list that do not appear in the input will be treated as empty output fields. (See the -e option.) Only specifically requested fields are written. The list must be a single command line argument. However, as an obsolescent feature, the argument list can be multiple arguments on the command line.
-t char
Use character char as a separator, for both input and output. Every appearance of char in a line will be significant. When this option is specified, the collating sequence should be the same as sort without the -b option.
-v file_number
Instead of the default output, produce a line only for each unpairable line in file_number, where file_number is 1 or 2. If both -v 1 and -v 2 are specified, all unpairable lines will be output.
-1 field
Join on the fieldth field of file 1. Fields are decimal integers starting with 1.
-2 field
Join on the fieldth field of file 2. Fields are decimal integers starting with 1.

 OPERANDS

The following operands are supported:
file1
file2
A pathname of a file to be joined. If either of the file1 or file2 operands is "-", the standard input is used in its place.

 STDIN

The standard input will be used only if the file1 or file2 operand is "-". See the INPUT FILES section.

 INPUT FILES

The input files must be text files.

 ENVIRONMENT VARIABLES

The following environment variables affect the execution of join:
LANG
Provide a default value for the internationalisation variables that are unset or null. If LANG is unset or null, the corresponding value from the implementation-dependent default locale will be used. If any of the internationalisation variables contains an invalid setting, the utility will behave as if none of the variables had been defined.
LC_ALL
If set to a non-empty string value, override the values of all the other internationalisation variables.
LC_COLLATE
Determine the locale of the collating sequence join expects to have been used when the input files were sorted.
LC_CTYPE
Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single- as opposed to multi-byte characters in arguments and input files).
LC_MESSAGES
Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error.
NLSPATH
Determine the location of message catalogues for the processing of LC_MESSAGES .

 ASYNCHRONOUS EVENTS

Default.

 STDOUT

The join utility output will be a concatenation of selected character fields. When the -o option is not specified, the output will be:

"%s%s%s\n", <join field>,
<other file1 fields>,
<other file2 fields>

If the join field is not the first field in a file, the <other file fields> for that file are:

<fields preceding join field>,
<fields following join field>

When the -o option is specified, the output format will be:

"%s\n", <concatenation of fields> where the concatenation of fields is described by the -o option, above.

For either format, each field (except the last) will be written with its trailing separator character. If the separator is the default (blank characters), a single space character will be written after each field (except the last).

 STDERR

Used only for diagnostic messages.

 OUTPUT FILES

None.

 EXTENDED DESCRIPTION

None.

 EXIT STATUS

The following exit values are returned:
0
All input files were output successfully.
>0
An error occurred.

 CONSEQUENCES OF ERRORS

Default.

 APPLICATION USAGE

Pathnames consisting of numeric digits or of the form string.string should not be specified directly following the -o list.

 EXAMPLES

The -o 0 field essentially selects the union of the join fields. For example, given file phone:

!Name           Phone Number
Don             +1 123-456-7890
Hal             +1 234-567-8901
Yasushi         +2 345-678-9012

and file fax:

!Name           Fax Number
Don             +1 123-456-7899
Keith           +1 456-789-0122
Yasushi         +2 345-678-9011

(where the large expanses of white space are meant to each represent a single tab character), the command:

join -t "<tab>" -a 1 -a 2 -e '(unknown)' -o 0,1.2,2.2 phone fax

would produce:

!Name           Phone Number            Fax Number
Don             +1 123-456-7890         +1 123-456-7899
Hal             +1 234-567-8901         (unknown)
Keith           (unknown)               +1 456-789-0122
Yasushi         +2 345-678-9012         +2 345-678-9011

 FUTURE DIRECTIONS

The obsolescent -j options and the multi-argument -o option may be withdrawn in a future issue.

 SEE ALSO

awk, comm, sort, uniq.

UNIX ® is a registered Trademark of The Open Group.
Copyright © 1997 The Open Group
[ Main Index | XSH | XCU | XBD | XCURSES | XNS ]