The Single UNIX ® Specification, Version 2
Copyright © 1997 The Open Group

 NAME

csplit - split files based on context

 SYNOPSIS



csplit [-ks][-f prefix][-n number] file arg1 ...argn

 DESCRIPTION

The csplit utility reads the file named by the file operand, writes all or part of that file into other files as directed by the arg operands, and writes the sizes of the files.

 OPTIONS

The csplit utility supports the XBD specification, Utility Syntax Guidelines  .

The following options are supported:

-f prefix
Name the created files prefix00, prefix01, ..., prefixn. The default is xx00 ... xxn. If the prefix argument would create a filename exceeding {NAME_MAX} bytes, an error will result, csplit will exit with a diagnostic message and no files will be created.
-k
Leave previously created files intact. By default, csplit will remove created files if an error occurs.
-n number
Use number decimal digits to form filenames for the file pieces. The default is 2.
-s
Suppress the output of file size messages.

 OPERANDS

The following operands are supported:
file
The pathname of a text file to be split. If file is "-", the standard input will be used.

The operands arg1 ... argn can be a combination of the following:

/rexp/[offset]
Create a file using the content of the lines from the current line up to, but not including, the line that results from the evaluation of the regular expression with offset, if any, applied. The regular expression rexp must follow the rules for basic regular expressions described in the XBD specification, Basic Regular Expressions  . The optional offset must be a positive or negative integer value representing a number of lines. The integer value must be preceded by "+" or "-". If the selection of lines from an offset expression of this type would create a file with zero lines, or one with greater than the number of lines left in the input file, the results are unspecified. After the section is created, the current line will be set to the line that results from the evaluation of the regular expression with any offset applied. The pattern match of rexp always is applied from the current line to the end of the file.
%rexp%[offset]
This operand is the same as /rexp/[offset], except that no file will be created for the selected section of the input file.
line_no
Create a file from the current line up to (but not including) the line number line_no. Lines in the file will be numbered starting at one. The current line becomes line_no.
{num}
Repeat operand. This operand can follow any of the operands described previously. If it follows a rexp type operand, that operand will be applied num more times. If it follows a line_no operand, the file will be split every line_no lines, num times, from that point.

An error will be reported if an operand does not reference a line between the current position and the end of the file.

 STDIN

See the INPUT FILES section.

 INPUT FILES

The input file must be a text file.

 ENVIRONMENT VARIABLES

The following environment variables affect the execution of csplit:
LANG
Provide a default value for the internationalisation variables that are unset or null. If LANG is unset or null, the corresponding value from the implementation-dependent default locale will be used. If any of the internationalisation variables contains an invalid setting, the utility will behave as if none of the variables had been defined.
LC_ALL
If set to a non-empty string value, override the values of all the other internationalisation variables.
LC_COLLATE
Determine the locale for the behaviour of ranges, equivalence classes and multi-character collating elements within regular expressions.
LC_CTYPE
Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single- as opposed to multi-byte characters in arguments and input files) and the behaviour of character classes within regular expressions.
LC_MESSAGES
Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error.
NLSPATH
Determine the location of message catalogues for the processing of LC_MESSAGES .

 ASYNCHRONOUS EVENTS

If the -k option is specified, created files will be retained. Otherwise the default action occurs.

 STDOUT

Unless the -s option is used, the standard output will consist of one line per file created, with a format as follows:

"%d\n", <file size in bytes>

 STDERR

Used only for diagnostic messages.

 OUTPUT FILES

The output files will contain portions of the original input file, otherwise unchanged.

 EXTENDED DESCRIPTION

None.

 EXIT STATUS

The following exit values are returned:
0
Successful completion.
>0
An error occurred.

 CONSEQUENCES OF ERRORS

By default, created files will be removed if an error occurs. When the -k option is specified, created files will not be removed if an error occurs.

 APPLICATION USAGE

None.

 EXAMPLES

  1. This example creates four files, cobol00 ... cobol03:
    
    csplit -f cobol file '/procedure division/' /par5./ /par16./
    
    

    After editing the split files, they can be recombined as follows:

    
    cat cobol0[0-3] > file
    
    

    Note that this example overwrites the original file.

  2. This example would split the file after the first 99 lines, and every 100 lines thereafter, up to 9999 lines; this is because lines in the file are numbered from 1 rather than zero, for historical reasons:
    
    csplit -k file  100  {99}
    
    

  3. Assuming that prog.c follows the C-language coding convention of ending routines with a "}" at the beginning of the line, this example will create a file containing each separate C routine (up to 21) in prog.c:
    
    csplit -k prog.c '%main(%'  '/^}/+1' {20}
    
    

 FUTURE DIRECTIONS

The IEEE PASC 1003.2 Interpretations Committee has forwarded concerns about parts of this interface definition to the IEEE PASC Shell and Utilities Working Group which is identifying the corrections. A future revision of this specification will align with IEEE Std. 1003.2b when finalised.

 SEE ALSO

sed, split.

UNIX ® is a registered Trademark of The Open Group.
Copyright © 1997 The Open Group
[ Main Index | XSH | XCU | XBD | XCURSES | XNS ]