Rationale

The Open Group Base Specifications Issue 6
IEEE Std 1003.1, 2004 Edition
Copyright © 2001-2004 The IEEE and The Open GroupA newer edition of this document exists here

A.4 General Concepts

A.4.1 Concurrent Execution

There is no additional rationale provided for this section.

A.4.2 Directory Protection

There is no additional rationale provided for this section.

A.4.3 Extended Security Controls

Allowing an implementation to define extended security controls enables the use of IEEE Std 1003.1-2001 in environments that require different or more rigorous security than that provided in POSIX.1. Extensions are allowed in two areas: privilege and file access permissions. The semantics of these areas have been defined to permit extensions with reasonable, but not exact, compatibility with all existing practices. For example, the elimination of the superuser definition precludes identifying a process as privileged or not by virtue of its effective user ID.

A.4.4 File Access Permissions

A process should not try to anticipate the result of an attempt to access data by a priori use of these rules. Rather, it should make the attempt to access data and examine the return value (and possibly errno as well), or use access(). An implementation may include other security mechanisms in addition to those specified in POSIX.1, and an access attempt may fail because of those additional mechanisms, even though it would succeed according to the rules given in this section. (For example, the user's security level might be lower than that of the object of the access attempt.) The supplementary group IDs provide another reason for a process to not attempt to anticipate the result of an access attempt.

A.4.5 File Hierarchy

Though the file hierarchy is commonly regarded to be a tree, POSIX.1 does not define it as such for three reasons:

Links may join branches.
In some network implementations, there may be no single absolute root directory; see pathname resolution.
With symbolic links, the file system need not be a tree or even a directed acyclic graph.

A.4.6 Filenames

Historically, certain filenames have been reserved. This list includes core, /etc/passwd, and so on. Conforming applications should avoid these.

Most historical implementations prohibit case folding in filenames; that is, treating uppercase and lowercase alphabetic characters as identical. However, some consider case folding desirable:

For user convenience
For ease-of-implementation of the POSIX.1 interface as a hosted system on some popular operating systems

Variants, such as maintaining case distinctions in filenames, but ignoring them in comparisons, have been suggested. Methods of allowing escaped characters of the case opposite the default have been proposed.

Many reasons have been expressed for not allowing case folding, including:

No solid evidence has been produced as to whether case-sensitivity or case-insensitivity is more convenient for users.
Making case-insensitivity a POSIX.1 implementation option would be worse than either having it or not having it, because:
- More confusion would be caused among users.
- Application developers would have to account for both cases in their code.
- POSIX.1 implementors would still have other problems with native file systems, such as short or otherwise constrained filenames or pathnames, and the lack of hierarchical directory structure.
Case folding is not easily defined in many European languages, both because many of them use characters outside the US ASCII alphabetic set, and because:
- In Spanish, the digraph "ll" is considered to be a single letter, the capitalized form of which may be either "Ll" or "LL", depending on context.
- In French, the capitalized form of a letter with an accent may or may not retain the accent, depending on the country in which it is written.
- In German, the sharp ess may be represented as a single character resembling a Greek beta (ß) in lowercase, but as the digraph "SS" in uppercase.
- In Greek, there are several lowercase forms of some letters; the one to use depends on its position in the word. Arabic has similar rules.
Many East Asian languages, including Japanese, Chinese, and Korean, do not distinguish case and are sometimes encoded in character sets that use more than one byte per character.
Multiple character codes may be used on the same machine simultaneously. There are several ISO character sets for European alphabets. In Japan, several Japanese character codes are commonly used together, sometimes even in filenames; this is evidently also the case in China. To handle case insensitivity, the kernel would have to at least be able to distinguish for which character sets the concept made sense.
The file system implementation historically deals only with bytes, not with characters, except for slash and the null byte.
The purpose of POSIX.1 is to standardize the common, existing definition, not to change it. Mandating case-insensitivity would make all historical implementations non-standard.
Not only the interface, but also application programs would need to change, counter to the purpose of having minimal changes to existing application code.
At least one of the original developers of the UNIX system has expressed objection in the strongest terms to either requiring case-insensitivity or making it an option, mostly on the basis that POSIX.1 should not hinder portability of application programs across related implementations in order to allow compatibility with unrelated operating systems.

Two proposals were entertained regarding case folding in filenames:

Remove all wording that previously permitted case folding.

Rationale

Case folding is inconsistent with portable filename character set definition and filename definition (all characters except slash and null). No known implementations allowing all characters except slash and null also do case folding.
Change "though this practice is not recommended:" to "although this practice is strongly discouraged."

Rationale

If case folding must be included in POSIX.1, the wording should be stronger to discourage the practice.

The consensus selected the first proposal. Otherwise, a conforming application would have to assume that case folding would occur when it was not wanted, but that it would not occur when it was wanted.

A.4.7 File Times Update

This section reflects the actions of historical implementations. The times are not updated immediately, but are only marked for update by the functions. An implementation may update these times immediately.

The accuracy of the time update values is intentionally left unspecified so that systems can control the bandwidth of a possible covert channel.

The wording was carefully chosen to make it clear that there is no requirement that the conformance document contain information that might incidentally affect file update times. Any function that performs pathname resolution might update several st_atime fields. Functions such as getpwnam() and getgrnam() might update the st_atime field of some specific file or files. It is intended that these are not required to be documented in the conformance document, but they should appear in the system documentation.

A.4.8 Host and Network Byte Order

There is no additional rationale provided for this section.

A.4.9 Measurement of Execution Time

The methods used to measure the execution time of processes and threads, and the precision of these measurements, may vary considerably depending on the software architecture of the implementation, and on the underlying hardware. Implementations can also make tradeoffs between the scheduling overhead and the precision of the execution time measurements. IEEE Std 1003.1-2001 does not impose any requirement on the accuracy of the execution time; it instead specifies that the measurement mechanism and its precision are implementation-defined.

A.4.10 Memory Synchronization

In older multi-processors, access to memory by the processors was strictly multiplexed. This meant that a processor executing program code interrogates or modifies memory in the order specified by the code and that all the memory operation of all the processors in the system appear to happen in some global order, though the operation histories of different processors are interleaved arbitrarily. The memory operations of such machines are said to be sequentially consistent. In this environment, threads can synchronize using ordinary memory operations. For example, a producer thread and a consumer thread can synchronize access to a circular data buffer as follows:

int rdptr = 0;
int wrptr = 0;
data_t buf[BUFSIZE];


Thread 1:
    while (work_to_do) {
        int next;


        buf[wrptr] = produce();
        next = (wrptr + 1) % BUFSIZE;
        while (rdptr == next)
            ;
        wrptr = next;
}


Thread 2:
    while (work_to_do) {
        while (rdptr == wrptr)
            ;
        consume(buf[rdptr]);
        rdptr = (rdptr + 1) % BUFSIZE;
    }

In modern multi-processors, these conditions are relaxed to achieve greater performance. If one processor stores values in location A and then location B, then other processors loading data from location B and then location A may see the new value of B but the old value of A. The memory operations of such machines are said to be weakly ordered. On these machines, the circular buffer technique shown in the example will fail because the consumer may see the new value of wrptr but the old value of the data in the buffer. In such machines, synchronization can only be achieved through the use of special instructions that enforce an order on memory operations. Most high-level language compilers only generate ordinary memory operations to take advantage of the increased performance. They usually cannot determine when memory operation order is important and generate the special ordering instructions. Instead, they rely on the programmer to use synchronization primitives correctly to ensure that modifications to a location in memory are ordered with respect to modifications and/or access to the same location in other threads. Access to read-only data need not be synchronized. The resulting program is said to be data race-free.

Synchronization is still important even when accessing a single primitive variable (for example, an integer). On machines where the integer may not be aligned to the bus data width or be larger than the data width, a single memory load may require multiple memory cycles. This means that it may be possible for some parts of the integer to have an old value while other parts have a newer value. On some processor architectures this cannot happen, but portable programs cannot rely on this.

In summary, a portable multi-threaded program, or a multi-process program that shares writable memory between processes, has to use the synchronization primitives to synchronize data access. It cannot rely on modifications to memory being observed by other threads in the order written in the application or even on modification of a single variable being seen atomically.

Conforming applications may only use the functions listed to synchronize threads of control with respect to memory access. There are many other candidates for functions that might also be used. Examples are: signal sending and reception, or pipe writing and reading. In general, any function that allows one thread of control to wait for an action caused by another thread of control is a candidate. IEEE Std 1003.1-2001 does not require these additional functions to synchronize memory access since this would imply the following:

All these functions would have to be recognized by advanced compilation systems so that memory operations and calls to these functions are not reordered by optimization.
All these functions would potentially have to have memory synchronization instructions added, depending on the particular machine.
The additional functions complicate the model of how memory is synchronized and make automatic data race detection techniques impractical.

Formal definitions of the memory model were rejected as unreadable by the vast majority of programmers. In addition, most of the formal work in the literature has concentrated on the memory as provided by the hardware as opposed to the application programmer through the compiler and runtime system. It was believed that a simple statement intuitive to most programmers would be most effective. IEEE Std 1003.1-2001 defines functions that can be used to synchronize access to memory, but it leaves open exactly how one relates those functions to the semantics of each function as specified elsewhere in IEEE Std 1003.1-2001. IEEE Std 1003.1-2001 also does not make a formal specification of the partial ordering in time that the functions can impose, as that is implied in the description of the semantics of each function. It simply states that the programmer has to ensure that modifications do not occur "simultaneously" with other access to a memory location.

IEEE Std 1003.1-2001/Cor 1-2002, item XBD/TC1/D6/4 is applied, adding a new paragraph beneath the table of functions: "The pthread_once() function shall synchronize memory for the first call in each thread for a given pthread_once_t object.".

A.4.11 Pathname Resolution

It is necessary to differentiate between the definition of pathname and the concept of pathname resolution with respect to the handling of trailing slashes. By specifying the behavior here, it is not possible to provide an implementation that is conforming but extends all interfaces that handle pathnames to also handle strings that are not legal pathnames (because they have trailing slashes).

Pathnames that end with one or more trailing slash characters must refer to directory paths. Previous versions of IEEE Std 1003.1-2001 were not specific about the distinction between trailing slashes on files and directories, and both were permitted.

Two types of implementation have been prevalent; those that ignored trailing slash characters on all pathnames regardless, and those that permitted them only on existing directories.

IEEE Std 1003.1-2001 requires that a pathname with a trailing slash character be treated as if it had a trailing "/." everywhere.

Note that this change does not break any conforming applications; since there were two different types of implementation, no application could have portably depended on either behavior. This change does however require some implementations to be altered to remain compliant. Substantial discussion over a three-year period has shown that the benefits to application developers outweighs the disadvantages for some vendors.

On a historical note, some early applications automatically appended a '/' to every path. Rather than fix the applications, the system implementation was modified to accept this behavior by ignoring any trailing slash.

Each directory has exactly one parent directory which is represented by the name dot-dot in the first directory. No other directory, regardless of linkages established by symbolic links, is considered the parent directory by IEEE Std 1003.1-2001.

There are two general categories of interfaces involving pathname resolution: those that follow the symbolic link, and those that do not. There are several exceptions to this rule; for example, open( path,O_CREAT|O_EXCL) will fail when path names a symbolic link. However, in all other situations, the open() function will follow the link.

What the filename dot-dot refers to relative to the root directory is implementation-defined. In Version 7 it refers to the root directory itself; this is the behavior mentioned in IEEE Std 1003.1-2001. In some networked systems the construction /../hostname/ is used to refer to the root directory of another host, and POSIX.1 permits this behavior.

Other networked systems use the construct //hostname for the same purpose; that is, a double initial slash is used. There is a potential problem with existing applications that create full pathnames by taking a trunk and a relative pathname and making them into a single string separated by '/', because they can accidentally create networked pathnames when the trunk is '/'. This practice is not prohibited because such applications can be made to conform by simply changing to use "//" as a separator instead of '/' :

If the trunk is '/', the full pathname will begin with "///" (the initial '/' and the separator "//" ). This is the same as '/', which is what is desired. (This is the general case of making a relative pathname into an absolute one by prefixing with "///" instead of '/'.)
If the trunk is "/A", the result is "/A//..." ; since non-leading sequences of two or more slashes are treated as a single slash, this is equivalent to the desired "/A/...".
If the trunk is "//A", the implementation-defined semantics will apply. (The multiple slash rule would apply.)

Application developers should avoid generating pathnames that start with "//". Implementations are strongly encouraged to avoid using this special interpretation since a number of applications currently do not follow this practice and may inadvertently generate "//...".

The term "root directory" is only defined in POSIX.1 relative to the process. In some implementations, there may be no absolute root directory. The initialization of the root directory of a process is implementation-defined.

A.4.12 Process ID Reuse

There is no additional rationale provided for this section.

A.4.13 Scheduling Policy

There is no additional rationale provided for this section.

A.4.14 Seconds Since the Epoch

Coordinated Universal Time (UTC) includes leap seconds. However, in POSIX time (seconds since the Epoch), leap seconds are ignored (not applied) to provide an easy and compatible method of computing time differences. Broken-down POSIX time is therefore not necessarily UTC, despite its appearance.

As of September 2000, 24 leap seconds had been added to UTC since the Epoch, 1 January, 1970. Historically, one leap second is added every 15 months on average, so this offset can be expected to grow steadily with time.

Most systems' notion of "time" is that of a continuously increasing value, so this value should increase even during leap seconds. However, not only do most systems not keep track of leap seconds, but most systems are probably not synchronized to any standard time reference. Therefore, it is inappropriate to require that a time represented as seconds since the Epoch precisely represent the number of seconds between the referenced time and the Epoch.

It is sufficient to require that applications be allowed to treat this time as if it represented the number of seconds between the referenced time and the Epoch. It is the responsibility of the vendor of the system, and the administrator of the system, to ensure that this value represents the number of seconds between the referenced time and the Epoch as closely as necessary for the application being run on that system.

It is important that the interpretation of time names and seconds since the Epoch values be consistent across conforming systems; that is, it is important that all conforming systems interpret "536457599 seconds since the Epoch" as 59 seconds, 59 minutes, 23 hours 31 December 1986, regardless of the accuracy of the system's idea of the current time. The expression is given to ensure a consistent interpretation, not to attempt to specify the calendar. The relationship between tm_yday and the day of week, day of month, and month is in accordance with the Gregorian calendar, and so is not specified in POSIX.1.

Consistent interpretation of seconds since the Epoch can be critical to certain types of distributed applications that rely on such timestamps to synchronize events. The accrual of leap seconds in a time standard is not predictable. The number of leap seconds since the Epoch will likely increase. POSIX.1 is more concerned about the synchronization of time between applications of astronomically short duration.

Note that tm_yday is zero-based, not one-based, so the day number in the example above is 364. Note also that the division is an integer division (discarding remainder) as in the C language.

Note also that the meaning of gmtime(), localtime(), and mktime() is specified in terms of this expression. However, the ISO C standard computes tm_yday from tm_mday, tm_mon, and tm_year in mktime(). Because it is stated as a (bidirectional) relationship, not a function, and because the conversion between month-day-year and day-of-year dates is presumed well known and is also a relationship, this is not a problem.

Implementations that implement time_t as a signed 32-bit integer will overflow in 2038. The data size for time_t is as per the ISO C standard definition, which is implementation-defined.

A.4.15 Semaphore

There is no additional rationale provided for this section.

A.4.16 Thread-Safety

Where the interface of a function required by IEEE Std 1003.1-2001 precludes thread-safety, an alternate thread-safe form is provided. The names of these thread-safe forms are the same as the non-thread-safe forms with the addition of the suffix "_r". The suffix "_r" is historical, where the 'r' stood for "reentrant".

In some cases, thread-safety is provided by restricting the arguments to an existing function.

A.4.17 Tracing

Refer to Tracing.

A.4.18 Treatment of Error Conditions for Mathematical Functions

There is no additional rationale provided for this section.

A.4.19 Treatment of NaN Arguments for Mathematical Functions

There is no additional rationale provided for this section.

A.4.20 Utility

There is no additional rationale provided for this section.

A.4.21 Variable Assignment

There is no additional rationale provided for this section.

UNIX ® is a registered Trademark of The Open Group.
POSIX ® is a registered Trademark of The IEEE.
[ Main Index | XBD | XCU | XSH | XRAT ]