The Open Group Base Specifications Issue 8
IEEE Std 1003.1-2024
Copyright © 2001-2024 The IEEE and The Open Group

NAME

dgettext, dgettext_l, dcgettext, dcgettext_l, gettext, gettext_l, ngettext, ngettext_l, dngettext, dngettext_l, dcngettext, dcngettext_l — message handling functions

SYNOPSIS

#include <libintl.h>

char *dgettext(const char *
domainname, const char *msgid);
char *dgettext_l(const char *
domainname, const char *msgid,
       locale_t
locale);
char *dcgettext(const char *
domainname, const char *msgid,
       int
category);
char *dcgettext_l(const char *
domainname, const char *msgid,
       int
category, locale_t locale);
char *dngettext(const char *
domainname, const char *msgid,
       const char *
msgid_plural, unsigned long int n);
char *dngettext_l(const char *
domainname, const char *msgid,
       const char *
msgid_plural, unsigned long int n,
       locale_t
locale);
char *dcngettext(const char *
domainname, const char *msgid,
       const char *
msgid_plural, unsigned long int n,
       int
category);
char *dcngettext_l(const char *
domainname, const char *msgid,
       const char *
msgid_plural, unsigned long int n,
       int
category, locale_t locale);
char *gettext(const char *
msgid);
char *gettext_l(const char *
msgid, locale_t locale);
char *ngettext(const char *
msgid, const char *msgid_plural,
       unsigned long int
n);
char *ngettext_l(const char *
msgid, const char *msgid_plural,
       unsigned long int
n, locale_t locale);

DESCRIPTION

The gettext() function shall:

If the locale name in effect is "POSIX" or "C" (i.e. the name associated with the LC_MESSAGES locale category in the current locale), or if no suitable messages object exists, or if no string identified by msgid exists in the messages object, or if an error occurs, msgid shall be returned.

The dgettext() function shall be equivalent to gettext(), except domainname shall be used instead of the current text domain to locate the messages object.

The dcgettext() function shall be equivalent to dgettext(), except the locale category identified by category shall be used instead of LC_MESSAGES .

The ngettext() function shall be equivalent to gettext(), except:

The dngettext() function shall be equivalent to ngettext(), except domainname shall be used instead of the current text domain to locate the messages object.

The dcngettext() function shall be equivalent to dngettext(), except the locale category identified by category shall be used instead of LC_MESSAGES .

The *_l() functions shall be equivalent to their counterparts without the _l suffix, except locale shall be used instead of the current locale. If locale is the special locale object LC_GLOBAL_LOCALE or is not a valid locale object handle, the behavior is undefined.

The application shall ensure that the msgid and msgid_plural arguments are strings. If either msgid or msgid_plural is an empty string, or contains characters not in the portable character set, the results are unspecified. If the category argument is LC_ALL , the results are unspecified.

The location of the messages object shall be determined according to the following criteria, stopping when the first messages object is found:

  1. [XSI] [Option Start] If the NLSPATH environment variable is set to a non-empty string, an NLSPATH search shall be performed as described in XBD 8.2 Internationalization Variables. If NLSPATH identifies more than one template to use, each template in turn shall be used until a valid messages object is found. [Option End]

  2. If the LANGUAGE environment variable is set to a non-empty string, a LANGUAGE search shall be performed as described below. If LANGUAGE identifies more than one directory to search, each directory shall be searched until a valid messages object is found.

  3. A single-locale search shall be performed as described below.

For [XSI] [Option Start]  the NLSPATH search and [Option End]  the single-locale search, the single locale name used to locate the messages object shall be the locale name associated with the selected locale category from the current locale, or the provided locale object if calling one of the *_l() functions; additional searches of locale names without .codeset (if present), without _territory (if present), and without @modifier (if present) may be performed.

For the LANGUAGE search, the value of the LANGUAGE environment variable shall be a list of one or more locale names separated by a <colon> (':') character. Each locale name shall be tried in the specified order. If a messages object for the locale does not exist, or cannot be opened, or is unsuitable for implementation-defined reasons (such as security), the next locale name (if any) shall be tried. If:

it is unspecified whether the next locale name (if any) is tried. In all other cases, the messages object for the locale shall be used.

For each locale name in LANGUAGE , or if LANGUAGE is not set or is empty, or no suitable messages object is found in processing LANGUAGE , the pathname used to locate the messages object shall be dirname/localename/categoryname/textdomainname.mo, where:

Resolution of the messages object pathname shall be performed the first time one of the gettext family of functions is called for a given combination of dirname, localename, categoryname, and textdomainname. It is unspecified whether the pathname is re-resolved if the combination has been used before in a call to one of the gettext family of functions. If bindtextdomain() performs pathname resolution of its dirname argument, only the part of the messages object pathname after dirname shall be resolved by the gettext family of functions.

When one of the gettext family of functions returns a message string that was found in a messages object, it shall convert the codeset of the message string to the output codeset if a codeset is specified in the messages object (see msgfmt) and the output codeset is not the same as that codeset. If a successful call to bind_textdomain_codeset() has been made with the text domain of the messages object as the domainname argument and a non-null codeset argument, the output codeset shall be the codeset argument from the most recent such call. Otherwise, the output codeset shall be the codeset of characters in the current locale, or the provided locale object if calling one of the *_l() functions, as specified by the LC_CTYPE category of the locale. The conversion shall be performed as if by a call to iconv() using a conversion descriptor returned by iconv_open(<output codeset>, <messages object codeset>), except that if the return value of iconv() would be greater than zero, the non-identical conversions performed by the gettext family of functions need not be the same as those that such an iconv() call would perform. If an error prevents the codeset conversion from being performed, the gettext family of functions shall behave as if no message string was found in the messages object. If at least one non-identical conversion is performed that results in a fallback character (one that does not provide any information about the character it was converted from, for example, a <question-mark> or "replacement-character"), the gettext family of functions may behave as if no message string was found in the messages object.

RETURN VALUE

The gettext(), gettext_l(), dgettext(), dgettext_l(), dcgettext(), and dcgettext_l() functions shall return the message string described in DESCRIPTION if successful. Otherwise, they shall return msgid.

The ngettext(), ngettext_l(), dngettext(), dngettext_l(), dcngettext(), and dcngettext_l() functions shall return the message string described in DESCRIPTION if successful. Otherwise, msgid shall be returned if n is equal to 1, or msgid_plural if n is not equal to 1.

The application shall ensure that it does not modify the returned string. A subsequent call to a gettext family function shall not overwrite or invalidate the returned string. The returned string may be invalidated by a subsequent call to bind_textdomain_codeset(), bindtextdomain(), setlocale(), or textdomain() in the same process, except for calls that only query values. The returned string shall not be invalidated by a subsequent call to uselocale().

ERRORS

The gettext family of functions shall not modify errno. If an error occurs these functions shall return a string as described in RETURN VALUE.


The following sections are informative.

EXAMPLES

The example code below assumes the following:

Furthermore, the following .mo files (and only the following .mo files) are installed:

These are compiled from a portable messages object source file (dot-po file) with the following ISO/IEC 8859-1 encoded contents (see the EXTENDED DESCRIPTION of the msgfmt utility for a description of the dot-po file format):

msgid ""
msgstr ""
"Content-Type: text/plain; charset=ISO_8859-1\n"
"Plural-Forms: nplurals=4; plural= n==1?0: (n>1&&n<10)?1: (n==0)?2:3;\n"
msgid "recipient"
msgid_plural "recipients"
msgstr[0] "1 recipient"
msgstr[1] "2 to 9 recipients"
msgstr[2] "no recipients"
msgstr[3] "more than 9 recipients"

/system/gettextlib/de_DE/LC_MESSAGES/mail.mo is compiled from a dot-po file with the following ISO/IEC 8859-1 encoded contents:

msgid ""
msgstr ""
"Content-Type: text/plain; charset=ISO_8859-1\n"
"Plural-Forms: nplurals=4; plural= n==1?0: (n>1&&n<5)?1: (n==0)?2:3;\n"
msgid "recipient"
msgid_plural "recipients"
msgstr[0] "1 Empf:a]nger"
msgstr[1] "2 bis 4 Empf:a]nger"
msgstr[2] "keine Empf:a]nger"
msgstr[3] "mehr als 4 Empf:a]nger"

/messagecatalogs/example/en_GB/LC_MESSAGES/mail.mo is compiled from a dot-po file with the following ISO/IEC 8859-1 encoded contents:

msgid ""
msgstr ""
"Content-Type: text/plain; charset=ISO_8859-1\n"
"Plural-Forms: nplurals=4; plural= n==1?0: (n>1&&n<5)?1: (n==0)?2:3;\n"
msgid "recipient"
msgid_plural "recipients"
msgstr[0] "1 recipient"
msgstr[1] "2 to 4 recipients"
msgstr[2] "no recipients"
msgstr[3] "5 or more recipients"

/messagecatalogs/example2/en_US/LC_MESSAGES/othermail.mo is not a suitable messages object file or is a suitable messages object file that does not contain the msgid "recipient".

The following example demonstrates the interactions between bindtextdomain(), bind_textdomain_codeset(), textdomain(), and the gettext family of functions.

unsigned long n_recipients;
// strdup() is used to prevent default_domain from being invalidated by
// a future call to bindtextdomain()
const char *default_domain = strdup(bindtextdomain("mail", NULL));
setlocale(LC_MESSAGES, "POSIX");
setlocale(LC_CTYPE, "POSIX");

n_recipients = 1; // The following outputs "recipient" with the same encoding as the // "recipient" argument to ngettext(): printf("%s\n", ngettext("recipient", "recipients", n_recipients));
n_recipients = 3; // The following outputs "recipients" with the same encoding as the // "recipients" argument to ngettext(): printf("%s\n", ngettext("recipient", "recipients", n_recipients));
setlocale(LC_MESSAGES, "en_US"); setlocale(LC_CTYPE, "en_US"); textdomain("mail");
n_recipients = 1; // The following outputs "1 recipient", encoded in UTF-8: printf("%s\n", ngettext("recipient", "recipients", n_recipients));
n_recipients = 3; // The following outputs "2 to 9 recipients", encoded in UTF-8: printf("%s\n", ngettext("recipient", "recipients", n_recipients));
setlocale(LC_MESSAGES, "en_GB"); setlocale(LC_CTYPE, "en_GB"); bindtextdomain("mail", "/messagecatalogs/example/");
n_recipients = 3; // The following outputs "2 to 4 recipients", encoded in UTF-8: printf("%s\n", ngettext("recipient", "recipients", n_recipients));
setlocale(LC_MESSAGES, "en_US"); setlocale(LC_CTYPE, "en_US"); textdomain("othermail"); bindtextdomain("othermail", "/messagecatalogs/example2/");
n_recipients = 3; // The following outputs "recipients" with the same encoding as the // "recipients" argument to ngettext(): printf("%s\n", ngettext("recipient", "recipients", n_recipients));
// Because there is no locale named en_AU on the system, en_US is used: setenv("LANGUAGE", "en_AU:en_US:en_GB", 1); setlocale(LC_MESSAGES, ""); setlocale(LC_CTYPE, ""); bindtextdomain("mail", default_domain);
// The following outputs "2 to 9 recipients", encoded in UTF-8: printf("%s\n", dngettext("mail", "recipient", "recipients", 3));
textdomain("mail"); bind_textdomain_codeset("mail", "UTF-8"); setlocale(LC_MESSAGES, "de_DE"); setlocale(LC_CTYPE, "de_DE"); // Clear the LANGUAGE environment variable, otherwise it would take // precedence over the locale set above, and en_US would continue to // be used. setenv("LANGUAGE", "", 1);
n_recipients = 1; // The following outputs "1 Empf:a]nger", encoded in UTF-8: printf("%s\n", ngettext("recipient", "recipients", n_recipients));
bind_textdomain_codeset("mail", "ASCII"); setlocale(LC_CTYPE, "POSIX");
n_recipients = 1; // The following outputs "recipient" with the same encoding as the // "recipient" argument to ngettext() - remember, the system is assumed // to not support conversion from ISO/IEC 8859-1 to ASCII: printf("%s\n", ngettext("recipient", "recipients", n_recipients));
free(default_domain);

APPLICATION USAGE

These functions do not impose a limit on message length. Note that translated strings typically have a different length than the input strings, possibly much longer, and applications using these translations in formatted text (for example, aligned columns for a table) should take that into account.

The dcgettext(), dcgettext_l(), dcngettext(), and dcngettext_l() functions are useful to retrieve locale-specific strings for a category other than LC_MESSAGES . For example, they can be used to obtain a time format string from the LC_TIME category; because the locale setting of LC_TIME and LC_MESSAGES can be different, using the other gettext family functions in such a case might cause an undesired result. All of the functions in the gettext family of functions, except dcgettext(), dcgettext_l(), dcngettext(), and dcngettext_l(), search for messages objects only in the LC_MESSAGES category.

Implementations typically, but are not required to, mmap() the messages object file the first time one of the gettext family of functions is called, and keep that map in place until it is no longer expected to be used. For example, a successful call to bindtextdomain() will typically cause the next call to one of the gettext family of functions to munmap() the previous file and mmap() the new file. Applications should not rely on this behavior, however: the implementation is allowed to cache previously used maps, or not use mmap() at all and reopen the file each time one of the gettext family of functions is called.

The msgid and msgid_plural arguments are typically in (US) English. The arguments are always used in the POSIX or C locale, and when a gettext family function encounters an error, so they should not be abstract message identifiers (for example, "message 123") and they should only use characters in the portable character set (to avoid outputting byte sequences that are not valid characters in the current output codeset). If the xgettext utility is used to extract the msgid and msgid_plural arguments from C source files into a template dot-po file, the arguments must be string literals in order for the resulting file to be useful to translators.

The strings returned by the gettext family of functions are not guaranteed to contain only characters that are valid in the current output codeset. In particular, byte sequences that do not form valid characters can occur when:

The strings returned by the gettext family of functions are guaranteed to remain valid until invalidated as described in the RETURN VALUE section. This includes strings that are created by codeset conversion; those strings are freed by the implementation, not the application. Thus, it is safe to call gettext family functions multiple times in situations such as:

printf("%s %s\n", gettext("foo"), gettext("bar"));

RATIONALE

Although the return type of these functions ought to be const char *, it is char * to match historical practice.

The gettext family of functions is frequently used in reporting errors. In fact, it is possible to have an application that attempts to create an error message that combines a translated string via gettext() with an error string provided by strerror(). The standard requires that the gettext family of functions does not modify errno, so that an application need not worry about complications of providing sequencing points to capture a stable value of errno prior to the translation of the error message, and so that the user will still get a somewhat useful string (even if it is the untranslated original string) on any failure.

There are no wide character equivalents for these functions; historically no implementation is known to exist, and the multi-byte message returned from these functions can, in most instances, be converted to wide characters by the application if desired.

Some historical gettext implementations returned the translated string from the messages object without codeset conversion if iconv_open() fails. This is considered to be a bug in those implementations.

FUTURE DIRECTIONS

None.

SEE ALSO

bindtextdomain, catopen , iconv, setlocale, uselocale

XBD <libintl.h>, <limits.h>

XCU gettext, msgfmt, xgettext

CHANGE HISTORY

First released in Issue 8.

End of informative text.

 

return to top of page

UNIX® is a registered Trademark of The Open Group.
POSIX™ is a Trademark of The IEEE.
Copyright © 2001-2024 The IEEE and The Open Group, All Rights Reserved
[ Main Index | XBD | XSH | XCU | XRAT ]