[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Different countries and cultures have varying conventions for how to communicate. These conventions range from very simple ones, such as the format for representing dates and times, to very complex ones, such as the language spoken.
Internationalization of software means programming it to be able to adapt to the user's favorite conventions. In ISO C, internationalization works by means of locales. Each locale specifies a collection of conventions, one convention for each purpose. The user chooses a set of conventions by specifying a locale (via environment variables).
All programs inherit the chosen locale as part of their environment. Provided the programs are written to obey the choice of locale, they will follow the conventions preferred by the user.
7.1 What Effects a Locale Has | Actions affected by the choice of locale. | |
7.2 Choosing a Locale | How the user specifies a locale. | |
7.3 Categories of Activities that Locales Affect | Different purposes for which you can select a locale. | |
7.4 How Programs Set the Locale | How a program specifies the locale with library functions. | |
7.5 Standard Locales | Locale names available on all systems. | |
7.6 Accessing Locale Information | How to access the information for the locale. | |
7.7 A dedicated function to format numbers | ||
7.8 Yes-or-No Questions | Check a Response against the locale. |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Each locale specifies conventions for several purposes, including the following:
Some aspects of adapting to the specified locale are handled
automatically by the library subroutines. For example, all your program
needs to do in order to use the collating sequence of the chosen locale
is to use strcoll
or strxfrm
to compare strings.
Other aspects of locales are beyond the comprehension of the library. For example, the library can't automatically translate your program's output messages into other languages. The only way you can support output in the user's favorite language is to program this more or less by hand. The C library provides functions to handle translations for multiple languages easily.
This chapter discusses the mechanism by which you can modify the current locale. The effects of the current locale on specific library functions are discussed in more detail in the descriptions of those functions.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The simplest way for the user to choose a locale is to set the
environment variable LANG
. This specifies a single locale to use
for all purposes. For example, a user could specify a hypothetical
locale named ‘espana-castellano’ to use the standard conventions of
most of Spain.
The set of locales supported depends on the operating system you are using, and so do their names. We can't make any promises about what locales will exist, except for one standard locale called ‘C’ or ‘POSIX’. Later we will describe how to construct locales.
A user also has the option of specifying different locales for different purposes—in effect, choosing a mixture of multiple locales.
For example, the user might specify the locale ‘espana-castellano’ for most purposes, but specify the locale ‘usa-english’ for currency formatting. This might make sense if the user is a Spanish-speaking American, working in Spanish, but representing monetary amounts in US dollars.
Note that both locales ‘espana-castellano’ and ‘usa-english’, like all locales, would include conventions for all of the purposes to which locales apply. However, the user can choose to use each locale for a particular subset of those purposes.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The purposes that locales serve are grouped into categories, so
that a user or a program can choose the locale for each category
independently. Here is a table of categories; each name is both an
environment variable that a user can set, and a macro name that you can
use as an argument to setlocale
.
LC_COLLATE
This category applies to collation of strings (functions strcoll
and strxfrm
); see Collation Functions.
LC_CTYPE
This category applies to classification and conversion of characters, and to multibyte and wide characters; see Character Handling, and Character Set Handling.
LC_MONETARY
This category applies to formatting monetary values; see Generic Numeric Formatting Parameters.
LC_NUMERIC
This category applies to formatting numeric values that are not monetary; see Generic Numeric Formatting Parameters.
LC_TIME
This category applies to formatting date and time values; see Formatting Calendar Time.
LC_MESSAGES
This category applies to selecting the language used in the user interface for message translation (see section The Uniforum approach to Message Translation; see section X/Open Message Catalog Handling) and contains regular expressions for affirmative and negative responses.
LC_ALL
This is not an environment variable; it is only a macro that you can use
with setlocale
to set a single locale for all purposes. Setting
this environment variable overwrites all selections by the other
LC_*
variables or LANG
.
LANG
If this environment variable is defined, its value specifies the locale to use for all purposes except as overridden by the variables above.
When developing the message translation functions it was felt that the
functionality provided by the variables above is not sufficient. For
example, it should be possible to specify more than one locale name.
Take a Swedish user who better speaks German than English, and a program
whose messages are output in English by default. It should be possible
to specify that the first choice of language is Swedish, the second
German, and if this also fails to use English. This is
possible with the variable LANGUAGE
. For further description of
this GNU extension see User influence on gettext
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A C program inherits its locale environment variables when it starts up.
This happens automatically. However, these variables do not
automatically control the locale used by the library functions, because
ISO C says that all programs start by default in the standard ‘C’
locale. To use the locales specified by the environment, you must call
setlocale
. Call it as follows:
setlocale (LC_ALL, ""); |
to select a locale based on the user choice of the appropriate environment variables.
You can also use setlocale
to specify a particular locale, for
general use or for a specific category.
The symbols in this section are defined in the header file ‘locale.h’.
The function setlocale
sets the current locale for category
category to locale. A list of all the locales the system
provides can be created by running
locale -a |
If category is LC_ALL
, this specifies the locale for all
purposes. The other possible values of category specify an
single purpose (see section Categories of Activities that Locales Affect).
You can also use this function to find out the current locale by passing
a null pointer as the locale argument. In this case,
setlocale
returns a string that is the name of the locale
currently selected for category category.
The string returned by setlocale
can be overwritten by subsequent
calls, so you should make a copy of the string (see section Copying and Concatenation) if you want to save it past any further calls to
setlocale
. (The standard library is guaranteed never to call
setlocale
itself.)
You should not modify the string returned by setlocale
. It might
be the same string that was passed as an argument in a previous call to
setlocale
. One requirement is that the category must be
the same in the call the string was returned and the one when the string
is passed in as locale parameter.
When you read the current locale for category LC_ALL
, the value
encodes the entire combination of selected locales for all categories.
In this case, the value is not just a single locale name. In fact, we
don't make any promises about what it looks like. But if you specify
the same “locale name” with LC_ALL
in a subsequent call to
setlocale
, it restores the same combination of locale selections.
To be sure you can use the returned string encoding the currently selected locale at a later time, you must make a copy of the string. It is not guaranteed that the returned pointer remains valid over time.
When the locale argument is not a null pointer, the string returned
by setlocale
reflects the newly-modified locale.
If you specify an empty string for locale, this means to read the appropriate environment variable and use its value to select the locale for category.
If a nonempty string is given for locale, then the locale of that name is used if possible.
If you specify an invalid locale name, setlocale
returns a null
pointer and leaves the current locale unchanged.
Here is an example showing how you might use setlocale
to
temporarily switch to a new locale.
#include <stddef.h>
#include <locale.h>
#include <stdlib.h>
#include <string.h>
void
with_other_locale (char *new_locale,
void (*subroutine) (int),
int argument)
{
char *old_locale, *saved_locale;
/* Get the name of the current locale. */
old_locale = setlocale (LC_ALL, NULL);
/* Copy the name so it won't be clobbered by |
Portability Note: Some ISO C systems may define additional locale categories, and future versions of the library will do so. For portability, assume that any symbol beginning with ‘LC_’ might be defined in ‘locale.h’.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The only locale names you can count on finding on all operating systems are these three standard ones:
"C"
This is the standard C locale. The attributes and behavior it provides are specified in the ISO C standard. When your program starts up, it initially uses this locale by default.
"POSIX"
This is the standard POSIX locale. Currently, it is an alias for the standard C locale.
""
The empty name says to select a locale based on environment variables. See section Categories of Activities that Locales Affect.
Defining and installing named locales is normally a responsibility of the system administrator at your site (or the person who installed the GNU C library). It is also possible for the user to create private locales. All this will be discussed later when describing the tool to do so.
If your program needs to use something other than the ‘C’ locale, it will be more portable if you use whatever locale the user specifies with the environment, rather than trying to specify some non-standard locale explicitly by name. Remember, different machines might have different sets of locales installed.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are several ways to access locale information. The simplest way is to let the C library itself do the work. Several of the functions in this library implicitly access the locale data, and use what information is provided by the currently selected locale. This is how the locale model is meant to work normally.
As an example take the strftime
function, which is meant to nicely
format date and time information (see section Formatting Calendar Time).
Part of the standard information contained in the LC_TIME
category is the names of the months. Instead of requiring the
programmer to take care of providing the translations the
strftime
function does this all by itself. %A
in the format string is replaced by the appropriate weekday
name of the locale currently selected by LC_TIME
. This is an
easy example, and wherever possible functions do things automatically
in this way.
But there are quite often situations when there is simply no function
to perform the task, or it is simply not possible to do the work
automatically. For these cases it is necessary to access the
information in the locale directly. To do this the C library provides
two functions: localeconv
and nl_langinfo
. The former is
part of ISO C and therefore portable, but has a brain-damaged
interface. The second is part of the Unix interface and is portable in
as far as the system follows the Unix standards.
7.6.1 localeconv : It is portable but … | ISO C's localeconv .
| |
7.6.2 Pinpoint Access to Locale Data | X/Open's nl_langinfo .
|
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
localeconv
: It is portable but … Together with the setlocale
function the ISO C people
invented the localeconv
function. It is a masterpiece of poor
design. It is expensive to use, not extendable, and not generally
usable as it provides access to only LC_MONETARY
and
LC_NUMERIC
related information. Nevertheless, if it is
applicable to a given situation it should be used since it is very
portable. The function strfmon
formats monetary amounts
according to the selected locale using this information.
The localeconv
function returns a pointer to a structure whose
components contain information about how numeric and monetary values
should be formatted in the current locale.
You should not modify the structure or its contents. The structure might
be overwritten by subsequent calls to localeconv
, or by calls to
setlocale
, but no other function in the library overwrites this
value.
localeconv
's return value is of this data type. Its elements are
described in the following subsections.
If a member of the structure struct lconv
has type char
,
and the value is CHAR_MAX
, it means that the current locale has
no value for that parameter.
7.6.1.1 Generic Numeric Formatting Parameters | Parameters for formatting numbers and currency amounts. | |
7.6.1.2 Printing the Currency Symbol | How to print the symbol that identifies an amount of money (e.g. ‘$’). | |
7.6.1.3 Printing the Sign of a Monetary Amount | How to print the (positive or negative) sign for a monetary amount, if one exists. |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
These are the standard members of struct lconv
; there may be
others.
char *decimal_point
char *mon_decimal_point
These are the decimal-point separators used in formatting non-monetary
and monetary quantities, respectively. In the ‘C’ locale, the
value of decimal_point
is "."
, and the value of
mon_decimal_point
is ""
.
char *thousands_sep
char *mon_thousands_sep
These are the separators used to delimit groups of digits to the left of
the decimal point in formatting non-monetary and monetary quantities,
respectively. In the ‘C’ locale, both members have a value of
""
(the empty string).
char *grouping
char *mon_grouping
These are strings that specify how to group the digits to the left of
the decimal point. grouping
applies to non-monetary quantities
and mon_grouping
applies to monetary quantities. Use either
thousands_sep
or mon_thousands_sep
to separate the digit
groups.
Each member of these strings is to be interpreted as an integer value of
type char
. Successive numbers (from left to right) give the
sizes of successive groups (from right to left, starting at the decimal
point.) The last member is either 0
, in which case the previous
member is used over and over again for all the remaining groups, or
CHAR_MAX
, in which case there is no more grouping—or, put
another way, any remaining digits form one large group without
separators.
For example, if grouping
is "\04\03\02"
, the correct
grouping for the number 123456787654321
is ‘12’, ‘34’,
‘56’, ‘78’, ‘765’, ‘4321’. This uses a group of 4
digits at the end, preceded by a group of 3 digits, preceded by groups
of 2 digits (as many as needed). With a separator of ‘,’, the
number would be printed as ‘12,34,56,78,765,4321’.
A value of "\03"
indicates repeated groups of three digits, as
normally used in the U.S.
In the standard ‘C’ locale, both grouping
and
mon_grouping
have a value of ""
. This value specifies no
grouping at all.
char int_frac_digits
char frac_digits
These are small integers indicating how many fractional digits (to the right of the decimal point) should be displayed in a monetary value in international and local formats, respectively. (Most often, both members have the same value.)
In the standard ‘C’ locale, both of these members have the value
CHAR_MAX
, meaning “unspecified”. The ISO standard doesn't say
what to do when you find this value; we recommend printing no
fractional digits. (This locale also specifies the empty string for
mon_decimal_point
, so printing any fractional digits would be
confusing!)
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
These members of the struct lconv
structure specify how to print
the symbol to identify a monetary value—the international analog of
‘$’ for US dollars.
Each country has two standard currency symbols. The local currency symbol is used commonly within the country, while the international currency symbol is used internationally to refer to that country's currency when it is necessary to indicate the country unambiguously.
For example, many countries use the dollar as their monetary unit, and when dealing with international currencies it's important to specify that one is dealing with (say) Canadian dollars instead of U.S. dollars or Australian dollars. But when the context is known to be Canada, there is no need to make this explicit—dollar amounts are implicitly assumed to be in Canadian dollars.
char *currency_symbol
The local currency symbol for the selected locale.
In the standard ‘C’ locale, this member has a value of ""
(the empty string), meaning “unspecified”. The ISO standard doesn't
say what to do when you find this value; we recommend you simply print
the empty string as you would print any other string pointed to by this
variable.
char *int_curr_symbol
The international currency symbol for the selected locale.
The value of int_curr_symbol
should normally consist of a
three-letter abbreviation determined by the international standard
ISO 4217 Codes for the Representation of Currency and Funds,
followed by a one-character separator (often a space).
In the standard ‘C’ locale, this member has a value of ""
(the empty string), meaning “unspecified”. We recommend you simply print
the empty string as you would print any other string pointed to by this
variable.
char p_cs_precedes
char n_cs_precedes
char int_p_cs_precedes
char int_n_cs_precedes
These members are 1
if the currency_symbol
or
int_curr_symbol
strings should precede the value of a monetary
amount, or 0
if the strings should follow the value. The
p_cs_precedes
and int_p_cs_precedes
members apply to
positive amounts (or zero), and the n_cs_precedes
and
int_n_cs_precedes
members apply to negative amounts.
In the standard ‘C’ locale, all of these members have a value of
CHAR_MAX
, meaning “unspecified”. The ISO standard doesn't say
what to do when you find this value. We recommend printing the
currency symbol before the amount, which is right for most countries.
In other words, treat all nonzero values alike in these members.
The members with the int_
prefix apply to the
int_curr_symbol
while the other two apply to
currency_symbol
.
char p_sep_by_space
char n_sep_by_space
char int_p_sep_by_space
char int_n_sep_by_space
These members are 1
if a space should appear between the
currency_symbol
or int_curr_symbol
strings and the
amount, or 0
if no space should appear. The
p_sep_by_space
and int_p_sep_by_space
members apply to
positive amounts (or zero), and the n_sep_by_space
and
int_n_sep_by_space
members apply to negative amounts.
In the standard ‘C’ locale, all of these members have a value of
CHAR_MAX
, meaning “unspecified”. The ISO standard doesn't say
what you should do when you find this value; we suggest you treat it as
1 (print a space). In other words, treat all nonzero values alike in
these members.
The members with the int_
prefix apply to the
int_curr_symbol
while the other two apply to
currency_symbol
. There is one specialty with the
int_curr_symbol
, though. Since all legal values contain a space
at the end the string one either printf this space (if the currency
symbol must appear in front and must be separated) or one has to avoid
printing this character at all (especially when at the end of the
string).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
These members of the struct lconv
structure specify how to print
the sign (if any) of a monetary value.
char *positive_sign
char *negative_sign
These are strings used to indicate positive (or zero) and negative monetary quantities, respectively.
In the standard ‘C’ locale, both of these members have a value of
""
(the empty string), meaning “unspecified”.
The ISO standard doesn't say what to do when you find this value; we
recommend printing positive_sign
as you find it, even if it is
empty. For a negative value, print negative_sign
as you find it
unless both it and positive_sign
are empty, in which case print
‘-’ instead. (Failing to indicate the sign at all seems rather
unreasonable.)
char p_sign_posn
char n_sign_posn
char int_p_sign_posn
char int_n_sign_posn
These members are small integers that indicate how to
position the sign for nonnegative and negative monetary quantities,
respectively. (The string used by the sign is what was specified with
positive_sign
or negative_sign
.) The possible values are
as follows:
0
The currency symbol and quantity should be surrounded by parentheses.
1
Print the sign string before the quantity and currency symbol.
2
Print the sign string after the quantity and currency symbol.
3
Print the sign string right before the currency symbol.
4
Print the sign string right after the currency symbol.
CHAR_MAX
“Unspecified”. Both members have this value in the standard ‘C’ locale.
The ISO standard doesn't say what you should do when the value is
CHAR_MAX
. We recommend you print the sign after the currency
symbol.
The members with the int_
prefix apply to the
int_curr_symbol
while the other two apply to
currency_symbol
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When writing the X/Open Portability Guide the authors realized that the
localeconv
function is not enough to provide reasonable access to
locale information. The information which was meant to be available
in the locale (as later specified in the POSIX.1 standard) requires more
ways to access it. Therefore the nl_langinfo
function
was introduced.
The nl_langinfo
function can be used to access individual
elements of the locale categories. Unlike the localeconv
function, which returns all the information, nl_langinfo
lets the caller select what information it requires. This is very
fast and it is not a problem to call this function multiple times.
A second advantage is that in addition to the numeric and monetary
formatting information, information from the
LC_TIME
and LC_MESSAGES
categories is available.
The type nl_type
is defined in ‘nl_types.h’. The argument
item is a numeric value defined in the header ‘langinfo.h’.
The X/Open standard defines the following values:
CODESET
nl_langinfo
returns a string with the name of the coded character
set used in the selected locale.
ABDAY_1
ABDAY_2
ABDAY_3
ABDAY_4
ABDAY_5
ABDAY_6
ABDAY_7
nl_langinfo
returns the abbreviated weekday name. ABDAY_1
corresponds to Sunday.
DAY_1
DAY_2
DAY_3
DAY_4
DAY_5
DAY_6
DAY_7
Similar to ABDAY_1
etc., but here the return value is the
unabbreviated weekday name.
ABMON_1
ABMON_2
ABMON_3
ABMON_4
ABMON_5
ABMON_6
ABMON_7
ABMON_8
ABMON_9
ABMON_10
ABMON_11
ABMON_12
The return value is abbreviated name of the month. ABMON_1
corresponds to January.
MON_1
MON_2
MON_3
MON_4
MON_5
MON_6
MON_7
MON_8
MON_9
MON_10
MON_11
MON_12
Similar to ABMON_1
etc., but here the month names are not abbreviated.
Here the first value MON_1
also corresponds to January.
AM_STR
PM_STR
The return values are strings which can be used in the representation of time as an hour from 1 to 12 plus an am/pm specifier.
Note that in locales which do not use this time representation these strings might be empty, in which case the am/pm format cannot be used at all.
D_T_FMT
The return value can be used as a format string for strftime
to
represent time and date in a locale-specific way.
D_FMT
The return value can be used as a format string for strftime
to
represent a date in a locale-specific way.
T_FMT
The return value can be used as a format string for strftime
to
represent time in a locale-specific way.
T_FMT_AMPM
The return value can be used as a format string for strftime
to
represent time in the am/pm format.
Note that if the am/pm format does not make any sense for the
selected locale, the return value might be the same as the one for
T_FMT
.
ERA
The return value represents the era used in the current locale.
Most locales do not define this value. An example of a locale which does define this value is the Japanese one. In Japan, the traditional representation of dates includes the name of the era corresponding to the then-emperor's reign.
Normally it should not be necessary to use this value directly.
Specifying the E
modifier in their format strings causes the
strftime
functions to use this information. The format of the
returned string is not specified, and therefore you should not assume
knowledge of it on different systems.
ERA_YEAR
The return value gives the year in the relevant era of the locale.
As for ERA
it should not be necessary to use this value directly.
ERA_D_T_FMT
This return value can be used as a format string for strftime
to
represent dates and times in a locale-specific era-based way.
ERA_D_FMT
This return value can be used as a format string for strftime
to
represent a date in a locale-specific era-based way.
ERA_T_FMT
This return value can be used as a format string for strftime
to
represent time in a locale-specific era-based way.
ALT_DIGITS
The return value is a representation of up to 100 values used to
represent the values 0 to 99. As for ERA
this
value is not intended to be used directly, but instead indirectly
through the strftime
function. When the modifier O
is
used in a format which would otherwise use numerals to represent hours,
minutes, seconds, weekdays, months, or weeks, the appropriate value for
the locale is used instead.
INT_CURR_SYMBOL
The same as the value returned by localeconv
in the
int_curr_symbol
element of the struct lconv
.
CURRENCY_SYMBOL
CRNCYSTR
The same as the value returned by localeconv
in the
currency_symbol
element of the struct lconv
.
CRNCYSTR
is a deprecated alias still required by Unix98.
MON_DECIMAL_POINT
The same as the value returned by localeconv
in the
mon_decimal_point
element of the struct lconv
.
MON_THOUSANDS_SEP
The same as the value returned by localeconv
in the
mon_thousands_sep
element of the struct lconv
.
MON_GROUPING
The same as the value returned by localeconv
in the
mon_grouping
element of the struct lconv
.
POSITIVE_SIGN
The same as the value returned by localeconv
in the
positive_sign
element of the struct lconv
.
NEGATIVE_SIGN
The same as the value returned by localeconv
in the
negative_sign
element of the struct lconv
.
INT_FRAC_DIGITS
The same as the value returned by localeconv
in the
int_frac_digits
element of the struct lconv
.
FRAC_DIGITS
The same as the value returned by localeconv
in the
frac_digits
element of the struct lconv
.
P_CS_PRECEDES
The same as the value returned by localeconv
in the
p_cs_precedes
element of the struct lconv
.
P_SEP_BY_SPACE
The same as the value returned by localeconv
in the
p_sep_by_space
element of the struct lconv
.
N_CS_PRECEDES
The same as the value returned by localeconv
in the
n_cs_precedes
element of the struct lconv
.
N_SEP_BY_SPACE
The same as the value returned by localeconv
in the
n_sep_by_space
element of the struct lconv
.
P_SIGN_POSN
The same as the value returned by localeconv
in the
p_sign_posn
element of the struct lconv
.
N_SIGN_POSN
The same as the value returned by localeconv
in the
n_sign_posn
element of the struct lconv
.
INT_P_CS_PRECEDES
The same as the value returned by localeconv
in the
int_p_cs_precedes
element of the struct lconv
.
INT_P_SEP_BY_SPACE
The same as the value returned by localeconv
in the
int_p_sep_by_space
element of the struct lconv
.
INT_N_CS_PRECEDES
The same as the value returned by localeconv
in the
int_n_cs_precedes
element of the struct lconv
.
INT_N_SEP_BY_SPACE
The same as the value returned by localeconv
in the
int_n_sep_by_space
element of the struct lconv
.
INT_P_SIGN_POSN
The same as the value returned by localeconv
in the
int_p_sign_posn
element of the struct lconv
.
INT_N_SIGN_POSN
The same as the value returned by localeconv
in the
int_n_sign_posn
element of the struct lconv
.
DECIMAL_POINT
RADIXCHAR
The same as the value returned by localeconv
in the
decimal_point
element of the struct lconv
.
The name RADIXCHAR
is a deprecated alias still used in Unix98.
THOUSANDS_SEP
THOUSEP
The same as the value returned by localeconv
in the
thousands_sep
element of the struct lconv
.
The name THOUSEP
is a deprecated alias still used in Unix98.
GROUPING
The same as the value returned by localeconv
in the
grouping
element of the struct lconv
.
YESEXPR
The return value is a regular expression which can be used with the
regex
function to recognize a positive response to a yes/no
question. The GNU C library provides the rpmatch
function for
easier handling in applications.
NOEXPR
The return value is a regular expression which can be used with the
regex
function to recognize a negative response to a yes/no
question.
YESSTR
The return value is a locale-specific translation of the positive response to a yes/no question.
Using this value is deprecated since it is a very special case of message translation, and is better handled by the message translation functions (see section Message Translation).
The use of this symbol is deprecated. Instead message translation should be used.
NOSTR
The return value is a locale-specific translation of the negative response
to a yes/no question. What is said for YESSTR
is also true here.
The use of this symbol is deprecated. Instead message translation should be used.
The file ‘langinfo.h’ defines a lot more symbols but none of them is official. Using them is not portable, and the format of the return values might change. Therefore we recommended you not use them.
Note that the return value for any valid argument can be used for
in all situations (with the possible exception of the am/pm time formatting
codes). If the user has not selected any locale for the
appropriate category, nl_langinfo
returns the information from the
"C"
locale. It is therefore possible to use this function as
shown in the example below.
If the argument item is not valid, a pointer to an empty string is returned.
An example of nl_langinfo
usage is a function which has to
print a given date and time in a locale-specific way. At first one
might think that, since strftime
internally uses the locale
information, writing something like the following is enough:
size_t i18n_time_n_data (char *s, size_t len, const struct tm *tp) { return strftime (s, len, "%X %D", tp); } |
The format contains no weekday or month names and therefore is
internationally usable. Wrong! The output produced is something like
"hh:mm:ss MM/DD/YY"
. This format is only recognizable in the
USA. Other countries use different formats. Therefore the function
should be rewritten like this:
size_t i18n_time_n_data (char *s, size_t len, const struct tm *tp) { return strftime (s, len, nl_langinfo (D_T_FMT), tp); } |
Now it uses the date and time format of the locale selected when the program runs. If the user selects the locale correctly there should never be a misunderstanding over the time and date format.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
We have seen that the structure returned by localeconv
as well as
the values given to nl_langinfo
allow you to retrieve the various
pieces of locale-specific information to format numbers and monetary
amounts. We have also seen that the underlying rules are quite complex.
Therefore the X/Open standards introduce a function which uses such locale information, making it easier for the user to format numbers according to these rules.
The strfmon
function is similar to the strftime
function
in that it takes a buffer, its size, a format string,
and values to write into the buffer as text in a form specified
by the format string. Like strftime
, the function
also returns the number of bytes written into the buffer.
There are two differences: strfmon
can take more than one
argument, and, of course, the format specification is different. Like
strftime
, the format string consists of normal text, which is
output as is, and format specifiers, which are indicated by a ‘%’.
Immediately after the ‘%’, you can optionally specify various flags
and formatting information before the main formatting character, in a
similar way to printf
:
The single byte character f is used for this field as the numeric fill character. By default this character is a space character. Filling with this character is only performed if a left precision is specified. It is not just to fill to the given field width.
The number is printed without grouping the digits according to the rules of the current locale. By default grouping is enabled.
At most one of these flags can be used. They select which format to
represent the sign of a currency amount. By default, and if
‘+’ is given, the locale equivalent of +/- is used. If
‘(’ is given, negative amounts are enclosed in parentheses. The
exact format is determined by the values of the LC_MONETARY
category of the locale selected at program runtime.
The output will not contain the currency symbol.
The output will be formatted left-justified instead of right-justified if it does not fill the entire field width.
The next part of a specification is an optional field width. If no width is specified 0 is taken. During output, the function first determines how much space is required. If it requires at least as many characters as given by the field width, it is output using as much space as necessary. Otherwise, it is extended to use the full width by filling with the space character. The presence or absence of the ‘-’ flag determines the side at which such padding occurs. If present, the spaces are added at the right making the output left-justified, and vice versa.
So far the format looks familiar, being similar to the printf
and
strftime
formats. However, the next two optional fields
introduce something new. The first one is a ‘#’ character followed
by a decimal digit string. The value of the digit string specifies the
number of digit positions to the left of the decimal point (or
equivalent). This does not include the grouping character when
the ‘^’ flag is not given. If the space needed to print the number
does not fill the whole width, the field is padded at the left side with
the fill character, which can be selected using the ‘=’ flag and by
default is a space. For example, if the field width is selected as 6
and the number is 123, the fill character is ‘*’ the result
will be ‘***123’.
The second optional field starts with a ‘.’ (period) and consists
of another decimal digit string. Its value describes the number of
characters printed after the decimal point. The default is selected
from the current locale (frac_digits
, int_frac_digits
, see
see section Generic Numeric Formatting Parameters). If the exact representation needs more digits
than given by the field width, the displayed value is rounded. If the
number of fractional digits is selected to be zero, no decimal point is
printed.
As a GNU extension, the strfmon
implementation in the GNU libc
allows an optional ‘L’ next as a format modifier. If this modifier
is given, the argument is expected to be a long double
instead of
a double
value.
Finally, the last component is a format specifier. There are three specifiers defined:
Use the locale's rules for formatting an international currency value.
Use the locale's rules for formatting a national currency value.
Place a ‘%’ in the output. There must be no flag, width specifier or modifier given, only ‘%%’ is allowed.
As for printf
, the function reads the format string
from left to right and uses the values passed to the function following
the format string. The values are expected to be either of type
double
or long double
, depending on the presence of the
modifier ‘L’. The result is stored in the buffer pointed to by
s. At most maxsize characters are stored.
The return value of the function is the number of characters stored in
s, including the terminating NULL
byte. If the number of
characters stored would exceed maxsize, the function returns
-1 and the content of the buffer s is unspecified. In this
case errno
is set to E2BIG
.
A few examples should make clear how the function works. It is
assumed that all the following pieces of code are executed in a program
which uses the USA locale (en_US
). The simplest
form of the format is this:
strfmon (buf, 100, "@%n@%n@%n@", 123.45, -567.89, 12345.678); |
The output produced is
"@$123.45@-$567.89@$12,345.68@" |
We can notice several things here. First, the widths of the output
numbers are different. We have not specified a width in the format
string, and so this is no wonder. Second, the third number is printed
using thousands separators. The thousands separator for the
en_US
locale is a comma. The number is also rounded.
.678 is rounded to .68 since the format does not specify a
precision and the default value in the locale is 2. Finally,
note that the national currency symbol is printed since ‘%n’ was
used, not ‘i’. The next example shows how we can align the output.
strfmon (buf, 100, "@%=*11n@%=*11n@%=*11n@", 123.45, -567.89, 12345.678); |
The output this time is:
"@ $123.45@ -$567.89@ $12,345.68@" |
Two things stand out. Firstly, all fields have the same width (eleven characters) since this is the width given in the format and since no number required more characters to be printed. The second important point is that the fill character is not used. This is correct since the white space was not used to achieve a precision given by a ‘#’ modifier, but instead to fill to the given width. The difference becomes obvious if we now add a width specification.
strfmon (buf, 100, "@%=*11#5n@%=*11#5n@%=*11#5n@", 123.45, -567.89, 12345.678); |
The output is
"@ $***123.45@-$***567.89@ $12,456.68@" |
Here we can see that all the currency symbols are now aligned, and that the space between the currency sign and the number is filled with the selected fill character. Note that although the width is selected to be 5 and 123.45 has three digits left of the decimal point, the space is filled with three asterisks. This is correct since, as explained above, the width does not include the positions used to store thousands separators. One last example should explain the remaining functionality.
strfmon (buf, 100, "@%=0(16#5.3i@%=0(16#5.3i@%=0(16#5.3i@", 123.45, -567.89, 12345.678); |
This rather complex format string produces the following output:
"@ USD 000123,450 @(USD 000567.890)@ USD 12,345.678 @" |
The most noticeable change is the alternative way of representing
negative numbers. In financial circles this is often done using
parentheses, and this is what the ‘(’ flag selected. The fill
character is now ‘0’. Note that this ‘0’ character is not
regarded as a numeric zero, and therefore the first and second numbers
are not printed using a thousands separator. Since we used the format
specifier ‘i’ instead of ‘n’, the international form of the
currency symbol is used. This is a four letter string, in this case
"USD "
. The last point is that since the precision right of the
decimal point is selected to be three, the first and second numbers are
printed with an extra zero at the end and the third number is printed
without rounding.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Some non GUI programs ask a yes-or-no question. If the messages (especially the questions) are translated into foreign languages, be sure that you localize the answers too. It would be very bad habit to ask a question in one language and request the answer in another, often English.
The GNU C library contains rpmatch
to give applications easy
access to the corresponding locale definitions.
The function rpmatch
checks the string in response whether
or not it is a correct yes-or-no answer and if yes, which one. The
check uses the YESEXPR
and NOEXPR
data in the
LC_MESSAGES
category of the currently selected locale. The
return value is as follows:
1
The user entered an affirmative answer.
0
The user entered a negative answer.
-1
The answer matched neither the YESEXPR
nor the NOEXPR
regular expression.
This function is not standardized but available beside in GNU libc at least also in the IBM AIX library.
This function would normally be used like this:
… /* Use a safe default. */ _Bool doit = false; fputs (gettext ("Do you really want to do this? "), stdout); fflush (stdout); /* Prepare the |
Note that the loop continues until an read error is detected or until a definitive (positive or negative) answer is read.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by root on January, 9 2009 using texi2html 1.78.