16.2. locale — 本地人文接口
目的:处理依赖于用户语言与位置的格式与值解析。
locale
模块是 Python 的国际化和本地化支持库的一部分。他提供了一种标准方式用于处理依赖于用户语言与位置的相关操作。举个例子:将数字格式化为货币、排序中的字符串比较、处理时间和日期。但这个模块并不包括翻译(参见 gettext
模块)和 Unicode 编码(参见 codecs
模块)相关的函数。
注解
改变地区设置会产生应用级别的影响,所以最佳实践是避免改变库中的值,而让应用自己设置一次。在本章的例子里,我们通过一个小程序多次改变地区设置,以便突出不同地区设置对程序的影响。更常见的情况是程序在启动时或收到一个 Web 请求时设置地区,之后便不再改变它。
本章会包含 locale
模块的一些高级函数。同时也会介绍一些更低级的操作,如 format_string()
—— 格式化字符串和与管理应用地区设置有关的 resetlocale()
。
探索当前的地区设置
一般通过设置环境变量,让用户可以改变某个应用的地区设置。不同的平台环境变量也不相同,常见的有:LC_ALL
、LC_CTYPE
、LANG
或 LANGUAGE
。
然后程序通过调用 setlocale()
函数从环境变量中获取地区设置,而不是把地区设置硬编码在程序中。
locale_env.py
import locale
import os
import pprint
# 基于用户环境变量的默认值
locale.setlocale(locale.LC_ALL, '')
print('Environment settings:')
for env_name in ['LC_ALL', 'LC_CTYPE', 'LANG', 'LANGUAGE']:
print(' {} = {}'.format(
env_name, os.environ.get(env_name, ''))
)
# 目前的地区设置是什么?
print('\nLocale from environment:', locale.getlocale())
template = """
Numeric formatting:
Decimal point : "{decimal_point}"
Grouping positions : {grouping}
Thousands separator: "{thousands_sep}"
Monetary formatting:
International currency symbol : "{int_curr_symbol!r}"
Local currency symbol : {currency_symbol!r}
Symbol precedes positive value : {p_cs_precedes}
Symbol precedes negative value : {n_cs_precedes}
Decimal point : "{mon_decimal_point}"
Digits in fractional values : {frac_digits}
Digits in fractional values,
international : {int_frac_digits}
Grouping positions : {mon_grouping}
Thousands separator : "{mon_thousands_sep}"
Positive sign : "{positive_sign}"
Positive sign position : {p_sign_posn}
Negative sign : "{negative_sign}"
Negative sign position : {n_sign_posn}
"""
sign_positions = {
0: 'Surrounded by parentheses',
1: 'Before value and symbol',
2: 'After value and symbol',
3: 'Before value',
4: 'After value',
locale.CHAR_MAX: 'Unspecified',
}
info = {}
info.update(locale.localeconv())
info['p_sign_posn'] = sign_positions[info['p_sign_posn']]
info['n_sign_posn'] = sign_positions[info['n_sign_posn']]
print(template.format(**info))
localeconv()
方法会返回一个字典,其中包含了地区设置约定。字典中其它的名称与定义可以在标准库的文档中找到。
在运行 OS X 10.11.6 系统的 Mac 上,不设置任何环境变量时,运行程序会输出以下结果:
$ export LANG=; export LC_CTYPE=; python3 locale_env.py
Environment settings:
LC_ALL =
LC_CTYPE =
LANG =
LANGUAGE =
Locale from environment: (None, None)
Numeric formatting:
Decimal point : "."
Grouping positions : []
Thousands separator: ""
Monetary formatting:
International currency symbol : "''"
Local currency symbol : ''
Symbol precedes positive value : 127
Symbol precedes negative value : 127
Decimal point : ""
Digits in fractional values : 127
Digits in fractional values,
international : 127
Grouping positions : []
Thousands separator : ""
Positive sign : ""
Positive sign position : Unspecified
Negative sign : ""
Negative sign position : Unspecified
提供不同的 LANG
环境变量变量参数,运行程序并观察地区设置和默认编码是如何改变的。
美国 (en_US
):
$ LANG=en_US LC_CTYPE=en_US LC_ALL=en_US python3 locale_env.py
Environment settings:
LC_ALL = en_US
LC_CTYPE = en_US
LANG = en_US
LANGUAGE =
Locale from environment: ('en_US', 'ISO8859-1')
Numeric formatting:
Decimal point : "."
Grouping positions : [3, 3, 0]
Thousands separator: ","
Monetary formatting:
International currency symbol : "'USD '"
Local currency symbol : '$'
Symbol precedes positive value : 1
Symbol precedes negative value : 1
Decimal point : "."
Digits in fractional values : 2
Digits in fractional values,
international : 2
Grouping positions : [3, 3, 0]
Thousands separator : ","
Positive sign : ""
Positive sign position : Before value and symbol
Negative sign : "-"
Negative sign position : Before value and symbol
法国 (fr_FR
):
$ LANG=fr_FR LC_CTYPE=fr_FR LC_ALL=fr_FR python3 locale_env.py
Environment settings:
LC_ALL = fr_FR
LC_CTYPE = fr_FR
LANG = fr_FR
LANGUAGE =
Locale from environment: ('fr_FR', 'ISO8859-1')
Numeric formatting:
Decimal point : ","
Grouping positions : [127]
Thousands separator: ""
Monetary formatting:
International currency symbol : "'EUR '"
Local currency symbol : 'Eu'
Symbol precedes positive value : 0
Symbol precedes negative value : 0
Decimal point : ","
Digits in fractional values : 2
Digits in fractional values,
international : 2
Grouping positions : [3, 3, 0]
Thousands separator : " "
Positive sign : ""
Positive sign position : Before value and symbol
Negative sign : "-"
Negative sign position : After value and symbol
西班牙 (es_ES
):
$ LANG=es_ES LC_CTYPE=es_ES LC_ALL=es_ES python3 locale_env.py
Environment settings:
LC_ALL = es_ES
LC_CTYPE = es_ES
LANG = es_ES
LANGUAGE =
Locale from environment: ('es_ES', 'ISO8859-1')
Numeric formatting:
Decimal point : ","
Grouping positions : [127]
Thousands separator: ""
Monetary formatting:
International currency symbol : "'EUR '"
Local currency symbol : 'Eu'
Symbol precedes positive value : 0
Symbol precedes negative value : 0
Decimal point : ","
Digits in fractional values : 2
Digits in fractional values,
international : 2
Grouping positions : [3, 3, 0]
Thousands separator : "."
Positive sign : ""
Positive sign position : Before value and symbol
Negative sign : "-"
Negative sign position : Before value and symbol
葡萄牙 (pt_PT
):
$ LANG=pt_PT LC_CTYPE=pt_PT LC_ALL=pt_PT python3 locale_env.py
Environment settings:
LC_ALL = pt_PT
LC_CTYPE = pt_PT
LANG = pt_PT
LANGUAGE =
Locale from environment: ('pt_PT', 'ISO8859-1')
Numeric formatting:
Decimal point : ","
Grouping positions : []
Thousands separator: " "
Monetary formatting:
International currency symbol : "'EUR '"
Local currency symbol : 'Eu'
Symbol precedes positive value : 0
Symbol precedes negative value : 0
Decimal point : "."
Digits in fractional values : 2
Digits in fractional values,
international : 2
Grouping positions : [3, 3, 0]
Thousands separator : "."
Positive sign : ""
Positive sign position : Before value and symbol
Negative sign : "-"
Negative sign position : Before value and symbol
波兰 (pl_PL
):
$ LANG=pl_PL LC_CTYPE=pl_PL LC_ALL=pl_PL python3 locale_env.py
Environment settings:
LC_ALL = pl_PL
LC_CTYPE = pl_PL
LANG = pl_PL
LANGUAGE =
Locale from environment: ('pl_PL', 'ISO8859-2')
Numeric formatting:
Decimal point : ","
Grouping positions : [3, 3, 0]
Thousands separator: " "
Monetary formatting:
International currency symbol : "'PLN '"
Local currency symbol : 'zł'
Symbol precedes positive value : 1
Symbol precedes negative value : 1
Decimal point : ","
Digits in fractional values : 2
Digits in fractional values,
international : 2
Grouping positions : [3, 3, 0]
Thousands separator : " "
Positive sign : ""
Positive sign position : After value
Negative sign : "-"
Negative sign position : After value
货币
上一个例子的输出显示,改变地区设置同时会改变货币符号和数字分隔符。
这个例子通过循环来改变地区设置,每个地区设置值下都打印一个正数和负数货币,输出结果以比较他们的差异。
locale_currency.py
import locale
sample_locales = [
('USA', 'en_US'),
('France', 'fr_FR'),
('Spain', 'es_ES'),
('Portugal', 'pt_PT'),
('Poland', 'pl_PL'),
]
for name, loc in sample_locales:
locale.setlocale(locale.LC_ALL, loc)
print('{:>10}: {:>10} {:>10}'.format(
name,
locale.currency(1234.56),
locale.currency(-1234.56),
))
程序以小表格的形式输出:
$ python3 locale_currency.py
USA: $1234.56 -$1234.56
France: 1234,56 Eu 1234,56 Eu-
Spain: 1234,56 Eu -1234,56 Eu
Portugal: 1234.56 Eu -1234.56 Eu
Poland: zł 1234,56 zł 1234,56-
格式化数字
地区设置改变时,与货币单位无关的数字格式发生了变化。用于将大数字分割成可读小块的分组字符也发生了变化。
locale_grouping.py
import locale
sample_locales = [
('USA', 'en_US'),
('France', 'fr_FR'),
('Spain', 'es_ES'),
('Portugal', 'pt_PT'),
('Poland', 'pl_PL'),
]
print('{:>10} {:>10} {:>15}'.format(
'Locale', 'Integer', 'Float')
)
for name, loc in sample_locales:
locale.setlocale(locale.LC_ALL, loc)
print('{:>10}'.format(name), end=' ')
print(locale.format('%10d', 123456, grouping=True), end=' ')
print(locale.format('%15.2f', 123456.78, grouping=True))
要让格式化后的数字不带货币单位,应该使用 format()
而不是 currency()
函数。
$ python3 locale_grouping.py
Locale Integer Float
USA 123,456 123,456.78
France 123456 123456,78
Spain 123456 123456,78
Portugal 123456 123456,78
Poland 123 456 123 456,78
要将本地化的数字还原为地区无关的数字请使用 delocalize()
。
locale_delocalize.py
import locale
sample_locales = [
('USA', 'en_US'),
('France', 'fr_FR'),
('Spain', 'es_ES'),
('Portugal', 'pt_PT'),
('Poland', 'pl_PL'),
]
for name, loc in sample_locales:
locale.setlocale(locale.LC_ALL, loc)
localized = locale.format('%0.2f', 123456.78, grouping=True)
delocalized = locale.delocalize(localized)
print('{:>10}: {:>10} {:>10}'.format(
name,
localized,
delocalized,
))
删除分组符号,并将设置小数分隔符为 .
。
$ python3 locale_delocalize.py
USA: 123,456.78 123456.78
France: 123456,78 123456.78
Spain: 123456,78 123456.78
Portugal: 123456,78 123456.78
Poland: 123 456,78 123456.78
解析数字
除了生成不同格式的输出外, locale
模块还可以帮助解析用户输入的字符串。模块内包含了 atoi()
和 atof()
函数,这两个函数可以根据地区数字格式约定,将字符串转化为整数或浮点数。
locale_atof.py
import locale
sample_data = [
('USA', 'en_US', '1,234.56'),
('France', 'fr_FR', '1234,56'),
('Spain', 'es_ES', '1234,56'),
('Portugal', 'pt_PT', '1234.56'),
('Poland', 'pl_PL', '1 234,56'),
]
for name, loc, a in sample_data:
locale.setlocale(locale.LC_ALL, loc)
print('{:>10}: {:>9} => {:f}'.format(
name,
a,
locale.atof(a),
))
解析器识别出了分组符号与小数点。
$ python3 locale_atof.py
USA: 1,234.56 => 1234.560000
France: 1234,56 => 1234.560000
Spain: 1234,56 => 1234.560000
Portugal: 1234.56 => 1234.560000
Poland: 1 234,56 => 1234.560000
时间与日期
时间与日期格式是本地化的另一个重要方面。
locale_date.py
import locale
import time
sample_locales = [
('USA', 'en_US'),
('France', 'fr_FR'),
('Spain', 'es_ES'),
('Portugal', 'pt_PT'),
('Poland', 'pl_PL'),
]
for name, loc in sample_locales:
locale.setlocale(locale.LC_ALL, loc)
format = locale.nl_langinfo(locale.D_T_FMT)
print('{:>10}: {}'.format(name, time.strftime(format)))
以上是一个使用地区日期格式字符串打印当前日期的例子。
$ python3 locale_date.py
USA: Sun Mar 18 16:20:59 2018
France: Dim 18 mar 16:20:59 2018
Spain: dom 18 mar 16:20:59 2018
Portugal: Dom 18 Mar 16:20:59 2018
Poland: ndz 18 mar 16:20:59 2018
参见
- 标准库文档 - 本地化 (英文)
- Python 2 到 3 本地化迁移笔记 (英文)
gettext
-- 供翻译使用的消息分类。
本译文仅用于学习和交流目的,转载请务必注明文章译者、出处、和本文链接
我们的翻译工作遵照 CC 协议,如果我们的工作有侵犯到您的权益,请及时联系我们。