1.14 对标题进行合适的大写化
时间:2007-01-09 来源:xiaoshengcaicai
1.14 对标题进行合适的大写化
1.14.1 问题
你用一个字符串,比如说是用来显示大行标题的,或者是一本书的书名,又或者其他一些你想进行合适地大写化的东西。
1.14.2 解决方案
使用下面这个tc函数,给它传一个(字符串)变量:
INIT {
our %nocap;
for (qw(
a an the
and but or
as at but by for from in into of off on onto per to with
))
{
$nocap{$_}++;
}
}
sub tc {
local $_ = shift;
# put into lowercase if on stop list, else titlecase
s/(\pL[\pL']*)/$nocap{$1} ? lc($1) : ucfirst(lc($1))/ge;
s/^(\pL[\pL']*) /\u\L$1/x; # last word guaranteed to cap
s/ (\pL[\pL']*)$/\u\L$1/x; # first word guaranteed to cap
# treat parenthesized portion as a complete title
s/\( (\pL[\pL']*) /(\u\L$1/x;
s/(\pL[\pL']*) \) /\u\L$1)/x;
# capitalize first word following colon or semi-colon
s/ ( [:;] \s+ ) (\pL[\pL']* ) /$1\u\L$2/x;
return $_;
}
1.14.3 讨论
要正确的对英语标题进行大写化,并不单单把首字母大写就可以了。如果你只要转化首字符为大写,只需这样写:
s/(\w+\S*\w*)/\u\L$1/g;
大多数的样式说明书(style guides) 都说标题的第一个单词跟最后一个单词都要大写化,其他一些比如不定式里面的to,连词,前置词等都不应该大写化。
下面又是一个演示,这次展示的不同属性的标题。假定下面的tc函数就是我们上面解决方案里头的函数:
# with apologies (or kudos) to Stephen Brust, PJF,
# and to JRRT, as always.
@data = (
"the enchantress of \x{01F3}ur mountain",
"meeting the enchantress of \x{01F3}ur mountain",
"the lord of the rings: the fellowship of the ring",
);
$mask = "%-20s: %s\n";
sub tc_lame {
local $_ = shift;
s/(\w+\S*\w*)/\u\L$1/g;
return $_;
}
for $datum (@data) {
printf $mask, "ALL CAPITALS", uc($datum);
printf $mask, "no capitals", lc($datum);
printf $mask, "simple titlecase", tc_lame($datum);
printf $mask, "better titlecase", tc($datum);
print "\n";
}
ALL CAPITALS : THE ENCHANTRESS OF DZUR MOUNTAIN
no capitals : the enchantress of dzur mountain
simple titlecase : The Enchantress Of Dzur Mountain
better titlecase : The Enchantress of Dzur Mountain
ALL CAPITALS : MEETING THE ENCHANTRESS OF DZUR MOUNTAIN
no capitals : meeting the enchantress of dzur mountain
simple titlecase : Meeting The Enchantress Of Dzur Mountain
better titlecase : Meeting the Enchantress of Dzur Mountain
ALL CAPITALS : THE LORD OF THE RINGS: THE FELLOWSHIP OF THE RING
no capitals : the lord of the rings: the fellowship of the ring
simple titlecase : The Lord Of The Rings: The Fellowship Of The Ring
better titlecase : The Lord of the Rings: The Fellowship of the Ring
有一样东西要注意的就是有一些式样说明书仅对那些长度是3,4,5的前置词进行大写化。以 O'Reilly & Associates 为例,对少于等于4个字符长度的前置词它都保持小写。下面是一个长一点的前置词列表,你可以根据需要修改:
@all_prepositions = qw{
about above absent across after against along amid amidst
among amongst around as at athwart before behind below
beneath beside besides between betwixt beyond but by circa
down during ere except for from in into near of off on onto
out over past per since than through till to toward towards
under until unto up upon versus via with within without
};
上面这个方法能做的大概就只能是这样了。因为这个方法不能分辨出一些单词是否是一段话里面的某部分内容。有一些前置词可以兼当一般的单词用,这样就必须大写化,比如主从句之间的连接词,副词甚至形容词。举个例子,"Down by the Riverside"里面的by是小写,而"Getting By on Just $30 a Day"里面的By则是大写化了的,再看"A Ringing in My Ears"跟"Bringing In the Sheaves"里面的in。
还有一点要注意的就是只用了\u或者ucfirst而没有同时用\L或者lc的话只会大写化第一个字母。这样的话那些全大写单词,比如取几个单词首字母的缩写词,就不会因为你用了\u或者ucfirst而错误把“FBI”跟“LBJ”转成了"Fbi"跟“Lbj”。
1.14.4 参阅
uc, lc, ucfirst, 和 lcfirst functions 在 perlfunc(1) 跟Programming Perl 第29章可以看到;
\L, \U, \l, 和 \u 转义在perlop(1)里面的 "Quote and Quote-like Operators" 一节跟 Programming Perl第5章有。