R语言 Unicode包 tokenizers()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-10-1 13:20:15

tokenizers(Unicode)
tokenizers()所属R语言包：Unicode

                                    Unicode Alphabetic Tokenizer
                                       Unicode的字母标记生成器

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

A simple Unicode alphabetic tokenizer.
一个简单的Unicode字母标记生成器。

用法----------Usage----------

Unicode_alphabetic_tokenizer(x)

参数----------Arguments----------

参数：x
a character vector.
字符向量。

Details

详细信息----------Details----------

Tokenization first replaces the elements of x by their Unicode character sequences.  Then, the non-alphabetic characters (i.e., the ones which do not have the Alphabetic property) are replaced by blanks, and the corresponding strings are split according to the blanks.
符号化的元素替换它们的Unicode字符序列的x。然后，非字母的字符（即，不具有顺序排列的属性的那些）所取代由空格，和相应的字符串分割根据空白。

值----------Value----------

A character vector with the tokenized strings.
带标记的字符串的字符向量。

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册