找回密码
 注册
查看: 378|回复: 0

R语言 TRAMPR包 group.knowns()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-10-1 11:43:18 | 显示全部楼层 |阅读模式
group.knowns(TRAMPR)
group.knowns()所属R语言包:TRAMPR

                                        Knowns Clustering
                                         已知聚类

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

Group a TRAMPknowns object so that knowns with similar TRFLP patterns and knowns that share the same species name “group” together. In general, this function will be called automatically whenever appropriate (e.g. when loading a data set or adding new knowns).  Please see Details to understand why this function is necessary, and how it works.
集团TRAMPknowns对象,使已知与的类似TRFLP模式和已知共享同一物种命名为“本集团”一起。在一般情况下,此功能将在适当的时候自动调用(例如,当加载一个数据集或增加新的已知)。请详细了解为什么这个功能是必要的,它是如何工作的。

The main reason for manually calling group.knowns is to change the default values of the arguments; if you call group.knowns on a TRAMPknowns object, then any subsequent automatic call to group.knowns will use any arguments you passed in the manual group.knowns call (e.g. after doing group.knowns(x, cut.height=20), all future groupings will use cut.height=20).
主要的原因手动调用group.knowns是更改默认的参数值,如果你叫group.knownsTRAMPknowns对象,那么任何后续的自动调用group.knowns,将使用group.knowns调用(例如,后做group.knowns(x, cut.height=20),未来所有的分组将使用手册中的任何参数通过cut.height=20“)。


用法----------Usage----------


group.knowns(x, ...)
## S3 method for class 'TRAMPknowns'
group.knowns(x, dist.method, hclust.method, cut.height, ...)
## S3 method for class 'TRAMP'
group.knowns(x, ...)



参数----------Arguments----------

参数:x
A TRAMPknowns or TRAMP object, containing identified TRFLP patterns.
ATRAMPknowns或TRAMP对象,包含确定TRFLP模式。


参数:dist.method
Distance method used in calculating similarity between different knowns (see dist).  Valid options include "maximum", "euclidian" and "manhattan".
距离中使用的方法计算不同的已知的相似性(见dist“)。有效选项包括"maximum","euclidian"和"manhattan"。


参数:hclust.method
Clustering method used in generating clusters from the similarity matrix (see hclust).
用于聚类的相似矩阵的聚类方法(见hclust“)。


参数:cut.height
Passed to cutree; controls how similar members of each group should be (the larger cut.height, the more inclusive knowns groups will be).
传递给cutree控制每个组的成员应该是多么的相似较大的cut.height,更具有包容性的已知组会。


参数:...
Arguments passed to further methods.
参数传递给更多的办法。


Details

详细信息----------Details----------

group.knowns groups together knowns in a TRAMPknowns object based on two criteria: (1) TRFLP profiles that are very similar across shared enzyme/primer combinations (based on clustering) and (2) TRFLP profiles that belong to the same species (i.e. share a common species column in the info data.frame of x; see TRAMPknowns for more information).  This is to solve three issues in TRFLP analysis:
group.knowns组已知的一个TRAMPknowns对象基于两个标准:(1)TRFLP配置文件,在共享的酶/引物组合(基于聚类)和(2)TRFLP配置文件非常相似,属于同一物种(即购一个共同的speciesinfo数据框的列x;看到TRAMPknowns的更多信息)。这是解决三个问题TRFLP分析:

The TRFLP profile of a single species can have variation in peak sizes due to DNA sequence variation.  By including multiple collections of each species, variation in TRFLP profiles can be accounted for.  If a TRAMPknowns object contains multiple collections of a species, these will be aggregated by group.knowns.  This aggregation is essential for community analysis, as leaving individual collections will artificially inflate the number of “present species” when running TRAMP.
TRFLP品种单一的档案可以有峰大小的变化,由于DNA序列变异。包括多个集合中的每一个物种,TRFLP配置文件的变化可以占。如果TRAMPknowns对象包含多个集合的一个品种,它们将被汇总,group.knowns。此聚集社会分析是必不可少的,,离开个人收藏本种“数”时,人为地抬高运行TRAMP。

Some authors have taken an alternative approach by using a larger tolerance in matching peaks between samples and knowns (effectively increasing accept.error in TRAMP) to account for within-species variation.  This is not recommended, as it dramatically increases the risk of incorrect matches.
有些作者已经采取了另一种方法,通过使用较大的样品和已知的山峰之间的匹配公差(有效地提高accept.errorTRAMP)帐户内物种变异。这是不推荐的,因为它极大地增加了风险不正确的比赛。

Distinctly different TRFLP profiles may occur within a species (or in some cases within an individual); see Avis et al. (2006). group.knowns looks at the species column of the info data.frame of x and joins any knowns with identical species values as a group.    This can also be used where multiple profiles are present in an individual.
明显不同的TRFLP公司可能会出现一个物种内(或在某些情况下,在个别);见安飞士等。 (2006年)。 group.knowns在species的info数据框的x列,与任何已知的相同species值作为一组。这也可以用在多个配置文件是存在于个人。

Different species may share a similar TRFLP profile and therefore be indistinguishable using TRFLP. If these patterns are not grouped, two species will be recorded as present wherever either is present. group.knowns prevents this by joining knowns with “very similar” TRFLP patterns as a group.  Ideally, these problematic groups can be resolved by increasing the number of enzyme/primer pairs in the data.
不同的物种可能共享类似的TRFLP配置文件和因此无法区分TRFLP。如果这些模式不进行分组,两个品种将被记录作为当前的地方要么是存在的。 group.knowns防止这种通过加入已知与“非常相似”TRFLP图案为一组。理想的情况下,通过增加酶/引物对中的数据的数目,可以解决这些问题的组。

Groups names are generated by concatenating all unique (sorted) species names together, separated by commas.
所产生的串联都是独一无二的(排序)物种的名字连在一起,用逗号分隔的组名。

To determine if knowns are “similar enough” to form a group, we use R's clustering tools: dist, hclust and cutree.  First, we generate a distance matrix of the knowns profiles using dist, and using method dist.method (see Example below; this is very similar to what TRAMP does, and dist.method should be specified accordingly).  We then generate clusters using hclust, and using method hclust.method, and “cut” the tree at cut.height using cutree.
要确定是否已知是“相似”,以至于形成一组,我们使用R的的聚类工具:dist,hclust和cutree。首先,我们的已知配置文件生成的距离矩阵使用dist,以及使用方法dist.method(见下面的例子中,这是非常相似的是什么TRAMP,dist.method应指定相应)。然后,我们使用hclust,使用方法hclust.method“和”砍“树cut.height使用cutree产生聚类。

Knowns are grouped together iteratively; so that all groups sharing a common cluster are grouped together, and all knowns that share a common species name are grouped together.  In certain cases this may chain together seemingly unrelated groups.
已知分组一起反复地组合在一起,使所有组共享一个通用的聚类,和所有已知都有一个共同的种名组合在一起。在某些情况下,这可能会链看似无关的群体。

Because group.knowns is generic, it can be run on either a TRAMPknowns or a TRAMP object.  When run on a TRAMP object, it updates the TRAMPknowns object (stored as x$knowns), so that subsequent calls to plot.TRAMPknowns or summary.TRAMPknowns (for example) will use the new grouping parameters.
因为group.knowns是通用的,它可以运行在一个TRAMPknowns或TRAMP对象。一个TRAMP对象上运行时,它会更新TRAMPknowns对象(存储为x$knowns),使后续调用plot.TRAMPknowns或summary.TRAMPknowns(例如)将使用新的组合参数。

Parameters set by group.knowns are retained as part of the object, so that when adding additional knowns (add.known and combine), or when subsetting a knowns database (see [.TRAMPknowns,  aka TRAMPindexing), the same grouping parameters will be used.
参数设置的group.knowns保留对象的一部分,所以,在添加其他已知(add.known和combine),或者当子集的一个已知数据库(见[.TRAMPknowns的,又名TRAMPindexing),同一分组的参数将被使用。


值----------Value----------

For group.knowns.TRAMPknowns, a new TRAMPknowns object. The cluster.pars element will have been updated with new parameters, if any were specified.
对于group.knowns.TRAMPknowns,一个新的TRAMPknowns对象。 cluster.pars元素已被更新,以新的参数,如果任何指定的。

For group.knowns.TRAMP, a new TRAMP object, with an updated knowns element.  Note that the original TRAMPknowns object (i.e. the one from which the TRAMP object was constructed) will not be modified.
对于group.knowns.TRAMP,一个新的TRAMP对象,更新的knowns元素。注意,原来TRAMPknowns对象(即从哪个TRAMP对象,构建一个)not进行修改。


警告----------Warning----------

Warning about missing data: where there are NA values in certain combinations, NAs may be present in the final distance matrix, which means we cannot use hclust to generate the clusters!  In general, NA values are fine.  They just can't be everywhere.
数据丢失的警告:是NA值在某些组合,NA的可能会出现在最后的距离矩阵,这意味着我们不能使用hclust产生的聚类!在一般情况下,NA值的罚款。他们只是不能无处不在。


参考文献----------References----------

testing the limitations of terminal restriction fragment length polymorphism (TRFLP) analysis of soil fungi. Molecular Ecology 15: 873-882.

参见----------See Also----------

TRAMPknowns, which describes the TRAMPknowns object.
TRAMPknowns,它描述了TRAMPknowns对象。

build.knowns, which attempts to generate a knowns database from a TRAMPsamples data set.
build.knowns,试图从一个TRAMPsamples数据集生成一个已知的数据库。

plot.TRAMPknowns, which graphically displays the relationships between knowns.
的plot.TRAMPknowns,以图形方式显示已知之间的关系。


实例----------Examples----------


data(demo.knowns)
data(demo.samples)

demo.knowns <- group.knowns(demo.knowns, cut.height=2.5)
plot(demo.knowns)

## Increasing cut.height makes groups more inclusive:[#提高cut.height使群体更具包容性的:]
plot(group.knowns(demo.knowns, cut.height=100))

res <- TRAMP(demo.samples, demo.knowns)
m1.ungrouped <- summary(res)
m1.grouped <- summary(res, group=TRUE)
ncol(m1.grouped) # 94 groups[94个团体]

res2 <- group.knowns(res, cut.height=100)
m2.ungrouped <- summary(res2)
m2.grouped <- summary(res2, group=TRUE)
ncol(m2.grouped) # Now only 38 groups[现在只有38组]

## group.knowns results in the same distance matrix as produced by[#group.knowns结果产生在相同的距离矩阵]
## TRAMP, therefore using the same method (e.g. method="maximum") is[#TRAMP,因此,使用同样的方法(例如method =“最大”)]
## important.  The example below shows how the matrix produced by[#很重要的。下面的例子演示了如何矩阵]
## dist(summary(x)) (as calculated by group.knowns) is the same as that[#DIST(摘要(x))的(计算的group.knowns)是相同]
## produced by TRAMP:[#制作的TRAMP:]
f <- function(x, method="maximum") {
  ## Create a pseudo-samples object from our knowns[#创建一个伪样从我们已知的对象]
  y <- x
  y$data$height <- 1
  names(y$info)[names(y$info) == "knowns.pk"] <- "sample.pk"
  names(y$data)[names(y$data) == "knowns.fk"] <- "sample.fk"
  class(y) <- "TRAMPsamples"

  ## Run TRAMP, clean up and return[#执行流浪汉,清理和返回]
  ## (If method != "maximum", rescale the error to match that[#(如果方法!=“最大”,重新调整的错误匹配]
  ## generated by dist()).[#产生的区())。]
  z <- TRAMP(y, x, method=method)
  if ( method != "maximum" ) z$error <- z$error * z$n
  names(dimnames(z$error)) <- NULL
  z
}

g <- function(x, method="maximum")
  as.matrix(dist(summary(x), method=method))

all.equal(f(demo.knowns, "maximum")$error,   g(demo.knowns, "maximum"))
all.equal(f(demo.knowns, "euclidian")$error, g(demo.knowns, "euclidian"))
all.equal(f(demo.knowns, "manhattan")$error, g(demo.knowns, "manhattan"))

## However, TRAMP is over 100 times slower in this special case.[#然而,TRAMP在这种特殊情况下慢100倍以上。]
system.time(f(demo.knowns))
system.time(g(demo.knowns))

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2024-12-2 08:31 , Processed in 0.024939 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表