aggregateRanks(RobustRankAggreg)
aggregateRanks()所属R语言包:RobustRankAggreg
Aggregate ranked lists...
总排名名单...
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Aggregate ranked lists
总排名列表
用法----------Usage----------
topCutoff=NA)
参数----------Arguments----------
参数:glist
list of element vectors, the order of the vectors is used as the ranking.
列表中的顺序的元素的向量,向量被用作排名。
参数:rmat
the rankings in matrix format. The glist is by default converted to this format.
以矩阵格式的排名。默认情况下,转换为这种格式的glist。
参数:N
the number of ranked elements, important when using only top-k ranks, by default it is calculated as the number of unique elements in the input.
排名元素的数量,重要的时候,只使用-K行列,默认情况下它的一些独特的元素在输入计算。
参数:method
rank aggregation method, by defaylt 'RRA', other options are 'min', 'geom.mean', 'mean', 'median' and 'stuart'
排名聚合方法,由defaylt'RRA',其他选项是'min','geom.mean','mean','median'和'stuart'
参数:full
indicates if the full rankings are given, used if the the sets of ranked elements do not match perfectly
表示如果完整的排名,如果排名元素套不完全匹配
参数:topCutoff
a vector of cutoff values used to limit the number of elements in the input lists elements do not match perfectly
的截止值的矢量用于限制输入列表元素中的元素数不完全匹配
Details
详细信息----------Details----------
All the methods implemented in this function make an assumtion that the number of ranked items is known. This assumption is satisfied for example in the case of gene lists (number of all genes known to certain extent), but not when aggregating results from google searches (there are too many web pages). This parameter N can be set manually and has strong influence on the end result. The p-values from RRA algorithm can be trusted only if N is close to the real value.
在此函数中实现的所有方法作出assumtion等级的项目的数目是已知。这个假设是满足的情况下,例如在基因列表(在一定程度上已知的所有基因),但不汇总结果时,从谷歌搜索(网页有太多)。也可以手动设置此参数N的最终结果具有较强的影响力。仅当N是接近的实际价值,RRA算法的p-值从可以信任的。
The rankings can be either full or partial. Tests with the RRA algorithm show that one does not lose too much information if only top-k rankings are used. The missing values are assumed to be equal to maximal value and that way taken into account appropriately.
该排名可以是全部或部分。测试结果与RRA算法表明,一个不失去太多的信息,如果使用唯一的top-k排名。缺少的值被假定为等于考虑适当的最大的值,并且该方式。
The function can handle also the case when elements of the different rankings do not overlap perfectly. For example if we combine resutls from different microarray platforms with varying coverage. In this case these structurally missing values are substituted with NA-s and handled differently than omitted parts of the rankings. The function accepts as an input either list of rankings or rank matrix based on them. It converts the list to rank matrix automatically using the function rankMatrix. For most cases the ranking list is more convenient. Only in complicated cases, for example with top-k lists and structural missing values one would like to construct the rank matrix manually.
该功能可以处理的情况下,不同的排名要素的不完全重叠。例如,如果我们从不同的芯片平台结合resutls具有不同的覆盖范围。在这种情况下,这些结构上的缺失值将被替换为NA-S和处理方式不同于省略部分的排名。该函数接受一个输入任一列表中的排名或秩矩阵的基础上。秩矩阵的功能rankMatrix自动转换。大多数情况下,排名更方便。只有在复杂的情况下,例如想手动构建秩矩阵的top-k名单和结构的缺失值。
When the number of top elements included into input is specified in advance, for example some lists are limited to 100 elements, and the lengths of these lists differ significantly, we can use more sensitive and accurate algorithm for the score calculation. Then one has to specify in the input also the parameter topCutoff, which is a vector defining an cutoff value for each input list. For example if we have three lists of 1000 elements but first is limited to 100, second 200 and third to 900
当最佳的元素的数目,包括到预先指定的输入,例如一些列表限于100个元素,并且这些列表的长度差异显着,我们可以使用更加敏感和准确的得分计算的算法。然后1中指定的输入也的参数topCutoff,这是一个矢量限定为每个输入列表中的截止值。例如,如果我们有三个列表1000个元素,但首先是有限至100,第二200和第三到第900
值----------Value----------
Returns a two column dataframe with the element names and associated scores
返回一个两列的数据框的元素名称和相应的分数
(作者)----------Author(s)----------
Raivo Kolde <rkolde@gmail.com>
参考文献----------References----------
实例----------Examples----------
glist <- list(sample(letters, 4), sample(letters, 10), sample(letters, 12))
# Aggregate the inputs[聚合输入]
aggregateRanks(glist = glist, N = length(letters))
aggregateRanks(glist = glist, N = length(letters), method = "stuart")
# Since we know the cutoffs for the lists in advance (4, 10, 12) we can use[由于我们提前知道名单的截止时间为(4,10,12),我们可以使用]
# the more accurate algorithm with parameter topCutoff[更准确的算法参数topCutoff]
# Use the rank matrix instead of the gene lists as the input[使用秩矩阵,而不是作为输入的基因列表]
r = rankMatrix(glist)
aggregateRanks(rmat = r)
# Example, when the input lists represent full rankings but the domains do not match [例如,当输入列表中的排名,但代表全域不匹配]
glist <- list(sample(letters[4:24]), sample(letters[2:22]), sample(letters[1:20]))
r = rankMatrix(glist, full = TRUE)
head(r)
aggregateRanks(rmat = r, method = "RRA")
# Dataset representing significantly changed genes after knockouts [数据集较显着改变后,敲除基因]
# of cell cycle specific trancription factors[的单元周期特异性trancription的因素]
data(cellCycleKO)
r = rankMatrix(cellCycleKO$gl, N = cellCycleKO$N)
ar = aggregateRanks(rmat = r)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|