R语言 WGCNA包 rankPvalue()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-10-1 22:19:02

rankPvalue(WGCNA)
rankPvalue()所属R语言包：WGCNA

                                       Estimate the p-value for ranking consistently high (or low) on multiple lists
                                       估计p值居一贯的高（或低）的多个列表

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

The function rankPvalue calculates the p-value for observing that an object (corresponding to a row of the input data frame datS) has a consistently high ranking (or low ranking) according to multiple ordinal scores (corresponding to the columns of the input data frame datS).
的功能rankPvalue计算观察到的对象（对应于输入数据框datS一排）具有始终如一的高排名（或低等级）根据多个序分数（对应的列的p值输入数据框datS）。

用法----------Usage----------

rankPvalue(datS, columnweights = NULL,
         na.last = "keep", ties.method = "average",
         calculateQvalue = TRUE, pValueMethod = "all")

参数----------Arguments----------

参数：datS
a data frame whose rows represent objects that will be ranked. Each column of datS represents an ordinal variable (which can take on negative values). The columns correspond to (possibly signed) object significance measures, e.g., statistics (such as Z statistics), ranks, or correlations.
一个数据框，其行表示，将排名的对象。 datS每一列代表一个定序变量（可以取负值）。列对应对象（可能签署）意义的措施，例如，统计数据（如Z统计），职级，或相关性。

参数：columnweights
allows the user to input a vector of non-negative numbers reflecting weights for the different columns of datZ. If it is set to NULL then all weights are equal.
允许用户输入的矢量反映datZ为不同的列的权重的非负数。如果它被设置为NULL然后所有的权重是相等的。

参数：na.last
controls the treatment of missing values (NAs) in the rank function. If TRUE, missing values in the data are put last (i.e. they get the highest rank values). If FALSE, they are put first;  if NA, they are removed; if "keep" they are kept with rank NA. See rank for more details.
RANK函数中的缺失值（NAS）控制的治疗。如果TRUE，遗漏值的数据放在最后一个（即获得最高等级值）。如果FALSE，他们把第一;，如果NA，他们被去除;如果"keep"他们一直排名NA。见rank更多详情。

参数：ties.method
represents the ties method used in the rank function for the percentile rank method. See rank for more details.
代表的联系方法rank函数的百分等级的方法。见rank更多详情。

参数：calculateQvalue
logical: should q-values be calculated? If set to TRUE then the function calculates corresponding q-values (local false discovery rates) using the qvalue package, see Storey JD and Tibshirani R. (2003). This option assumes that qvalue package has been installed.
逻辑：Q值计算出来的？如果设置为TRUE，那么该函数将计算相应的Q值（本地虚假的发现率）使用qvalue包，请参阅层JD和Tibshirani R.（2003）。此选项假定qvalue包已经安装。

参数：pValueMethod
determines which method is used for calculating p-values. By default it is set to "all", i.e. both methods are used. If it is set to "rank" then only the percentile rank method is used. If it set to "scale" then only the scale method will be used.
确定该方法被用于计算p-值。默认情况下，它被设置为“所有”，即两种方法都使用。如果它被设置为“等级”，则只有百分等级的方法。如果将其设置为“规模化”，那么规模的方法将被使用。

Details

详细信息----------Details----------

The function calculates asymptotic p-values (and optionally q-values) for testing the null hypothesis that the values in the columns of datS are independent. This allows us to find objects (rows) with consistently high (or low) values across the columns.
该函数计算渐近p-值（和Q值），测试的零假设，即反三合会行动组的列中的值是独立的。这使我们能够找到的对象（行）一贯的高值（或低），跨列。

Example: Imagine you have 5 vectors of Z statistics corresponding to the columns of datS. Further assume that a gene has ranks 1,1,1,1,20 in the 5 lists. It seems very significant that the gene ranks number 1 in 4 out of the 5 lists. The function rankPvalue can be used to calculate a p-value for this occurrence.
示例：假设你有5个向量的Z统计量对应的列DATS。进一步假设，一个基因已排名在5列出1,1,1,1,20。它似乎很重要的，该基因的列表中有4个名列第一。可以使用的功能rankPvalue，这种情况的发生，以计算p-值。

The function uses the central limit theorem to calculate asymptotic p-values for two types of test statistics that measure consistently high or low ordinal values. The first method (referred to as percentile rank method) leads to accurate estimates of p-values if datS has at least 4 columns but it can be overly conservative.  The percentile rank method replaces each column datS by the ranked version rank(datS[,i]) (referred to ask low ranking) and by rank(-datS[,i]) (referred to as high ranking). Low ranking and high ranking allow one to find consistently small values or  consistently large values of datS, respectively.  All ranks are divided by the maximum rank so that the result lies in the unit interval [0,1]. In the following, we refer to rank/max(rank) as percentile rank. For a given object (corresponding to a row of datS) the observed percentile rank follows approximately a uniform distribution under the null hypothesis. The test statistic is defined as the sum of the percentile ranks (across the columns of datS). Under the null hypothesis that there is no relationship between the rankings of the columns of datS, this (row sum) test statistic follows a distribution that is given by the convolution of random uniform distributions. Under the null hypothesis, the individual percentile ranks are independent and one can invoke the central limit theorem to argue that the row sum test statistic follows asymptotically a normal distribution.  It is well-known that the speed of convergence to the normal distribution is extremely fast in case of  identically distributed uniform distributions. Even when datS has only  4 columns, the difference between the normal approximation and the exact distribution is negligible in practice (Killmann et al 2001). In summary, we use the central limit theorem to argue that the sum of the percentile ranks follows a normal distribution whose mean and variance can be calculated using the fact that the mean value of a uniform random variable (on the unit interval) equals 0.5 and its variance equals 1/12.
该函数使用中心极限定理来计算渐近p值两种类型的测试统计，衡量一贯的高或低的序号值。第一种方法（称为百分等级方法）导致的p-值的准确估计，如果反三合会行动组至少有4列，但它可以是过于保守。 DATS版排名排名的百分位排名的方法取代每列（DATS [I]）（以下简称问排名低）按职级（DATS [I]）（以下简称为高级）。排名低和高的排名，允许一个一贯的小值或持续大值区反黑组，分别。所有级别划分的最高排名，这样的结果是在单位的时间间隔[0,1]。在下面，我们指的职级/最大百分等级（职级）。对于一个给定的对象（对应一排DATS）的零假设下所观察到的百分位排名如下大致均匀分布的。被定义为百分等级的总和（横跨列DATS）的检验统计量。（行之）检验统计量的零假设下的反三合会行动组的列之间的排名没有任何关系，如下分布，随机均匀分布的卷积。在零假设下，个人的百分等级是独立的，可以调用认为，行和检验统计量如下渐近常态分布的中心极限定理。这是众所周知的收敛速度是非常快的情况下，同分布的均匀分布的正态分布。即使当DATS仅具有4列，正常的近似和具体的分布是在实践中可以忽略不计（Killmann等2001）之间的差异。总之，我们用中心极限定理争论的总和的百分等级服从正态分布的均值和方差可以计算出使用的事实，均匀分布的随机变量的平均值（单位间隔）等于0.5其方差等于十二分之一。

The second method for calculating p-values is referred to as scale method. It is often more powerful but its asymptotic p-value can only be trusted if either datS has a lot of columns or if the ordinal scores (columns of datS)  follow an approximate normal distribution.  The scale method scales (or standardizes) each ordinal variable (column of datS) so that it has mean 0 and variance 1. Under the null hypothesis of independence, the row sum follows approximately a normal distribution if the assumptions of  the central limit theorem are met. In practice, we find that the second approach is often more powerful but it makes more distributional assumptions (if datS has few columns).
计算p-值的第二种方法被称为作为尺度法。它往往是更强大，但如果其中一个DATS有很多列的反三合会行动组（列）的顺序分近似正态分布，它的渐近p值只能被信任。规模的方法扩展（或标准化），定序变量的反三合会行动组（列），所以它的平均值为0，方差为1。独立的零假设下，行和满足如下近似的正态分布，中心极限定理的假设。在实践中，我们发现，第二种方法是通常更强大，但它使分布假设（如果反三合会行动组数列）。

值----------Value----------

A list whose actual content depends on which p-value methods is selected, and whether q0values are calculated. The following inner components are calculated, organized in outer components datoutrank and datoutscale,:
其实际内容的列表取决于哪一个被选择时，p-值的方法，是否计算q0values。以下内部组件的计算方式，组织的外部组件datoutrank和datoutscale“：

参数：pValueExtremeRank
This is the minimum between pValueLowRank and pValueHighRank, i.e. min(pValueLow, pValueHigh)
这是之间pValueLowRank和pValueHighRank最小，即分钟（pValueLow，pValueHigh）

参数：pValueLowRank
Asymptotic p-value for observing a consistently low value across the columns of datS based on the rank method.
渐近p值在列反三合会行动组观察持续低价值的排名方法的基础上。

参数：pValueHighRank
Asymptotic p-value for observing a consistently low value across the columns of datS based on the rank method.
渐近p值在列反三合会行动组观察持续低价值的排名方法的基础上。

参数：pValueExtremeScale
This is the minimum between pValueLowScale and pValueHighScale, i.e. min(pValueLow, pValueHigh)
这是之间pValueLowScale和pValueHighScale最小，即分钟（pValueLow，pValueHigh）

参数：pValueLowScale
Asymptotic p-value for observing a consistently low value across the columns of datS based on the Scale method.
渐近p值在列反三合会行动组观察持续低价值的基础上的Scale方法。

参数：pValueHighScale
Asymptotic p-value for observing a consistently low value across the columns of datS based on the Scale method.
渐近p值在列反三合会行动组观察持续低价值的基础上的Scale方法。

参数：qValueExtremeRank
local false discovery rate (q-value) corresponding to the p-value pValueExtremeRank
虚假的发现率（q值）对应的p值pValueExtremeRank

参数：qValueLowRank
local false discovery rate (q-value) corresponding to the p-value pValueLowRank
虚假的发现率（q值）对应的p值pValueLowRank

参数：qValueHighRank
local false discovery rate (q-value) corresponding to the p-value pValueHighRank
虚假的发现率（q值）对应的p值pValueHighRank

参数：qValueExtremeScale
local false discovery rate (q-value) corresponding to the p-value pValueExtremeScale
虚假的发现率（q值）对应的的p值pValueExtremeScale的

参数：qValueLowScale
local false discovery rate (q-value) corresponding to the p-value pValueLowScale
虚假的发现率（q值）对应的的p值pValueLowScale的

参数：qValueHighScale
local false discovery rate (q-value) corresponding to the p-value pValueHighScale
虚假的发现率（q值）对应的的p值pValueHighScale的

（作者）----------Author(s)----------

Steve Horvath

参考文献----------References----------

Use in Quality Control. Economic Quality Control Vol 16 (2001), No. 1, 17-41.ISSN 0940-5151

参见----------See Also----------

rank, qvalue
rank，qvalue

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册