bicor(WGCNA)
bicor()所属R语言包:WGCNA
Biweight Midcorrelation
Biweight Midcorrelation
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Calculate biweight midcorrelation efficiently for matrices.
有效地计算biweight midcorrelation,矩阵。
用法----------Usage----------
bicor(x, y = NULL,
robustX = TRUE, robustY = TRUE,
use = "all.obs",
maxPOutliers = 1,
quick = 0,
pearsonFallback = "individual",
cosine = FALSE,
cosineX = cosine,
cosineY = cosine,
nThreads = 0,
verbose = 0, indent = 0)
参数----------Arguments----------
参数:x
a vector or matrix-like numeric object
一个向量或矩阵的数值对象
参数:y
a vector or matrix-like numeric object
一个向量或矩阵的数值对象
参数:robustX
use robust calculation for x?
使用强大计算x?
参数:robustY
use robust calculation for y?
使用强大计算y?
参数:use
specifies handling of NAs. One of (unique abbreviations of) "all.obs", "pairwise.complete.obs".
规定处理NA的。其中一个(独特的缩写的个)“all.obs”,“pairwise.complete.obs”。
参数:maxPOutliers
specifies the maximum percentile of data that can be considered outliers on either side of the median separately. For each side of the median, if higher percentile than maxPOutliers is considered an outlier by the weight function based on 9*mad(x), the width of the weight function is increased such that the percentile of outliers on that side of the median equals maxPOutliers. Using maxPOutliers=1 will effectively disable all weight function broadening; using maxPOutliers=0 will give results that are quite similar (but not equal to) Pearson correlation.
指定的最大百分位数的数据是可以考虑的离群值中位数的任一侧上分开。如果在中位数的每一侧,如果更高的百分比maxPOutliers被认为是一个异常值的权重函数基于9*mad(x),权重函数的宽度的增加,离群值的那侧上的百分中位数等于maxPOutliers。使用maxPOutliers=1将有效地禁用所有的权重函数扩大; maxPOutliers=0给出的结果是相当类似(但不等于)Pearson相关。
参数:quick
real number between 0 and 1 that controls the handling of missing data in the calculation of correlations. See details.
0和1之间,控制处理中丢失的数据的相关性的计算的实数。查看详细信息。
参数:pearsonFallback
Specifies whether the bicor calculation should revert to Pearson when median absolute deviation (mad) is zero. Recongnized values are (abbreviations of) "none", "individual", "all". If set to "none", zero mad will result in NA for the corresponding correlation. If set to "individual", Pearson calculation will be used only for columns that have zero mad. If set to "all", the presence of a single zero mad will cause the whole variable to be treated in Pearson correlation manner (as if the corresponding robust option was set to FALSE).
指定,是否BICOR计算时,应恢复到皮尔逊平均绝对偏差(MAD)是零。株型识别的值是(的缩写)"none", "individual", "all"。如果设置为"none",零狂会导致NA相应的相关。如果设置为"individual",皮尔森计算将仅用于列具有零狂。如果设置为"all",一个单独的零狂的存在,将导致在Pearson相关性的方式来对待整个变量(如果相应的robust选项被设置为FALSE)。
参数:cosine
logical: calculate cosine biweight midcorrelation? Cosine bicorrelation is similar to standard bicorrelation but the median subtraction is not performed.
符合逻辑的:计算余弦biweight midcorrelation的吗?余弦bicorrelation类似标准bicorrelation的,但还没有进行中位数的减法。
参数:cosineX
logical: use the cosine calculation for x? This setting does not affect y and can be used to give a hybrid cosine-standard bicorrelation.
符合逻辑的:使用的余弦计算x的吗?此设置不会影响y,可以用来给一个余弦标准混合bicorrelation。
参数:cosineY
logical: use the cosine calculation for y? This setting does not affect x and can be used to give a hybrid cosine-standard bicorrelation.
符合逻辑的:使用的余弦计算y的吗?此设置不会影响x,可以用来给一个余弦标准混合bicorrelation。
参数:nThreads
non-negative integer specifying the number of parallel threads to be used by certain parts of correlation calculations. This option only has an effect on systems on which a POSIX thread library is available (which currently includes Linux and Mac OSX, but excludes Windows). If zero, the number of online processors will be used if it can be determined dynamically, otherwise correlation calculations will use 2 threads.
非负的整数,用于指定要使用的某些部分的相关性计算的并行线程的数目。此选项仅影响的系统上POSIX线程库(目前包括Linux和Mac OSX,但不包括视窗)。如果为零,则在线的处理器的数目将被使用,如果是可以动态地确定,否则将使用相关计算2个线程。
参数:verbose
if non-zero, the underlying C function will print some diagnostics.
如果不为零,底层的C函数将打印一些诊断。
参数:indent
indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces.
缩进诊断消息。零表示无压痕,每个单元增加两个空格。
Details
详细信息----------Details----------
This function implements biweight midcorrelation calculation (see references). If y is not supplied, midcorrelation of columns of x will be calculated; otherwise, the midcorrelation between columns of x and y will be calculated. Thus, bicor(x) is equivalent to bicor(x,x) but is more efficient.
此功能实现biweight的midcorrelation计算(参见参考资料)。如果y没有提供,midcorrelationx的列来计算,否则,列之间的midcorrelation,x和y将计算。因此,bicor(x)是相当于bicor(x,x),但更有效。
The options robustX, robustY allow the user to revert the calculation to standard correlation calculation. This is important, for example, if any of the variables is binary (or, more generally, discrete) as in such cases the robust methods produce meaningless results. If both robustX, robustY are set to FALSE, the function calculates the standard Pearson correlation (but is slower than the function cor).
选项robustX,robustY允许用户恢复计算标准相关运算的。这是重要的,例如,如果任何的变量是二进制的(或,更一般地,离散),作为在这种情况下的鲁棒性的方法产生无意义的结果。如果两个robustX,robustY设置为FALSE,该函数将计算标准的Pearson相关(但速度比功能cor)。
The argument quick specifies the precision of handling of missing data in the correlation calculations. Value quick = 0 will cause all calculations to be executed accurately, which may be significantly slower than calculations without missing data. Progressively higher values will speed up the calculations but introduce progressively larger errors. Without missing data, all column meadians and median absolute deviations (MADs) can be pre-calculated before the covariances are calculated. When missing data are present, exact calculations require the column medians and MADs to be calculated for each covariance. The approximate calculation uses the pre-calculated median and MAD and simply ignores missing data in the covariance calculation. If the number of missing data is high, the pre-calculated medians and MADs may be very different from the actual ones, thus potentially introducing large errors. The quick value times the number of rows specifies the maximum difference in the number of missing entries for median and MAD calculations on the one hand and covariance on the other hand that will be tolerated before a recalculation is triggered. The hope is that if only a few missing data are treated approximately, the error introduced will be small but the potential speedup can be significant.
参数quick指定的相关计算丢失的数据处理的精度。值quick = 0将导致所有的计算,以准确地被执行,这可能是明显慢于没有丢失的数据的计算。值会逐步提高计算的速度,但介绍逐步误差较大。没有丢失的数据,所有的列meadians和中位数绝对偏差(MADS)可以预先计算的协方差计算之前。当丢失的数据,精确的计算需要为每个协方差来计算列的中位数和MADS。使用预先计算的中位数和MAD的近似计算,简单地忽略丢失数据的协方差计算。如果丢失的数据,预先计算的中位数和MADS可能会从实际的有很大的不同,从而有可能引入较大的误差。 quick值乘以的行数指定缺少的条目的数目的中位数和MAD计算一方面和协方差,在另一方面,将重新计算被触发之前被容忍的最大差值。希望是,如果只有很少的缺失数据处理约,将引入的误差小,但潜在的加速可能是显着的。
The choice "all" for pearsonFallback is not fully implemented in the sense that there are rare but possible cases in which the calculation is equivalent to "individual". This may happen if the use option is set to "pairwise.complete.obs" and the missing data are arranged such that each individual mad is non-zero, but when two columns are analyzed together, the missing data from both columns may make a mad zero. In such a case, the calculation is treated as Pearson, but other columns will be treated as bicor.
的选择"all"pearsonFallback还没有完全实现,在这个意义上,是罕见的,但可能的情况下,在计算相当于"individual"。如果use选项被设置到"pairwise.complete.obs"和丢失的数据被布置,使得每个单独的狂是非零的,这可能会发生,但是当两列一起进行分析,两列中的丢失的数据从可使狂零。在这样的情况下,计算被视为皮尔逊,而其它列将被视为BICOR。
值----------Value----------
A matrix of biweight midcorrelations. Dimnames on the result are set appropriately.
的矩阵biweight midcorrelations。 Dimnames设置适当的结果。
(作者)----------Author(s)----------
Peter Langfelder
参考文献----------References----------
Fast R Functions for Robust Correlations and Hierarchical Clustering. Journal of Statistical Software, 46(11), 1-17. http://www.jstatsoft.org/v46/i11/
http://www.unt.edu/benchmarks/archives/2001/december01/rss.htm
1977, pp. 203-209.
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|