R语言 WGCNA包 populationMeansInAdmixture()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-10-1 21:30:30

populationMeansInAdmixture(WGCNA)
populationMeansInAdmixture()所属R语言包：WGCNA

                                    Estimate the population-specific mean values in an admixed population.
                                       估算特定人群平均值在一个混和的人口。

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

Uses the expression values from an admixed population and estimates of the proportions of sub-populations to estimate the population specific mean values. For example, this function can be used to estimate the cell type specific mean gene expression values based on expression values from a mixture of cells. The method is described in Shen-Orr et al (2010) where it was used to estimate cell type specific
使用一个混合的人口和亚人群的比例估计表达式的值来估计总体的平均值。例如，此功能可用于估计该信元类型特定的平均基因表达值表达式的值的基础上从单元的混合物。沉柯等人（2010），它被用来估计特定的单元类型中描述的方法

用法----------Usage----------

populationMeansInAdmixture(datProportions, datE.Admixture, scaleProportionsTo1 = TRUE,
scaleProportionsInCelltype = TRUE, setMissingProportionsToZero = FALSE)

参数----------Arguments----------

参数：datProportions
a matrix of non-negative numbers (ideally proportions) where the rows correspond to the samples (rows of datE.Admixture) and the columns correspond to the sub-populations of the mixture. The function calculates a mean expression value for each column of datProportions. Negative entries in datProportions lead to an error message. But the rows of datProportions do not have to sum to 1, see the argument scaleProportionsTo1.
非负数（理想情况下的比例）的矩阵，其中各行对应于样品（行datE.Admixture）和列对应的子种群的混合物。该函数计算，平均每一列datProportions表达式的值。在datProportions导致的负项的错误消息。但行datProportions不必总和为1，请参见参数scaleProportionsTo1。

参数：datE.Admixture
a matrix of numbers. The rows correspond to samples (mixtures of populations). The columns contain the variables (e.g. genes) for which the means should be estimated.
数字的矩阵。该行对应的样品（人口的混合物）。列中包含的变量（例如基因）的装置，应当估计。

参数：scaleProportionsTo1
logical. If set to TRUE (default) then the proportions in each row of datProportions are scaled so that they sum to 1, i.e. datProportions[i,]=datProportions[i,]/max(datProportions[i,]). In general, we recommend to set it to TRUE.
逻辑。如果设置为TRUE（默认值），则在每行中的比例的datProportions进行缩放，以便它们总和为1，即datProportions [] = datProportions〔i，] /最大（datProportions []）。在一般情况下，我们建议将其设置为TRUE。

参数：scaleProportionsInCelltype
logical. If set to TRUE (default) then the proportions in each cell types are recaled and make the mean to 0.
逻辑。如果设置为TRUE（默认值），然后在每个类型的单元的比例是recaled和平均值为0。

参数：setMissingProportionsToZero
logical. Default is FALSE. If set to TRUE then it sets missing values in datProportions to zero.
逻辑。默认值是false。如果设置为TRUE，那么它遗漏值datProportions零。

Details

详细信息----------Details----------

The function outputs a matrix of coefficients resulting from fitting a regression model. If the proportions sum to 1, then i-th row of the output matrix reports the coefficients  of the following model lm(datE.Admixture[,i]~.-1,data=datProportions). Aside, the minus 1 in the formula indicates that no intercept term will be fit. Under certain assumptions, the coefficients can be interpreted as the mean expression values in the sub-populations (Shen-Orr  2010).
该函数输出导致回归模型拟合的系数的矩阵。如果的比例之和为1，那么，所述输出矩阵的第i行报告的以下模型lm(datE.Admixture[,i]~.-1,data=datProportions)系数。除了，减去1在式表明没有截距项将是适当的。在一定条件下，系数可以解释的亚人群中的平均表达值（2010年沉柯）。

值----------Value----------

a numeric matrix whose rows correspond to the columns of datE.Admixture (e.g. to genes) and whose columns correspond to the columns of datProportions (e.g. sub populations or cell types).
一个数字矩阵的行对应的列datE.Admixture（例如，基因），其datProportions（例如子的群体或单元类型）的列相对应的列。

注意----------Note----------

This can be considered a wrapper of the lm function.
这可以被认为是一个包装的lm功能。

（作者）----------Author(s)----------

Steve Horvath, Chaochao Cai

参考文献----------References----------

Hastie T, Sarwal MM, Davis MM, Butte AJ (2010) Cell type-specific gene expression differences in complex tissues.  Nature Methods,  vol 7 no.4

实例----------Examples----------

set.seed(1)
# this is the number of complex (mixed) tissue samples, e.g. arrays[这是复杂的（混合的）的组织样品，例如数阵列]
m=10
# true count data (e.g. pure cells in the mixed sample)[真正的计数数据（如纯单元混合样品中）]
datTrueCounts=as.matrix(data.frame(TrueCount1=rpois(m,lambda=16),
TrueCount2=rpois(m,lambda=8),TrueCount3=rpois(m,lambda=4),
TrueCount4=rpois(m,lambda=2)))
no.pure=dim(datTrueCounts)[[2]]

# now we transform the counts into proportions[现在我们转换成比例的计数]
divideBySum=function(x) t(x)/sum(x)
datProportions= t(apply(datTrueCounts,1,divideBySum))
dimnames(datProportions)[[2]]=paste("TrueProp",1:dim(datTrueCounts)[[2]],sep=".")

# number of genes that are highly expressed in each pure population[高表达的基因，是在每个纯人口数]
no.genesPerPure=rep(5, no.pure)
no.genes= sum(no.genesPerPure)
GeneIndicator=rep(1:no.pure, no.genesPerPure)
# true mean values of the genes in the pure populations[纯种群的基因在真正的平均值]
# in the end we hope to estimate them from the mixed samples[最后，我们希望估计他们的混合样品]
datTrueMeans0=matrix( rnorm(no.genes*no.pure,sd=.3), nrow= no.genes,ncol=no.pure)
for (i in 1:no.pure ){
datTrueMeans0[GeneIndicator==i,i]= datTrueMeans0[GeneIndicator==i,i]+1
}
dimnames(datTrueMeans0)[[1]]=paste("Gene",1:dim(datTrueMeans0)[[1]],sep="." )
dimnames(datTrueMeans0)[[2]]=paste("MeanPureCellType",1:dim(datTrueMeans0)[[2]],sep=".")
# plot.mat(datTrueMeans0)[plot.mat（datTrueMeans0）]
# simulate the (expression) values of the admixed population samples[模拟值（表达式）的混合的人口样本]

noise=matrix(rnorm(m*no.genes,sd=.1),nrow=m,ncol= no.genes)
datE.Admixture= as.matrix(datProportions) %*% t(datTrueMeans0) + noise
dimnames(datE.Admixture)[[1]]=paste("MixedTissue",1:m,sep=".")

datPredictedMeans=populationMeansInAdmixture(datProportions,datE.Admixture)

par(mfrow=c(2,2))
for (i in 1:4 ){
verboseScatterplot(datPredictedMeans[,i],datTrueMeans0[,i],
xlab="predicted mean",ylab="true mean",main="all populations")
abline(0,1)
}

#assume we only study 2 populations (ie we ignore the others)[假设我们只研究人口（即我们忽略其他）]
selectPopulations=c(1,2)
datPredictedMeansTooFew=populationMeansInAdmixture(datProportions[,selectPopulations],datE.Admixture)

par(mfrow=c(2,2))
for (i in 1:length(selectPopulations) ){
verboseScatterplot(datPredictedMeansTooFew[,i],datTrueMeans0[,i],
xlab="predicted mean",ylab="true mean",main="too few populations")
abline(0,1)
}

#assume we erroneously add a population[假设我们错误地增加人口]
datProportionsTooMany=data.frame(datProportions,WrongProp=sample(datProportions[,1]))
datPredictedMeansTooMany=populationMeansInAdmixture(datProportionsTooMany,datE.Admixture)

par(mfrow=c(2,2))
for (i in 1:4 ){
  verboseScatterplot(datPredictedMeansTooMany[,i],datTrueMeans0[,i],
  xlab="predicted mean",ylab="true mean",main="too many populations")
  abline(0,1)
}

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册