proportionsInAdmixture(WGCNA)
proportionsInAdmixture()所属R语言包:WGCNA
Estimate the proportion of pure populations in an admixed population based on marker expression
根据标记物的表达,纯人群在一个混合的人口比例估计
译者:生物统计家园网 机器人LoveR
描述----------Description----------
Assume that datE.Admixture provides the expression values from a mixture of cell types (admixed population) and you want to estimate the proportion of each pure cell type in the mixed samples (rows of datE.Admixture). The function allows you to do this as long as you provide a data frame MarkerMeansPure that reports the mean expression values of markers in each of the pure cell types.
假设datE.Admixture提供了表达式的值类型的单元的混合物(掺和人口),要估计的每个纯的单元类型中所占比例的混合样品(datE.Admixture)行。该功能可以让你做到这一点,只要你提供一个数据框MarkerMeansPure“”,报告的平均表现值的纯单元类型的标记。
用法----------Usage----------
proportionsInAdmixture(MarkerMeansPure, datE.Admixture, calculateConditionNumber = FALSE, coefToProportion = TRUE)
参数----------Arguments----------
参数:MarkerMeansPure
is a data frame whose first column reports the name of the marker and the remaining columns report the mean values of the markers in each of the pure populations. The function will estimate the proportion of pure cells which correspond to columns 2 through of dim(MarkerMeansPure)[[2]] of MarkerMeansPure. Rows that contain missing values (NA) will be removed.
是一个数据框的第一列报告的名称的标记物和剩余的列报告每个纯种群中的标记的平均值。该函数将估计的纯单元的比例相对应的列2,通过dim(MarkerMeansPure)[[2]]MarkerMeansPure。包含缺失值的行数(NA)将被删除。
参数:datE.Admixture
is a data frame of expression data, e.g. the columns of datE.Admixture could correspond to thousands of genes. The rows of datE.Admixture correspond to the admixed samples for which the function estimates the proportions of pure populations. Some of the markers specified in the first column of MarkerMeansPure should correspond to column names of datE.Admixture.
是一个数据框的表达数据,例如列datE.Admixture可能对应数以千计的基因。 datE.Admixture的行对应于该函数估计纯种群的比例混合的样品。 MarkerMeansPure在第一列中指定的某些标记应该对应列名datE.Admixture。
参数:calculateConditionNumber
logical. Default is FALSE. If set to TRUE then it uses the kappa function to calculates the condition number of the matrix MarkerMeansPure[,-1]. This allows one to determine whether the linear model for estimating the proportions is well specified. Type help(kappa) to learn more. kappa() computes by default (an estimate of) the 2-norm condition number of a matrix or of the R matrix of a QR decomposition, perhaps of a linear fit.
逻辑。默认值是false。如果设置为TRUE,则使用kappa函数来计算矩阵的条件数MarkerMeansPure[,-1]。这允许一个线性模型估计的比例,以确定是否是指定。类型help(kappa)了解更多信息。 kappa()默认情况下,计算(估计)的2 - 范数条件数的矩阵或R矩阵的QR分解,也许是一个线性拟合。
参数:coefToProportion
logical. By default, it is set to TRUE. When estimating the proportions the function fits a multivariate linear model. Ideally, the coefficients of the linear model correspond to the proportions in the admixed samples. But sometimes the coefficients take on negative values or do not sum to 1. If coefToProportion=TRUE then negative coefficients will be set to 0 and the remaining coefficients will be scaled so that they sum to 1.
逻辑。默认情况下,它被设置为TRUE。当估算的比例功能符合多元线性模型。理想的情况下,线性模型的系数,对应于在混合的样品的比例。但有时系数取负值或不等于1。如果coefToProportion=TRUE然后负系数将被设置为0,其余的系数将被缩放,以便它们总和为1。
Details
详细信息----------Details----------
The methods implemented in this function were motivated by the gene expression deconvolution approach described by Abbas et al (2009), Lu et al (2003), Wang et al (2006). This approach can be used to predict the proportions of (pure) cells in a complex tissue, e.g. the proportion of blood cell types in whole blood. To define the markers, you may need to have expression data from pure populations. Then you can define markers based on a significant t-test or ANOVA across the pure populations. Next use the pure population data to estimate corresponding mean expression values. Hopefully, the array platforms and normalization methods for datE.MarkersAdmixtureTranspose and MarkerMeansPure are comparable. When dealing with Affymetrix data: we have successfully used it on untransformed MAS5 data. For statisticians: To estimate the proportions, we use the coefficients of a linear model. Specifically: datCoef= t(lm(datE.MarkersAdmixtureTranspose ~MarkerMeansPure[,-1])$coefficients[-1,]) where datCoef is a matrix whose rows correspond to the mixed samples (rows of datE.Admixture) and the columns correspond to pure populations (e.g. cell types), i.e. the columns of MarkerMeansPure[,-1]. More details can be found in Abbas et al (2009).
在这个函数中实现的方法的动机是基因表达的反褶积阿巴斯等人(2009年),卢等人(2003年),王等人(2006)所描述的方法。这种方法可以被用来预测在一个复杂的组织(纯)单元的比例,例如全血中的类型的血单元的比例。要定义标记,您可能需要从单纯的人口有表达数据。然后,您可以定义基于一个显著在的纯人群中的检验或ANOVA的标志。接下来,使用纯粹的人口数据来估计相应的平均表达值。我们希望,在阵列平台和规范化的方法datE.MarkersAdmixtureTranspose和MarkerMeansPure是可比的。在处理与Affymetrix公司的数据,我们已经成功地使用它上未转换MAS5数据。统计学家估计的比例,我们使用一个线性模型的系数。具体而言:datCoef= t(lm(datE.MarkersAdmixtureTranspose ~MarkerMeansPure[,-1])$coefficients[-1,])其中datCoef是一个矩阵,它的行对应的混合样品(datE.Admixture)和列对应于纯种群(例如单元类型),即中的列中的行MarkerMeansPure[,-1]。更多细节可以发现,在阿巴斯等人(2009)。
值----------Value----------
A list with the following components
以下组件列表
参数:PredictedProportions
data frame that contains the predicted proportions. The rows of PredictedProportions correspond to the admixed samples, i.e. the rows of datE.Admixture. The columns of PredictedProportions correspond to the pure populations, i.e. the columns of MarkerMeansPure[,-1].
数据框包含的预测的比例。 PredictedProportions的行对应的混合的样品,即行datE.Admixture。 PredictedProportions的列相对应的的纯人群,即列MarkerMeansPure[,-1].
参数:datCoef=datCoef
data frame of numbers that is analogous to PredictedProportions. In general, datCoef will only be different from PredictedProportions if coefToProportion=TRUE. See the description of coefToProportion
数据框的数字显示,类似于PredictedProportions。在一般情况下,datCoef只会是不同PredictedProportions如果coefToProportion=TRUE。请参阅描述coefToProportion
参数:conditionNumber
This is the condition number resulting from the kappa function. See the description of calculateConditionNumber.
这是条件数从kappa功能。见的描述calculateConditionNumber。
参数:markersUsed
vector of character strings that contains the subset of marker names (specified in the first column of MarkerMeansPure) that match column names of datE.Admixture and that contain non-missing pure mean values.
向量的字符串,其中包含的标记名称的子集(指定在第一列中的MarkerMeansPure)匹配的列名datE.Admixture和包含非缺失的纯平均值。
注意----------Note----------
This function can be considered a wrapper of the lm function.
lm功能,这个功能可以被认为是一个包装。
(作者)----------Author(s)----------
Steve Horvath, Chaochao Cai
参考文献----------References----------
Systemic Lupus Erythematosus. PLoS ONE 4(7): e6098. doi:10.1371/journal.pone.0006098
reinterpretation of DNA microarray data reveals dynamic changes in cell populations. Proc Natl Acad Sci U S A 100: 10370-10375.
deconvolution in a complex mammalian organ. BMC Bioinformatics 7: 328.
参见----------See Also----------
lm, kappa
lm,kappa
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|