找回密码
 注册
查看: 539|回复: 0

R语言 pdmclass包 pdmGenes()函数中文帮助文档(中英文对照)

[复制链接]
发表于 2012-2-26 10:59:46 | 显示全部楼层 |阅读模式
pdmGenes(pdmclass)
pdmGenes()所属R语言包:pdmclass

                                         A Function to output the Top Ranked Genes from a Penalized
                                         一个函数来输出一个判罚的得分最高基因

                                         译者:生物统计家园网 机器人LoveR

描述----------Description----------

After fitting a classifier, it is often desirable to output the most "interesting" genes for further validation. This function will output the top 'n' genes that discriminate between each class, along with an estimate of the stability of the observed rankings (see details for more information).
拟合后的分类器,它往往是可取的输出最“有趣”的进一步验证基因。此功能将输出的顶级“N”区分每个类的基因,随着所观察到的排名稳定的估计(详见更多信息)。


用法----------Usage----------


pdmGenes(formula = formula(data), method = c("pls", "pcr", "ridge"),
data = sys.frame(sys.parent()), weights, theta, dimension = J - 1,
eps = .Machine$double.eps, genelist = NULL, list.length = NULL, B = 100, ...)



参数----------Arguments----------

参数:formula
A symbolic description of the model to be fit. Details given below.  
一个象征性的描述模型是合适的。细节如下。


参数:method
One of "pls", "pcr", "ridge", corresponding to partial least squares, principal components regression and ridge regression.
一个“请”,“PCR”,“脊”,相应的偏最小二乘,主成分回归和岭回归。


参数:data
An optional data.frame that contains the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which pdmClass is called. Note that unlike most microarray analyses, in this case rows are samples and columns are genes.
一个可选的数据框包含在模型中的变量。如果没有找到data,environment(formula),通常是从哪个pdmClass被称为环境变量。请注意,不像大多数的微阵列分析,在这种情况下,行样品和列的基因。


参数:weights
An optional vector of sample weights. Defaults to 1.  
一个可选的样本权重向量。默认为1。


参数:theta
An optional matrix of class scores, typically with less than J - 1 columns.
一个可选的类分数矩阵,通常比J  -  1列。


参数:dimension
The dimension of the solution, no greater than J - 1, where J is the number of classes. Defaults to J - 1.  
解决方案的尺寸,没有比J  -  1,其中J是班级数目。默认为J  -  1。


参数:eps
A threshold for excluding small discriminant variables. Defaults to .Machine$double.eps.
不包括小的判别变量的阈值。 .Machine$double.eps默认。


参数:genelist
A vector of gene names, one per gene.  
向量的基因名称,每个基因之一。


参数:list.length
The number of 'top' genes to output.  
输出数量的顶基因。


参数:B
The number of bootstrap samples to use for estimating stability. Defaults to 100. More than this may take an inordinate amount of time.
引导样品的数量,用于估算稳定。默认为100。比这更可能需要大量时间。


参数:...
Additional parameters to pass to method.  
额外的参数来传递method。


Details

详情----------Details----------

The formula interface is identical to all other formula calls in R, namely Y ~ X, where Y is a numeric vector of class assignments and X is a matrix or data.frame containing the gene expression values. Note that unlike most microarray analyses, in this instance the columns of X are genes and rows are samples, so most calls will require something similar to Y ~ t(X).
公式的接口是相同的所有其他配方研发,即Y~X,其中Y是一个课堂作业的数字向量,含有该基因的表达值,X是一个矩阵或数据框。请注意,不像大多数的微阵列分析,在此实例中的X列是基因和行是样品,所以大多数检测会要求到Y~T(X)类似。

The dimension of the solution is typically J - 1, where J is the number of classes. The model fit uses contr.treatment contrasts, which means that all of the coefficients in the model are comparing the given class to a baseline class. Therefore, the genes listed are those that discriminate between a given class and the baseline. For instance, if there are three classes (characterized by a numeric vector of 1s, 2s, and 3s), then there will be two sets of 'top genes'. The first set will be those genes that discriminate between class 2 and class 1, whereas the second set will be the genes that discriminate between class 3 and class 1. The 'Y' vector will therefore need to be constructed to give the comparisons of interest.
解决方案的尺寸是典型的J  -  1,其中J是班级数目。该模型适合使用contr.treatment反差,这意味着,在模型的系数都比较基线类的类。因此,上市的基因是那些区分某一类和基线。举例来说,如果有三个类(1S,2S,3S的数字向量的特点),然后将有两套基因。第一组将这些基因歧视之间的2级和1级,而第二组将是3级和1级之间的基因歧视。因此需要兴建的“Y”向量将给予比较感兴趣的。


值----------Value----------

A list containing a data.frame for each comparison. The first column of each data.frame contains the gene names, and the second column contains the frequency that the gene was observed in the bootstrapped samples.
一个列表,其中包含data.frame为每个比较。的每个data.frame第一列中包含的基因名称,第二列包含该基因在自举样本中观察到的频率。


作者(S)----------Author(s)----------


James W. MacDonald and Debashis Ghosh. Partial least squares
and principal components regression based on code written by
Mike Denham and contributed to StatLib. Model fit based on code from
the <code>mda</code> package written by Trevor Hastie and Robert Tibshirani
and ported to R by Kurt Hornik, Brian D. Ripley, and Friedrich Leisch.




参考文献----------References----------

转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。


注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

手机版|小黑屋|生物统计家园 网站价格

GMT+8, 2025-2-1 14:33 , Processed in 0.028581 second(s), 15 queries .

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表