gIndex(rms)
gIndex()所属R语言包:rms
Calculate Total and Partial g-indexes for an rms Fit
计算总的rms飞度和部分G-索引
译者:生物统计家园网 机器人LoveR
描述----------Description----------
gIndex computes the total g-index for a model based on the vector of linear predictors, and the partial g-index for each predictor in a model. The latter is computed by summing all the terms involving each variable, weighted by their regression coefficients, then computing Gini's mean difference on this sum. For example, a regression model having age and sex and age*sex on the right hand side, with corresponding regression coefficients b1, b2, b3 will have the g-index for age computed from Gini's mean difference on the product of age times (b1 + b3*w) where w is an indicator set to one for observations with sex not equal to the reference value. When there are nonlinear terms associated with a predictor, these terms will also be combined.
gIndex计算总g指数的向量的线性预测模型的基础上,和部分g指数模型中的每一个预测。后者的计算方法是总结的所有条款,涉及每一个变量,其回归系数加权,然后计算基尼这笔款项的平均差异。例如,回归模型的右侧,相应的回归系数b1, b2, b3年龄,性别和年龄*性别g指数计算年龄从基尼产品的平均差异年龄times (b1 + b3*w)其中w是一个指标设置一个观测性不等于参考值。当有非线性项的预测,这些条款也将被合并。
A print method is defined, and there is a plot method for displaying g-indexes using a dot chart.
Aprint方法的定义,并有一个plot方法使用一个圆点图显示g指数。
A basic function GiniMD computes Gini's mean difference on a numeric vector. This index is defined as the mean absolute difference between any two distinct elements of a vector. For a Bernoulli (binary) variable with proportion of ones equal to p and sample size n, Gini's mean difference is 2np(1-p)/(n-1). For a trinomial variable (e.g., predicted values for a 3-level categorical predictor using two dummy variables) having (predicted) values A, B, C with corresponding proportions a, b, c, Gini's mean difference is 2n[ab|A-B|+ac|A-C|+bc|B-C|]/(n-1).
一个基本功能GiniMD计算基尼平均差异上的一个数值向量。该指数被定义为任何两个不同的元素的向量之间的平均绝对差。伯努利变量(二进制)的比例,等于p和样本量n,基尼平均差异是2np(1-p)/(n-1)。对于三项式变量(例如,预测值的3级使用两个虚拟变量的分类预测)(预测)值A, B, C相应的比例a, b, c,基尼平均差异是2n[ab|A-B|+ac|A-C|+bc|B-C|]/(n-1).
用法----------Usage----------
gIndex(object, partials=TRUE, type=c('ccterms', 'cterms', 'terms'),
lplabel=if(length(object$scale) && is.character(object$scale))
object$scale[1] else 'X*Beta',
fun, funlabel=if(missing(fun)) character(0) else
deparse(substitute(fun)),
postfun=if(length(object$scale)==2) exp else NULL,
postlabel=if(length(postfun))
ifelse(missing(postfun),
if((length(object$scale) > 1) &&
is.character(object$scale)) object$scale[2] else
'Anti-log',
deparse(substitute(postfun))) else character(0),
...)
## S3 method for class 'gIndex'
print(x, digits=4, abbrev=FALSE,
vnames=c("names","labels"), ...)
## S3 method for class 'gIndex'
plot(x, what=c('pre', 'post'),
xlab=NULL, pch=16, rm.totals=FALSE,
sort=c('descending', 'ascending', 'none'), ...)
GiniMd(x, na.rm=FALSE)
参数----------Arguments----------
参数:object
result of an rms fitting function
rms拟合函数的结果
参数:partials
set to FALSE to suppress computation of partial gs
设置为FALSE抑制部分g的计算
参数:type
defaults to 'ccterms' which causes partial discrimination indexes to be computed after maximally combining all related main effects and interactions. The is usually the only way that makes sense when considering partial linear predictors. Specify type='cterms' to only combine a main effect with interactions containing it, not also with other main effects connected through interactions. Use type='terms' to separate interactions into their own effects.
默认为'ccterms'最大限度地结合起来后,所有相关的主效应和交互作用,导致局部的歧视索引来计算。通常这是有道理的考虑部分线性预测时,唯一的方法。指定type='cterms'只有结合主效应与包含它的相互作用,而不是与其他主要通过交互连接的影响。使用type='terms'到自己的影响分开的相互作用。
参数:lplabel
a replacement for default values such as "X*Beta" or "log odds"/
如"X*Beta"或"log odds"/替换为默认值
参数:fun
an optional function to transform the linear predictors before computing the total (only) g. When this is present, a new component gtrans is added to the attributes of the object resulting from gIndex.
一个可选的功能,改造前的线性预测计算总(只)g的。当这是目前,一个新的组件gtrans被添加的属性的对象,得到的由gIndex工业。
参数:funlabel
a character string label for fun, otherwise taken from the function name itself
一个字符串标签fun,否则的函数名本身
参数:postfun
a function to transform g such as exp (anti-log), which is the default for certain models such as the logistic and Cox models
一个功能改造g exp如(反log),这是默认的模式,如MF和Cox模型
参数:postlabel
a label for postfun
标签为postfun
参数:...
For gIndex, passed to predict.rms. Ignored for print. Passed to dotchart2 for plot.
对于gIndex,传递给predict.rms。为print忽略。传递给dotchart2的plot。
参数:x
an object created by gIndex (for print or plot) or a numeric vector (for GiniMd)
gIndex(print或创建的对象plot)或一个数值向量(为GiniMd)
参数:digits
causes rounding to the digits decimal place
导致舍入到digits小数位
参数:abbrev
set to TRUE to abbreviate labels if vname="labels"
设置为TRUE如果vname="labels"的缩写标签
参数:vnames
set to "labels" to print predictor labels instead of names
设置为"labels"的要打印的预测标签,而不是名字
参数:what
set to "post" to plot the transformed g-index if there is one (e.g., ratio scale)
设置为"post"绘制转化g的索引,如果有一个(例如,比例尺度)
参数:xlab
x-axis label; constructed by default
x轴标签,默认情况下,构建
参数:pch
plotting character for point
绘图字符点
参数:rm.totals
set to TRUE to remove the total g-index when plotting
设置为TRUE打印时,删除总g指数
参数:sort
specifies how to sort predictors by g-index; default is in descending order going down the dot chart
指定如何排序的预测g指数,默认为降序点图
参数:na.rm
set to TRUE if you suspect there may be NAs in x; these will then be removed. Otherwise an error will result.
设置为TRUE,“如果你怀疑有可能是NA的x;这些将被删除。否则将导致错误。
Details
详细信息----------Details----------
For stratification factors in a Cox proportional hazards model, there is no contribution of variation towards computing a partial g except from terms that interact with the stratification variable.
分层因素的Cox比例风险模型,没有任何贡献的变化向计算部分g,除了从分层变量与互动的条款。
值----------Value----------
gIndex returns a matrix of class "gIndex" with auxiliary information stored as attributes, such as variable labels. GiniMd returns a scalar.
gIndex返回一个矩阵类"gIndex"辅助信息存储为属性,如变量标签。 GiniMd返回一个标量。
(作者)----------Author(s)----------
Frank Harrell<br>
Department of Biostatistics<br>
Vanderbilt University<br>
<a href="mailto:f.harrell@vanderbilt.edu">f.harrell@vanderbilt.edu</a>
参考文献----------References----------
参见----------See Also----------
predict.rms
predict.rms
实例----------Examples----------
set.seed(1)
n <- 40
x <- 1:n
w <- factor(sample(c('a','b'), n, TRUE))
u <- factor(sample(c('A','B'), n, TRUE))
y <- .01*x + .2*(w=='b') + .3*(u=='B') + .2*(w=='b' & u=='B') + rnorm(n)/5
dd <- datadist(x,w,u); options(datadist='dd')
f <- ols(y ~ x*w*u, x=TRUE, y=TRUE)
f
anova(f)
z <- list()
for(type in c('terms','cterms','ccterms'))
{
zc <- predict(f, type=type)
cat('type:', type, '\n')
print(zc)
z[[type]] <- zc
}
# Test GiniMd against a brute-force solution[的测试GiniMd对一个蛮力解决方案]
gmd <- function(x)
{
n <- length(x)
sum(outer(x, x, function(a, b) abs(a - b)))/n/(n-1)
}
zc <- z$cterms
gmd(zc[, 1])
GiniMd(zc[, 1])
GiniMd(zc[, 2])
GiniMd(zc[, 3])
GiniMd(f$linear.predictors)
g <- gIndex(f)
g
g['Total',]
gIndex(f, partials=FALSE)
gIndex(f, type='cterms')
gIndex(f, type='terms')
z <- c(rep(0,17), rep(1,6))
n <- length(z)
GiniMd(z)
2*mean(z)*(1-mean(z))*n/(n-1)
a <- 12; b <- 13; c <- 7; n <- a + b + c
A <- -.123; B <- -.707; C <- 0.523
xx <- c(rep(A, a), rep(B, b), rep(C, c))
GiniMd(xx)
2*(a*b*abs(A-B) + a*c*abs(A-C) + b*c*abs(B-C))/n/(n-1)
y <- y > .8
f <- lrm(y ~ x * w * u, x=TRUE, y=TRUE)
gIndex(f, fun=plogis, funlabel='Prob[y=1]')
# Manual calculation of combined main effect + interaction effort of[手工计算联合主效应+互动的努力]
# sex in a 2x2 design with treatments A B, sexes F M,[性治疗AB在一个2x2的设计,不论男女,FM,]
# model -.1 + .3*(treat=='B') + .5*(sex=='M') + .4*(treat=='B' & sex=='M')[模型 - 0.1 + 0.3 *(治疗==B)+ 0.5 *(性别==M)+ 0.4 *(治疗==B和性别==M)]
set.seed(1)
X <- expand.grid(treat=c('A','B'), sex=c('F', 'M'))
a <- 3; b <- 7; c <- 13; d <- 5
X <- rbind(X[rep(1, a),], X[rep(2, b),], X[rep(3, c),], X[rep(4, d),])
y <- with(X, -.1 + .3*(treat=='B') + .5*(sex=='M') + .4*(treat=='B' & sex=='M'))
f <- ols(y ~ treat*sex, data=X, x=TRUE)
gIndex(f, type='cterms')
k <- coef(f)
b1 <- k[2]; b2 <- k[3]; b3 <- k[4]
n <- nrow(X)
( (a+b)*c*abs(b2) + (a+b)*d*abs(b2+b3) + c*d*abs(b3))/(n*(n-1)/2 )
# Manual calculation for combined age effect in a model with sex,[手工计算的模型与性别年龄加起来效果,]
# age, and age*sex interaction[,而年龄*性互动]
a <- 13; b <- 7
sex <- c(rep('female',a), rep('male',b))
agef <- round(runif(a, 20, 30))
agem <- round(runif(b, 20, 40))
age <- c(agef, agem)
y <- (sex=='male') + age/10 - (sex=='male')*age/20
f <- ols(y ~ sex*age, x=TRUE)
f
gIndex(f, type='cterms')
k <- coef(f)
b1 <- k[2]; b2 <- k[3]; b3 <- k[4]
n <- a + b
sp <- function(w, z=w) sum(outer(w, z, function(u, v) abs(u-v)))
( abs(b2)*sp(agef) + abs(b2+b3)*sp(agem) + 2*sp(b2*agef, (b2+b3)*agem) ) / (n*(n-1))
( abs(b2)*GiniMd(agef)*a*(a-1) + abs(b2+b3)*GiniMd(agem)*b*(b-1) +
2*sp(b2*agef, (b2+b3)*agem) ) / (n*(n-1))
## Not run: [#不运行:]
# Compare partial and total g-indexes over many random fits[比较局部和整体的G-索引在许多随机拟合]
plot(NA, NA, xlim=c(0,3), ylim=c(0,3), xlab='Global',
ylab='x1 (black) x2 (red) x3 (green) x4 (blue)')
abline(a=0, b=1, col=gray(.9))
big <- integer(3)
n <- 50 # try with n=7 - see lots of exceptions esp. for interacting var[尝试用N = 7 - 看到很多的例外ESP。互动的VAR]
for(i in 1:100) {
x1 <- runif(n)
x2 <- runif(n)
x3 <- runif(n)
x4 <- runif(n)
y <- x1 + x2 + x3 + x4 + 2*runif(n)
f <- ols(y ~ x1*x2+x3+x4, x=TRUE)
# f <- ols(y ~ x1+x2+x3+x4, x=TRUE) # also try this[F < - 醇(Y~X1 + X2 + X3 + X4,X = TRUE)#也试试这个]
w <- gIndex(f)[,1]
gt <- w['Total']
points(gt, w['x1, x2'])
points(gt, w['x3'], col='green')
points(gt, w['x4'], col='blue')
big[1] <- big[1] + (w['x1, x2'] > gt)
big[2] <- big[2] + (w['x3'] > gt)
big[3] <- big[3] + (w['x4'] > gt)
}
print(big)
## End(Not run)[#(不执行)]
options(datadist=NULL)
转载请注明:出自 生物统计家园网(http://www.biostatistic.net)。
注:
注1:为了方便大家学习,本文档为生物统计家园网机器人LoveR翻译而成,仅供个人R语言学习参考使用,生物统计家园保留版权。
注2:由于是机器人自动翻译,难免有不准确之处,使用时仔细对照中、英文内容进行反复理解,可以帮助R语言的学习。
注3:如遇到不准确之处,请在本贴的后面进行回帖,我们会逐渐进行修订。
|