R语言 optmatch包 mdist()函数中文帮助文档(中英文对照)

loveR · 发表于 2012-9-23 23:16:50

mdist(optmatch)
mdist()所属R语言包：optmatch

                                    Create matching distances
                                       创建匹配距离

                                       译者：生物统计家园网机器人LoveR

描述----------Description----------

A generic function, with several supplied methods, for creating matrices of distances between observations to be used in the match process. Using these matrices,  pairmatch() or fullmatch() can determine optimal matches.
一个通用的功能，与几个供给方法，用于创建要在匹配过程中使用的观测矩阵之间的距离。使用这些矩阵，pairmatch()或fullmatch()确定最佳的匹配。

用法----------Usage----------

  mdist(x, structure.fmla = NULL, ...)

参数----------Arguments----------

参数：x
The object to use as the basis for forming the mdist. Methods exist for formulas, functions, and generalized linear models.
该对象使用为基础形成MDIST。方法存在的公式，函数和广义线性模型。

参数：structure.fmla
A formula denoting the treatment variable on the  left hand side and an optional grouping expression on the right hand side. For example, z ~ 1 indicates no grouping. z ~ s subsets the data only computing distances within the subsets formed by s. See method notes, below, for additional formula options.
公式表示的左手侧和右手侧上的一个可选的分组表达式的处理变量。例如，z ~ 1表示不分组。 z ~ s子集的数据只计算距离内的子集由s。见下面的方法说明，额外的公式。

参数：...
Additional method arguments. Most methods require a 'data' argument.
其他方法的参数。大多数方法都需要一个“数据”的说法。

Details

详细信息----------Details----------

The mdist method provides three ways to construct a matching distance (ie, a distance matrix or suitably organized list of such matrices): guided by a function, by a fitted model, or by a formula.  The class of the first argument given to mdist determines which of these methods is invoked.
mdist方法提供了三种方式构建一个匹配距离（即距离矩阵或适当的组织，这样的矩阵列表）：由功能为导向，通过一个合适的模型，或通过一个公式。这些方法被调用的类的第一个参数给mdist。

The mdist.function method takes a function of two arguments. When called, this function will receive the treatment observations as the first argument and the control observations as the second argument. As an example, the following computes the raw differences between values of t1 for treatment units (here, nuclear plants with pr==1) and controls (here, plants with pr==0), returning the result as a distance matrix: sdiffs <- function(treatments, controls) {    abs(outer(treatments$t1, controls$t1, `-`))    }
mdist.function方法需要两个参数的函数。当被调用时，此功能将作为第一个参数作为第二个参数的对照观察接受治疗观察。作为一个例子，下面的计算原料t1治疗单位（这里的值之间的差异，核电站用pr==1）和对照组（这里，植物与pr==0），返回其结果距离矩阵：sdiffs <- function(treatments, controls) {    abs(outer(treatments$t1, controls$t1, -))    }

The mdist.function method does similar things as the earlier optmatch function makedist, although the interface is a bit different.
mdist.function方法没有类似的事情作为较早optmatch的功能makedist，虽然界面是一个有点不同。

The mdist.formula method computes the squared Mahalanobis distance between observations, with the right-hand side of the formula determining which variables contribute to the Mahalanobis distance. If matching is to be done within strata, the stratification can be communicated using either the structure.fmla argument (e.g. ~ grp) or as part of the main formula (e.g. z ~ x1 + x2 | grp).
mdist.formula方法计算的平方之间的Mahalanobis距离的观察，确定哪些变量有助于马哈拉诺比斯距离式的右手侧。如果匹配是必须要做的地层中，可以沟通，分层使用structure.fmla参数（如：~ grp）或部分的主要公式（如：z ~ x1 + x2 | grp）。

An mdist.glm method takes an argument of class glm as first argument.  It assumes that this object is a fitted propensity model, extracting distances on the linear propensity score (logits of the estimated conditional probabilities) and, by default, rescaling the distances by the reciprocal of the pooled s.d. of treatment- and control-group propensity scores. (The scaling uses mad, for resistance to outliers, by default; this can be changed to the actual s.d., or rescaling can be skipped entirely, by setting argument standardization.scale to sd or NULL, respectively.)  A mdist.bigglm method works analogously with bigglm objects, created by the bigglm function from package "biglm", which can handle bigger data sets than the ordinary glm function can.  In contrast with mdist.glm it requires additional data and structure.fmla arguments.  (If you have enough data to have to use bigglm, then you'll probably have to subgroup before matching to avoid memory problems. So you'll have to use the structure.fmla argument anyway.)
mdist.glm方法需要一个参数的类glm作为第一个参数。它假定此对象一个合身的倾向模型，提取距离的线性倾向得分（logits条件概率的估计），默认情况下，重新调整距离的倒数汇集SD治疗和控制组的倾向得分。（缩放使用mad，抗离群，默认情况下，这可以改变的实际SD，或重新标度完全可以跳过，通过设置参数standardization.scale到sd或 NULL“。）mdist.bigglm方法工作原理类似于用bigglm对象，创建由bigglm功能从包”biglm，它可以处理更大的数据集比普通的GLM功能。与此相反的mdist.glm它需要额外的data和structure.fmla参数。（如果你有足够的数据来使用bigglm，那么你很可能有亚群，然后再匹配以避免内存问题。所以，你就必须使用反正structure.fmla参数）。

值----------Value----------

Object of class optmatch.dlist, which is suitable to be given as distance argument to fullmatch or pairmatch. For more information, see pscore.dist
对象类optmatch.dlist，这是适合于作为distance的参数，fullmatch或pairmatch。有关详细信息，请参阅pscore.dist

（作者）----------Author(s)----------

Mark M. Fredrickson

参考文献----------References----------

‘Constructing a control group using multivariate matched sampling methods that incorporate the propensity score’, The American Statistician, 39 33–38.

参见----------See Also----------

makedist, mahal.dist, fullmatch, pairmatch,
makedist，mahal.dist，fullmatch，pairmatch，

实例----------Examples----------

data(nuclearplants)
mdist.examples <- list()
### Propensity score distances.[＃＃倾向分数的距离。]
### Recommended approach:[＃＃推荐的方法：]
(aGlm <- glm(pr~.-(pr+cost), family=binomial(), data=nuclearplants))
mdist.examples$ps1 <- mdist(aGlm)
### A second approach: first extract propensity scores, then separately[＃＃A第二种方法：先提取物的倾向得分，然后分别]
### create a distance from them.  (Useful when importing propensity[＃＃创建一个离他们很远。（进口倾向时非常有用]
### scores from an external program.)[＃＃分数从外部程序）。]
plantsPS <- predict(aGlm)
mdist.examples$ps2 <- mdist(pr~plantsPS, data=nuclearplants)^(1/2)
### Full matching on the propensity score.[＃＃匹配的倾向得分。]
fullmatch(mdist.examples$ps1)
fullmatch(mdist.examples$ps2)
### Because mdist.glm uses robust estimates of spread, [＃＃，因为mdist.glm使用可靠的估计数的价差，]
### the results differ in detail -- but they are close enough[＃＃不同的结果的细节 - 但他们足够接近]
### to yield similar optimal matches.[＃＃带来同类最佳的匹配。]
all(fullmatch(mdist.examples$ps1)==fullmatch(mdist.examples$ps2)) # The same[同样的]

### Mahalanobis distance:[＃＃马氏距离：]
mdist.examples$mh1 <- mdist(pr ~ t1 + t2, data = nuclearplants)

### Absolute differences on a scalar:[＃＃绝对的标量上的差异：]

sdiffs <- function(treatments, controls) {
  abs(outer(treatments$t1, controls$t1, `-`))
}

(absdist <- mdist(sdiffs, structure.fmla = pr ~ 1, data = nuclearplants))

### Pair matching on the variable `t1`:[＃＃变量T1配对：]
pairmatch(absdist)

### Propensity score matching within subgroups:[＃＃倾向得分匹配子组内：]
mdist.examples$ps3 <- mdist(aGlm, structure.fmla=~pt)
fullmatch(mdist.examples$ps3)

### Propensity score matching with a propensity score caliper:[＃＃倾向得分匹配倾向评分卡尺：]
mdist.examples$pscal <- mdist.examples$ps1 + caliper(1,mdist.examples$ps1)
fullmatch(mdist.examples$pscal) # Note that the caliper excludes some units[需要注意的是卡钳排除一些单位]

### A Mahalanobis distance for matching within subgroups:[＃＃A匹配的子组内的马氏距离：]
mdist.examples$mh2 <- mdist(pr ~ t1 + t2 | pt, data = nuclearplants)
all.equal(mdist.examples$mh2,
      mdist(pr ~ t1 + t2, structure.fmla = ~ pt, data = nuclearplants))

### Mahalanobis matching within subgroups, with a propensity score[＃＃马氏子组内的匹配，以一个倾向得分]
### caliper:[＃＃卡尺：]
fullmatch(mdist.examples$mh2 + caliper(1, mdist.examples$ps3))

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。

账号		自动登录	找回密码
密码			注册