logistic regression中 odds 和 odds ratio之间的区别

biostatistic · 发表于 2013-4-7 10:03:02

Unfortunately, the language used to describe statistical terms is not used uniformly across fields. One example of this is odds and odds ratio. Economists especially refer to what others call the odds as the odds ratio. Below, we will be careful to define our terms.

Proof that the estimated odds ratio is constant in logistic regression
Let there be a binary outcome y; we will say y=0 or y=1, and let us assume that

      Pr(y==1) = F(Xb)
where X and b are vectors and F() is some cumulative distribution.

If F() is the normal distribution, we have the probit estimator.

If F() is the logistic distribution, we have the logit (logistic) estimator.

The cumulative distribution for the logistic distribution is

      F(Xb) = exp(Xb) / [1 + exp(Xb)]
Thus,

      Pr(y==1) = exp(Xb) / [1 + exp(Xb)]
Let us write p for Pr(y==1)

      p = exp(Xb) / [1 + exp(Xb)]
The odds p/(1−p) is therefore

      p    exp(Xb) / [1 + exp(Xb)]    exp(Xb) / [1 + exp(Xb)]
      ---  =  ------------------------- = -----------------------
      1-p    1 - exp(Xb)/[1 + exp(Xb)]          1 / [1 + exp(Xb)]


         =  exp(Xb)
Many authors present this formula as

      log( p/[1-p] ) = Xb
which also means

      p / (1-p) = exp(Xb)
The language here is sometimes confusing because some authors call this the odds ratio. Englishwise, they are correct: it is the odds and the odds are based on a ratio calculation. It is not, however, the odds ratio that is talked about when results are reported.

The odds ratio when results are reported refers to the ratio of two odds or, if you prefer, the ratio of two odds ratios.

That is, let us write

      o(Xb) = exp(Xb)
The odds ratio is

      o(evaluated at one place)
      -------------------------
      o(evaluated at another)
In particular, we want to consider the ratio of the odds for a one-unit change in one of the components of X. Let us now write

      Xb = b0 + b1*x1 + b2*x2 + ... + bk*xk
Let us arbitrarily consider what is called the odds ratio for x1:

      o(b0 + b1*(x1+1) + b2*x2 + ... + bk*xk)
      ---------------------------------------
      o(b0 + b1*x1    + b2*x2 + ... + bk*xk)

         o(b0 + b1*x1 + b2*x2 + ... + bk*xk + b1)
      = ----------------------------------------
         o(b0 + b1*x1 + b2*x2 + ... + bk(xk)
Now, remember, o() = exp(), so

         exp(b0 + b1*x1 + b2*x2 + ... + bk*xk + b1)
      = ----------------------------------------
         exp(b0 + b1*x1 + b2*x2 + ... + bk(xk)

         exp(b0 + b1*x2 + b2*x2 + ... + bk*xk) * exp(b1)
      = -----------------------------------------------
         exp(b0 + b1*x2 + b2*x2 + ... + bk*xk)

      = exp(b1)
This is the standard result. The ratio of the odds for a one-unit increase in Xi is exp(bi).

This ratio is constant: it does not change according to the value of the other Xs because they cancel out in the calculation.

Be careful about language:

1.This is called the odds ratio; it is called that because it is the ratio of two odds.
2.Some people call the odds the odds ratio because the odds itself is a ratio. That is fine English, but this can quickly lead to confusion. If you did that, you would have to call this calculation the odds ratio ratio or the ratio of the odds ratios.
It is the language, and not the math, that leads to the confusion. When we say that in a logistic model, the odds ratio is constant, we mean

      o(evaluated at one point)
      --------------------------    is constant.
      o(evaluated somewhere else)
We do not mean that

      o(evaluated at one point)       is constant.
(that is, we do not mean the odds are constant).

logistic回归中 odds ratio 是一个常数exp(bi)
odds 是exp(Xb)
log odds 是 xb

账号		自动登录	找回密码
密码			注册