2 min read

Odds Ratio

Consider the following contingency table for a retrospective study:

\ Disease(Yes) Disease(No) Total
Treatment 1 a b a+b
Treatment 2 c d c+d

There are two statistics of interest:

  • Relative Risk (RR): $\frac{a/(a+b)}{c/(c+b)}$

  • Odds ratios (OR): $\frac{a/b}{c/d}$

Relative risk is easy to understand: it’s the ratio of disease probability between two treatment groups. But why care about odds ratio?

It has something to do with the experimental design:

Retrospective Study

A retrospective study collects certain amount of random samples in both treatment groups.

Below is a retrospective study example: We collect 1000 random samples from BMI>30 people, 1000 from BMI<30, and see how many of them have diabetes

\ diabetes(Yes) diabetes(No) Total
BMI > 30 350 650 1000
BMI < 30 100 900 1000

In retrospective studies, the relative risk (RR) makes sense.

Case control Study

A case control study collects certain amount of random samples in both disease status.

Case control study is usually designed when the disease proportion is very low. Consider some type of cancer which has 0.1% occurrence in the treatment 1 population. Then to get a decent amount of positive patients, say 10, we are expected to collect 10000 samples, which is very time/money consuming. Those we instead sample certain number of observations directly from the disease population (e.g. from hospital), and see how many are under treatment 1.

Below is a case control study example: we collect 200 random samples from those with lung cancer, 800 from those without, and see how many of them are in treatment 1 (live in city):

\ lung cancer(Yes) lung cancer(No)
city 150 500
countryside 50 300
Total 200 800

The risks or relative risk (RR) makes no sense here. The RR here is $\frac{150/(150+500)}{50/(50+300)} = 1.61$; but if we only collect 100 samples instead of 800 in the no lung cancer group it became $\frac{150/(150+50)}{50/(50+30)} = 1.2$. In other words it is NOT consistent with sample size.

But the odds ratio works in this case. It is consistent with the sample size.

Further, if the disease rate is very low in both groups, then obviously OR can approximate RR very well.

Conclusion

For retrospective studies, use RR (can also use OR).

For case control study, use OR. And if the disease probability is low, can use OR to approximate RR.