## Intro

## Contents

## Introduction

The fourth component of the M.A.A.R.I.E. framework is the results or analysis section. Like the previous components, results require us to address three key questions [(1)]:

- Estimation: What is the magnitude or strength of the association or relationship observed in the investigation?
- Inference: What statistical technique(s) are used to perform statistical significance testing?
- Adjustment: What statistical technique(s) are used to take into account or control for difference between the study and control groups that may affect the results?

## Estimation

### Strength of Relationship[1]–[3]

When measuring the strength of a relationship using data from samples, we are attempting to use that information to estimate the strength of the relationship within a larger group called a population. Thus, biostatisticians often refer to any measurement of the strength of a relationship as an estimate or point estimate. The data from the samples are said to estimate the population’s effect size, which is the magnitude of the association or the difference in the larger population.2.1

When we measure the strength of a relationship, we usually need to define what we mean by the independent variable(s) and the dependent variable. In general, a dependent variable is the one primary outcome or end point that we wish to estimate based on one or more independent variables. Let us take a look at how we can measure the strength of the relationship between birth control pills, an independent variable, and thrombophlebitis, a dependent variable.2.2 First, we will look at the basic measure of the strength of an association that is most frequently used in cohort studies. Then we turn to the basic measure used in case–control studies.

Let us assume that we are studying the association between birth control pills and thrombophlebitis. We want to measure the strength of the association to determine how strongly the use of birth control pills affects the risk of thrombophlebitis. Before we do this we must first clarify the concept of risk.

When used quantitatively, risk implies the probability of developing a condition over a specified period. Risk equals the number of individuals who develop the condition divided by the total number of individuals who were possible candidates to develop the condition at the beginning of the period. In assessing the 10-year risk of developing thrombophlebitis, we would divide the number of women taking birth control pills who developed thrombophlebitis over a 10-year period by the total number of women in the study group who were taking birth control pills.2.3

A further calculation is necessary to measure the relative degree of association between thrombophlebitis for women who are on birth control pills compared with women who are not on birth control pills. One such measure is known as relative risk. Relative risk is the probability of thrombophlebitis if birth control pills are used divided by the probability if birth control pills are not used. It is defined as follows:

Generally,

Let us illustrate how the risk and relative risk are calculated using a hypothetical example:

#### Mini-Study 2.1

For 10 years, an investigator monitored 1,000 young women taking birth control pills and 1,000 young women who were nonusers. He found that 30 of the women on birth control pills developed thrombophlebitis over the 10-year period, whereas only 3 of the nonusers developed thrombophlebitis over the same period. He presented his data using what is called a 2 × 2 table:

Thrombophlebitis | No Thrombophlebitis | ||
---|---|---|---|

Birth control pills | a = 30 |
b = 970 |
a + b = 1,000 |

No birth control pills | c = 3 |
d = 997 |
c + d = 1,000 |

The 10-year risk of developing thrombophlebitis on birth control pills equals the number of women on the pill who develop thrombophlebitis divided by the total number of women on the pill at the beginning of the study. Thus, the risk of developing thrombophlebitis for women on birth control pills is equal to

Likewise, the 10-year risk of developing thrombophlebitis for women not on the pill equals the number of women not on the pill who develop thrombophlebitis divided by the total number of women not on the pill at the beginning of the study. Thus, the risk of developing thrombophlebitis for women not on the pill is equal to

The relative risk equals the ratio of these two risks. A relative risk of 1 implies that the use of birth control pills does not increase thrombophlebitis.

This relative risk of 10 implies that, on the average, women on the pill have a risk of thrombophlebitis 10 times that of women not on the pill. (2,4) 2.4

Now let us look at how we measure the strength of association for case–control studies by looking at a study of the association between birth control pills and thrombophlebitis:

#### Mini-Study 2.2

An investigator selected 100 young women with thrombophlebitis and 100 young women without thrombophlebitis. She carefully obtained the history of prior use of birth control pills. She found that 90 of the 100 women with thrombophlebitis were using birth control pills compared with 45 of the women without thrombophlebitis. She presented her data using the following 2×2 table:

Thrombophlebitis | No Thrombophlebitis | |
---|---|---|

Birth control pills | a = 90 |
b = 45 |

No birth control pills | c = 10 |
d = 55 |

a + c = 100 |
b + d = 100 |

Note that in case–control studies the investigator can choose the total number of patients in each group (those with and without thrombophlebitis). She could have chosen to select 200 patients with thrombophlebitis and 100 patients without thrombophlebitis, or a large number of other combinations.

Thus, the actual numbers in each vertical column, the cases and the controls, can be altered at will by the investigator. In other words, in a case–control study, the number of individuals who have and do not have the disease does not necessarily reflect the relative frequency of those with and without the disease. Whenever the number of cases relative to the number of controls is determined by the investigator, it is improper to add the boxes in the case–control 2×2 table horizontally (as we did in the preceding cohort study) and calculate relative risk.

Thus we need to use a measurement that is not altered by the relative numbers in the study and control groups. This measurement is known as the odds ratio.

The size of the odds ratio is often very close to the relative risk. That is, the odds ratio is often a good approximation of the relative risk. When this is the situation, it can be used as a substitute for the relative risk. This is the usual situation when the disease or condition under investigation occurs relatively infrequently.

To understand what we mean by an odds ratio, we first need to appreciate what we mean by odds and how odds differ from risk. Risk is a probability in which the numerator contains the number of times the event, such as thrombophlebitis, occurs over a specified period. The denominator of a risk or probability contains the number of times the event could have occurred. Odds, like probability, contain the number of times the event occurred in the numerator. However, in the denominator odds contain only the number of times the event did not occur.

The difference between odds and probability may be appreciated by thinking of the chance of drawing an ace from a deck of 52 cards. The probability of drawing an ace is the number of times an ace can be drawn divided by the total number of cards: 4 of 52 or 1 of 13. Odds, on the other hand, are the number of times an ace can be drawn divided by the number of times it cannot be drawn: 4 to 48 or 1 to 12. Thus, the odds are slightly larger than the probability, but when the event or the disease under study is relatively rare, the odds are a good approximation of the probability.

The odds ratio is the odds of having the risk factor if the condition is present divided by the odds of having the risk factor if the condition is not present. The odds of being on the pill if thrombophlebitis is present are equal to

Likewise, the odds of being on the pill for women who do not develop thrombophlebitis is measured by dividing the number of women who do not have thrombophlebitis and are using the pill by the number of women who do not have thrombophlebitis and are not using the pill. Thus, the odds of being on the pill if thrombophlebitis is not present are equal to

Like the calculation of relative risk, one can develop a measure of the relative odds of being on the pill if thrombophlebitis is present versus being on the pill if thrombophlebitis is not present. This measure of the strength of association is the odds ratio. Thus,

An odds ratio of 1, parallel with our interpretation of relative risk, implies that the odds are the same for being on the pill if thrombophlebitis is present and for being on the pill if thrombophlebitis is absent. Our odds ratio of 11 means that the odds of being on birth control pills are increased 11-fold for women with thrombophlebitis.

The odds ratio is the basic measure of the degree of association for case–control studies. It is a useful measurement of the strength of the association. In addition, as long as the disease (thrombophlebitis) is rare, the odds ratio is approximately equal to the relative risk. Note, however, that the odds ratio is larger than the relative risk. This is a general principle. You can expect the odds ratio to be larger than the relative risk. As the probability of the disease increases, the difference between the odds ratio and the relative risk will increase.

It is possible to look at the odds ratio in reverse, as one would do in a cohort study, and come up with the same result. For instance,

The odds ratio then equals

Note that this is actually the same formula for the odds ratio as the one shown previously, that is, both can be expressed as ad divided by bc. This convenient property allows one to calculate an odds ratio from a cohort or randomized controlled trial instead of calculating the relative risk. This makes it easier to compare the results of a case–control study with those of a cohort study or randomized controlled trial. Learn More 2.1 looks at a special form of the odds ratio or relative risk that can be used when pairing is used as part of the investigation.

#### Learn More 2.1: Pairing and Estimating the Strength of The Association

The relative risk and odds ratio are the fundamental measures we use to measure the strength of an association between a risk factor and a disease. A special type of odds ratio (or relative risk) is calculated when pairing is used to conduct an investigation. Pairing implies that one study group individual is linked to one control group individual and the endpoints in the pair are compared. This type of matching, known as pairing, is used to ensure identical distribution of potential confounding variables between study and control groups. When pairing is used, a special type of odds ratio should be used to take advantage of the increased statistical power when estimating the strength of the association

Let us look at the following example of this measure of the strength of the relationship:

#### Mini-Study 2.3

Assume that a case–control study of birth control pills and thrombophlebitis was conducted using 100 pairs of patients with thrombophlebitis and controls without thrombophlebitis. The cases and controls were paired so that each member of the pair was the same age and had the same number of previous pregnancies. The results of a paired case-control study are presented using the following 2×2 table2.5:

Controls Using Birth Control Pills | Controls Not Using Birth Control Pills | |
---|---|---|

Cases using birth control pills | 30 | 50 |

Cases not using birth control pills | 5 | 15 |

The odds ratio in a paired case–control study uses only the pairs in which the exposure (e.g., the use of birth control pills) is different between the case and control members of a pair. The pairs in which the cases with thrombophlebitis and the controls without thrombophlebitis differ in their use of birth control pills are known as discordant pairs.

The odds ratio is calculated using discordant pairs as follows:

This odds ratio is interpreted the same way as an odds ratio calculated from unpaired studies.2.6

2.1 The measures that we use in this chapter are measures of the strength of an association. Associations measure the strength of a relationship in one sample (or population) compared with another. That is, associations are expressed as ratios. Differences in contrast subtract a measurement taken in one sample (or population) from those in another sample (or population). Also note that the term effect size does not imply that a cause and effect relationship is present. When describing the data obtained in an investigation, measures of the central tendency such as the mean or average or alternatively the median are used. Measures of the spread or dispersion of the data such as the standard deviation are also needed to describe the distribution of continuous data

2.2 At times, there will be only a dependent variable and no independent variables. This type of statistical analysis is called univariable analysis. Univariable analysis is used primarily in descriptive studies. In contrast, bivariable and multivariable statistical methods are used primarily in analytical studies. Bivariable analysis implies one dependent and one independent variable. Multivariable analysis implies one dependent and more than one independent variable. In addition to the estimation measures discussed in this chapter, you will also see an estimate of the strength of the relationship for the association between two continuous variables such as blood pressure (BP) and salt intake or body mass index and blood sugar. The basic measurement of the strength of the relationship is known as the correlation coefficient (R). The correlation coefficient can vary from +1 to ?1. Zero indicates no association or correlation. A correlation coefficient of +1 indicates that as the value of one variable increases the value of the other variable increases. A negative correlation coefficient indicates that as the value of one variable increases the other decreases. In addition, the square of R, or R2, that is called the coefficient of determination provides an estimate of the percentage of the variation that is explained.

2.3 When individuals are followed for different periods of time, the denominator used to calculate risk often includes a measure known as person-years. A person-year is equivalent to one person followed for one year. Person-years allow us to include individuals who are followed for differing lengths of time by including their data for each year in which they are followed. Person-years are an example of a broader approach that can include any interval of time such as person months, days, or minutes, etc.

2.4 Relative risks may also be presented with the group at lower risk in the numerator. These two forms of the relative risks are merely the reciprocal of each other. Thus, the risk of thrombophlebitis for those not taking birth control pills divided by the risk for those taking birth control pills would be 0.003/0.030 = 0.1 or 1/10.

2.5 The table for a paired case–control study tells us about what happens to a pair instead of what happens to each person. Thus, the frequencies in this paired 2×2 table add up to 100 (the number of pairs) instead of 200 (the number of persons in the study).

2.6 Pairing, however, has an advantage of greater statistical power. Everything else being equal, statistical significance can be established using smaller numbers of study and control group patients. Pairing may be used in cohort studies and randomized controlled trials as well as case–control studies.