A contingency table is used to show the relationship between (usually) two categorical variables. Each categorical variable in the table has (usually) two levels, but there can certainly be more. In a \(2 \times 2\) contingency table, we refer to the cell counts with \(a,b,c,d\) as follows.
A+ | A- | Total | |
---|---|---|---|
B+ | a | b | a+b |
B- | c | d | c+d |
Total | a+c | b+d | a+b+c+d |
To create a \(2 \times 2\) table in SPSS:
If the raw data are not given to you in the form of columns with 0’s and 1’s, but rather in the form of a pre-made \(2 \times 2\) table, you can use weights to enter the data into SPSS. You do not need to enter the 1’s and 0’s by hand! Let’s say you have the following data table.
Disease+ | Disease- | Total | |
---|---|---|---|
Exposed+ | 10 | 30 | 40 |
Exposed- | 9 | 80 | 89 |
Total | 19 | 110 | 129 |
To enter these data into SPSS, open a blank SPSS spreadsheet and populate it as follows.
Disease | Exposure | Counts |
---|---|---|
Disease+ | Exposed+ | 10 |
Disease- | Exposed+ | 30 |
Disease+ | Exposed- | 9 |
Disease- | Exposed- | 80 |
Now weight the data in SPSS by the count variable:
Tabulated are presented with the observed counts in each cell. It will be useful for many statistical tests to also compute the expected counts for each cell. This expected value is based on the marginal row count, marginal column count, and grand total count. For the cell in row 1, column 1, the expected count is obtained by multiplying the row total by the column total and dividing that number by the grand total. The expected counts for the general table above are as follows.
A+ | A- | Total | |
---|---|---|---|
B+ | (a+b)(a+c)/(a+b+c+d) | (a+b)(b+d)/(a+b+c+d) | a+b |
B- | (c+d)(a+c)/(a+b+c+d) | (c+d)(b+d)/(a+b+c+d) | c+d |
Total | a+c | b+d | a+b+c+d |
To create a \(2 \times 2\) table in SPSS with both observed and expected counts:
The risk of an event is a term used synonomously in the class with probability. So, the risk of an event \(A\) happening, \(Risk(A) = P(A)\). Risks can be conditional or unconditional.
The odds of an event is defined as the probability of an event happening divided by the probability of the event not happening. So, the odds of an event \(A\), \(Odds(A) = P(A)/(1-P(A))\). Odds can be conditional or unconditional.
The risk ratio (relative risk) is the probability of an event in one group divided by the probability of an event in another group. Note that the risk ratio is the ratio of two conditional probabilities. For a generic event \(A\) and two groups the relative risk is
\[RR = \frac{P(A|Group = 1)}{P(A|Group = 2)}\]
The odds ratio is the ratio of the odds in group 1 divided by the odds in group 2. Similar to relative risk, the odds ratio is a ratio of two conditional odds.
\[OR = \frac{Odds(A|Group = 1)}{Odds(A|Group = 2)} = \frac{P(A|Group = 1)/(1-P(A|Group = 1))}{P(A|Group = 2)/(1-P(A|Group = 2))}\]
To compute measures of association in SPSS:
Pearson’s \(\chi^2\) test is used to test whether there is independence between the row variable and the column variable in a \(2 \times 2\) table. This test relies on the asympototic normality of a test statistic based on the differences between the observed and expected count in each cell. As such, it requires a large sample size, namely that the expected cell counts are all greater than 5.
The setup of hypotheses for Pearson’s \(\chi^2\) test are as follows, where \(A\) is a column variable, \(B\) is a row variable, and the symbol \(\perp\) denotes independence between two random variables.
\[H_0: A \perp B\] \[H_A: A \stackrel{not}{\perp} B\]
Like the other hypothesis tests we have done, Pearson’s \(\chi^2\) test produces a p-value that can be used to reject/fail to reject the null hypothesis.
To conduct Pearson’s \(\chi^2\) test in SPSS:
Fisher’s exact test can be used when the expected cell counts are too small for Pearson’s \(\chi^2\) test. The p-value for Fisher’s exact test are given in the same output as the Pearson’s \(\chi^2\).
To conduct Fisher’s exact test in SPSS:
The M-H \(\chi^2\) test for ordinal data is similar to Pearson’s \(\chi^2\) but requires fewer degrees of freedom (it only requires 1 df). It is given in the same output as Pearson’s \(\chi^2\) and Fisher’s exact.
To conduct the M-H trend test in SPSS:
A CMH odds ratio has the same interpretation as a regular odds ratio, but it is adjusted for a given stratifying variable. For example if we stratify our \(2 \times 2\) table into two different \(2 \times 2\) tables where the first corresponds to males and the second corresponds to females, the CMH odds ratio would measure the association between row and column variables after adjusting for gender.
McNemar’s test is used to test association between row and column variables when data are paired.
To use McNemar’s test: