Multiple Testing

We’ll begin by recreating the examples from this morning’s lecture:

Coin Toss Experiments

set.seed(231)
x=sample(c("H","T"),10,replace=TRUE,prob=c(1/3,2/3)) # Create a sample of size 10, with probability of "H" = 1/3
                                                     # and probability of "T" = 2/3. Clearly, this is a biased coin.
x
  1. 'T'
  2. 'T'
  3. 'T'
  4. 'T'
  5. 'T'
  6. 'T'
  7. 'T'
  8. 'H'
  9. 'T'
  10. 'T'
# Test for biasedness

binom.test(sum(x=='T'), n=length(x), p = 0.5)
    Exact binomial test

data:  sum(x == "T") and length(x)
number of successes = 9, number of trials = 10, p-value = 0.02148
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.5549839 0.9974714
sample estimates:
probability of success
                   0.9

So, our test concludes the coin is biased. Now we toss two coins:

set.seed(231)
x1=sample(c("H","T"),10,replace=TRUE,prob=c(1/3,2/3))
x2=sample(c("H","T"),10,replace=TRUE,prob=c(1/3,2/3))
x1
x2
  1. 'T'
  2. 'T'
  3. 'T'
  4. 'T'
  5. 'T'
  6. 'T'
  7. 'T'
  8. 'H'
  9. 'T'
  10. 'T'
  1. 'T'
  2. 'T'
  3. 'T'
  4. 'T'
  5. 'T'
  6. 'T'
  7. 'H'
  8. 'T'
  9. 'T'
  10. 'T'
# Test for bias
binom.test(sum(x1=='T'), n=length(x), p = 0.5)
    Exact binomial test

data:  sum(x1 == "T") and length(x)
number of successes = 9, number of trials = 10, p-value = 0.02148
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.5549839 0.9974714
sample estimates:
probability of success
                   0.9
binom.test(sum(x2=='T'), n=length(x), p = 0.5)
    Exact binomial test

data:  sum(x2 == "T") and length(x)
number of successes = 9, number of trials = 10, p-value = 0.02148
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.5549839 0.9974714
sample estimates:
probability of success
                   0.9
?binom.test
binom.test {stats}R Documentation

Exact Binomial Test

Description

Performs an exact test of a simple null hypothesis about the probability of success in a Bernoulli experiment.

Usage

binom.test(x, n, p = 0.5,
           alternative = c("two.sided", "less", "greater"),
           conf.level = 0.95)

Arguments

x

number of successes, or a vector of length 2 giving the numbers of successes and failures, respectively.

n

number of trials; ignored if x has length 2.

p

hypothesized probability of success.

alternative

indicates the alternative hypothesis and must be one of "two.sided", "greater" or "less". You can specify just the initial letter.

conf.level

confidence level for the returned confidence interval.

Details

Confidence intervals are obtained by a procedure first given in Clopper and Pearson (1934). This guarantees that the confidence level is at least conf.level, but in general does not give the shortest-length confidence intervals.

Value

A list with class "htest" containing the following components:

statistic

the number of successes.

parameter

the number of trials.

p.value

the p-value of the test.

conf.int

a confidence interval for the probability of success.

estimate

the estimated probability of success.

null.value

the probability of success under the null, p.

alternative

a character string describing the alternative hypothesis.

method

the character string "Exact binomial test".

data.name

a character string giving the names of the data.

References

Clopper, C. J. & Pearson, E. S. (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika, 26, 404–413.

William J. Conover (1971), Practical nonparametric statistics. New York: John Wiley & Sons. Pages 97–104.

Myles Hollander & Douglas A. Wolfe (1973), Nonparametric Statistical Methods. New York: John Wiley & Sons. Pages 15–22.

See Also

prop.test for a general (approximate) test for equal or given proportions.

Examples

## Conover (1971), p. 97f.
## Under (the assumption of) simple Mendelian inheritance, a cross
##  between plants of two particular genotypes produces progeny 1/4 of
##  which are "dwarf" and 3/4 of which are "giant", respectively.
##  In an experiment to determine if this assumption is reasonable, a
##  cross results in progeny having 243 dwarf and 682 giant plants.
##  If "giant" is taken as success, the null hypothesis is that p =
##  3/4 and the alternative that p != 3/4.
binom.test(c(682, 243), p = 3/4)
binom.test(682, 682 + 243, p = 3/4)   # The same.
## => Data are in agreement with the null hypothesis.

[Package stats version 3.2.0 ]
# Suppose we toss 100 fair coins
count.reject=0
for (i in 1:100){
    x2=sample(c("H","T"),10,replace=TRUE,prob=c(1/2,1/2))
    result=binom.test(sum(x2=='T'), n=length(x2), p = 0.5)
    print(result$p.value)
    if (result$p.value<.05) {count.reject=count.reject+1}
    }
count.reject
[1] 0.34375
[1] 1
[1] 0.109375
[1] 1
[1] 1
[1] 0.109375
[1] 0.34375
[1] 0.34375
[1] 1
[1] 0.34375
[1] 0.7539063
[1] 0.7539063
[1] 0.34375
[1] 1
[1] 0.02148438
[1] 0.34375
[1] 0.34375
[1] 0.7539063
[1] 0.7539063
[1] 0.7539063
[1] 0.7539063
[1] 0.34375
[1] 0.7539063
[1] 0.7539063
[1] 1
[1] 0.34375
[1] 0.7539063
[1] 0.7539063
[1] 0.34375
[1] 0.7539063
[1] 1
[1] 0.34375
[1] 0.34375
[1] 1
[1] 1
[1] 0.7539063
[1] 0.7539063
[1] 0.109375
[1] 1
[1] 0.34375
[1] 0.7539063
[1] 1
[1] 0.34375
[1] 1
[1] 0.34375
[1] 0.34375
[1] 0.7539063
[1] 0.109375
[1] 1
[1] 0.109375
[1] 1
[1] 1
[1] 1
[1] 0.7539063
[1] 0.34375
[1] 1
[1] 0.109375
[1] 0.7539063
[1] 0.7539063
[1] 1
[1] 0.7539063
[1] 0.109375
[1] 0.34375
[1] 0.7539063
[1] 1
[1] 0.7539063
[1] 0.34375
[1] 0.109375
[1] 0.34375
[1] 0.34375
[1] 0.109375
[1] 0.7539063
[1] 0.7539063
[1] 0.7539063
[1] 0.7539063
[1] 1
[1] 1
[1] 0.34375
[1] 0.7539063
[1] 1
[1] 1
[1] 1
[1] 1
[1] 0.7539063
[1] 0.34375
[1] 0.02148438
[1] 0.109375
[1] 0.34375
[1] 0.7539063
[1] 0.34375
[1] 0.7539063
[1] 0.02148438
[1] 0.34375
[1] 0.34375
[1] 1
[1] 1
[1] 1
[1] 0.34375
[1] 0.7539063
[1] 0.7539063
3

We can consider the number of tosses to be the ‘sample size’ and the number of hypotheses tested to be the number of coins tossed. Here, our sample size is small compared to the number of hypotheses tested. In genome data, the sample size is the number of patients and the number of hypotheses tested is the number of genes (or SNPs) analyzed. If we set a significance level of \(0.05\), we are saying that we expect to find a false positive 5% of the time. So in our coin toss, we would expect to find the coin to be fair 5 times out of 100.

# What happens if we increase the number of tosses?