Introduction to Bayes Theorem

    Conventional statistics rely on a Probabilistic Model of events, such as the probability (p) that one will draw an Ace from a deck of cards (p = 4/52) or roll Boxcars with two dice (p = 1/36). The joint probability of drawing an Ace AND rolling boxcars in then simply p' = (1/13)(1/36) = 0.00214. The probability of drawing an Ace OR rolling boxcars is p'' = [(1/13) + (1/36)] / (13x36) = (36 + 13)= 0.1047, about fifty times a great, where AND and OR are the operant words for multiplication and addition, respectively.

    This can be extended to biological situations, for example that the next hospital patient you see will be male and (or) have hemophilia, based on data that about half the population is male, and that a certain fraction of the population has hemophilia. The probabilistic model becomes complicated once it is recognized that hemophilia is typically a male trait, and further that in a hospital ward, there will be a higher proportion of hemophiliacs than in the outside population. Note that the probabilistic approach will be different when applied to optometric patients who are male and (or) color-blind. A simple probabilistic approach may fail under these circumstances.

    Alternatively, the Bayes Model is concerned with the likelihood of events, which explicitly considers the co-occurrence of events, especially where those events are not independent. This is phrased as, What is the probability of event A, given that event B also occurs?

Bayes’ Theorem as stated mathematically is:

p(A|B) = [ p(B|A) x p(A) ] / p(B)

where A & B are events, and p(B) 0. An event is something that can be true or false, or one way or another, for example, that a person is color blind, or female. Bayes' Theorem can then be stated in words as

The probability of Event A given that B is True, is equal to the probability of Event B given that A is True, times the probability of Event A, all divided by the probability of Event B.


    p(A|B) and p(B|A) are conditional probabilities, the likelihood of event A occurring, given that B is true, and v.v. The shorthand is p(A|B) as the probability of A given B.    p(A) and p(B) are the marginal probabilities of observing A and B, independently of each other: for example, the proportion of color-blind people, or females.

Among other uses, Bayes’ Theorem provides an improved method of assessing the likelihood that two non-independent events will occur simultaneously.


Example: Sensitivity & Specificity of Drug Testing

    Suppose a urine test used to detect the presence of a particular banned performance-enhancing drug is 99.9% sensitive and 99.0% specific. That is, the test will provide 99.9% true positive results for drug users, and 99.0% true negative results for non-users. Suppose further than 0.5% of the athlete population tested are drug users (incidence). We ask: What is the probability that an individual who tests positive is a User? Bayes’ Theorem phrases this as, what is p(User|+), that is, what is the probability that an individual is a User, provided that s/he tests positive ?

    Let p(A) = p(User) and p(B) = p(+), then


p(User|+) = [ p(+|User) x p(User) ] / p(+)

    Here, p(+|User) estimates sensitivity, that 0.999 of Users tested will be detected, and [1 - p(+|Non-User)] incorporates specificity, that only (1 – 0.99) = 0.01 of Non-Users will be reported (incorrectly) as Users.

Then, p(+) estimates the total number of positive tests, including true as well as false positives. These two components are

p(+) = [ p(+|User) x p(User) ] + [ p(+|Non-User) x p(Non-User) ]

Keeping the same number formats as defined above

p(+) = (0.999)(0.005) + (1 - 0.99)(1 - 0.005)  =  0.01495

So that

p(User|+) = [ p(+|User) x p(User) ] / p(+) = (0.999 x 0.005) / [(0.999)(0.005) + (1 - 0.99)(1 - 0.005)] = 0.3342


    That is, even if an individual tests positive, it is twice as likely as not (1 – 33.42% =  66.58%) that s/he is not a User. Why? Even though the test appears to be highly “accurate” (99.9% sensitivity & 99% specificity), the number of non-Users is very large compared to the number of Users. Under such conditions, the count of false positives exceeds the count of true positives. For example, if 1,000 individuals are tested, we expect 995 non-Users and 5 Users. Among the 995 non-Users, we expect 0.01 x 995 10 false positives. Among the 5 Users, we expect 0.99 x 5 = 5 true positives. So, out of 15 positive tests, only 5 (33%) are genuine. The test cannot be used to screen the general population for Users.

    What are the effects of improving “accuracy” of the test? If sensitivity were increased to 100%, and specificity remained at 99%, p(User|+) = 33.44%, a minuscule improvement. Alternatively, if sensitivity remains at 99.9% and specificity is increased to 99.5%, then p(User|+) = 50.10%, and half the positive tests are reliable. The test remains unreliable.

    How can testing be improved? If sensitivity and specificity remain unchanged at 0.999 and 0.99 respectively, but in the population of interest the incidence of users increases to 0.1, p(User|+) = 0.91736, and the test is reasonably reliable (but not at a 95% criterion). Alternatively, testing may be applied as a population screen, and any individual who tests positive may be re-tested: the probability that any non-user will fail the test twice is (0.001)2, but the probability that a user will escape detection twice is only (0.01)2.


    HOMEWORK: Write an Excel spreadsheet program to calculate p(User|+) for various values of Sensitivity, Specificity, and Incidence. Use the base values above as a starting point. Under what circumstances is the test most “useful”? Explain.



Drug-test example modified after Wikipedia, retrieved 2017; text © 2024 by Steven M Carr