Lab: Probability

CSC261 - Artificial Intelligence - Weinman



Summary:
We explore the application of logic and probability.

Preparation

  1. Copy the starter source code to somewhere in your directory:
    cp /home/weinman/courses/CSC261/code/probability/charmodel.scm somewhere/
  2. Create a new Scheme file
    touch somewhere/starter.scm
  3. Open your file in DrRacket:
     drracket somewhere/starter.scm
  4. Add your names and any other pertinent info to the file (in comments).
  5. Get out a sheet of paper (yes, paper) and something to write with.

Exercises

A. Bayes Rule

Test effectiveness is a big, real world issue. Consider a recent controversy over ditching routine prostate cancer screening:
The assumption is that finding cancer early is always a good thing. Not so, said Dr. Virginia Moyer of the Baylor College of Medicine, who heads the [government] task force.
[The test] only sometimes signals prostate cancer is brewing. It also can mean a benign enlarged prostate or an infection. Worse, screening often detects small tumors that will prove too slow-growing to be deadly. And there's no sure way to tell in advance who needs aggressive therapy.
The task force analyzed...whether routine screening reduces deaths from prostate cancer. The conclusion: There's little if any mortality benefit.
But there is harm from routine screening: impotence, incontinence, infections, even death that can come from the biopsies, surgery and radiation, Moyer said.
Associated Press, "Government panel recommends against routine PSA screening for prostate cancer," Washington Post, Thursday, 6 October 2011, http://www.washingtonpost.com/national/health-science/reports-say-government-to-recommend-against-routine-psa-screening-for-prostate-cancer/2011/10/06/gIQAPKCWRL_story.html (Accessed 20 October 2011)
How might such a decision to ditch testing be made? Probably with a process not entirely unlike the following.1 (All numbers and test below are entirely fictitious.)
Consider two medical tests, A and B for a particular type of cancer. Test A is 95% accurate at recognizing the cancer when it is present, but has a 10% false positive rate (indicating cancer is present when it is not). Test B is only 90% effective at detecting the cancer, and has a 5% false positive rate. The two tests use independent detection methods. Just one percent of men actually suffer from this cancer.
Suppose a person is tested for the cancer using only one of the two possible tests, and that test comes back positive. Which test returning positive is more indicative of someone actually having the cancer?
  1. Define all the propositions that will be used to answer the question. For example
    ATest A signals positive for the cancer.
  2. Using these propositional symbols, write down all of the probabilities given in the problem statement above.
  3. What probabilities must be calculated to answer the question "which positive test is more indicative?"?
  4. Using the rules of probability, calculate the values needed. For clarity, write out intermediate probability terms before replacing with any numeric counterparts.
  5. Which test returning positive is more indicative of someone actually having the cancer?
  6. How much more likely is someone to have cancer if they test positive with the more effective test than someone who tested positive with the less effective test? (That is, calculate a ratio.)
  7. Be prepared to walk through and share your answers with the rest of the class.

B: Typo Modeling

The individual lab has you using a large Scheme library of language procedures you'll need to become familiar with. This exercises should jump start the process.
  1. Open (or get out a paper copy) the procedure reference from the lab.
  2. Open the file charmodel.scm in DrRacket and briefly inspect it. You may use the definitions browser to jump to a specific procedure and inspect its formal documentation (or implementation if you must).
  3. Read the documentation for char->index to determine how many unique characters we will model.
    Note: Beware oboes!
  4. Use the procedure create-counts to make (and define) a structure to hold the counts of each typo for one typist. (Remember, to use the rule of succession, the initial-value should be 1.)
  5. The files /home/weinman/courses/CSC261/data/mills/original/16.txt and /home/weinman/courses/CSC261/data/mills/corrupted/typist1/16.txt are an original/corrupted pair. Use the procedure count-conditionals! to update your structure from B.4 using these two files to count the number of times each correct character was observed as each corrupt character.
  6. Using char->index, determine the numerical indices of the characters "f" and "g".
  7. Using your updated counts and the procedure get-count, determine how many times the character "f" was mistyped as "g". Note: read the postconditions for count-conditionals! very carefully to make sure you use the arguments in the right order!
  8. Use normalize-counts! to transform your counts into proper conditional probabilities.
  9. Use get-count again to identify the probability P(O="g" | C="f",D=16.txt,T=1,I).
  10. Use log-likelihood to assess the log likelihood of letter number 8 and typist one's rendition of it using the model you learned from letter 16.
  11. You may wonder (aside from speed) why we calculate the log likelihood. Use exp to transform the result from B.10 into a regular (non-log) likelihood. What do you get? Why?
  12. Repeat the process in B.4,B.5, and B. 8 to establish a likelihood model (that is, normalized counts) for typist 2 using letter number 4.
  13. Use get-count again to identify the probability P(O="g" | C="f",D=4.txt,T=2,I), the probability typist two would mistype "f" as "g". How does this compare to typist one? (The procedure exact->inexact may be helpful.)
  14. Calculate the likelihood of letter number 8 and typist one's rendition using the typist two model you learned in B.12.
  15. How do the two log likelihoods from B.10 and B.14 compare? What does that comparison indicate about who typed letter number 8?
Aside from some thinking about how to make the final comparisons and the overall task of organizing your computational analysis, you should now be ready to complete a report comparing the two typists and their unattributed letters.
You may begin (working alone) on the analysis, or continue (below) by deriving one of Jaynes's weak syllogisms.

C: [Optional] Plausible Reasoning

The inference rule modus ponens allows us to make the following deduction
AB,A

B
The strength of propositional logic lies in its ability, given a sound and complete inference algorithm, to determine what we may conclude from our background information. However modern AI systems deal with incomplete information and are still required to make decisions and take action. Thus, probability affords us a useful way of quantifying uncertainty about certain proposition and measuring the change in uncertainty as more observations are made.
In this exercise we will look at a weaker, probabilistic version of the modus ponens rule. Given the implication above, what happens to our belief about A when B is observed rather than A? Our intuition suggests that observing B, which is a consquence of A, increases our belief in A. Can we prove it? Yes. Here's how.
  1. Let C º AÞ B be our background information or knowledge. Write a probability equation relating A and B, conditioned on the background C that signifies the equivalence of reasoning about A first and then B (as in modus ponens) or else B first and then A (as we are now trying to do).
  2. Using the rules of probability we have derived so far, find any terms in the equation above that may be simplified.
  3. If you note that the background information you are given does not entail B, what can you say (numerically) about your belief in B?
  4. Using this information, how would the observation of B influence your degree of belief about A? (That is, compare the probabilities before and after observing B).
  5. Be prepared to walk through and share your answers with the rest of the class.
Copyright © 2011, 2013 Jerod Weinman.
ccbyncsa.png
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.

Footnotes:

1Adapted from Russell and Norvig, 2010, Exercise 13.13.