Lab: Dynamic Bayesian Networks

CSC261 - Artificial Intelligence - Weinman

Summary:: We explore the use of sequential probabilistic reasoning for prediction and inference.

Preparation

Copy the starter source code to the same place you put it for the last lab.

$ cp /home/weinman/courses/CSC261/code/hmm/hmm.scm somewhere/
Open the previous lab's starter file in DrRacket:

$ drracket somewhere/starter.scm &

Exercises

A. Typo Modeling

In the last in-class lab, you should have defined likelihood models for typist one and two. Review these definitions in starter.scm and note in particular what you called them.

B. Language Modeling

Load the file hmm.scm at the top of your code.
Using create-counts again, define a count structure to store the language (character) transition model for our Bayesian network.
Use count-transitions! (twice) to add the transitions from the two original (uncorrupted) letters 4 and 16 to your structure from B.2.
Using get-count and char->index, determine how many times "t" was followed by "h" and how often "q" was followed by "u". Note: read the postconditions for count-transitions! very carefully to make sure you use the arguments in the right order!
How do these raw frequency counts compare?
Use normalize-counts! again to transform your transition counts into proper conditional probabilities.
Using get-count again, determine P(C=h | C′=t,L={ 4.txt,16.txt} ,I) and P(C=u | C′=q,L={ 4.txt,16.txt} ,I). How do the probabilities compare? Based on the frequencies you saw, are these probabilities surprising? Explain the apparent discrepancy.
Use marginal-counts to define a vector containing the total number of times each character appears in the two letters. (But see the note below for a detail/caveat.¹)
Make sure insert the marginalization before the normalization of the counts matrix from B.5!
Use normalize-marginal-counts! to transform your marginal vector from B.7 into a proper marginal probability.

C. Marginal Sequential Inference

We already know letter 14 was transcribed by typist one. However, we don't have the original, so we can't use log-likelihood as before to compare

P(O=typed.14.txt | C=orig.14.txt,D=16.txt,T=1)

and

P(O=typed.14.txt | C=orig.14.txt,D=4.txt,T=2).

However, we now have a language model to help us infer the original values of C (and marginalize them) for comparison.

Use log-evidence and the values you calculated in A, B.5, and B. 8 to calculate the log marginal likelihood for the observed letter if typed by typist one:

lnP(O=typed.14.txt | D=16.txt,L={ 4.txt,16.txt} ,T=1,I).
Using the same process, calculate the log marginal likelihood for the observed letter if typed by typist two:

lnP(O=typed.14.txt | D=4.txt,L={ 4.txt,16.txt} ,T=2,I).
How do the two log (marginal) likelihoods compare? Even when you don't know the original text, might you able to establish who typed it?

D. Restorative Sequential Inference

Rather than marginalize the true text C, how can we predict it?

Use count-errors-file to determine how many errors are in the first typists' rendition of letter 8.
Use most-likely-sequence with the same models you used for C.1 to produce the best prediction for letter 8:

arg
max
c
P(C=c | O=8.txt,D=16.txt,L={ 4.txt,16.txt} ,T=1,I).
Use count-errors-list to determine how many errors are there in the restored version of the file you just calculated. Dissatisfied? The homework will help resolve your concerns.
For now, assuage your concerns by using most-likely-sequence to restore letter 8 as if typist two had transcribed it:

arg
max
c
P(C=c | O=8.txt,D=4.txt,L={ 4.txt,16.txt} ,T=2,I).
How many errors are in the new restoration from D.4? (Do you feel any better about D.3 now?)

Once again, aside from some thinking about the overall task of organizing your computational analysis, you should now be ready to complete a report comparing the two typists their unattributed letters lacking the originals. You may begin working on the analysis homework.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 International License.

Footnotes:

¹If you initialized the structure with a 1 to use the rule of succession, then each character's frequency will be overrepresented in your count, making the marginal distribution a little more uniform than it might otherwise be. As you'll see later, that "smoothing" may actually be an advantage. For now, simply note that we've broken the rules a bit, but we will tolerate that to make our programming just a bit easier.