The only characters recognized are [A-Za-z0-9], so no punctuation is in the "ground truth" labels. The columns of the "points" field of the struct in the .mat file contain the [row col width height] of each character, except that row and col are for the center of the box, not the upper-left corner (which would have made a lot more sense probably). Because a 128x128 filter was used to process the images, some of them were given extended borders (64 pixels in both dimensions, pre and post) so that filters centered on character regions didn't hang over the image edge. Thus, there are some unfortunate artifacts in some of the images which may hamper detection algorithms. It should be easy to determine from inspection which have extended borders (see, for instance, blair_md.tif). The text regions are also scaled to a uniform font height (100 px, which translates to about 72pixels for capital letters and 50 pixels for lower-case). The "string" field was used to determine word boundaries between characters. The "polarity" field of the struct is 1 if the text is light-on-dark and 0 otherwise. This bit was not used in the CVPR06 or ICDAR07 papers, since the features were phase invariant. Jerod Weinman jerod at acm dot org April 2008