The only characters recognized are [A-Za-z0-9], so no punctuation is
in the "ground truth" labels.

The columns of the "points" field of the struct in the .mat file
contain the [row col width height] of each character, except that row
and col are for the center of the box, not the upper-left corner
(which would have made a lot more sense probably).

Because a 128x128 filter was used to process the images, some of them
were given extended borders (64 pixels in both dimensions, pre and
post) so that filters centered on character regions didn't hang over
the image edge. Thus, there are some unfortunate artifacts in some of
the images which may hamper detection algorithms. It should be easy to
determine from inspection which have extended borders (see, for
instance, blair_md.tif).

The text regions are also scaled to a uniform font height (100 px,
which translates to about 72pixels for capital letters and 50 pixels
for lower-case).


The "string" field was used to determine word boundaries between
characters. 

The "polarity" field of the struct is 1 if the text is light-on-dark
and 0 otherwise. This bit was not used in the CVPR06 or ICDAR07
papers, since the features were phase invariant.

Jerod Weinman
jerod at acm dot org
April 2008