Text & Sign DetectionResearch < Jerod Weinman < CompSci < Grinnell

Detection Model

Figure 1. A model for contextual detection. White boxes relate image features to a single unknown label y (sign or background), while green boxes associate image features such as texture gradients with pairs of neighboring labels.

Most previous work has either used (1) a sliding window approach to independently detect individual windows as text (or sign) or (2) an uninformed segmentation method followed by a classification of each region as text versus background.

Our model captures the dependency between neighboring regions and uses powerful texture features to discriminate between signs and uninteresting areas.

Example Results

Using a contextual model to eliminate isolated false positives and more fully cover all regions of a detected sign, we are able to robustly detect text and logos with arbitrary sizes and layouts in complex scenes.

Comparison

Using a sliding-window classification approach (lefthand pair above) often fails to detect many sign regions, while the contextual model (righthand pair above) uses the dependencies to more fully cover signs.

Detection Model

Example Results

Comparison

Related Papers