Research ProjectsJerod Weinman < CompSci < Grinnell | ||||||||||||||
Multimodal Keypoint Detection (2022–2023)We adapt a multimodal (text+image) object detection model to the keypoint detection task and demonstrate that adding richer descriptions of the keypoints improves performance and reduces train time. Read more ... Paper: ICVS23. Self-Supervised Depth Prediction (2019–2021)Monocular depth estimation can be learned from egomotion or stereo pairs. We improve this process through a novel loss function that encourages the model to predict positive depths for visible points and a GPU-based differentiable Z-buffering algorithm for quickly rendering transformed point clouds in the training loop. Paper: CRV21. Map Processing (2010–2020+)By leveraging a gazetteer of place names (toponyms) and their locations, we can improve OCR on historical map images by inferring an alignment between the gazetteer and words on a map. Given such an alignment, calculating posterior probabilities significantly reduces word error. Read more ... Papers: SIGSPATIAL19, ICDAR19, ICDAR17, ICDAR13, Technical Report. Wearable Aid for the Blind (2004–2014)
Developing algorithms as part of a wearable aid for the blind called VIDI (Visual Information Dissemination for the Impaired) involved several contributions and subprojects toward this end.
Text and Sign DetectionUsing a contextual model to eliminate isolated false positives and more fully cover all regions of a detected sign, we are able to robustly detect text and logos with arbitrary sizes and layouts in complex scenes. Read more ... Papers: MLSP04, CVAVI05, Master's Thesis. Robust RecognitionBy integrating character appearance models more closely with statistical language and lexicon models, as well as a locally adaptive font model, we can more reliably recognize characters from signs in different fonts. Read more ... Papers: CVPR06, ICDAR07, ICPR08, PAMI09, ICPR10. Joint Detection and RecognitionBy asking the "where?" and "what?" questions of finding and identifying text simultaneously during learning, we can be faster or more accurate than the usual method of learning these independently. Read more ... Papers: Tech Report UM-CS-2006-054
Portions of this work have been funded by NSF grant numbers IIS-0100851, IIS-0326249, and IIS-0546666 as well as the Central Intelligence Agency and the National Security Agency.
Faster Learning for Stereo (2006–2010)Learning algorithms that rely on message-passing algorithms can be very slow when state spaces are large, as in most stereo disparity estimation problems. We improve stereo predictions by using an algorithm that can capture the appropriate amount of uncertainty during learning. Read more ... Papers: Tech Report UM-CS-2007-054, ECCV08, IJCV10 Learning Mobile Background Subtraction (2007)By storing background images for a stationary robot with a moveable vision system, we can perform an efficient matching of the current view to known views with a local deformation model to overcome parallax. The robot learns to perform novel object detection in the presence of camera motion. Read more ... Papers: ICRA07
Portions of this work have been funded by NSF grant number IIS-0546666 and MIT/NASA cooperative agreement NNJ05HB61A.
Stroke Segmentation (2001–2003)The problems of noisy magnetic resonance (MR) images, ambiguous boundaries, along with unknown and irregular shape, must be overcome to accurately estimate the volume of brain tissue affected by ischemic stroke. We use multiple segmentation parameters to create a confidence map that can closely match physician segmentations. Read more ... Papers: MICCAI03, Tech. Report UM-CS-2003-017. Collaborators
Undergraduates
|