Loren Data Corp.

'

 
 

COMMERCE BUSINESS DAILY ISSUE OF JULY 20,1999 PSA#2391

National Library of Medicine, Office of Acquisitions Management, Building 38A, Room B1N17, 8600 Rockville Pike, Bethesda, Maryland 20894

R -- PROFESSIONAL SUUPORT SERVCIES SOL NLM99-108/RTR DUE 080399 POC Ramona Rivers, Purchasing Agent (301) 496-6127 Fax (301) 402-8169 E-MAIL: NLM 99-108/RTR, ramona_rivers@ccmail.nlm.nih.gov. It is the intent of the National Library of Medicine (NLM) to negotiate on a sole source basis with Mathsoft, Inc. 1700 Wastlake Ave., N. Suite 500 Seattle, Washington 98109 for Professional Services to support ongoing R & D in the automated entry of data via scanning and optical character recognition technology combined with documentation image analysis and understanding techniques from medical journal literature to information for MEDLINE indexing. The NLM has designed and developed a system that converts bitmapped images to text for this purpose. While conventional text consissting of Latin characters and Arabic and Roman numerals are handled well by this system, the NLM wishes to enhance the character recognition scheme currently employed to recognize Greek letters, characters with diacritical marks, and biomedical and scientific symbols (all of which are hereafter collectively referred to as "special symbols"). The primary technical objective of this task is to develop a software library for automated recognition of these symbols. The library shall be capable of being integrated with the current system with minimal effort. The recognition accuracy and speed performance of the implementation shall be comparable to those of the current system. The task includes the following subtasks: 1) Based on an analysis of the overall second-generation MARS system design and the special symbol recognition problem including an analysis of the current system detection module, develop detailed requirements, design specifications and an implementation plan for this project. The plan shall describe all subtasks required to accomplish the goals and shall alsoinclude a description of all deliverables, both code and documentation. 2) The detection stage distinguishes special symbols from normal characters in a document. This stage was found to be a performance bottleneck in the existing recognition module. Therefore, this subtask shall focus on the development of an improved detection module by mainly reducing the dimensionality of the feature space in the detection stage and by converting the detection problem to a two-class decision problem. In addition, the detector shall be trained on a larger data set and optimal values of the detection parameters shall e determined by cross-validation on this training data. 3) The classification stage classifies a detected special symbol into one of the many symbols in our database. This stage shall improve in various ways to mostly handle errors propagated from the detection stage. Primarily, an artifical neural network (ANN) shall be used at this stage and the classifier trained on a umber of normal characters to improve the overall recognition accuracy. Other features such as vertical line position shall be incorporated into the classifier at this stage. In addition to the above classification improvement tasks, an appropriate combination of the detection stage and the classification stage shall be investigated, again through the use of the ANN. This combined ANN shall take the following data as input: the OCR outputs (including confidence levels) for the current character and the previous and next characters, the normalized input image, and other features such as line position. The output class shall be a character, either a normal one or a special symbol from the total character set considered. 4) Design and development algorithms for other character recognition tasks, such as more accurate text-line detection fro improved segmentation, detection of superscript and subscript characters, and characters, and character attribute detection (e.g., italics versus non-italics characters). Firms interested in responding to this notice must submit as part of their response clear and convincing documentation of their ability to meet the Government's requirements. If a response indicates that a competitive acquisition would be more advantageous to the Government, a formal solicitation may be issued.***** Posted 07/16/99 (W-SN355410). (0197)

Loren Data Corp. http://www.ld.com (SYN# 0069 19990720\R-0002.SOL)


R - Professional, Administrative and Management Support Services Index Page