|
COMMERCE BUSINESS DAILY ISSUE OF FEBRUARY 24,1997 PSA#1788Patent & Trademark Office, Office of Procurement, Box 6, Washington,
D.C. 20231 70 -- SOURCES SOUGHT FOR AUTOMATED BIOTECHNOLOGY SEQUENCE SEARCH
(ABSS) SYSTEM POC Henrietta V. Brox, Contract Specialist, 703-305-8016,
Michael J. Anastasio, Jr., Contracting Officer. The U.S. Patent and
Trademark Office (USPTO) is considering alternatives to enhance its
capabilities for automated molecular sequence searching in support of
USPTO examiners. The requirement for improving and streamlining the
examination of sequence patent applications has been identified as a
critical requirement by the USPTO. The USPTO has concluded that there
is a need to evaluate alternatives for responding to these issues. In
order to handle the present workload and future increases in
biotechnology applications, the USPTO is considering enhancements to
their existing hardware/software platform or replacement of their
existing hardware/software platform with additional tools and computer
processing power. Currently, the USPTO computing platform, the
Automated Biotechnology Sequence Search (ABSS) system, consists of Sun
Microsystems workstations which access RNA/DNA and protein sequence
databases resident on the Sun Microsystems database server,
supplemented by two (2) MasPar massively-parallel computing platforms.
IBM compatible computers and Macintosh computers are used for input
data preparation and administrative functions. The system is configured
as a distributed processing system. The ABSS System has one (1) major
application function: automated sequence search processing. This major
application is further divided into three (3) functional subsystems:
Computer Readable Forms (CRF), Sequence Search subsystem, and Sequence
Dissemination subsystem. However, from a hardware configuration
viewpoint, the ABSS may be viewed as two (2) separate subsystems: the
CRF and Sequence Search subsystems. The CRF subsystem has Sun SPARC
workstations that are attached to a LAN. The CRF is responsible for the
pre-processing of all computer readable format data received from
patent applications. Data from patent applicants may be submitted on
magnetic media with MS-DOS format, Macintosh format, or UNIX format.
After a standalone personal computer has been used to validate that the
incoming does not contain a virus, the data is transferred to a
LAN-connected PC for validation processing. Validation processing
includes steps to convert the data, edit checks to ensure that the data
is compliant with CRF standards, correction of obvious format errors,
and reformatting of the data into standard IntelliGenetics (IG)
sequence search format. The Sequence Search subsystem has two (2) Sun
SPARC servers acting as database servers for performing sequence
searches. Sun SPARC workstations are connected to the database servers
with an Ethernet LAN and are used to enter sequence search
information. In addition, the MasPar system provides a searching
platform for extensive and sensitive searches that would take much
longer to process on the Sun workstations. Software used to perform the
sequence search processing is the IG Suite (primarily the FASTDB
program) and MPSRCH developed by IG, Inc. (now the Oxford Molecular
Group), and the Genetics Computer Group (GCG) Wisconsin package. The
MPSRCH software incorporates provisions for: (1) support of affine gap
penalties to maximize sensitivity; (2) supports standard nucleotide
and amino acide codes as well as IUPAC ambiguity codes; (3) reporting
of alignment, prediction values, and annotations for each result,
reporting of percent matching residues for each result in the list of
top scores and in individual alignment output for each match; (4)
supports standard and user-defined matrices such as PAM, BLOSUM, etc.;
(5) supports automated batch submission of multiple searches using
defined parameters; (6) supports the use of a graphical user interface
(GUI) that automates sequence selection and searching process
including performing multiple database searches of both commercial
and/or in-house (non-commercial) databases; (7) supports searching
sequence ranges and oligomer searches; (8) supports reverse translation
in all six (6) frames (for comparing DNA queries with protein databanks
and for comparing protein queries with DNA databanks); (9) supports
search procedures for both DNA strands; and (10)supports creation of
output which can be further processed analytically. Commercially
available databases such as Genbank, EMBL, PIR, etc. are received from
IG, Inc. and GCG on a regular basis as part of their maintenance
contract with USPTO. The data used to update the commercial databases
on the Sun SPARC servers. The Sequence Dissemination subsystem
processes sequence data in the Issued databases and creates output on
a magnetic tape which is then distributed to Genbank. Examiners can
have the sequences searched against the commercially available
databases or the in-house pending sequence databases. The choice of
which software package is used is based upon the nature of each
application and its claimed sequences. After processing, on the Sun
workstations or on the MasPar, the search results files are downloaded
to disk or printed and forwarded to the examiner. Results files
contain a predefined number of highest scoring sequence alignments and
associated annotation information. The examiner opens the files on
his/her workstation and examines the results. Results returned from
database searching often identify literature references pertinent to
sequences matched. The examiner can review files on his/her
workstation, isolating pertinent material for printout and storage for
the record for that particular application. The USPTO is interested in
obtaining information technology desktop tools which can improve the
sequence examination process by facilitating analysis (and utilization)
of search results. Current sequence search tools produce voluminous
textual results which require considerable examiner analysis to
manually isolate and summarize the results most pertinent to a given
application. Specifically, the USPTO is interested in emerging or
existing software tools which can aid examiners in improving the
quality of, and enhance the efficiency of, sequence examinations. Such
tools could include pre-processing tools for expediting the setup of
sequence searches, search engines which meet or exceed the performance
of existing search system capabilities, and post-processing tools
which could help examiners in reviewing raw sequence search results to
best meet patentability decision-making needs. Of special interest
would be the ability to output the results into a database format such
that one could sort by different fields. Potentially useful fields
would include author/inventor, gene, locus, date, best score, etc. Also
of interest would be the ability for merging various results files into
one file containing non-redundant data which can then be further
incorporated into a database. The future ability to hyperlink search
output directly to full-text online documents, would also be of special
interest for future application. Parties interested in responding to
this annoucement should send as full and specific a description of
proposed hardware and software tools as possible, including a
description of the potential and methods for integrating the proposed
tools into the existing ABSS platform, as described above. For example,
proposed software should be fully-described functionally and should
include technical specifications such as programming languages,
hardware platform necessary, size, level of support available, costs,
and installed base. Interested sources should send their responds to
Henrietta V. Brox within fifteen (15) days of this publication. (0051) Loren Data Corp. http://www.ld.com (SYN# 0350 19970224\70-0008.SOL)
70 - General Purpose ADP Equipment Software, Supplies and Support Eq. Index Page
|
|