SOURCES SOUGHT
70 -- TIC Platform Tool
- Notice Date
- 5/5/2015
- Notice Type
- Sources Sought
- NAICS
- 541512
— Computer Systems Design Services
- Contracting Office
- Department of the Treasury, Bureau of the Public Debt (BPD), Bureau of the Fiscal Service, Avery 5F, 200 Third Street, Parkersburg, West Virginia, 26106-5312, United States
- ZIP Code
- 26106-5312
- Solicitation Number
- RFI-OFR-15-0048
- Archive Date
- 5/26/2015
- Point of Contact
- Missy Forbes, Phone: 304-480-7135, Tina White, Phone: 304-480-8994
- E-Mail Address
-
Melissa.Forbes@bpd.treas.gov, PURCHASING@fiscal.treasury.gov
(Melissa.Forbes@bpd.treas.gov, PURCHASING@fiscal.treasury.gov)
- Small Business Set-Aside
- N/A
- Description
- Background THIS IS A REQUEST FOR INFORMATION (RFI) ONLY - A solicitation in NOT available at this time. Requests for a solicitation will not receive a response. The Government will not award a contract on the basis of this RFI. The Office of Financial Research (OFR) is seeking sources capable of providing a Tick Platform Tool capable of analyzing financial market data such as Tick/Trade and Quote data. This solution should include mechanisms for tick data collection, management, research, data cleansing, analysis, and reporting. Tick data and the related analysis and reporting are generally well-understood in the industry. OFR is seeking information to help design its internal infrastructure for analysis of these data. OFR would prefer to leverage its existing data presentation platforms and distributed computing infrastructure for this work. Required Functionality: • The solution shall support multiple sources and types of financial market data. Example data that should be available through this platform are: o Trade and quote data, examples include: o NYSE TAQ o OptionMetrics (IvyDB) o FINRA TRACE (corporate bonds) o Integration with low-frequency meta-datasets, e.g. CRSP for equities o Integration with tick-data providers, e.g. Thompson-Reuters o Machine readable news feeds o Order Book reconstruction from messages (i.e., FIX Protocol Messages) • The solution shall support role-based access control and shall integrate with OFR's existing Microsoft Active Directory authentication/authorization infrastructure securely (e.g., encrypted w/Kerberos) • The solution shall run on a system or systems whose operating system is either Red Hat Enterprise Linux Server 6 or Microsoft Windows Server 2012. • The solution shall support a system that runs either on physical hardware or as a virtual machine on a VMWare hypervisor • The solution shall support storage of data in a compressed format (e.g., GZIP, ZIP, Snappy, etc). • The solution shall support automated updates of data and shall include either an API or a programmatic ability to update supported datasets as new data arrives on a periodic basis to allow the solution to integrate with OFR's custom-built, linux & ruby-based automated data ingestion tool. • The solution shall support standard remote SQL queries via ODBC or JDBC. • The solution shall include the ability to create graphs, charts, and reports from raw or analyzed data. • The solution shall support standard analytic functions for stock market domain-specific data. • The solution shall include the facility to extract subsets of data. Preferred Functionality: • The solution should support integration with OFR's existing secure (kerberized) Cloudera Hadoop implementation. • The solution should use OFR's existing secure (kerberized) Cloudera Hadoop implementation as a data storage backend. • The solution should use OFR's existing secure (kerberized) Cloudera Hadoop implementation as a SQL-storage backend (e.g., HIVE/Impala). • The solution should use OFR's existing secure (kerberized) Cloudera Hadoop implementation for analytics (YARN, Spark, MapReduce, etc) • The solution should allow auditing of user-access to data stored or presented by the solution. This audit data should be written to a standard logging facility (e.g., syslog, log4j, etc) that can be ingested by OFR's existing log analysis tool, Splunk. Auditing should be configurable so only certain subsets of data presented by the solution are audited - either by only auditing those subsets or through a filtering component prior to logs being passed to Splunk • The solution should include a mechanism to share stored procedures, cleansing, analysis, charts, etc between users. E.g., user 1 develops a workflow and user 2 can leverage that workflow as part of an expanded workflow. • The solution should use a client-server model. The client piece should either be web-based or an application that can be virtualized with either Citrix XenApp or Microsoft AppV. • Any RDBMS connectivity or required back-ends should support either Microsoft SQL Server (2012 or 2014) or Postgres EnterpriseDB (9.x). • The system should support verification of ingested data updates - e.g., MD5SUM/SHASUM. • The system should support partitioning/parallelization of analytics/queries. OFR would prefer to leverage its existing Hadoop infrastructure or High-Performance Computing infrastructure (batch-queued based on SLURM) for this functionality. Required Tick Database User Functionality: • Platform-independent, performant data store From the point of view of a SQL database, it should be simple and fast to fetch data using a query like SELECT * FROM taq WHERE TICKER='IBM' and DATE='4-15-2015' and TIME >= ‘9:30:00' and TIME<='16:00:00' This functionality should be available across the range of supported statistical packages including MATLAB, Python, R, Excel • Data Cleaning High -frequency data is often contaminated with errors - 0 prices or trade sizes or "spikes" where a single price or a block of prices are far away from the prices before or after. These require filtering for errors. These filters are typically implemented as leave-one-out moving estimators of standard deviation of mean-absolute deviation. A set of these filters is needed to utilize tick data. • Aggregation Tick data is vast in nature, often having > 1,000,000 observations per instrument per day (counting both quotes and trades). Aggregation using last-price interpolation, average price interpolation or HLOC bars using some pre-specified time window (e.g. every 5 seconds, 5 minutes, etc). • Cross-market integration Many interesting questions require accessing assets across many markets. For example, in the flash crash, it is necessary to access SPY from NYSE TAQ, ES from CME (via, e.g. Thompson-Reuters or CFTC data, if available) and Options data from OptionsMetrics (IvyDB). These should be accessible in a time-aligned manner so that an overall view of trading across these exchanges can be readily analyzed. Similar cross-market aggregation would have been needed in October 2014 where important assets trade on ICAP (BrokerTek) and CME (Thomson-Reuters or CFTC data, if available). • Contract/Treasury rolling Futures trade for a fixed amount of time, with contracts typically ending either monthly or quarterly. The vast majority of trading occurs in the front month (or quarter) contract, and so data is usually required from this futures contract. However, when as a contract expires it is necessary to "roll" to the next contract in a smooth manner so that prices or volumes are comparable. This requires knowledge of future contract structure and appropriate methods to splice the return together. • Merge Quotes and Trades Quotes and trades (transactions) are generally stored separately since they have different fields. It is common to merge these two databases where the merge will keep the last valid quote N seconds before the transaction, where N>=0. It is also useful to use the Lee & Ready algorithm to sign trades based on the side of the order book. • Pull data directly into Excel It should be easy to pull aggregated data straight in to an Excel book using a Macro which will auto update. A native API would be helpful, along with connections to frequently-used statistical software packages such as MATLAB, R, Stata, and SAS. • User generated tables or views Some mechanism to store cleaned data, especially if the data cleaning involves many rules and is slow. • Portfolio Aggregates Aggregating within the same database using pre-specified portfolio weights. For example, real-time construction of the S&P 500. • Order Book Reconstruction/Level 2 data This is hypothetical, but reconstructing the order book from the message stream. • Integrate external reference data - e.g. CRSP Integrating with external reference data used to track firms across mergers and acquisitions or ticker changes. • News-driven market analysis Pulling trades/quotes within some time window of a new event. • Helpdesk Assistance/Support Access to product support on questions related to functionality and query development with reasonable response times. • Manuals / Documentation Documentation of functions must exist such that new users can easily learn/ implement functions as needed. RFI Response Instructions: We anticipate responses will come in one of the following three general categories although we do not intend this list to limit responses or suggestions - alternatives are welcomed. 1. A commercial, off-the-shelf (COTS) software product that uses OFR's existing Hadoop infrastructure as its data storage/presentation backend. 2. A commercial, off-the-shelf (COTS) software product that is self-contained. 3. A custom-built solution that uses OFR's existing Hadoop infrastructure as its data storage/presentation backend. RFI responses shall be in the form of a Capability Statement or White Paper, no more than 10 pages (including graphs and charts) in no less than an 11 point font, that addresses: 1. Commercial off the Shelf Software/Product Availability, ability to fulfill the above functionality. Describe any customization that would require fulfilling required functionality. 2. Company's profile and capability in providing these types of solution, if any. 3. Business type (i.e. large business, small business, small disadvantaged business, woman owned business, HUB-Zone small business etc.) based upon North American Industry Classification System (NAICS) code 541512 Computer Systems Design Services. Please refer to Federal Acquisition Regulation (FAR) Part 19 for additional detailed information on Small Business Size Standards. The FAR is available at http://www.acquisition.gov/comp/far/index.html 4. Past Performance History/Experience in providing this area of expertise and other relevant information that would enhance the Government's understanding of the information submitted. 5. Budgetary estimates to include pricing model for all services, hardware, and software licensing to implement proposed solution. 6. An invitation to demonstrate the solution may be requested from the companies responding to this RFI.
- Web Link
-
FBO.gov Permalink
(https://www.fbo.gov/spg/TREAS/BPD/DP/RFI-OFR-15-0048/listing.html)
- Place of Performance
- Address: Washington, District of Columbia, United States
- Record
- SN03721210-W 20150507/150506000025-1b0fe4bf45e4d96c3914ea525289d5ea (fbodaily.com)
- Source
-
FedBizOpps Link to This Notice
(may not be valid after Archive Date)
| FSG Index | This Issue's Index | Today's FBO Daily Index Page |