The last two decades have seen a large number of sudden and severe disruptions to financial markets in the U.S.: the 9/11 attacks, flu pandemics, the Northeast blackout, and a whole series of hurricanes and major storms that have knocked out power, disrupted transportation networks, and forced the temporary closure of market infrastructure.
These events have tested the resiliency of the futures industry's business operations and highlighted the importance of having a strategy for limiting the impacts to liquidity, stability and confidence. In response, financial firms have developed business continuity plans that identify the businesses, systems, processes and people that are critical to the enterprise, and have developed strategies to deal with everything from terrorist attacks to the death of the CEO.
Developing a business continuity strategy is not a one-time exercise, however. The financial services technology environment has undergone a number of changes in recent years, including the growth of algorithmic and high-frequency trading, the introduction of new market data delivery systems, the increasing reliance on the co-location of trading infrastructure in data centers, and a restructuring of the exchange landscape through mergers and the introduction of new venues.
As a result, market participants need to incorporate the latest changes in the technology environment into their business continuity plans. A crucial component of the process is testing these plans to make sure they work. Frequent testing is the number one lesson learned from previous disruptions—waiting until a disruption occurs is a bad way to find out that your network connections have changed and your recovery plan does not work.
Testing the Trading Network
For the past 14 years, FIA has spearheaded an effort by the U.S. futures industry to enhance business continuity planning. Each fall, FIA conducts a disaster recovery test to assess industry-wide preparedness to recover from disruptions and resume business. This annual initiative typically involves conducting a mock trading session on the weekend. This allows the exchanges, clearinghouses, brokers, trading firms and key service providers to test their backup systems and identify any problems that might disrupt trading during a real-life disaster.
The goal is to complement the testing that individual firms do on their own. Rather than each firm trying to schedule a test with each exchange or clearinghouse, the industry test provides an opportunity for all firms to test their systems at the same time. This provides economies of scale and creates a framework for bringing the U.S. futures markets up to a higher level of preparedness.
The mock trading session provides an opportunity to involve secondary and backup sites in the flow of communications among the various market participants and the key nodes in the market infrastructure. A limited number of orders are submitted to the exchanges, trades are matched, and the resulting fills are returned for processing. This process tests systems such as order management, FIX messaging gateways, risk management, trade matching and execution reporting.
The test is voluntary, but all the major segments of the infrastructure take part: the futures exchanges and clearinghouses, the brokers that provide market access and clearing services, and the trading firms that are connected directly to the exchanges and provide a large amount of the liquidity. A number of key service providers and software vendors also participate, reflecting their mission-critical roles in the futures industry ecosystem. Some of these companies provide front and middle office systems for execution, risk management and market data. Others provide back office systems for clearance, settlement and trade reporting.
Another important dimension of the test is the inclusion of the limit hub operated by Traiana that clearing firms use to manage credit checks on swap execution facilities. This limit hub is an important part of the infrastructure for the swap markets, and the mock trading session provides an opportunity to test the systems involved in trading and clearing credit default swaps and interest rate swaps.
Running Systems on the Weekend
Testing takes place on a Saturday when markets are closed. The systems used in the test are prepared and data is backed up for recovery purposes on Friday evening, as is the norm. During the test, the exchanges and clearinghouses typically operate from their backup sites using their backup systems. Some firms also have test staff working remotely from alternate sites or home locations to test the efficacy of remote worksites and remote communications.
Each exchange that participates in the test provides a test script that dictates which futures and/or options products will be used in the test. This determines the orders that firms are expected to create, the functionalities that they will exercise, and the information they will need to retrieve from the exchange's trading engines, web portals and clearing systems. The test plans also lay out the type of pre-test information that the firms and service providers are required to provide to the exchanges, including key personnel contacts and the identification of the systems and interfaces to be used during the test.
Data created during the test—i.e., orders, quotes, trades, output reports, and so on—are captured by the exchanges and stored for future research and analysis purposes. At the conclusion of testing, all systems are returned to a production-ready state, in anticipation of trading on Sunday evening or Monday morning.
These tests are not intended to test the ability of these systems to handle unusually high volume or volatility. The rationale is that if a firm can use its backup systems during this mock trading session to connect to an exchange, enter a small but meaningful amount of orders, receive execution reports and clearinghouse outputs, and bring that information into its risk management systems, then the odds are that it will be able to rely on those backup systems during an emergency.
Lessons Learned from 2017 Test
2017 test in summary
Date: Oct. 14
Number of market infrastructure operators involved: 24
Number of clearing and non-clearing firms involved: 66
Number of people involved: more than 800
Locations of backup sites tested: Dubai, Frankfurt, London, Madrid, Montreal, Paris, Toronto, Florida, Illinois, Missouri, New Jersey, New York, North Carolina, Utah
Organized by: FIA’s Market Technology Division and its Business Continuity Management Committee.
The most recent test was conducted on Oct. 14, 2017. Preparations began six months in advance and involved more than 800 support staff and management from various market participants, exchanges, clearinghouses and service providers across the U.S., Canada and Europe. On the day of the test, 66 firms joined the mock trading sessions from backup sites across the U.S. and in Europe and the Middle East. Most firms tested successfully, meaning that they were able to fulfill the instructions in the test scripts. Success rates varied by exchange, however, with some as high as 100% at one end and others as low as 83%.
Each exchange determined the parameters of the products and systems tested at that exchange. For example, at Cboe Futures Exchange, firms tested their ability to submit orders for VIX futures and use the exchange's web-based applications such as Market Replay and Risk Controls. Firms also tested their connections to OCC, the clearinghouse that supports CFE, by transmitting trades to OCC's backup systems via MQ and SFTP.
At Eurex, the test simulated the outage of its primary data center, which hosts its matching engine, gateways and co-location site, and the fail-over to its backup data center. The test validated trade data and customer account setup in the exchange's disaster recovery environment and the ability of firms to log into the exchange's systems, manage orders and quotes, and process trades.
As in prior years, firms and exchanges also tested the “people side” of their business continuity plans, in the sense that their staff worked from remote locations and in some cases from home. The test also included the National Futures Association, the self-regulatory organization for the U.S. futures industry. NFA successfully received regulatory file uploads from an exchange that outsources its regulatory compliance to NFA.
After the test, the firms indicated that it helped them to:
- Exercise their business continuity/disaster recovery plans,
- Identify internal single points of failure,
- Test other in-house applications and systems at the same time,
- Tighten up and document their business continuity procedures,
- Better understand the need for cross-training, and
- Test connectivity with exchanges’ DR sites.
Exchanges and clearinghouses indicated that the test helped them to:
- Test connectivity to/from DR sites,
- Identify/refine pre-test and post-test procedures for connectivity testing,
- Tighten up and document their business continuity and system fail-over procedures,
- Improve test scripts and plans for future tests,
- Identify some internal single points of failure, and
- Better understand the need for cross-training.
The test uncovered some areas for improvement. Most of these problems encountered were categorized as “real world” issues, such as incorrect network IP addresses, MQ channel configurations, missing passwords, and anomalies in hardware or software configurations. Most of these issues were promptly resolved in concert with staff from the exchanges and clearinghouses. Issues that could not be resolved were documented for follow-up actions.
Another type of problem occurred when firms used different staff to run the systems during the weekend session. In some cases, these staff did not have the technical or domain knowledge needed to fix problems that cropped up during the test, or did not have access to all of the applications being tested. In effect, the test served as a cross-training exercise, which is another dimension of business continuity planning.
One recommendation that emerged from the test was that exchanges should investigate more efficient methods to facilitate seamless failover from primary to backup systems. One problem was that firms with primary systems co-located in an exchange’s main data center were impacted when the exchange failed over to its disaster recovery data center.
Looking Ahead to 2018
The 2018 test* has been scheduled for Saturday Oct. 13. The committee plans to expand the scope to address the firms’ abilities to test from remote recovery site locations (i.e., out of region capabilities) and variations on what has been tested in the past. This is in part to address evolving regulatory requirements that impact the business continuance and disaster recovery of market participants.
The committee will continue to coordinate and refine the annual test initiative and strive to add additional participants. This may include additional clearing and non-clearing firms, exchanges, swap execution facilities and swap data repositories.
As has been done in prior years, the committee will liaise with its counterparts at the Securities Industry and Financial Markets Association. Both industry groups have conducted similar testing on common dates in prior years and many firms are members of both organizations.
More information about the 2017 test, including a list of participating exchanges, is available here.
John Rapa is the president and chief executive officer of Tellefsen and Company, a management consulting firm focused on the capital markets and derivatives industries. He is chair of the FIA Market Technology Division’s Business Continuity Management committee and has helped facilitate and manage the annual disaster recovery test over the past 14 years.
* The test date was originally published as Oct. 27. The correct date for the 2018 test is Oct. 13.