Sunday, March 12, 2006

The FAST Protocol

The Market Data Optimization Working Group (aka MDOWG) under FLP has recently released the first version of the FAST protocol. And it really seems like FAST is fast. The initial POCs show improvements of up to 70% compared to “native� FIX.

But what is FAST (Fix Adapted for STreaming) and how will it influence the existing FIX session? This blog entry will try to elaborate on these questions. I have looked at the material supplied at the FAST Technical Summit held at the London Stock Exchange back in January.

The FAST protocol consists to two specifications: FAST Field Encoding Specification (FAST CODEC) and FAST Serialization Specification (FAST SERDES). The first has to do with sending fewer tags, and hence less data over the wire, the second has to do with compression of the shorter message before sending it.

There are two approaches to implementing FAST: as an integrated part of the FIX session or as a separate FAST session layer. It is recommended to use the later as this will have a minimal or no impact on the existing applications. An illustration of the message stack could be the following:

Business application
FIX message parsing
FAST Field encoding/decoding
FAST Wire format encoding/decoding
TCP
IP

But why FAST at all? We have in the last few years seen dramatically increased market data volumes leading to high band width and processing costs, and there is no doubt that this trend will continue, as the different exchanges add more and more new products to their offerings. It is then easy to see, that if FAST offer up to 70% better utilization of your existing line capacity, you may well be able to stick with the T1 line you have, and not invest in a new T3 line.

If we should look at the details of the FAST protocol we could begin by looking at the basic feature set making up the protocol.

First of all it has been designed and optimized for message streams. This does not mean that FAST can not be used for other purposes, e.g. order routing as we shall see later. FAST is content aware. This requires knowledge about the different messages structures, which on one hand leads to less flexibility, but on the other hand to a much more efficient protocol. FAST uses a byte-oriented binary representation and variable-length fields. It has been defined that each message must contain at least one or more fields, hence no fields, no message, nothing to send. The last feature is the use of a presence map which enables efficient use of default values.

A basic FAST Implementation consists of the following:

  • A simple configuration in which the Sender encodes the data and the Receiver decodes the data.
  • No additional session management is needed
  • FAST-encoded data is sent directly over the native transport.
  • Templates are sent out-of-band or statically downloaded.
  • Templates are defined in a simple, human-readable format


At this point we have to define a Template. We need to understand how they are defined and used by FAST.

In general, a template is used to specify the structure, data types, and field operations of a message type. It specifies all fields included in a message as well as the sequence of those fields. If required it may also support repeating groups which allow a single message to efficiently convey multiple instructions: bids, asks, trades, etc.

When planning to encode a data feed the user should begin by converting standard message formats to temples, as all message types must be expressed in the proper template format.

When the template is defined, FAST uses it in a number of ways. First of all it provides the required content-awareness as described in the basic feature set. It allows FAST to encode and decode on a field by field basis and it provides critical information for both field encoding and serialization operations. The field encoding instructions given in the template are conveyed to FAST which performs the appropriate field encoding operations. Last data type descriptions are specified in the template which informs the Serializer whether a field is a string, an integer or a decimal value.

Templates can be defined using two different notations known as the Compact notation and the XML notation. I will only look at the former.

The structure of the Compact Notation is

((tag number)(data type)(field encoding operator))

Please note, that this is not an extensible solution and has known limitations, which is why the XML format is also proposed. I just prefer the compact notation as it is a straightforward way to define a template.

A summary of the Field Encoding Operators is given below:

! : Default Coding – default value per template
= : Copy Coding – copy prior value
+ : Increment Coding – increment prior value
- : Delta Coding – numeric or string differential
@ : Constant Value Coding – constant value specified in template
* : Implicit Value Coding – implies field values

The Data Type Descriptors is defined as:

s : string
u : unsigned integer
U : Unsigned integer supporting a NULL value
I : Signed integer
i : Signed integer supporting a NULL value
F : Scaled number

An example is in order to see how this works. Let us look at an Order – Single (35=D), just to stress the point that FAST can be used for other purposes than market data. Please note that only the logical result of the field encoding is shown. The serialization will compact the message even further and produce the physical message to be sent.

“[0]� represents a basic field delimiter.
The template on compact form is the following:

8s@FIX.4.2[0]9u[0]35s@D[0]49s=[0]56s=[0]34u[0]52s-[0]11s-[0]54s=[0]38u=[0]40s=[0]21s=[0]55s=[0]44F=[0]10u

We want to place a Limit order to Buy 100 Microsoft (MSFT.OQ) at the price of 24.75. This will result in the following order as FIX. To ease readability we only identify the instrument by tag 55 (Symbol) and not the usual tag 22 (IDSource), tag 48 (SecurityID), and tag 100 (ExDestination):

8=FIX4.2[0]9=108[0]35=D[0]49=SENDER[0]56=TARGET[0]34=7[0]52=20060312-21:53:05[0]11=12345678[0]54=1[0]38=100[0]40=2[0]21=1[0]55=MSFT.OQ[0] [0]44=24.75[0]10=120[0]

If we FAST field encode the above, we will save around 41% before serialization:

[0]108[0][0]SENDER[0]TARGET[0]7[0]20060312-21:53:05[0]12345678[0]1[0]100[0]2[0]1[0]MSFT.OQ[0]24.74[0]120

The second order we want to place is a Limit to Sell 100 Apple (AAPL.OQ) at 12.55. This will give the following FIX order:

8=FIX4.2[0]9=108[0]35=D[0]49=SENDER[0]56=TARGET[0]34=8[0]52=20060312-21:53:15[0]11=12345679[0]54=2[0]38=100[0]40=2[0]21=1[0]55=AAPL.OQ[0][0]44=12.55[0]10=129[0]

When we FAST field encode the second order we save 73% before serialization:

[0]105[0][0][0][0]8[0]15[0]9[0]2[0][0][0][0]AAPL.OQ[0]12.55[0]129

We can likewise give the template for the corresponding execution report message. On compact form it can be written as:

8s@FIX.4.2[0]9u[0]35s@8[0]49s=[0]56s=[0]34u[0]52t-[0]11s-[0]54s=[0]38u=[0]40s=[0]55s=[0]44F=[0]37s-[0]17s-[0]20s=[0]39s=[0]150s=[0]59s=[0]31F=[0]32u=[0]14u=[0]6F=[0]151u=[0]60s-[0]58s=[0]10u

The accept message - Execution Report (New) - for our order place request for the Microsoft equities is as FIX:

8=FIX.4.2[0]9=204[0]35=8[0]49=TARGET[0]56=SENDER[0]34=6[0]52=20060312-21:53:05[0]11=12345678[0]54=1[0]38=100[0]40=2[0]55=MSFT.OQ[0]44=24.75[0]37=OrderID001[0]17=ExecID1[0]20=0[0]39=0[0]150=0[0]59=0[0]31=0[0]32=0[0]14=0[0]6=0[0]151=100[0]60=20060312-21:53:06[0]58=New order[0]10=033[0]

Again we will FAST field encode the FIX message. This will give us a saving of 39% before serialization

[0]204[0][0]TARGET[0]SENDER[0]6[0]20060312-21:53:05[0]12345678[0]1[0]100[0]2[0]MSFT.OQ[0]24.75[0]OrderID001[0]ExecID1[0]0[0]0[0]0[0]0[0]0[0]0[0]0[0]0[0]100[0]20060312-21:53:06[0]New order[0]033

The accept message of the second order as FIX is:

8=FIX.4.2[0]9=204[0]35=8[0]49=TARGET[0]56=SENDER[0]34=7[0]52=20060312-21:53:15[0]11=12345679[0]54=2[0]38=100[0]40=2[0]55=AAPL.OQ[0]44=12.55[0]37=OrderID002[0]17=ExecID2[0]20=0[0]39=0[0]150=0[0]59=0[0]31=0[0]32=0[0]14=0[0]6=0[0]151=100[0]60=20060312-21:53:18[0]58=New order[0]10=043[0]

As was the case for the actual place order request, it is for the second execution report that we really see the advantage of FAST. If we field encode the second response we save around 78% before serialization.

[0]204[0][0][0][0]7[0]15[0]1[0]2[0][0][0]AAPL.OQ[0]12.55[0]4[0]2[0][0][0][0][0][0][0][0][0][0]18[0][0]043

After this nobody should be in doubt about the strength of the FAST protocol and the enormous potential it holds to optimize the existing FIX session.

No comments: