Thursday, March 30, 2006

FAST Reference Code for C++

I have been asked to make the reference code for C++ for the FAST protocol. The reference code – and all other information about the FAST protocol – can be found at www.fixprotocol.org/fast.

This work will most likely be done sometime this summer.

Tuesday, March 21, 2006

Lock free caching

All developers writing multi-threaded, real-time mission critical systems has encountered the problem of reading from a cache and updating it at the same time. This is something that should be avoided as it will generally lead to exceptions, but how do you get your cache updated without introducing concurrency problems in your code?

The following will address this problem and give a (well known) design to solve it. The small code samples are in C++, but any language will do. I will also show a way to check for updates without reading all values (we here assume, that the values are stored in a database), hence keeping the load on the database to a minimum.

Please add the usual disclaimers about not being production coded, that you have to add error checking, &c. I must also apologize for the layout; when I have more time, I will create a template/style to accommodate code snippets.

Assume that you have a map of some sort.

typedef std::map<key,value> CMapCollection;
CMapCollection m_colMyMap[2];
long m_lMapCacheIdx;

As can be seen we declare an array of two maps. This is to hold the active and passive map. The m_lMapCacheIdx will hold the index of the active cache, so when we wish to look up a key in the map, we do the following:

CMapCollection::iterator it = m_colMyMap[m_lMapCacheIdx].find(theKey);

We can now do our lookups in the active map, and update the passive map without introducing concurrency problems in the form of locks.

My UpdateCache() function will connect to a database, retrieve a recordset and update the cache accordingly.

bool CMyClass::UpdateCache( void )
{
bool bRet = m_pConnection->AccessDatabase(...);

// Find the passive cache
const long lPassiveIdx = m_lMapCacheIdx ? 0 : 1;

// Clear the passive cache
m_colMyMap[lPassiveIdx].clear();

// Loop around the recordset returned by the call to AccessDatabase(...) and
// update the, just cleared, passive cache.

// Switch active cache
if ( bUpdateOk )
{
m_lMapCacheIdx = lPassiveIdx;
}

return true;
}


We periodically want to check for new values, so we have a worker thread timing out with a given interval, here 1000 ms.

unsigned long CMyClass::_WorkerThread ( CMyClass *pThis )
{
HRESULT hRes = CoInitializeEx(NULL, COINIT_MULTITHREADED);
if (hRes != S_OK) ErrorExit( _T("CoInitializeEx failed"), hRes );

unsigned long ulRes = pThis->WorkerThread();

CoUninitialize(); // Closes the COM library on the current thread, unloads all DLLs loaded by the thread, frees any other resources that the thread maintains, and forces all RPC connections on the thread to close

return ulRes;
}

unsigned long CMyClass::WorkerThread()
{
while ( true )
{
unsigned long ulEvent = WaitForSingleObject(m_hEvent, 1000); // Timeout every sec
switch (ulEvent)
{
case WAIT_TIMEOUT:
UpdateCache();
break;
case WAIT_OBJECT_0: // Something in queue or stop flag set
if (m_bStopFlag)
{
return ThreadExitCleanUp(0);
}
break;
default:
return ThreadExitCleanUp(-1);
break;
} // end switch on ulEvent
} // end while true
}

To ease readability I will reduce my UpdateCache function to the following:

bool CMyClass::UpdateCache( void )
{
bool bRet = m_pConnection->AccessDatabase(...);
// Do the update stuff
return true;
}


I've declared my UpdateCahce function as a bool; old habit as we do not yet use the return value; it might as well have been declared as void.

So every second we call our UpdateCache() function. However, we may not want to actually check for updates this frequently.

We will add some more code to UpdateCache. Assume the following:

m_llUpdateTime is declared as __int64. This is the time stamp for the last update.
UtcTimeNow() is a helper function that sets the UTC time (always use UTC when doing anything with time and you want to ensure it is working over different time zones).

bool CMyClass::UpdateCache( void )
{
__int64 tNow;
UtcTimeNow(&tNow);

if ( (tNow < m_llUpdateTime ¦¦
(tNow - m_ llUpdateTime)/TSW_SECONDS_TO_NANOSEC >= 900))
{
m_pConnection->AccessDatabase(...);
// Do the update stuff
m_ llUpdateTime = tNow;
}
return true;
}


We now only access the database every 15 minutes (900 seconds), but we still do it regardless of whether any rows in the database has been updated or not. If we are reading in large amounts of data this is really a waste of resources.

To accommodate this last requirement, I will use a design pattern of first checking a timestamp in another (small) table and only if this timestamp is more recent than when I last read my data, I will access the data base. The database schema for the TableUpdates table is: TableName (varchar), Timestamp (datetime).

The member variable m_TableUpdate is a pointer to a class maintaining a connection to the database. It also has two helper functions: CheckForTableUpdate() and AcceptTableUpdate(). CheckForTableUpdate() takes a table name as input and returns a bool if the time stamp for that table has been updated since last time we checked. AcceptTableUpdate() sets an internal timer for when the last update was read. This design allows us to use the same class for checking multiple tables for updates.

Update our UpdateCache() function to the following:

bool CMyClass::UpdateCache( void )
{
LPTSTR TableUpdateTag = _T("Exchanges");
__int64 tNow;
UtcTimeNow(&tNow);

if ( (tNow < m_llUpdateTime ¦¦
(tNow - m_ llUpdateTime)/TSW_SECONDS_TO_NANOSEC >= 900))
{
bool bIsUpdated = false;
if ( !m_TableUpdate.CheckForTableUpdate(TableUpdateTag, &bIsUpdated) )
{
bIsUpdated = true; // We force a read if the check failed
// Some error logging
}

if ( bIsUpdated )
{
if ( !m_TableUpdate.AcceptTableUpdate(TableUpdateTag) )
{
// Some more error logging
}
m_pConnection->AccessDatabase(...);
// Do the update stuff
m_ llUpdateTime = tNow;
}
}
return true;
}


We have now achieved what we wanted, namely to only read the (map)values from the database when they have been updated.

Monday, March 13, 2006

Mark Seemann has a blog

My good friend (and former colleague back when I did consultancy work in the happy dot-com era) Mark has a blog about .NET design and programming. Read it! Mark is a lot better writer that I am, and he knows this stuff. Personally I don't understand half of what he is writing about, but that is more an issue with my lack of .NET knowledge that the quality of the contributions.

We actually meet each other back when both of us studied economics at the University of Copenhagen. If I remember correctly it was a class in game theory, and as usual Mark had to explain the finer details to me.

Sunday, March 12, 2006

The FAST Protocol

The Market Data Optimization Working Group (aka MDOWG) under FLP has recently released the first version of the FAST protocol. And it really seems like FAST is fast. The initial POCs show improvements of up to 70% compared to “native� FIX.

But what is FAST (Fix Adapted for STreaming) and how will it influence the existing FIX session? This blog entry will try to elaborate on these questions. I have looked at the material supplied at the FAST Technical Summit held at the London Stock Exchange back in January.

The FAST protocol consists to two specifications: FAST Field Encoding Specification (FAST CODEC) and FAST Serialization Specification (FAST SERDES). The first has to do with sending fewer tags, and hence less data over the wire, the second has to do with compression of the shorter message before sending it.

There are two approaches to implementing FAST: as an integrated part of the FIX session or as a separate FAST session layer. It is recommended to use the later as this will have a minimal or no impact on the existing applications. An illustration of the message stack could be the following:

Business application
FIX message parsing
FAST Field encoding/decoding
FAST Wire format encoding/decoding
TCP
IP

But why FAST at all? We have in the last few years seen dramatically increased market data volumes leading to high band width and processing costs, and there is no doubt that this trend will continue, as the different exchanges add more and more new products to their offerings. It is then easy to see, that if FAST offer up to 70% better utilization of your existing line capacity, you may well be able to stick with the T1 line you have, and not invest in a new T3 line.

If we should look at the details of the FAST protocol we could begin by looking at the basic feature set making up the protocol.

First of all it has been designed and optimized for message streams. This does not mean that FAST can not be used for other purposes, e.g. order routing as we shall see later. FAST is content aware. This requires knowledge about the different messages structures, which on one hand leads to less flexibility, but on the other hand to a much more efficient protocol. FAST uses a byte-oriented binary representation and variable-length fields. It has been defined that each message must contain at least one or more fields, hence no fields, no message, nothing to send. The last feature is the use of a presence map which enables efficient use of default values.

A basic FAST Implementation consists of the following:

  • A simple configuration in which the Sender encodes the data and the Receiver decodes the data.
  • No additional session management is needed
  • FAST-encoded data is sent directly over the native transport.
  • Templates are sent out-of-band or statically downloaded.
  • Templates are defined in a simple, human-readable format


At this point we have to define a Template. We need to understand how they are defined and used by FAST.

In general, a template is used to specify the structure, data types, and field operations of a message type. It specifies all fields included in a message as well as the sequence of those fields. If required it may also support repeating groups which allow a single message to efficiently convey multiple instructions: bids, asks, trades, etc.

When planning to encode a data feed the user should begin by converting standard message formats to temples, as all message types must be expressed in the proper template format.

When the template is defined, FAST uses it in a number of ways. First of all it provides the required content-awareness as described in the basic feature set. It allows FAST to encode and decode on a field by field basis and it provides critical information for both field encoding and serialization operations. The field encoding instructions given in the template are conveyed to FAST which performs the appropriate field encoding operations. Last data type descriptions are specified in the template which informs the Serializer whether a field is a string, an integer or a decimal value.

Templates can be defined using two different notations known as the Compact notation and the XML notation. I will only look at the former.

The structure of the Compact Notation is

((tag number)(data type)(field encoding operator))

Please note, that this is not an extensible solution and has known limitations, which is why the XML format is also proposed. I just prefer the compact notation as it is a straightforward way to define a template.

A summary of the Field Encoding Operators is given below:

! : Default Coding – default value per template
= : Copy Coding – copy prior value
+ : Increment Coding – increment prior value
- : Delta Coding – numeric or string differential
@ : Constant Value Coding – constant value specified in template
* : Implicit Value Coding – implies field values

The Data Type Descriptors is defined as:

s : string
u : unsigned integer
U : Unsigned integer supporting a NULL value
I : Signed integer
i : Signed integer supporting a NULL value
F : Scaled number

An example is in order to see how this works. Let us look at an Order – Single (35=D), just to stress the point that FAST can be used for other purposes than market data. Please note that only the logical result of the field encoding is shown. The serialization will compact the message even further and produce the physical message to be sent.

“[0]� represents a basic field delimiter.
The template on compact form is the following:

8s@FIX.4.2[0]9u[0]35s@D[0]49s=[0]56s=[0]34u[0]52s-[0]11s-[0]54s=[0]38u=[0]40s=[0]21s=[0]55s=[0]44F=[0]10u

We want to place a Limit order to Buy 100 Microsoft (MSFT.OQ) at the price of 24.75. This will result in the following order as FIX. To ease readability we only identify the instrument by tag 55 (Symbol) and not the usual tag 22 (IDSource), tag 48 (SecurityID), and tag 100 (ExDestination):

8=FIX4.2[0]9=108[0]35=D[0]49=SENDER[0]56=TARGET[0]34=7[0]52=20060312-21:53:05[0]11=12345678[0]54=1[0]38=100[0]40=2[0]21=1[0]55=MSFT.OQ[0] [0]44=24.75[0]10=120[0]

If we FAST field encode the above, we will save around 41% before serialization:

[0]108[0][0]SENDER[0]TARGET[0]7[0]20060312-21:53:05[0]12345678[0]1[0]100[0]2[0]1[0]MSFT.OQ[0]24.74[0]120

The second order we want to place is a Limit to Sell 100 Apple (AAPL.OQ) at 12.55. This will give the following FIX order:

8=FIX4.2[0]9=108[0]35=D[0]49=SENDER[0]56=TARGET[0]34=8[0]52=20060312-21:53:15[0]11=12345679[0]54=2[0]38=100[0]40=2[0]21=1[0]55=AAPL.OQ[0][0]44=12.55[0]10=129[0]

When we FAST field encode the second order we save 73% before serialization:

[0]105[0][0][0][0]8[0]15[0]9[0]2[0][0][0][0]AAPL.OQ[0]12.55[0]129

We can likewise give the template for the corresponding execution report message. On compact form it can be written as:

8s@FIX.4.2[0]9u[0]35s@8[0]49s=[0]56s=[0]34u[0]52t-[0]11s-[0]54s=[0]38u=[0]40s=[0]55s=[0]44F=[0]37s-[0]17s-[0]20s=[0]39s=[0]150s=[0]59s=[0]31F=[0]32u=[0]14u=[0]6F=[0]151u=[0]60s-[0]58s=[0]10u

The accept message - Execution Report (New) - for our order place request for the Microsoft equities is as FIX:

8=FIX.4.2[0]9=204[0]35=8[0]49=TARGET[0]56=SENDER[0]34=6[0]52=20060312-21:53:05[0]11=12345678[0]54=1[0]38=100[0]40=2[0]55=MSFT.OQ[0]44=24.75[0]37=OrderID001[0]17=ExecID1[0]20=0[0]39=0[0]150=0[0]59=0[0]31=0[0]32=0[0]14=0[0]6=0[0]151=100[0]60=20060312-21:53:06[0]58=New order[0]10=033[0]

Again we will FAST field encode the FIX message. This will give us a saving of 39% before serialization

[0]204[0][0]TARGET[0]SENDER[0]6[0]20060312-21:53:05[0]12345678[0]1[0]100[0]2[0]MSFT.OQ[0]24.75[0]OrderID001[0]ExecID1[0]0[0]0[0]0[0]0[0]0[0]0[0]0[0]0[0]100[0]20060312-21:53:06[0]New order[0]033

The accept message of the second order as FIX is:

8=FIX.4.2[0]9=204[0]35=8[0]49=TARGET[0]56=SENDER[0]34=7[0]52=20060312-21:53:15[0]11=12345679[0]54=2[0]38=100[0]40=2[0]55=AAPL.OQ[0]44=12.55[0]37=OrderID002[0]17=ExecID2[0]20=0[0]39=0[0]150=0[0]59=0[0]31=0[0]32=0[0]14=0[0]6=0[0]151=100[0]60=20060312-21:53:18[0]58=New order[0]10=043[0]

As was the case for the actual place order request, it is for the second execution report that we really see the advantage of FAST. If we field encode the second response we save around 78% before serialization.

[0]204[0][0][0][0]7[0]15[0]1[0]2[0][0][0]AAPL.OQ[0]12.55[0]4[0]2[0][0][0][0][0][0][0][0][0][0]18[0][0]043

After this nobody should be in doubt about the strength of the FAST protocol and the enormous potential it holds to optimize the existing FIX session.