With the normal TCP/IP setup it can take up to two hours for a dropped connection to terminate — the Samba project faced this problem when creating clustered Samba.
In a video interview at linux.conf.au in Melbourne, the founder of the Samba project, Andrew Tridgell, explained how Samba tackled the problem of node failure in a cluster.
"The problem is that the client doesn't know it's happened — the client is waiting for a reply from the previous server and doesn't know the new server has taken over. It can take up to 2 hours with normal TCP setup for what's called a keep-alive packet to kick in and cause the connection to reset."
Clustered Samba solves this problem with the use of a "tickle ACK" — an exchange of acknowledgement packets that allows for the replacement node to issue a proper reset packet.
The "tickle ACK" mechanism is necessary because the reset packet needs a valid sequence number to be obeyed — an invalid reset packet is ignored. The catch is that only the client and the failed node know the correct sequence number, and this is where the "tickle ACK" proves useful.
Since every node knows the connections on every other node, when a new node takes over it will send an acknowledge packet with an invalid sequence number. The client responds with an acknowledge packet with the correct sequence number, which the new node takes to issue a reset.
"The end result is that you can flick a service backwards and forwards between nodes in a cluster incredibly fast" said Tridgell.
"Which, for a person like myself who enjoys dealing with TCP packages and low level networking, is really great fun."
Some would say that it is a long way from software engineering to journalism, others would correctly argue that it is a mere 10 metres according to the floor plan.During his first five years with CBS Interactive, Chris started his journalistic adventure in 2006 as the Editor of Builder AU after originally joining the company as a programmer.Leaving CBS Interactive in 2010 to follow his deep desire to study the snowdrifts and culinary delights of Canada, Chris based himself in Vancouver and paid for his new snowboarding and poutine cravings as a programmer for a lifestyle gaming startup.Chris returns to CBS in 2011 as the Editor of TechRepublic Australia determined to meld together his programming and journalistic tendencies once and for all.In his free time, Chris is often seen yelling at different operating systems for their own unique failures, avoiding the dreaded tech support calls from relatives, and conducting extensive studies of internets — he claims he once read an entire one.