What are yRFCs?

yRFCs are discussion documents on one or more issues related to the design, development or implementation of the Y-Comm architecture. Y-Comm is a new architecture being developed to support heterogeneous networking. yRFCs therefore represent the views of the authors of the document. They are non-binding and do not oblige anyone to agree with or to implement any conceptsor details expressed therein. They can also be modified without notice. Finally, yRFCs are public documents and should not in whole or in part be the basis of a patent or copyright claim. Please contact the authors directly to discuss relevant issues.

yRFC2:The Simple Protocol (SP) Specification

Authors: Andriy Padiy (), LeroyRiley () and Glenford Mapp ()

This document was released to the Y-Comm Website Team on 12th January 2012

Update: 14th May 2012

Updates:

a)Replaced the ECN field by the SCOPE field

b)The mess_mtu size refers to the maximum message size, not the maximum packet size.

1.0 Introduction:

This yRFC discusses the specification of a transport protocol for local area networking called the Simple Protocol. This protocol was designed to optimize transport in the local area. The motivation for doing this is based around the concept that it is necessary to separate the need for Local Area Networking which may be defined by different local conditions such as heterogeneous wireless networking or high speed communications from Internetworking which is more based on Wide Area Networking. The strategy of tuning TCP to adapt to these conditions has met with mixed results. So the plan is to develop a simple protocol which can be used to optimize local interactions. The Simple Protocol, which we call SP, is being used to provide this functionality. SP is a simple message-based system compared to TCP which is stream-based.

1.1 Background

Communications in Local-Area Network (LAN) and Wide-Area Network (WLAN) environments are beginning to take divergent paths. This has been motivated by several factors. The first is that local networks speeds are still increasing; 1 Gbps is common in the Local Area with 10 Gbps becoming available in a few years. In addition, the rise of wireless also means that a lot of peripheral networks will be wireless networks. This indicates that the transportation characteristics of end networks are will be dominated by characteristics of wireless communications which are completely different from wired systems. Hence, transport protocols such as TCP which were developed to support wired communications are not able to perform in an optimal way in wireless environments. Adapting TCP has had mixed results, because it is difficult to really tune the protocol for these diverse LAN conditions.

The authors therefore believe that the argument that one transport protocol should be used for both global and local environments has been severely weakened. This paper looks at the development of a transport protocol specially designed for the local area. The authors believe that TCP should be used as a WAN protocol while a local protocol is used for local communications.

2.0 Requirements for a LAN Transport

A transport protocol for LAN communications needs to have certain properties to optimize its performance which differs from WAN transport protocols such as TCP.

Larger Window Sizes

In order to make use of high-speed LANs, LAN protocols should use a much larger window size compared with WAN protocols. Since the LAN is fast, a bigger window size can be used by default.

Support for Message-Based Communications

TCP is based on stream-like communication. There are no message boundaries. However, most communication in LAN environments tends to be message or transactional based. So using a message-based approach is better for LAN protocols.

Ease of Packet Processing: Keeping it simple.

There is a strong case to use this design to try to simplify protocol processing. Thus the idea would be have a small number of connection states as well as defined packet types. Thus the packet type is used as key parameter to drive the main loop.

Keep it flexible.

One of the key issues is that since there are a lot of diverse applications and it is necessary for the protocol to give different qualities of service to different applications. This means that various mechanisms such as check-summing and error correction need to be set independently to yield different qualities of service.

3.0Protocol Specification

Figure 1 shows the Diagram of the Simple Protocol while Figure 2 shows the length of the individual fields.

Figure 1: Diagram of the Simple Protocol

Figure 2: Showing the length of the Fields

The individual fields are detailed below:

The DEST_ID is a connection identifier on the remote machine. The SRC_ID identifies the connection on the local machine. So the connection is independently identified by [DEST_ID, IPaddress(DEST_ID)] or [SRC_ID, IPaddress(SRC_ID)]. Note that a value of zero is not regarded as a valid connection identifier.

Packet type is the type of packet being sent or received. SP supports a number of them:

START: the first packet transmitted to set up a connection

REJ: the connection has been rejected.

CNTL: this is a control packet and will be sent reliably

DATA: this is a data packet

ACK: this is an Acknowledgement packet

NACK: this is a NACK packet

ECHO: this is an echo packet

END: this is an end packet which is usedto close a connection.

PRI – 2 bits are used; hence SP supports 4 levels of priority. SP guarantees that a higher priority packet will always be delivered before a lower-priority packet.

SC – 2bits are used to support the idea of scope. The concept of scope is the idea that each server should have a scope of operation which defines the region in which it operates. With this concept, only machines that are within the scope of operation of the server are allowed to access the server. There are 4 scopes which are represented by two bits:

00: This means that the server can only be accessed by processes on the same machine.

01: This means that the server is only accessible by machines on the same Local Area Network (LAN).

10: This means that the server is only accessible by machines on the same site.

11: This means that the server is globally accessible

FLAGS: this comprises a field containing 8 bits:

BIT (0): Window-Size is valid

BIT (1): ST_CKS: Checksum this packet

BIT (2): ST_RTR: Recover packet if checksum error or missing

BIT (3): ST_RETRANS: This is an indication that the packet has been retransmitted

BIT (4): REMOTE_RESET: The connection has been reset by the other side

BIT (5): REPLY_REQUESTED: A reply has been requested for this packet

BIT (6): REPLY: This is a reply to a previous request

BIT (7): End-of-Message: Indicates that the last message was completely received.

CHKSUM: this is the 16 bit checksum. It is the same as used in TCP.

TOTAL_LEN: This is the total length of the packet including the SP header.

PBLOCK: This is used to signify which part of the message is contained in this packet.

TBLOCK: The total number of blocks/packets in a message.

MESS_SEQ_NO: the last message sent. Only DATA, CNTL and END packets can increase the MESS_SEQ_NO. The sending of other packet types does not increase the MESS_SEQ_NO of a connection.

MESS_ACK_NO: the last message received.

WINDOW_SIZE: This is 22 bits long and specifies the number of bytes that can be sent by the sending side before waiting for an acknowledgement from the receiver. Hence a maximum of 4 MBs can be sent before waiting for an acknowledgement.

SYNC_NO: This is a variable which is used to ensure that ACKs have been correctly received. So every time a unique acknowledgement is received, the SYNC_NO is incremented. The SYNC is 10 bits long and must be randomly assigned at connection start-up.

Connection States

The Simple Protocol supports the following connection states:

NOT_INUSE = 0; this connection is not valid

CONN_REQUESTED = 1; a connection has been requested so a START packet has been transmitted but there has not been a reply.

CONNECTED = 2; the connection is in the connected state.

END_REQUESTED_LOCAL: = 3 the local end has sent an END packet to close down the connection and is waiting on an END packet from the remote end to completely close the connection.

END_REQUEST_REMOTE: = 4 the remote end has sent an END packet and is waiting on the local end.

CLOSING = 5: the connection is closing. This means that END packets have been received and sent by both sides of the connection. However, unacknowledged or missing or retransmitted packets could still be received in this state.

CLOSED = 6: the connection is closed and the resources can now be reclaimed.

TIMERS

There are a number of timers associated with every SP connection:

CONN_TIMER: This is activated when a connection request is sent. When the timer expires the CONN_TIMER packet is resent. This process is repeated 3 times after which the connection is dropped.

ACK_TIMER: This is activated when an acknowledgement is requested. When the timer expires an ACK packet is sent with ACK REPLY_REQUESTED. This process is repeated 3 times after which the outstanding data packets are retransmitted. Then the RETRANS_TIMER is started. When the RETRANS_TIMER expires the process is repeated 3 times then the connection is dropped.

ECHO_TIMER:- This is used to time the end-to-end network latency. So when an ECHO_TIMER expires a packet is sent with REPLY_REQUESTED. When the receiving stack gets this packet it simply replies to the packet. Echo packets are therefore NULL packets which allow the system to measure the network latency of the connection.The ECHO_TIMER is used to ensure that packets are sent periodically.

END_TIMER:- This timer is set to ensure that the first END packet is acknowledged. An END packet is regarded as part of the data stream and has a distinct message sequence no. When one side wants to close the connection, it sends an END_PACKET with a distinct message sequence number and starts the END_TIMER. If the END_TIMER expires then the END packet is resent. This process is repeated 3 times and then the connection is dropped.

Figure 3: Showing the Different Connections States

Packet Formats

SP is a bit unusual in that outside DATA and CONTROL packets all other packets in SP are the size of the SP header. This means that for certain packet types some headerfields have been renamed to reflect the function of that packet type. This approach means that no part of the protocol header is wasted.

However, this is compounded by the fact that SP is an asynchronous protocol so normal SP packets do not reflect the complete state of both sides of a connection.This can be seen in a normal DATA packet. PBLOCK and TBLOCK reflect the block that is being transmitted in the message given by MESS_SEQ_NO. So these are variables associated with transmission or sending of data. The reception of messages is given by the variable MESS_ACK_NO which indicates the current message being received. Notice that with this SP header, you do not know the last block of that message (PBLOCK) that was received. You would only know whether or not the entire message has been completely received because the END_OF_MESSAGE bit will be set in the flags when the entire message has been received.

In the case of Acknowledgement packets the PBLOCK and TBLOCK parameters are associated with MESS_ACK_NO and not MESS_SEQ_NO. So when an ACK is received it reveals more than the piggybacked information as the ACK reveals which was the last block of that message that was received.

The most radical format change is for NACK packets. NACK packets indicate that there is a gap of missing packets. In SP, NACK packets delineate that gap by sending back information on packets at either side of that gap. So the last packet received before the gap in a NACK packet is given by the MESS_ACK_NO and the PBLOCK number in the NACK SP header while the received packet at the other end gap is given by MESS_SEQ_NO and the TBLOCK number. It is very important to realise that there is no way for SP to reliably work out exactly how packets are missing. It could know how many messages are missing but it would not know the size of each message. It is therefore up to the sender to just retransmit the packets.

Note that in a stream with a number of gaps, SP is set up to deal with one gap at a time. So the system will keep transmitting NACK packets for the oldest gap until it is filled and then goes to the second oldest, etc.

The Mechanisms

Connection

When Process A wants to start a connection to Process B, it chooses a SRC_ID which locally represents the connection structure. Note that the SRC_ID cannot be zero. It sends a START message with the REPLY_REQUSTED bit set. Note in this initial START message, the DEST_ID must be set to zero or the call is rejected since a connection id has not yet been allocated at the other end. The flags are set in the START packet and are taken to represent the type of the connection being requested. The application can also set its receive window size. If not, the default starting window size of 128 KBs is used. Process A must also randomly generate a 10-bit SYNC_NO value which is placed in the starting packet. After sending the START packet a CONN_TIMER is started.

Process B gets the connection request and examines the source address as well as the type of connection being requested. If Process B does not want to connect, it issues a REJ packet. When the REJ packet is received by Process A, the connection is immediately shut down and all structures associated with the connection are released.

If Process B accepts the connection, it sends a START packet with the REPLY flag set indicating that it has accepted the connection. It first takes the SRC_ID of the incoming packet and then makes it the DEST_ID of the outgoing packet. It then chooses a local number or src_id and sets the SRC_ID of the outgoing packet to src_id. Note that the value of src_id cannot be zero. It then sets the same flags as the incoming packet. Note that since SP is meant to be quick there is no QoS negotiation built into the protocol, so if Process B does not want the same type of connection as Process A, then it must reject the connection. Process B then generates its window size and also generates a random 10-bit SYNC_NO value which it sends back to Process A.

When Process A receives a START packet with a REPLY bit set and the DEST_ID equal to the SRC_ID of its START packet then it knows that the connection has accepted. It stops the CONN_TIMER and fills out the rest of the connection structure. It is also worth pointing out that in SP, the SYNC_NOs are crossed. So Process B must use the SYNC_NO generated by Process A in the original start packet and Process A must use the SYNC_NO generated by Process B in the reply packet. This helps to prevent replay attacks. Both Process A and Process B move to the CONNECTED state.

We now look at the use of the SC field. When a client wishes to talk to a server, it must first get the IP address of the server and the scope of the server. This information will be stored in the DNS. The client asks to be connected to the server and also includes the scope of the server. The SP protocol will check to see if the server is reachable according to the specified scope. If not, the connection request is rejected. If the request is admissible, it sends a START packet with the IP destination of the server along with the scope of the server. On receiving a START packet with its REQUEST_REPLY bit set, the server looks at the scope of the destination in the IP packet. If the scope does not match or the Source IP address is not within the receiving server’s scope, then a REJ packet is sent back to the client.

Data Transmission

After the connection is made, i.e., both Processes are in the CONNECTED state, they can begin to exchange data. In SP, data is sent using messages and each message can be divided into a number of blocks. The total number of blocks of a message is given by the parameter TBLOCK in the SP header. Each block of the message is sent as one SP packet. The particular block is given by the parameter, PBLOCK, in the SP header. In SP, it is recommended that if a message is composed of several blocks, then each block except the last block, should be of the same size. This will allow the receiver to allocate memory to save the entire message at the start of message transfer. Every message is uniquely identified by the MESS_SEQ_NO parameter. The total blocks in the message is given by TBLOCK and the individual block is given by PBLOCK.

When the receiver gets a DATA packet, if the DATA packet is the start of a message, the receiver increases the MESS_ACK_NO number for that connection. The total number of blocks in the message is given by TBLOCK and this is used to set the local variable tblock_rx in the connection structure. The block of the message, PBLOCK, is used to set a local variable pblock_rx. So the first block in a message PBLOCK will be zero and hence pblock_rx is set to zero. When the last block of a message is received, i.e., PBLOCK is equal to TBLOCK – 1, the receiver sets an End-of-Message Flag (EOF) which is sent on outgoing packets. This indicates to the sender that the message given by MESS_ACK_NO has been completely received. So it means that the sender can de-allocate the blocks of that message because it has been successfully received by the receiver.