[Return to Top] [Standard Processes]

EROS Object Reference

Standard Processes

(BSD Copyright Notice)

Network Socket

N O T E: This design is not implemented and is considered a failed attempt. It is presented here for historical information only.

Description

Networks don't really exist. They are a figment of some poor Steve Muir's imagination. The bits actually move from one machine to the other by magic; the physical infrastructure merely serves to establish the appropriate preconditions for belief in magic on the part of the users.

Note that the UNIX-isms have not yet been cleaned out of this page.

The Network Socket object encapsulates all of the necessary functionality build and use network connections to either remote or local hosts. The current object interface is based closely on the UNIX system call interface for sockets, and was chosen for speed of implementation rather than for the merits of the interface. In some places the function of the UNIX system calls should probably be merged into a single request.

A Network Socket object has the following internal states:

    StateDescription
    UnboundNot bound to any particular network interface. May be used to establish new connections
    BoundBound by a connect or bind operation.

Every socket has an associated process (protocol family), type, and protocol. The supported protocol families are:

    AF_UNIXUNIX-style internal protocols
    AF_EROSEROS-style internal protocols. Local connections only, but survives system restart.
    AF_INETARPA Internet protocols
    AF_ISOISO protocols
    AF_NSXerox Network Systems protocols

The supported connection types are:

    SOCK_STREAMReliable byte stream connection
    SOCK_DGRAMUnreliable, connectionless datagrams service.
    SOCK_RAWProvides access to internal network protocols and interfaces. It is not clear if this should be supported other than for reasons of compatibility.
    SOCK_SEQPACKETReliable, connection-oriented sequenced datagram protocol. Supported only by AF_NS protocol family
    SOCK_RDMNot implemented.

A SOCK_STREAM type provides sequenced, reliable, two-way connection based byte streams. An out-of-band data transmission mechanism may be supported. A SOCK_DGRAM socket supports datagrams (connectionless, unreliable messages of a fixed (typically small) maximum length). A SOCK_SEQPACKET socket may provide a sequenced, reliable, two-way connection-based data transmission path for datagrams of fixed maximum length; a consumer may be required to read an entire packet with each read system call. This facility is protocol specific, and presently implemented only for AF_NS. SOCK_RAW sockets provide access to internal network protocols and interfaces. The types SOCK_RAW, which is available only to the super-user, and SOCK_RDM, which is planned, but not yet implemented, are not described here.

The protocol specifies a particular protocol to be used with the socket. Normally only a single protocol exists to support a particular socket type within a given protocol family. However, it is possible that many protocols may exist, in which case a particular protocol must be specified in this manner. The protocol number to use is particular to the ``communication process'' in which communication is to take place.

Sockets of type SOCK_STREAM are full-duplex byte streams, similar to pipes. A stream socket must be in a connected state before any data may be sent or received on it. A connection to another socket is created with a connect invocation. Once connected, data may be transferred using read and write invocations or send and recv invocations. When a session has been completed a close invocation should be performed.

Out-of-band data may also be transmitted as described in send and received as described in recv.

The communications protocols used to implement a SOCK_STREAM insure that data is not lost or duplicated. If a piece of data for which the peer protocol has buffer space cannot be successfully transmitted within a reasonable length of time, then the connection is considered broken and calls will indicate an error with -1 returns and with ETIMEDOUT as the specific code in the global variable errno. The protocols optionally keep sockets warm by forcing transmissions roughly every minute in the absence of other activity. An error is then indicated if no response can be elicited on an otherwise idle connection for a extended period (e.g. 5 minutes). A SIGPIPE signal is raised if a process sends on a broken stream; this causes naive processes, which do not handle the signal, to exit.

SOCK_SEQPACKET sockets employ the same system calls as SOCK_STREAM sockets. The only difference is that read calls will return only the amount of data requested, and any remaining in the arriving packet will be discarded.

SOCK_DGRAM and SOCK_RAW sockets allow sending of datagrams to correspondents named in send calls. Datagrams are generally received with recvfrom, which returns the next datagram with its return address.

A control invocation can be used to specify a process group to receive a SIGURG signal when the out-of-band data arrives. It may also enable non-blocking I/O and asynchronous notification of I/O events via SIGIO.

The operation of sockets is controlled by socket level options. These options are defined in the file socket.h. Setsockopt and getsockopt are used to set and get options, respectively.


Socket Options

The following options are recognized at the socket level. Except as noted, each may be examined with getsockopt and set with setsockopt.SO_DEBUG enables debugging in the underlying protocol modules. SO_REUSEADDR indicates that the rules used in validating addresses supplied in a bind call should allow reuse of local addresses. SO_KEEPALIVE enables the periodic transmission of messages on a connected socket. Should the connected party fail to respond to these messages, the connection is considered broken and processes using the socket are notified via a SIGPIPE signal when attempting to send data. SO_DONTROUTE indicates that outgoing messages should bypass the standard routing facilities. Instead, messages are directed to the appropriate network interface according to the network portion of the destination address.

SO_LINGER controls the action taken when unsent messages are queued on socket and a close is performed. If the socket promises reliable delivery of data and SO_LINGER is set, the system will block the process on the close attempt until it is able to transmit the data or until it decides it is unable to deliver the information (a timeout period, termed the linger interval, is specified in the setsockopt call when SO_LINGER is requested). If SO_LINGER is disabled and a close is issued, the system will process the close in a manner that allows the process to continue as quickly as possible.

The option SO_BROADCAST requests permission to send broadcast datagrams on the socket. Broadcast was a privileged operation in earlier versions of the system. With protocols that support out-of-band data, the SO_OOBINLINE option requests that out-of-band data be placed in the normal data input queue as received; it will then be accessible with recv or read calls without the MSG_OOB flag. Some protocols always behave as if this option is set. SO_SNDBUF and SO_RCVBUF are options to adjust the normal buffer sizes allocated for output and input buffers, respectively. The buffer size may be increased for high-volume connections, or may be decreased to limit the possible backlog of incoming data. The system places an absolute limit on these values.

SO_SNDLOWAT is an option to set the minimum count for output operations. Most output operations process all of the data supplied by the call, delivering data to the protocol for transmission and blocking as necessary for flow control. Nonblocking output operations will process as much data as permitted subject to flow control without blocking, but will process no data if flow control does not allow the smaller of the low water mark value or the entesting the ability to write to a socket will return true only if the low water mark amount could be processed. The default value for SO_SNDLOWAT is set to a convenient size for network efficiency, often 1024. SO_RCVLOWAT is an option to set the minimum count for input operations. In general, receive calls will block until any (non-zero) amount of data is received, then return with smaller of the amount available or the amount requested. The default value for SO_SNDLOWAT is 1. If SO_SNDLOWAT is set to a larger value, blocking receive calls normally wait until they have received the smaller of the low water mark value or the requested amount. Receive calls may still return less than the low water mark if an error occurs, a signal is caught, or the type of data next in the receive queue is different than that returned.

SO_SNDTIMEO is an option to set a timeout value for output operations. It accepts a struct timeval parameter with the number of seconds and microseconds used to limit waits for output operations to complete. If a send operation has blocked for this much time, it returns with a partial count or with the error EWOULDBLOCK if no data were sent. In the current implementation, this timer is restarted each time additional data are delivered to the protocol, implying that the limit applies to output portions ranging in size from the low water mark to the high water mark for output. SO_RCVTIMEO is an option to set a timeout value for input operations. It accepts a struct timeval parameter with the number of seconds and microseconds used to limit waits for input operations to complete. In the current tire request to be processed. A select operation testing the ability to write to a socket will return true only if the low water mark amount could be processed. The default value for SO_SNDLOWAT is set to a convenient size for network efficiency, often 1024. SO_RCVLOWAT is an option to set the minimum count for input operations. In general, receive calls will block until any (non-zero) amount of data is received, then return with smaller of the amount available or the amount requested. The default value for SO_SNDLOWAT is 1. If SO_SNDLOWAT is set to a larger value, blocking receive calls normally wait until they have received the smaller of the low water mark value or the requested amount. Receive calls may still return less than the low water mark if an error occurs, a signal is caught, or the type of data next in the receive queue is different than that returned.

SO_SNDTIMEO is an option to set a timeout value for output operations. It accepts a struct timeval parameter with the number of seconds and microseconds used to limit waits for output operations to complete. If a send operation has blocked for this much time, it returns with a partial count or with the error EWOULDBLOCK if no data were sent. In the current implementation, this timer is restarted each time additional data are delivered to the protocol, implying that the limit applies to output portions ranging in size from the low water mark to the high water mark for output. SO_RCVTIMEO is an option to set a timeout value for input operations. It accepts a struct timeval parameter with the number of seconds and microseconds used to limit waits for input operations to complete. In the current implementation, this timer is restarted each time additional data are received by the protocol, and thus the limit is in effect an inactivity timer. If a receive operation has been blocked for this much time without receiving additional data, it returns with a short count or with the error EWOULDBLOCK if no data were received.

Finally, SO_TYPE and SO_ERROR are options used only with setsockopt. SO_TYPE returns the type of the socket, such as SOCK_STREAM; it is useful for servers that inherit sockets on startup. SO_ERROR returns any pending error on the socket and clears the error status. It may be used to check for asynchronous errors on connected datagram sockets or for other asynchronous errors.

Socket Options
OptionMeaningOptionMeaning
SO_DEBUGenables recording of debugging informationSO_REUSEADDRenables local address reuse
SO_KEEPALIVEenables keep connections aliveSO_DONTROUTEenables routing bypass for outgoing messages
SO_LINGERlinger on close if data presentSO_BROADCASTenables permission to transmit broadcast messages
SO_OOBINLINEenables reception of out-of-band data in bandSO_SNDBUFset buffer size for output
SO_RCVBUFset buffer size for inputSO_SNDLOWATset minimum count for output
SO_RCVLOWATset minimum count for inputSO_SNDTIMEOset timeout value for output
SO_RCVTIMEOset timeout value for inputSO_TYPEget the type of the socket (get only)
SO_ERRORget and clear error on the socket (get only)

Operations

Check Alleged Key Type (OC = KT)

Returns the alleged type of the key.
ReplyR1???? No type is assigned (yet).

SetSockType (OC = 1)

Sets the network process, connection type, and protocol for this socket object.
RequestLProtocol family (a.k.a. protocol process).
LConnection type.
LProtocol. An integer identifying the family-specific subprotocol.
Results0Sucessful completion.
1Socket is currently connected.

Connect (OC = 2)

Sets the network process, connection type, and protocol for this socket object.
RequestW*A socket address structure. The content of the socket address structure is protocol specific.
Results0Connection established.
1Socket is currently connected.
2Connection timed out.
3Connection failed.

Close (OC = 3)

Tears down the network connection, with or without waiting for the connection to drain. Regardless of reported errors, the connection is torn down on return from this invocation.
RequestWIf 0, connection will be drained before it is torn down.
Results0Connection closed.
2Drain timed out, connection is closed.

Read (OC = 4)

Reads the requested number of bytes from the socket. Some protocols discard packet data if a read operation will not hold a complete packet.

ReplyB*Bytes copied from the byte stream into the reply message.
Results0Read completed successfully.
4One or more packets were truncated.

Write (OC = 5)

Writes some number of bytes to the socket. For datagram protocols, the payload must be of a size that will fit within the datagram size limitations.

RequestB*Data to be written to the stream.
Results0Read completed successfully.
4One or more packets were truncated.

Send (OC = 6)

Send, Sendto, Sendmsg - send a message to the socket. For datagram protocols, the payload must be of a size that will fit within the datagram size limitations.

RequestB*Data to be written to the stream.
Results0Read completed successfully.
4One or more packets were truncated.

Recv (OC = 7)

Receives the requested number of bytes from the socket. May be used to receive data on a socket whether or not it is connection-oriented.

Some protocols discard packet data if a read operation will not hold a complete packet.

RequestBThe flags argument to a recv call is formed by or'ing one or more of the values:

MSG_OOB process out-of-band data

MSG_PEEK peek at incoming message

MSG_WAITALL wait for full request or error

The MSG_OOB flag requests receipt of out-of-band data that would not be received in the normal data stream. Some protocols place expedited data at the head of the normal data queue, and thus this flag cannot be used with such protocols. The MSG_PEEK flag causes the receive operation to return data from the beginning of the receive queue without removing that data from the queue. Thus, a subsequent receive call will return the same data. The MSG_WAITALL flag requests that the operation block until the full request is satisfied. However, the call may still return less data than requested if a signal is caught, an error or disconnect occurs, or the next data to be received is of a different type than that returned.

ReplyB*Bytes copied from the byte stream into the reply message.
Results0Recv completed successfully.
4One or more packets were truncated.

RecvFrom (OC = 8)

GetSockOpt(OC = 16), SetSockOpt (OC = 17)

Manipulate the options associated with a socket. Options may exist at multiple protocol levels; they are always present at the uppermost socket level.

Optname and any specified options are passed uninterpreted to the appropriate protocol module for interpretation. For setsockopt, the parameter should be non-zero to enable a boolean option, or zero if the option is to be disabled. SO_LINGER uses a struct linger parameter, defined in sys/socket.h, which specifies the desired state of the option and the linger interval (see below). SO_SNDTIMEO and SO_RCVTIMEO use a struct timeval parameter defined in sys/time.h.

RequestHLevel at which the option resides. To manipulate options at the socket level, level is specified as SOL_SOCKET. To manipulate options at any other level the protocol number of the appropriate protocol controlling the option is supplied. For example, to indicate that an option is to be interpreted by the TCP protocol, level should be set to the protocol number of TCP;
Woption name (see table above)
Results0Bind completed successfully.
9The option is unknown at the level indicated (ENOPROTOOPT)

As per UNIX manual pages

Control (OC = 33)

Bind (OC = 48)

Gives local address to the socket. It "assigns a name to the socket". When a socket is created with socket it exists in a name space (address family) but has no name assigned. Binding a name creates a socket in the system that must be deleted when it is no longer needed

RequestW*the local address
Results0Bind completed successfully.
8The socket is already bound to an address (EINVAL)

Accept (OC = 49)

Is waiting for connections after a listen. The accept argument extracts the first connection request on the queue of pending connections,and creates a new socket with the same properties of s If no pending connections are present on the queue, and the socket is not marked as non-blocking, accept blocks the caller until a connection is present. If the socket is marked non-blocking and no pending connections are present accept The accepted socket may not be used to accept more connections. The original socket s remains open.

The argument addr is a result parameter that is filled in with the address of the connecting entity, as known to the communications layer. The exact format of the addr parameter is determined by the process in which the communication is occurring. This call is used with connection-based socket types, currently with SOCK_STREAM.

ReplyW*result parameter that is filled in with the address of the connecting entity, as known to the communications layer. The exact format of the addr parameter is determined by the process in which the communication is occurring.
Results0completed successfully.
6The referenced socket is not of type SOCK_STREAM (EOPNOTSUPP).
7The socket is marked non-blocking and no connections are present to be accepted (EWOULDBLOCK).

Listen (OC = 50)

Listens for connections on a socket To accept connections, a socket is first created. a willingness to accept incoming connections and a queue limit for incoming connections are specified with listen, and then the connections are accepted with accept. The parameter defines the maximum length the queue of pending connections may grow to. If a connection request arrives with the queue full the client may receive an error with an indication of ECONNREFUSED, or, if the underlying protocol supports retransmission, the request may be ignored so that retries may succeed.

RequestWQueue limit
Results0completed successfully
3The socket is not of a type that supports the operation listen (EOPNOTSUPP).

Shutdown (OC = 51)

Causes all or part of a full-duplex connection on the socket to be shut down.
RequestBHow the socket should be shut down. If how is 0, further receives will be disallowed. If 1, further sends will be disallowed. If 2, further sends and receives will be disallowed.
Results0completed successfully.
5The specified socket is not connected (ENOTCONN).


Copyright 1999 by Jonathan Shapiro. All rights reserved. For terms of redistribution, see the GNU General Public License