The Sockets Interface

From the perspective of the Linux kernel, a socket is an end point for communication. From the perspective of a Linux program, a socket is an open file with a corresponding descriptor.
Internet socket addresses are stored in 16-byte structures having the type sockaddr_in
.
/* IP socket address structure */
struct sockaddr_in {
uint16_t sin_family; /* Protocol family*/
uint16_t sin_port; /* Port number in network byte order */
struct in_addr sin_addr; /* IP address in network byte order */
unsigned char sin_zero[8]; /* Pad to sizeof(struct sockaddr) */
};
sin_addr
is a 32-bit address. The IP address and port number are always stored in network byte order.
The connect
, bind
, and accept
functions require a pointer to a protocol-specific socket address structure.
In old days, the sockets functions expect a pointer to a generic sockaddr
structure.
struct sockaddr {
uint16_t sa_family; /* Protocol family */
char sa_data[14]; /* Address data */
};
Pointers pointing to protocol-specific structures should be cast to sockaddr
.
Now, we use a generic void *
pointer.
Literally we setup a server through the following procedures:
We invoke
getaddrinfo(const char *host, const char *service, const struct addrinfo *hints, struct addrinfo **result)
to obtain a list ofaddrinfo
structures that represent the possible network addresses we can use.For a server, the
host
is set toNULL
and theservice
is set to the port number.We iterate through each of the
addrinfo
structures in the result list and try to use them with thesocket
andbind
functions. The goal is to create a socket and bind it to an address and port where it can listen for incoming connections. If thesocket
orbind
call fails, we try the nextaddrinfo
structure in the list until we find one that works or exhaust all options.We invoke the
listen
function on the bound socket to turn it into a listening socket. This tells the operating system that we want this socket to be used to accept incoming connection requests.We call
accept
in a loop to wait for and accept incoming connections. When a client connects,accept
returns a new socket that's specifically linked to that client. We can then use this new socket to communicate with the client.
The socket
Function
socket
FunctionClients and servers use the socket
function to create a socket descriptor.
#include <sys/types.h>
#include <sys/socket.h>
int socket(int domain, int type, int protocol);
// for the parameter protocol you can just use 0 and the system will choose the correct protocol based on the type.
// Returns: nonnegative descriptor if OK, −1 on error
clientfd = socket(AF_INET, SOCK_STREAM, 0);
AF_INET
indicates that we are using 32-bit IP addresses and SOCK_ STREAM
indicates that the socket will be an end point for a connection.
The best practice is to use the getaddrinfo
function to generate these parameters automatically, so that the code is protocol-independent.
The connect
Function
connect
Function#include <sys/socket.h>
int connect(int clientfd, const struct sockaddr *addr, socklen_t addrlen);
// Returns: 0 if OK, −1 on error
The connect
function attempts to establish an Internet connection with the server at the socket address addr
, where addrlen is sizeof(sockaddr_in)
. The connect
function blocks until either the connection is successfully established or an error occurs. If successful, the clientfd
descriptor is now ready for reading and writing, and the resulting connection is characterized by the socket pair (x:y, addr.sin_addr:addr.sin_port)
. As with socket, the best practice is to use getaddrinfo
to supply the arguments to connect.
The bind
Function
bind
Function#include <sys/socket.h>
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
// Returns: 0 if OK, −1 on error
The bind
function asks the kernel to associate the server’s socket address in addr
with the socket descriptor sockfd
.
The listen
Function
listen
FunctionThe client is the active side that initiates connection requests. The server side is the passive entities that wait for connection requests from clients. By default, the kernel assumes that a descriptor created by the socket
function is the client side of a connection. A server calls the listen
function to tell the kernel that the descriptor is on the server side.
#include <sys/socket.h>
int listen(int sockfd, int backlog);
The backlog
argument is a hint about the number of outstanding connection requests that the kernel should queue up before it starts to refuse requests.
The accept
Function
accept
FunctionServers wait for connection requests from clients by calling the accept
function.
#include <sys/socket.h>
int accept(int listenfd, struct sockaddr *addr, int *addrlen);
// Returns: nonnegative connected descriptor if OK, −1 on error
The accept function waits for a connection request from a client to arrive on the listening descriptor listenfd, then fills in the client’s socket address in addr, and returns a connected descriptor that can be used to communicate with the client using Unix I/O functions.
The listening descriptor serves as an end point for client connection requests. It is typically created once and exists for the lifetime of the server.
The connected descriptor is the end point of the connection that is established between the client and the server. It is created each time the server accepts a connection request and exists only as long as it takes the server to service a client.


This implies that a port in a host can have multiple connections. A connection is identified by the 5-tuple.
Host and Service Conversion
getaddrinfo
and getnameinfo
converts back and forth between binary socket address strcutures and the string representations of hostnames, host addresses, service names and port numbers.
The getaddrinfo
Function
The getaddrinfo
function converts string representations of hostnames, host addresses, service names, and port numbers into socket address structures.
It is reentrant and works with any protocol.
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
int getaddrinfo(const char *host, const char *service,
const struct addrinfo *hints,
struct addrinfo **result);
// Returns: 0 if OK, nonzero error code on error
void freeaddrinfo(struct addrinfo *result);
// Returns: nothing
const char *gai_strerror(int errcode);
// Returns: error message
struct addrinfo {
int ai_flags; /* Hints argument flags */
int ai_family; /* First arg to socket function */
int ai_socktype; /* Second arg to socket function */
int ai_protocol; /* Third arg to socket function */
char *ai_canoname; /* Canonical hostname */
size_t ai_addrlen; /* Size of ai_addr struct */
struct sockaddr *ai_addr; /* Ptr to socket address structure */
struct addrinfo *ai_next; /* Ptr to next item in linked list */
};
Given host
and service
(the two components of a socket address), getaddrinfo
returns a result that points to a linked list of addrinfo
structures, each of which points to a socket address structure that corresponds to host
and service
.
After a client calls getaddrinfo
, it walks this list, trying each socket address in turn until the calls to socket and connect succeed and the connection is established.
Similarly, a server tries each socket address on the list until the calls to socket and bind succeed and the descriptor is bound to a valid socket address.
To avoid memory leaks, the application must eventually free the list by calling freeaddrinfo
. If getaddrinfo
returns a nonzero error code, the application can call gai_strerror
to convert the code to a message string.
The optional hints
argumet is an addrinfo
structure that provides finer control over the list of socket addresses that getaddrinfo
returns. When passes as a hints
argument, only the ai_family
, ai_socktype
, ai_protocol
, and ai_flags
fields can be set. The other fields must be set to zero (or NULL). We use memset
to zero the entire structure and set a few selected fields:
Setting
ai_family
toAF_INET
restricts the list to IPv4 addresses. Setting it toAF_INET6
restricts the list to IPv6 addresses.Setting
ai_socktype
toSOCK_STREAM
restricts the list to at most oneaddrinfo
structure for each unique address, one whose socket address can be used as the end point of a connection.SOCK_STREAM
is a type of reliable and connection-oriented socket adopted by TCP. UDP usesSOCK_DGRAM
.The
ai_flags
field is a bit mask that further modifies the default behavior. You create it by oring combinations of various values.AI_ADDRCONFIG
. This flag is recommended if you are using connections. It asks getaddrinfo to return IPv4 addresses only if the local host is configured for IPv4. Similarly for IPv6.AI_CANONNAME
. By default, the ai_canonname field is NULL. If this flag is set, it instructsgetaddrinfo
to point the ai_canonname field in the first addrinfo structure in the list to the canonical (official) name of host.AI_NUMERICSERV
. By default, the service argument can be a service name or a port number. This flag forces the service argument to be a port number.AI_PASSIVE
. Bydefault,getaddrinfo
returns socket addresses that can be used by clients as active sockets in calls to connect. This flag instructs it to return socket addresses that can be used by servers as listening sockets. In this case, the host argument should be NULL. The address field in the resulting socket address structure(s) will be the wildcard address, which tells the kernel that this server will accept requests to any of the IP addresses for this host. This is the desired behavior for all of our example servers.
When getaddrinfo
creates an addrinfo
structure in the output list, it fills in each field except for ai_flags
.
One of the elegant aspects of getaddrinfo
is that the fields in an addrinfo
structure are opaque, in the sense that they can be passed directly to the functions in the sockets interface without any further manipulation by the application code.
The getnameinfo
Function
The getnameinfo
function is inverse of getaddrinfo
. It is reentrant and protocol-independent.
#include <sys/socket.h>
#include <netdb.h>
int getnameinfo(const struct sockaddr *sa, socklen_t salen,
char *host, size_t hostlen,
char *service, size_t servlen, int flags);
// Returns: 0 if OK, nonzero error code on error
The sa
argument points to a socket address structure of size salen
bytes, host
to a buffer of size hostlen
bytes and service
to a buffer of size servlen
bytes.
If getnameinfo
returns a nonzero error code, the application can convert it to a string by calling gai_strerror
.
The flags
argument is a bit mask that modifies the default behavior.
NI_NUMERICHOST
. By default,getnameinfo
tries to return a domain name in host. Setting this flag will cause it to return a numeric address string instead.NI_NUMERICSERV
. By default,getnameinfo
will look in /etc/services and if possible, return a service name instead of a port number. Setting this flag forces it to skip the lookup and simply return the port number.
Helper Functions for the Sockets Interface
int open_clientfd(char *hostname, char *port) {
int clientfd;
struct addrinfo hints, *listp, *p;
memset(&hints, 0, sizeof(struct addrinfo));
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_NUMERICSERV;
hints.ai_flags |= AI_ADDRCONFIG;
getaddrinfo(hostname, port, &hints, &lisp);
for(p = listp; p; p = p->ai_next) {
if ((clientfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) < 0)
continue;
if (connect(clientfd, p->ai_addr, p->ai_addrlen) != -1)
break;
close(clientfd);
}
freeaddrinfo(listp);
if(!p)
return -1;
else
return clientfd;
}
int open_listenfd(char *port) {
struct addrinfo hints, *listp, *p;
int listenfd, optval=1;
memset(&hints, 0, sizeof(struct addrinfo));
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_PASSIVE | AI_ADDRCONFIG; // on any IP addresses
hint.ai_flags = AI_NUMERICSERV; // using port number
getaddrinfo(NULL, port, &hints, &listp);
for(p = listp; p; p = p->ai_next) {
if((listenfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) < 0)
continue;
// Eliminates "Address already in use" error from bind
setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, (const void *)&optval, sizeof(int));
if (bind(listenfd, p->addr, p->ai_addrlen) == 0)
break;
close(listenfd);
}
freeaddrinfo(listp);
if(!p)
return -1;
if(listen(listenfd, LISTRENQ) < 0) {
close(listenfd);
return -1;
}
return listenfd;
}
It is good programming practice to explicitly close any descriptors that you have opened.
Last updated