The Sockets Interface

From the perspective of the Linux kernel, a socket is an end point for communication. From the perspective of a Linux program, a socket is an open file with a corresponding descriptor.
Internet socket addresses are stored in 16-byte structures having the type sockaddr_in.
/* IP socket address structure */
struct sockaddr_in {
uint16_t sin_family; /* Protocol family*/
uint16_t sin_port; /* Port number in network byte order */
struct in_addr sin_addr; /* IP address in network byte order */
unsigned char sin_zero[8]; /* Pad to sizeof(struct sockaddr) */
};sin_addr is a 32-bit address. The IP address and port number are always stored in network byte order.
The connect, bind, and accept functions require a pointer to a protocol-specific socket address structure.
In old days, the sockets functions expect a pointer to a generic sockaddr structure.
struct sockaddr {
uint16_t sa_family; /* Protocol family */
char sa_data[14]; /* Address data */
};Pointers pointing to protocol-specific structures should be cast to sockaddr.
Now, we use a generic void * pointer.
Literally we setup a server through the following procedures:
We invoke
getaddrinfo(const char *host, const char *service, const struct addrinfo *hints, struct addrinfo **result)to obtain a list ofaddrinfostructures that represent the possible network addresses we can use.For a server, the
hostis set toNULLand theserviceis set to the port number.We iterate through each of the
addrinfostructures in the result list and try to use them with thesocketandbindfunctions. The goal is to create a socket and bind it to an address and port where it can listen for incoming connections. If thesocketorbindcall fails, we try the nextaddrinfostructure in the list until we find one that works or exhaust all options.We invoke the
listenfunction on the bound socket to turn it into a listening socket. This tells the operating system that we want this socket to be used to accept incoming connection requests.We call
acceptin a loop to wait for and accept incoming connections. When a client connects,acceptreturns a new socket that's specifically linked to that client. We can then use this new socket to communicate with the client.
The socket Function
socket FunctionClients and servers use the socket function to create a socket descriptor.
#include <sys/types.h>
#include <sys/socket.h>
int socket(int domain, int type, int protocol);
// for the parameter protocol you can just use 0 and the system will choose the correct protocol based on the type.
// Returns: nonnegative descriptor if OK, −1 on error
clientfd = socket(AF_INET, SOCK_STREAM, 0);AF_INET indicates that we are using 32-bit IP addresses and SOCK_ STREAM indicates that the socket will be an end point for a connection.
The best practice is to use the getaddrinfo function to generate these parameters automatically, so that the code is protocol-independent.
The connect Function
connect Function#include <sys/socket.h>
int connect(int clientfd, const struct sockaddr *addr, socklen_t addrlen);
// Returns: 0 if OK, −1 on errorThe connect function attempts to establish an Internet connection with the server at the socket address addr, where addrlen is sizeof(sockaddr_in). The connect function blocks until either the connection is successfully established or an error occurs. If successful, the clientfd descriptor is now ready for reading and writing, and the resulting connection is characterized by the socket pair (x:y, addr.sin_addr:addr.sin_port). As with socket, the best practice is to use getaddrinfo to supply the arguments to connect.
The bind Function
bind Function#include <sys/socket.h>
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
// Returns: 0 if OK, −1 on errorThe bind function asks the kernel to associate the server’s socket address in addr with the socket descriptor sockfd.
The listen Function
listen FunctionThe client is the active side that initiates connection requests. The server side is the passive entities that wait for connection requests from clients. By default, the kernel assumes that a descriptor created by the socket function is the client side of a connection. A server calls the listen function to tell the kernel that the descriptor is on the server side.
#include <sys/socket.h>
int listen(int sockfd, int backlog);The backlog argument is a hint about the number of outstanding connection requests that the kernel should queue up before it starts to refuse requests.
The accept Function
accept FunctionServers wait for connection requests from clients by calling the accept function.
#include <sys/socket.h>
int accept(int listenfd, struct sockaddr *addr, int *addrlen);
// Returns: nonnegative connected descriptor if OK, −1 on errorThe accept function waits for a connection request from a client to arrive on the listening descriptor listenfd, then fills in the client’s socket address in addr, and returns a connected descriptor that can be used to communicate with the client using Unix I/O functions.
The listening descriptor serves as an end point for client connection requests. It is typically created once and exists for the lifetime of the server.
The connected descriptor is the end point of the connection that is established between the client and the server. It is created each time the server accepts a connection request and exists only as long as it takes the server to service a client.


This implies that a port in a host can have multiple connections. A connection is identified by the 5-tuple.
Host and Service Conversion
getaddrinfo and getnameinfo converts back and forth between binary socket address strcutures and the string representations of hostnames, host addresses, service names and port numbers.
The getaddrinfo Function
The getaddrinfo function converts string representations of hostnames, host addresses, service names, and port numbers into socket address structures.
It is reentrant and works with any protocol.
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
int getaddrinfo(const char *host, const char *service,
const struct addrinfo *hints,
struct addrinfo **result);
// Returns: 0 if OK, nonzero error code on error
void freeaddrinfo(struct addrinfo *result);
// Returns: nothing
const char *gai_strerror(int errcode);
// Returns: error message
struct addrinfo {
int ai_flags; /* Hints argument flags */
int ai_family; /* First arg to socket function */
int ai_socktype; /* Second arg to socket function */
int ai_protocol; /* Third arg to socket function */
char *ai_canoname; /* Canonical hostname */
size_t ai_addrlen; /* Size of ai_addr struct */
struct sockaddr *ai_addr; /* Ptr to socket address structure */
struct addrinfo *ai_next; /* Ptr to next item in linked list */
};Given host and service (the two components of a socket address), getaddrinfo returns a result that points to a linked list of addrinfo structures, each of which points to a socket address structure that corresponds to host and service.
After a client calls getaddrinfo, it walks this list, trying each socket address in turn until the calls to socket and connect succeed and the connection is established.
Similarly, a server tries each socket address on the list until the calls to socket and bind succeed and the descriptor is bound to a valid socket address.
To avoid memory leaks, the application must eventually free the list by calling freeaddrinfo. If getaddrinfo returns a nonzero error code, the application can call gai_strerror to convert the code to a message string.
The optional hints argumet is an addrinfo structure that provides finer control over the list of socket addresses that getaddrinfo returns. When passes as a hints argument, only the ai_family, ai_socktype, ai_protocol, and ai_flags fields can be set. The other fields must be set to zero (or NULL). We use memset to zero the entire structure and set a few selected fields:
Setting
ai_familytoAF_INETrestricts the list to IPv4 addresses. Setting it toAF_INET6restricts the list to IPv6 addresses.Setting
ai_socktypetoSOCK_STREAMrestricts the list to at most oneaddrinfostructure for each unique address, one whose socket address can be used as the end point of a connection.SOCK_STREAMis a type of reliable and connection-oriented socket adopted by TCP. UDP usesSOCK_DGRAM.The
ai_flagsfield is a bit mask that further modifies the default behavior. You create it by oring combinations of various values.AI_ADDRCONFIG. This flag is recommended if you are using connections. It asks getaddrinfo to return IPv4 addresses only if the local host is configured for IPv4. Similarly for IPv6.AI_CANONNAME. By default, the ai_canonname field is NULL. If this flag is set, it instructsgetaddrinfoto point the ai_canonname field in the first addrinfo structure in the list to the canonical (official) name of host.AI_NUMERICSERV. By default, the service argument can be a service name or a port number. This flag forces the service argument to be a port number.AI_PASSIVE. Bydefault,getaddrinforeturns socket addresses that can be used by clients as active sockets in calls to connect. This flag instructs it to return socket addresses that can be used by servers as listening sockets. In this case, the host argument should be NULL. The address field in the resulting socket address structure(s) will be the wildcard address, which tells the kernel that this server will accept requests to any of the IP addresses for this host. This is the desired behavior for all of our example servers.
When getaddrinfo creates an addrinfo structure in the output list, it fills in each field except for ai_flags.
One of the elegant aspects of getaddrinfo is that the fields in an addrinfo structure are opaque, in the sense that they can be passed directly to the functions in the sockets interface without any further manipulation by the application code.
The getnameinfo Function
The getnameinfo function is inverse of getaddrinfo. It is reentrant and protocol-independent.
#include <sys/socket.h>
#include <netdb.h>
int getnameinfo(const struct sockaddr *sa, socklen_t salen,
char *host, size_t hostlen,
char *service, size_t servlen, int flags);
// Returns: 0 if OK, nonzero error code on errorThe sa argument points to a socket address structure of size salen bytes, host to a buffer of size hostlen bytes and service to a buffer of size servlen bytes.
If getnameinfo returns a nonzero error code, the application can convert it to a string by calling gai_strerror.
The flags argument is a bit mask that modifies the default behavior.
NI_NUMERICHOST. By default,getnameinfotries to return a domain name in host. Setting this flag will cause it to return a numeric address string instead.NI_NUMERICSERV. By default,getnameinfowill look in /etc/services and if possible, return a service name instead of a port number. Setting this flag forces it to skip the lookup and simply return the port number.
Helper Functions for the Sockets Interface
int open_clientfd(char *hostname, char *port) {
int clientfd;
struct addrinfo hints, *listp, *p;
memset(&hints, 0, sizeof(struct addrinfo));
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_NUMERICSERV;
hints.ai_flags |= AI_ADDRCONFIG;
getaddrinfo(hostname, port, &hints, &lisp);
for(p = listp; p; p = p->ai_next) {
if ((clientfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) < 0)
continue;
if (connect(clientfd, p->ai_addr, p->ai_addrlen) != -1)
break;
close(clientfd);
}
freeaddrinfo(listp);
if(!p)
return -1;
else
return clientfd;
}int open_listenfd(char *port) {
struct addrinfo hints, *listp, *p;
int listenfd, optval=1;
memset(&hints, 0, sizeof(struct addrinfo));
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_PASSIVE | AI_ADDRCONFIG; // on any IP addresses
hint.ai_flags = AI_NUMERICSERV; // using port number
getaddrinfo(NULL, port, &hints, &listp);
for(p = listp; p; p = p->ai_next) {
if((listenfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) < 0)
continue;
// Eliminates "Address already in use" error from bind
setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, (const void *)&optval, sizeof(int));
if (bind(listenfd, p->addr, p->ai_addrlen) == 0)
break;
close(listenfd);
}
freeaddrinfo(listp);
if(!p)
return -1;
if(listen(listenfd, LISTRENQ) < 0) {
close(listenfd);
return -1;
}
return listenfd;
}It is good programming practice to explicitly close any descriptors that you have opened.
Last updated