Copyright © 2007 Jeremy Kerr
v2007.09 Sept 2007, released under GPL.
Table of Contents
Abstract
This document is provided to assist in deploying the feedbackd system on a Linux Virtual Server (LVS) based server cluster. It is assumed that the LVS system is working, and you have a basic understanding of Linux and compiling software packages.
The LVS HOWTO gives a comprehensive guide to setting up an LVS cluster.
The feedbackd system is aimed at improving the performance of a server cluster, by evenly distributing a request load amongst the available servers. It does this by reporting the servers' load to the director, where it is used to allocate subsequent requests to the least busy servers.
To do this, feedbackd is split into two processes - the master, which is run on the director, and the agents, which are run on the servers. The agents and master communicate using the Network Element Control Protocol (NECP).
The following terms are used in this document:
The set of nodes that provide the entire load-balanced network service. This includes director(s) and realservers.
The host performing the load balancing, using the LVS software. Requests (from the clients) are first received by a director and forwarded to a realserver to be processed.
The host providing the network service.
A metric of a realserver's ability to provide a network service. There are a number of possible measurements for this (for example, cpu load). The most suitable measure will depend on the particular network service that the realserver provides.
The director must allocate incoming requests to a particular realserver. The weight of each realserver is the probability that it will be chosen to process the request.
The feedbackd process that runs on the director. The master's role is to request health information from the agents and update the weight for each realserver accordingly.
The feedbackd process that runs on the realservers. Each agent connects to the master and reports realserver health when polled.
When a command is to be entered to the shell, a prompt is given (for example director:$), specifying the host to enter the command on (either director or realserver). The host is followed by either a $ (indicating that the command is to be entered as a normal user) or a # (indicating that the command is to be entered by the root user).
This document was published using the 0.5 version of the feedbackd package. However, most of the information contained should be relevant to earlier versions
Feedbackd uses the Network Element Control Protocol (NECP) to communicate load data between master and agents. The NECP specification is available at http://www.circlemud.org/~jelson/writings/draft-cerpa-necp-03.txt. An overview of NECP is available at http://www.circlemud.org/~jelson/writings/necp-ietf/.
A typical NECP session proceeds like this:
The master starts, and listens for connections on 3262
An agent starts, connects to the master and performs initialisation
The agent informs the master of the services that it can provide. This is defined in the
configuration file, typically /etc/feedbackd-agent.conf
The master registers these services in the ipvs tables, initially with a "weight" of zero, so no requests will be forwarded to the realserver
Periodically, the master will ask the agent for its current health status for each of its services. When the reply is received by the master, it updates the ipvs tables with the new data. The NECP specification calls a request for health data a 'KEEPALIVE request'. The reply (which contains the health data) is called a 'KEEPALIVE ACK'.
If no response is seen to a keepalive packet, the master resends it. After a predefined number of retries, the realserver is considered dead and quiesced.
When an agent is shutdown, it tells the master, which queisces the service (by setting the realserver's weight to zero again) - no new connections will be made to the realserver
For any services in the process of quiescing, the master checks the number of active connections to that service on a regular basis. When this number of connections is zero, the service is removed from the ipvs tables.
This procedure outlines a few important concepts:
The agents are responsible for finding the master process - so must be configured with the address of the director node. The director has no advance knowlegde of where its realservers will be, or the forwarding type required to send requests to the realserver.
The realserver contacts the director with a list of the services it has available for the LVS.
These previous two points allow some nice functionality - no reconfiguration needs to be done to the director when new realservers are added to the cluster. New realservers can be added by plugging them into the network and running the agent. Similarly, to remove a realserver, the agent is halted - after all existing connections have been processed, the realserver can be removed.
To deploy feedbackd, you must have a working LVS-based cluster. Any type of forwarding type will work, as long as your network topology allows the realservers to establish a tcp session to port 3262 on the director.
The feedbackd master and agent are distributed as one tarball, which can be downloaded from the feedbackd website.
Building the master and agent is just a matter of the usual
./configure,
make, make
install:
$ tar zxvf feedbackd-0.5.tar.gz $ cd feedbackd-0.5 $ ./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var $ make $ sudo make install
This will build (and install) the master and agent. If you only need to
build one of the components, provide
--disable-master or
--disable-agent on the configure command
line.
--without-ipvs
Make a master binary that does not interface to the ipvs system. This is useful for testing on a system that does not have the ipvs kernel or libipvs installed.
--with-linux-dir=PATH
Where to find the Linux includes. Defaults to
/lib/modules/`uname -r`/source
If configure completes without errors, skip to the next section
Otherwise, if you get the following error:
configure: error: cannot find net/ip_vs.h in your kernel source.
then the ipvs headers are not installed in the expected location. There are a few possible reasons:
The kernel has not been patched with the ipvs software. Either patch and recompile the kernel, or if you're testing feedbackd on a non-LVS kernel, configure feedbackd with --without-ipvs
The kernel headers are not in
/lib/modules/`uname -r`/source. Add a
--with-linux-dir=PATH-TO-LINUX-DIR
argument to configure (for example, --with-linux-dir=/usr/src/linux-2.6.23)
The make install command should
install the binaries, sample configuration files, required libraries and
man pages. If you have only enabled the master (or only the agent), then
only the relevant files will be installed
The feedbackd master and agent are both configured using a simple configuration file syntax. During installation, sample configuration files are installed, but you will need to edit these (at least the director IP address in the agent's configuration file) to start using feedbackd.
The master configuration file is installed in /etc/feedbackd-master.conf by default.
Here is the sample configuration file:
# configuration file for feedbackd-master # logging configuration # log file logfile = /usr/local/var/log/feedbackd-master.log # log level: one of CRIT, ERR, WARN, INFO, DEBUG or VDEBUG loglevel = WARN keepalive-interval = 2 keepalive-timeout = 10 keepalive-retries = 3 removal-interval = 30 # Example virtual service definition. # name (a symboolic name, used in logging) # address (local address to listen on) # protocol (tcp or udp), # port (numeric or a name in /etc/services) # scheduler (eg, rr, wrr - see ipvsadm documentation) [virtual-service] name = http address = 127.0.0.1 protocol = tcp port = http scheduler = wrr
At present, the logging options are the only ones you need to worry about - the scheduler settings are covered in the section called “Adjusting feedbackd parameters”
Logging is configured by the
<logfile> and
<loglevel> directives.
The level parameter defines how verbose to be - it can be any one of
the following levels, in order of increasing verbosity:
CRIT
Critial messages
ERR
Errors
WARN
Warnings
INFO
Informational messages
DEBUG
Debugging messages
VDEBUG
Verbose debugging
When first deploying feedbackd, log at DEBUG or INFO. Once you have it working, change to WARN to keep the size of the log files down.
The agent configuration file is installed in
/etc/feedbackd-agent.conf by default. You'll need to
change (at least) the director IP address, and perhaps the service
definitions.
Here is the sample configuration file:
# configuration file for feedbackd-agent # Where to connect to the director. Either an IP address or hostname director = $DIP # How many connection attempts (to the master) to make before giving up. # 0 = never give up (default) connection-retries = 0 # How long to wait for the master master-timeout = 10 # Path to the monitor plugins moduledir = /usr/local/lib/feedbackd # logging configuration # log file logfile = /usr/local/var/log/feedbackd-agent.log # log level: one of CRIT, ERR, WARN, INFO, DEBUG or VDEBUG loglevel = WARN # Example service definition. # The following directives are required: # name (a symboolic name, used in logging) # protocol (tcp or udp), # port (numeric or a name in /etc/services) # forwarding (DR, NAT or TUN) # monitor, followed by any configuration for the monitor plugin [service] name = http protocol = tcp port = http forwarding = NAT monitor = cpuload.so # Another service definition [service] name = mail protocol = tcp port = smtp forwarding = NAT # Use the exec.so plugin, which requires a command parameter monitor = exec.so command = /usr/local/bin/monitor-mailqueue.sh
Logging is configured in exactly the same fashion as the master - see the section called “Logging Configuration”.
The director directive gives
the IP address or hostname of the director, where the master process is
running. You will need to change this to suit your
network.
The optional moduledir
directive sets the path to find monitor plugin modules. This defaults to
/usr/lib/feedback-agent(with the default
libdir given to configure).
The agent must inform the master of which services are available
on this host, and each service needs a definition in the config file.
There may be multiple service definitions, each with a
[service] section, with a symbolic
name for the service given in the 'name' directive.
Each service definition contains the following data:
protocolThe IP sub-protocol that the service uses; either
TCP or
UDP
portThe port that the service uses (numeric format, or a
symbolic name from /etc/services)
forwardingThe method used by the director to forward packets to the host. One of NAT, TUN or DR. (See the LVS documentation for details on forwarding types)
monitorThe monitor plugin to use to measure realserver health for this service. The plugin may require other directives to be present to configure it. See the section called “Monitor Plugins” for details on the available plugins
The example service given in the sample configuration file is for
a http service (TCP port 80), with NAT forwarding. The
cpuload plugin is used to measure realserver health
- this plugin requires no extra configuration directives.
Both the master and agent accept the following common options:
-c FILE, --conffile=FILE
Source configuration data from
FILE instead of the
default.
-t, --test-config
Test configuration data, then exit. This is useful after the configuration file has been altered
-d, --debug
Debug mode: run in the foreground, and log to stdout. The log leve is automatically set to DEBUG (any log settings provided in the configuration file will be ignored)
-V, --version
Print version and exit
-h, --help, -u, --usage
Print usage information and exit
Additionally, the master supports the following option:
-n, --dry-run
Don't alter the ipvs tables (realservers are not added to ipvs tables, weights are not updated). The ipvs tables are still read for active connection count.
Now you can run the master process. If you have compiled with ipvs
support (the default behaviour, without the
--without-ipvs option to
./configure), you will need to be root
to run it.
director:$ sudo feedbackd-master
feedbackd-master will parse the configuration, and if all is ok, run in the background. You can view the logs by:
director:$ sudo tail -f /var/log/feedbackd-master.log
To stop the process, issue a:
director:$ sudo killall feedbackd-master
The default behaviour on exit is to leave the ipvs tables as they are - the realservers will be weighted with the last value that was set.
Now you can run the agent process. You'll probably be able to do this as a non-root user, as long as your configuration will allow it - (specifically the monitor configuration and logging):
realserver:$ feedbackd-agent
feedbackd-agent will parse the configuration, and if all is ok, run in the background. You can view the logs by:
realserver:$ tail -f /var/log/feedbackd-agent.log
After the agent has started, you should see a new realserver added to the ipvs tables on the director. To view these on the director:
director:$ sudo ipvsadm -L
Depending on the plugin in use, the weights will be updated on a regular basis
To stop the agent, issue a:
realserver:$ killall feedbackd-agent
When an agent exits, it informs the master that is it
shutting down by sending a NECP_STOP
packet. The master will then quiesce the realserver in the ipvs tables,
so no new connections will be made to that realserver. Be careful when
shutting down agents on all nodes of a live cluster - you may end up
with no active realservers in the ipvs tables.
Because the definition of "realserver health" changes between applications, feedbackd-agent uses dynamically-loaded plugins to measure the health of a service. Plugins can be added without requiring the agent to be rebuilt, and allow development of monitor plugins for custom applications. At present, the following plugins are available:
The constant module is the simplest plugin - it returns a
(configurable) constant value for every measurement, defined by a
value directive. For example:
monitor = constant.so value = 32
The cpuload plugin reports the proportion of time that the CPU has spent idle. The higher this value, the 'healthier' the realserver is considered to be. This plugin requires no additional configuration directives:
monitor = cpuload.so
The exec plugin allows an external program to measure the server health. This program should generate a single line of output (on stdout) , with a single number indicating the health (between 0 and 100, 100 being the 'healtiest') of this server. This plugin requires requires a command directive:
monitor = exec.so command = /usr/bin/measure-load.sh
The external program should complete quickly, so as not to hold up the agent process (leading the master to assume that the server is down.
The perl plugin allows a monitor plugin to be implemented in a
Perl script. The script to use is defined in a
file directive:
monitor = perl.so file = /usr/lib/feedbackd-agent/monitor.pl
The Perl script must contain a get_health() sub, which returns a scalar between 0 and 100, representing the health of the service:
sub get_health() {
# perform necessary health measurement
return $health;
}
The Perl script is loaded and interpreted on startup, and the function is called when necessary. Therefore, altering the Perl file while feedbackd-agent is running will not alter the monitor code - feedbackd-agent must be restarted to reload the Perl file
Contributions for new plugins are welcome - either as a suggestion for development or a full implementation.
Because the feedbackd-master accepts connections from all remote hosts, it is possible for a malicious agent to connect to the master and add a realserver entry to the ipvs tables. This would allow a denial-of-service or man-in-the-middle attack on the services provided by the cluster. Therefore, it is important to block connections from unknown hosts to the director on tcp port 3262.
There are a number of parameters that allow optimisation of feedbackd behaviour to suit specific applications and network topologies.
The frequency of the NECP keepalive messages can be altered on the master. Lower values of the interval will give the director more up-to-date load data, but will increase network and processor load.
The keepalive interval can be changed in the
keepalive-interval directive of the
feedbackd-master configuration file. The interval is given in
seconds.
The keepalive messages also inform the agent that it still has
an active connection to the master. See the section called “Master Timeout”. If
keepalive-interval is higher than
master-timeout, then the agent will
presume that the connection has been broken and will disconnect from the
master and attempt a reconnection.
The default value for the keepalive interval is 2 seconds
If a response is not received for a keepalive message within a predefined amount of time, then the keepalive message is considered lost and resent. After a predefined number of successive retries, the master assumes that the remote host is down, and will quiesce the realserver (so that future requests are not forwarded to the realserver).
These two predefined values are the "keepalive timeout" and
"keepalive retries", respectively. These are found in the master
configuration file, in the
keepalive-timeout (value given in
seconds) and keepalive-retries
directives.
These values should be adjusted to suit the network connection between the director and realservers. If the realservers are only reachable through high-latency links, then the timeout value should be higher, to prevent active realservers from being inadvertantly quiesced. A timeout that is too high, however, will result in faulty realservers being quiesced after a longer amount of time (giving a larger proportion of errors in the load balanced service).
The default value for the keepalive timeout is 10 seconds, the default number of retries is 3.
(More specifcally, Master Timeout as seen by the agent on the realserver)
The agent expects to receive regular keepalive messages from the
master. If the agent receives no data from the master before the timeout,
then the agent will immediately close the current connection to the
master and reconnect. If you are logging at a level at or above
WARN, then the following message will be
logged:
Master timed out, attempting reconnection
If these messages occur frequently, you may have the master timeout set too low - the agent is timing out before the master has had a chance to send the keepalive packet. However, large values will result in a slow detection of broken connections.
The master timeout is set in the feedbackd-agent config file, in
the master-timeout directive. The
value is given in seconds.
If the master timeout is lower than (or close to) the keepalive interval, then the agent may timeout unnecessarily. Make sure the master timeout value is not too low.
The default value of the master timeout is 30 seconds