ubuntu 修改mysql 数据文件夹

https://stackoverflow.com/questions/17968287/how-to-find-the-mysql-data-directory-from-command-line-in-windows

https://www.digitalocean.com/community/tutorials/how-to-move-a-mysql-data-directory-to-a-new-location-on-ubuntu-16-04

https://askubuntu.com/questions/790685/cannot-set-a-different-database-directory-for-mysql-errcode-13-permission-d

rsync 帮助脚本

#!/bin/bash 
  
#this script for start|stop rsync daemon service 
#date:2012/2/13 
  
status1=$(ps -ef | egrep "rsync --daemon.*rsyncd.conf" | grep -v 'grep') 
pidfile="/var/run/rsyncd.pid" 
start_rsync="rsync --daemon --config=/etc/rsyncd.conf" 
  
function rsyncstart() { 
  
    if [ "${status1}X" == "X" ];then 
  
        rm -f $pidfile       
  
        ${start_rsync}   
  
        status2=$(ps -ef | egrep "rsync --daemon.*rsyncd.conf" | grep -v 'grep') 
          
        if [  "${status2}X" != "X"  ];then 
              
            echo "rsync service start.......OK" 
              
        fi 
  
    else 
  
        echo "rsync service is running !"    
  
    fi 
} 
  
function rsyncstop() { 
  
    if [ "${status1}X" != "X" ];then 
      
        kill -9 $(cat $pidfile) 
  
        status2=$(ps -ef | egrep "rsync --daemon.*rsyncd.conf" | grep -v 'grep') 
  
        if [ "${statusw2}X" == "X" ];then 
              
            echo "rsync service stop.......OK" 
        fi 
    else 
  
        echo "rsync service is not running !"    
  
    fi 
} 
  
  
function rsyncstatus() { 
  
  
    if [ "${status1}X" != "X" ];then 
  
        echo "rsync service is running !"   
      
    else 
  
         echo "rsync service is not running !"  
  
    fi 
  
} 
  
function rsyncrestart() { 
  
    if [ "${status1}X" == "X" ];then 
  
               echo "rsync service is not running..." 
  
               rsyncstart 
        else 
  
               rsyncstop 
  
               rsyncstart    
  
        fi       
}  
  
case $1 in 
  
        "start") 
               rsyncstart 
                ;; 
  
        "stop") 
               rsyncstop 
                ;; 
  
        "status") 
               rsyncstatus 
               ;; 
  
        "restart") 
               rsyncrestart 
               ;; 
  
        *) 
          echo 
                echo  "Usage: $0 start|stop|restart|status" 
          echo 
esac

linux免密钥SSH登陆配置

背景:
好多linux系统需要维护,那么就需要配置SSH免密钥登陆,此处涉及双向和单向两种。详情参见

环境:
master:
192.168.38.45
slave:
192.168.38.58
192.168.38.60
首先配置单向的也就是master到slave的免密钥ssh登陆。

单向配置:
1.在master和所有slave上,使用yourname用户名执行:

ssh-keygen -t dsa -P '' -f /home/yourname/.ssh/id_dsa  

2.在master的/home/yourname/.ssh目录下,执行 :

cat id_dsa.pub > authorized_keys  

3.将master上的authorized_keys拷贝到所有slave的相同目录下。命令:

scp   /home/yourname/.ssh/authorized_keys yourname@192.168.38.58:/home/yourname/.ssh/  
scp   /home/yourname/.ssh/authorized_keys yourname@192.168.38.60:/home/yourname/.ssh/  

此时可以master向slaves单向免密钥登陆

如果打算循环双向登陆那么参见如下步骤

双向:
1.在master和所有slave上,使用yourname用户名执行:

ssh-keygen -t dsa -P '' -f /home/yourname/.ssh/id_dsa  

2.在master的/home/yourname/.ssh目录下,执行 :

cat id_dsa.pub > authorized_keys   

3.将master上的authorized_keys拷贝到某slave的相同目录下。命令:

scp   /home/yourname/.ssh/authorized_keys yourname@192.168.38.58:/home/yourname/.ssh/   

4.把58信息加入到authorized_keys:

 cat id_dsa.pub >> authorized_keys   

5.把58上的 authorized_keys拷贝到60上并加入authorized_keys:

scp   /home/yourname/.ssh/authorized_keys yourname@192.168.38.60:/home/yourname/.ssh/  
cat id_dsa.pub >> authorized_keys  
 

6.此时authorized_keys拥有所有机器的id_dsa.pub,那么把他scp到其他节点上即可:

scp   /home/yourname/.ssh/authorized_keys yourname@192.168.38.58:/home/yourname/.ssh/  
scp   /home/yourname/.ssh/authorized_keys yourname@192.168.38.45:/home/yourname/.ssh/ 

此时可以双向了
ps:所有节点的id_dsa.pub 都必须加入到authorized_keys,那么如果集群中增加一个节点如何操作呢?增加进去然后全部scp分发出去

务必确认如下问题:
1./etc/ssh/sshd_config中对应的AuthorizedKeysFile .ssh/authorized_keys是配置了的
2.如果没有配置那么需要配置然后重启:/etc/rc.d/init.d/sshd restart
3.权限问题
chmod 700 .ssh
chmod 600 authorized_keys

redis持久化——RDB、AOF

原文

redis支持两种持久化方案:RDB和AOF。

RDB

RDB持久化是redis默认的,用来生成某一个时间点的数据快照;RDB是一个经过压缩的二进制文件,采用RDB持久化时服务器只会保存一个RDB文件(维护比较简单);

  • 每次进行RDB持久化时,redis都是将内存中完成的数据写到文件中,不是增量的持久化(写脏数据)
  • 写RDB文件时,先把内存中数据写到临时文件,然后替换原来的RDB文件;

1、RDB文件的生成:
RDB持久化可以手动执行,也可以按照服务器的配置定期执行。
1)save和bgsave命令:(手动用于生成RDB文件的命令)

  • save:会阻塞redis服务器进程,直到创建RDB文件完毕为止;(在此期间进程不能处理任何请求)
  • bgsave:fork一个子进程来创建RDB文件,父进程可以继续处理命令请求;

2)自动执行:
redis服务器允许用户通过设置配置文件save选项,让服务器每隔一段时间自动执行一次bgsave命令。如下:配置文件中的save选项
save 900 1
save 300 10
save 60 10000

服务器内部维护了一个dirty计数器和lastsave属性:

  • dirty:记录了距上次成功执行了save或bgsave命令之后,数据库修改的次数(写入、删除、更新等);
  • lastsave:unix时间戳,记录了上一次成功执行save或bgsave命令的时间;

redis服务器的周期性操作函数serverCron默认每100毫秒执行一次,该函数的一个主要作用就是检查save选项所设置的保存条件是否满足,如果满足就执行bgsave命令。检查的过程:根据系统当前时间、dirty和lastsave属性的值来检验保存条件是否满足。

补充:

  • 在执行bgsave期间,客户端发送的save、bgsave命令会被服务器拒绝执行(防止产生竞争);
  • 如果bgsave在执行,bgrewriteaof命令会被延迟到bgsave执行完毕后再执行;
  • 如果bgrewriteaof在执行,bgsave命令也会被延迟到bgrewriteaof命令执行完毕后再执行(bgsave和bgrewriteaof都是通过子进程完成的,不存在冲突,主要是考虑性能);

2、RDB文件的载入:

  • redis并没有专门的命令去载入RDB文件,只有在服务器启动的时候检测到RDB文件存在就会自动执行载入。
  • 如果redis启用了AOF持久化功能,那么服务器优先使用AOF文件还原数据。
  • 当服务器载入RDB文件时,会一直处于阻塞状态,直到载入完毕为止。
  • 载入时RDB文件时,系统会自动检查、如果是过期键不会加载到数据库中。

3、其他:
1)redis会根据配置文件中rdbcompression属性决定保存RDB文件时是否进行压缩存储(默认为yes),如果打开了压缩功能,redis会判断字符串长度>=20字节则压缩,否则不压缩存储;
2)redis自带了RDB文件检查工具redis-check-dump;
3)使用od命令打印RDB文件:[root@centOS1 dir]# od -oc dump.rdb
4)RDB文件名通过配置文件dbfilename dump.rdb指定,存放位置通过配置文件dir /var/lib/redis/ 指定;

AOF

有上面分析可知:RDB方式持久化的颗粒比较大,当服务器宕机时,到上次save或bgsave后的所有数据都会丢失。而AOF的持久化颗粒比较细,当服务器宕机后,只有宕机之前没来得AOF的操作数据会丢失。

1、AOF实现:

1)AOF持久化是通过保存redis服务器所执行的写命令来记录数据库状态的;被写入AOF文件的所有命令都是以Redis的命令请求协议格式保存的(Redis的请求协议是纯文本的)。服务器在启动时,通过载入AOF文件、并执行其中的命令来还原服务器状态。

2)AOF文件名通过配置文件appendfilename appendonly.aof 指定,存放位置通过配置文件dir /var/lib/redis/ 指定;

3)AOF步骤:

  • 命令追加:服务器在执行玩一个写命令后,会以协议的格式把其追加到aof_buf缓冲区末尾;
  • 文件写入:redis服务器进程就是一个事件循环,在每次事件循环结束,会根据配置文件中的appednfsync属性值决定是否将aof_buf中的数据写入到AOF文件中;
  • 文件同步:将内存缓冲区的数据写到磁盘;(由于OS的特性导致)

补充:

为了提高文件写入效率,现代OS中通常会把写入数据暂时保存在一个内存的缓冲区里面,等到缓冲区满或超时,一次性把其中的内容再写到磁盘。

4)appendfsync选项:

  • always:将aof_buf中所有内容写入、同步到AOF文件;
  • everysec:将aof_buf中所有内容写到AOF文件,如果上次同步AOF文件时间距当前时间超过1s,那么对AOF文件同步;(由专门线程负责)
  • no:将aof_buf中所有内容写入AOF文件,合适同步根据操作系统决定;

5)AOF持久化效率和安全性:(appendfsync选项控制)
always每次都要写入、同步,所以其安全性最高,效率是最慢的;everysec效率也足够快,也安全性也可以得到保证;no效率最高,但安全性比较差。

2、AOF文件载入:
AOF文件中记录了数据库中所有写操作的命令,所以服务器只需要重新执行一遍AOF文件中的命令即可恢复服务器关闭之前的状态。步骤如下:

  • 创建一个不带网络连接的伪客户端;
  • 从AOF文件中分析并读取一条写命令;
  • 使用伪客户端执行被读出的写命令;

3、AOF重写:
由于AOF记录了数据库的所有写命令,所以随着服务器的运行,AOF文件中内容会越来越大。实际上,对于一个键值,由于多次的修改,会产生很多写命令;中间的一些写操作可以直接省去,直接把最终的键值信息记录到AOF文件中即可,从而减小AOF文件大小。

1)AOF重写:
为了解决AOF文件体积膨胀问题,redis服务器使用了AOF重写功能:创建一个新的AOF文件来代替现有的AOF文件,新旧两个AOF文件所保存的数据库状态相同,但新AOF文件不会包含任何浪费空间的冗余命令,所以新AOF文件体积会比旧AOF文件体积小很多。

2)原理:
AOF文件重写并不需要对现有AOF文件进行任何读取、分析或者写入操作,这个功能是通过读取服务器当前的数据库状态来实现的。首先,从数据库中读取键现在的值,然后使用一条命令去记录键值对,代替之前记录这个键值对的多条命令,这就是AOF重写的原理。

注:

为了避免在执行命令时造成客户端缓冲区溢出,重写程序在处理集合、链表等发现带有多个元素时,会检查元素数量是否超过redis默认的64,如果超过了,redis会使用多个命令代替这一条命令。

3)bgrewriteaof命令、AOF重写缓冲区:
由于redis是单进程的,为了不在进行重写时阻塞服务,redis使用了子进程的方式进行AOF重写。——bgrewriteaof

在使用子进程进程AOF重写时会产生另一个问题:子进程在AOF重写时,服务器主进程还在继续处理命令请求,而新的命令可能会对现有数据库状态进行修改,而从使得服务器当前状态和重写后的AOF文件所保存的服务器状态不一致。为了解决这个问题,redis引入了AOF重写缓冲区。

AOF重写缓冲区是在服务器创建子进程之后开始使用:

  • 执行客户端命令;
  • 将执行后的写命令追加到aof_buf(AOF缓冲区);
  • 将执行后的写命令追加到AOF重写缓冲区;

aof_buf(AOF缓冲区)会定期被写入、同步到AOF文件;而在AOF重写期间新的命令会写到AOF重写缓冲区。当AOF重写完成后,会向父进程发送一个信号,父进程收到信号后会阻塞当前服务,进行如下操作:

  • 将AOF重写缓冲区中的写命令写入到新的AOF文件;(保证了新AOF文件数据库状态和当前数据库状态的一致)
  • 对新的AOF文件重命名、原子的覆盖旧的AOF;

注:

整个AOF重写中,只有信号处理是阻塞的;当信号处理完毕后父进程就可以接收命令请求了。

4、如果redis开始基于rdb进行的持久化,之后通过appendonly yes 打开了aof,这时重新启动redis后会根据aof进行载入,所以原来所有的数据无法加载到数据库中。

HaProxy日志详解

Log levels 日志级别
global 全局参数,如果实例上没设置参数,仅有log global那么每个实例都会使用该参数。

log global
log <address> [len <length>] <facility> [<level> [<minlevel>]]

address 日志发送的目的IP地址。

            - IPV4 默认UDP端口514  例如:127.0.0.1:514
            - IPV6 默认UDP端口514
            - 文件系统路径到scoket,保证文件是可写的。

length 日志输出行最大字符长度,值范围80-65535 默认值 1024
facility 必须是24个标准的syslog设施

             kern   user   mail   daemon auth   syslog lpr    news
             uucp   cron   auth2  ftp    ntp    audit  alert  cron2
             local0 local1 local2 local3 local4 local5 local6 local7

level 日志级别,可以设置一个最低日志级别,可以发送最低日志级别以上级别的日志信息

            emerg  alert  crit   err    warning notice info  debug
Example :

log global

log 127.0.0.1:514 local0 notice         

log 127.0.0.1:514 local0 notice notice

log ${LOCAL_SYSLOG}:514 local0 notice 

Log Formats
Haproxy 支持5种日志格式
1、默认格式:这是非常基本的,很少使用。它只提供关于传入连接的非常基本的信息 源IP:端口、目的IP:端口,和前端的名字。

Example :
    listen www
        mode http
        log global
        server srv1 127.0.0.1:8000

>&gt;&gt; Feb  6 12:12:09 localhost \
      haproxy[14385]: Connect from 10.0.1.2:33312 to 10.0.3.31:8012 \
      (www/HTTP)

    Field   Format                                Extract from the example above
      1   process_name '[' pid ']:'                            haproxy[14385]:
      2   'Connect from'                                          Connect from
      3   source_ip ':' source_port                             10.0.1.2:33312
      4   'to'                                                              to
      5   destination_ip ':' destination_port                   10.0.3.31:8012
      6   '(' frontend_name '/' mode ')'                            (www/HTTP)

2、TCP协议格式:通过“option tcplog” 启用该格式,这种格式提供了更丰富的信息,如定时器,连接数,队列大小等。这格式推荐纯TCP代理。

TCP协议格式字段定义串:
log-format %ci:%cp\ [%t]\ %ft\ %b/%s\ %Tw/%Tc/%Tt\ %B\ %ts\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq


Example :
    frontend fnt
        mode tcp
        option tcplog
        log global
        default_backend bck

    backend bck
        server srv1 127.0.0.1:8000

>&gt;&gt; Feb  6 12:12:56 localhost \
      haproxy[14387]: 10.0.1.2:33313 [06/Feb/2009:12:12:51.443] fnt \
      bck/srv1 0/0/5007 212 -- 0/0/0/0/3 0/0
  Field   Format                                Extract from the example above
      1   process_name '[' pid ']:'                            haproxy[14387]:
      2   client_ip ':' client_port                             10.0.1.2:33313
      3   '[' accept_date ']'                       [06/Feb/2009:12:12:51.443]
      4   frontend_name                                                    fnt
      5   backend_name '/' server_name                                bck/srv1
      6   Tw '/' Tc '/' Tt*                                           0/0/5007
      7   bytes_read*                                                      212
      8   termination_state                                                 --
      9   actconn '/' feconn '/' beconn '/' srv_conn '/' retries*    0/0/0/0/3
     10   srv_queue '/' backend_queue                                      0/0

字段含义

Detailed fields description :
  - "client_ip" is the IP address of the client which initiated the TCP
    connection to haproxy. If the connection was accepted on a UNIX socket
    instead, the IP address would be replaced with the word "unix". Note that
    when the connection is accepted on a socket configured with "accept-proxy"
    and the PROXY protocol is correctly used, then the logs will reflect the
    forwarded connection's information.

  - "client_port" is the TCP port of the client which initiated the connection.
    If the connection was accepted on a UNIX socket instead, the port would be
    replaced with the ID of the accepting socket, which is also reported in the
    stats interface.

  - "accept_date" is the exact date when the connection was received by haproxy
    (which might be very slightly different from the date observed on the
    network if there was some queuing in the system's backlog). This is usually
    the same date which may appear in any upstream firewall's log.

  - "frontend_name" is the name of the frontend (or listener) which received
    and processed the connection.

  - "backend_name" is the name of the backend (or listener) which was selected
    to manage the connection to the server. This will be the same as the
    frontend if no switching rule has been applied, which is common for TCP
    applications.

  - "server_name" is the name of the last server to which the connection was
    sent, which might differ from the first one if there were connection errors
    and a redispatch occurred. Note that this server belongs to the backend
    which processed the request. If the connection was aborted before reaching
    a server, "<NOSRV>" is indicated instead of a server name.

  - "Tw" is the total time in milliseconds spent waiting in the various queues.
    It can be "-1" if the connection was aborted before reaching the queue.
    See "Timers" below for more details.

  - "Tc" is the total time in milliseconds spent waiting for the connection to
    establish to the final server, including retries. It can be "-1" if the
    connection was aborted before a connection could be established. See
    "Timers" below for more details.

  - "Tt" is the total time in milliseconds elapsed between the accept and the
    last close. It covers all possible processing. There is one exception, if
    "option logasap" was specified, then the time counting stops at the moment
    the log is emitted. In this case, a '+' sign is prepended before the value,
    indicating that the final one will be larger. See "Timers" below for more
    details.

  - "bytes_read" is the total number of bytes transmitted from the server to
    the client when the log is emitted. If "option logasap" is specified, the
    this value will be prefixed with a '+' sign indicating that the final one
    may be larger. Please note that this value is a 64-bit counter, so log
    analysis tools must be able to handle it without overflowing.

  - "termination_state" is the condition the session was in when the session
    ended. This indicates the session state, which side caused the end of
    session to happen, and for what reason (timeout, error, ...). The normal
    flags should be "--", indicating the session was closed by either end with
    no data remaining in buffers. See below "Session state at disconnection"
    for more details.

  - "actconn" is the total number of concurrent connections on the process when
    the session was logged. It is useful to detect when some per-process system
    limits have been reached. For instance, if actconn is close to 512 when
    multiple connection errors occur, chances are high that the system limits
    the process to use a maximum of 1024 file descriptors and that all of them
    are used. See section 3 "Global parameters" to find how to tune the system.

  - "feconn" is the total number of concurrent connections on the frontend when
    the session was logged. It is useful to estimate the amount of resource
    required to sustain high loads, and to detect when the frontend's "maxconn"
    has been reached. Most often when this value increases by huge jumps, it is
    because there is congestion on the backend servers, but sometimes it can be
    caused by a denial of service attack.

  - "beconn" is the total number of concurrent connections handled by the
    backend when the session was logged. It includes the total number of
    concurrent connections active on servers as well as the number of
    connections pending in queues. It is useful to estimate the amount of
    additional servers needed to support high loads for a given application.
    Most often when this value increases by huge jumps, it is because there is
    congestion on the backend servers, but sometimes it can be caused by a
    denial of service attack.

  - "srv_conn" is the total number of concurrent connections still active on
    the server when the session was logged. It can never exceed the server's
    configured "maxconn" parameter. If this value is very often close or equal
    to the server's "maxconn", it means that traffic regulation is involved a
    lot, meaning that either the server's maxconn value is too low, or that
    there aren't enough servers to process the load with an optimal response
    time. When only one of the server's "srv_conn" is high, it usually means
    that this server has some trouble causing the connections to take longer to
    be processed than on other servers.

  - "retries" is the number of connection retries experienced by this session
    when trying to connect to the server. It must normally be zero, unless a
    server is being stopped at the same moment the connection was attempted.
    Frequent retries generally indicate either a network problem between
    haproxy and the server, or a misconfigured system backlog on the server
    preventing new connections from being queued. This field may optionally be
    prefixed with a '+' sign, indicating that the session has experienced a
    redispatch after the maximal retry count has been reached on the initial
    server. In this case, the server name appearing in the log is the one the
    connection was redispatched to, and not the first one, though both may
    sometimes be the same in case of hashing for instance. So as a general rule
    of thumb, when a '+' is present in front of the retry count, this count
    should not be attributed to the logged server.

  - "srv_queue" is the total number of requests which were processed before
    this one in the server queue. It is zero when the request has not gone
    through the server queue. It makes it possible to estimate the approximate
    server's response time by dividing the time spent in queue by the number of
    requests in the queue. It is worth noting that if a session experiences a
    redispatch and passes through two server queues, their positions will be
    cumulated. A request should not pass through both the server queue and the
    backend queue unless a redispatch occurs.

  - "backend_queue" is the total number of requests which were processed before
    this one in the backend's global queue. It is zero when the request has not
    gone through the global queue. It makes it possible to estimate the average
    queue length, which easily translates into a number of missing servers when
    divided by a server's "maxconn" parameter. It is worth noting that if a
    session experiences a redispatch, it may pass twice in the backend's queue,
    and then both positions will be cumulated. A request should not pass
    through both the server queue and the backend queue unless a redispatch
    occurs.

3、HTTP协议格式:通过“option httplog” 启用该格式,http代理推荐使用该格式。

HTTP协议格式字段定义串:
    log-format %ci:%cp\ [%t]\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %ST\ %B\ %CC\ \
               %CS\ %tsc\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %hr\ %hs\ %{+Q}r

Example :
    frontend http-in
        mode http
        option httplog
        log global
        default_backend bck

    backend static
        server srv1 127.0.0.1:8000

>&gt;&gt; Feb  6 12:14:14 localhost \
      haproxy[14389]: 10.0.1.2:33317 [06/Feb/2009:12:14:14.655] http-in \
      static/srv1 10/0/30/69/109 200 2750 - - ---- 1/1/1/1/0 0/0 {1wt.eu} \
      {} &quot;GET /index.html HTTP/1.1&quot;
  Field   Format                                Extract from the example above
      1   process_name '[' pid ']:'                            haproxy[14389]:
      2   client_ip ':' client_port                             10.0.1.2:33317
      3   '[' accept_date ']'                       [06/Feb/2009:12:14:14.655]
      4   frontend_name                                                http-in
      5   backend_name '/' server_name                             static/srv1
      6   Tq '/' Tw '/' Tc '/' Tr '/' Tt*                       10/0/30/69/109
      7   status_code                                                      200
      8   bytes_read*                                                     2750
      9   captured_request_cookie                                            -
     10   captured_response_cookie                                           -
     11   termination_state                                               ----
     12   actconn '/' feconn '/' beconn '/' srv_conn '/' retries*    1/1/1/1/0
     13   srv_queue '/' backend_queue                                      0/0
     14   '{' captured_request_headers* '}'                   {haproxy.1wt.eu}
     15   '{' captured_response_headers* '}'                                {}
     16   '&quot;' http_request '&quot;'                      &quot;GET /index.html HTTP/1.1&quot;

字段含义

Detailed fields description :
  - "client_ip" is the IP address of the client which initiated the TCP
    connection to haproxy. If the connection was accepted on a UNIX socket
    instead, the IP address would be replaced with the word "unix". Note that
    when the connection is accepted on a socket configured with "accept-proxy"
    and the PROXY protocol is correctly used, then the logs will reflect the
    forwarded connection's information.

  - "client_port" is the TCP port of the client which initiated the connection.
    If the connection was accepted on a UNIX socket instead, the port would be
    replaced with the ID of the accepting socket, which is also reported in the
    stats interface.

  - "accept_date" is the exact date when the TCP connection was received by
    haproxy (which might be very slightly different from the date observed on
    the network if there was some queuing in the system's backlog). This is
    usually the same date which may appear in any upstream firewall's log. This
    does not depend on the fact that the client has sent the request or not.

  - "frontend_name" is the name of the frontend (or listener) which received
    and processed the connection.

  - "backend_name" is the name of the backend (or listener) which was selected
    to manage the connection to the server. This will be the same as the
    frontend if no switching rule has been applied.

  - "server_name" is the name of the last server to which the connection was
    sent, which might differ from the first one if there were connection errors
    and a redispatch occurred. Note that this server belongs to the backend
    which processed the request. If the request was aborted before reaching a
    server, "<NOSRV>" is indicated instead of a server name. If the request was
    intercepted by the stats subsystem, "<STATS>" is indicated instead.

  - "Tq" is the total time in milliseconds spent waiting for the client to send
    a full HTTP request, not counting data. It can be "-1" if the connection
    was aborted before a complete request could be received. It should always
    be very small because a request generally fits in one single packet. Large
    times here generally indicate network trouble between the client and
    haproxy. See "Timers" below for more details.

  - "Tw" is the total time in milliseconds spent waiting in the various queues.
    It can be "-1" if the connection was aborted before reaching the queue.
    See "Timers" below for more details.

  - "Tc" is the total time in milliseconds spent waiting for the connection to
    establish to the final server, including retries. It can be "-1" if the
    request was aborted before a connection could be established. See "Timers"
    below for more details.

  - "Tr" is the total time in milliseconds spent waiting for the server to send
    a full HTTP response, not counting data. It can be "-1" if the request was
    aborted before a complete response could be received. It generally matches
    the server's processing time for the request, though it may be altered by
    the amount of data sent by the client to the server. Large times here on
    "GET" requests generally indicate an overloaded server. See "Timers" below
    for more details.

  - "Tt" is the total time in milliseconds elapsed between the accept and the
    last close. It covers all possible processing. There is one exception, if
    "option logasap" was specified, then the time counting stops at the moment
    the log is emitted. In this case, a '+' sign is prepended before the value,
    indicating that the final one will be larger. See "Timers" below for more
    details.

  - "status_code" is the HTTP status code returned to the client. This status
    is generally set by the server, but it might also be set by haproxy when
    the server cannot be reached or when its response is blocked by haproxy.

  - "bytes_read" is the total number of bytes transmitted to the client when
    the log is emitted. This does include HTTP headers. If "option logasap" is
    specified, the this value will be prefixed with a '+' sign indicating that
    the final one may be larger. Please note that this value is a 64-bit
    counter, so log analysis tools must be able to handle it without
    overflowing.

  - "captured_request_cookie" is an optional "name=value" entry indicating that
    the client had this cookie in the request. The cookie name and its maximum
    length are defined by the "capture cookie" statement in the frontend
    configuration. The field is a single dash ('-') when the option is not
    set. Only one cookie may be captured, it is generally used to track session
    ID exchanges between a client and a server to detect session crossing
    between clients due to application bugs. For more details, please consult
    the section "Capturing HTTP headers and cookies" below.

  - "captured_response_cookie" is an optional "name=value" entry indicating
    that the server has returned a cookie with its response. The cookie name
    and its maximum length are defined by the "capture cookie" statement in the
    frontend configuration. The field is a single dash ('-') when the option is
    not set. Only one cookie may be captured, it is generally used to track
    session ID exchanges between a client and a server to detect session
    crossing between clients due to application bugs. For more details, please
    consult the section "Capturing HTTP headers and cookies" below.

  - "termination_state" is the condition the session was in when the session
    ended. This indicates the session state, which side caused the end of
    session to happen, for what reason (timeout, error, ...), just like in TCP
    logs, and information about persistence operations on cookies in the last
    two characters. The normal flags should begin with "--", indicating the
    session was closed by either end with no data remaining in buffers. See
    below "Session state at disconnection" for more details.

  - "actconn" is the total number of concurrent connections on the process when
    the session was logged. It is useful to detect when some per-process system
    limits have been reached. For instance, if actconn is close to 512 or 1024
    when multiple connection errors occur, chances are high that the system
    limits the process to use a maximum of 1024 file descriptors and that all
    of them are used. See section 3 "Global parameters" to find how to tune the
    system.

  - "feconn" is the total number of concurrent connections on the frontend when
    the session was logged. It is useful to estimate the amount of resource
    required to sustain high loads, and to detect when the frontend's "maxconn"
    has been reached. Most often when this value increases by huge jumps, it is
    because there is congestion on the backend servers, but sometimes it can be
    caused by a denial of service attack.

  - "beconn" is the total number of concurrent connections handled by the
    backend when the session was logged. It includes the total number of
    concurrent connections active on servers as well as the number of
    connections pending in queues. It is useful to estimate the amount of
    additional servers needed to support high loads for a given application.
    Most often when this value increases by huge jumps, it is because there is
    congestion on the backend servers, but sometimes it can be caused by a
    denial of service attack.

  - "srv_conn" is the total number of concurrent connections still active on
    the server when the session was logged. It can never exceed the server's
    configured "maxconn" parameter. If this value is very often close or equal
    to the server's "maxconn", it means that traffic regulation is involved a
    lot, meaning that either the server's maxconn value is too low, or that
    there aren't enough servers to process the load with an optimal response
    time. When only one of the server's "srv_conn" is high, it usually means
    that this server has some trouble causing the requests to take longer to be
    processed than on other servers.

  - "retries" is the number of connection retries experienced by this session
    when trying to connect to the server. It must normally be zero, unless a
    server is being stopped at the same moment the connection was attempted.
    Frequent retries generally indicate either a network problem between
    haproxy and the server, or a misconfigured system backlog on the server
    preventing new connections from being queued. This field may optionally be
    prefixed with a '+' sign, indicating that the session has experienced a
    redispatch after the maximal retry count has been reached on the initial
    server. In this case, the server name appearing in the log is the one the
    connection was redispatched to, and not the first one, though both may
    sometimes be the same in case of hashing for instance. So as a general rule
    of thumb, when a '+' is present in front of the retry count, this count
    should not be attributed to the logged server.

  - "srv_queue" is the total number of requests which were processed before
    this one in the server queue. It is zero when the request has not gone
    through the server queue. It makes it possible to estimate the approximate
    server's response time by dividing the time spent in queue by the number of
    requests in the queue. It is worth noting that if a session experiences a
    redispatch and passes through two server queues, their positions will be
    cumulated. A request should not pass through both the server queue and the
    backend queue unless a redispatch occurs.

  - "backend_queue" is the total number of requests which were processed before
    this one in the backend's global queue. It is zero when the request has not
    gone through the global queue. It makes it possible to estimate the average
    queue length, which easily translates into a number of missing servers when
    divided by a server's "maxconn" parameter. It is worth noting that if a
    session experiences a redispatch, it may pass twice in the backend's queue,
    and then both positions will be cumulated. A request should not pass
    through both the server queue and the backend queue unless a redispatch
    occurs.

  - "captured_request_headers" is a list of headers captured in the request due
    to the presence of the "capture request header" statement in the frontend.
    Multiple headers can be captured, they will be delimited by a vertical bar
    ('|'). When no capture is enabled, the braces do not appear, causing a
    shift of remaining fields. It is important to note that this field may
    contain spaces, and that using it requires a smarter log parser than when
    it's not used. Please consult the section "Capturing HTTP headers and
    cookies" below for more details.

  - "captured_response_headers" is a list of headers captured in the response
    due to the presence of the "capture response header" statement in the
    frontend. Multiple headers can be captured, they will be delimited by a
    vertical bar ('|'). When no capture is enabled, the braces do not appear,
    causing a shift of remaining fields. It is important to note that this
    field may contain spaces, and that using it requires a smarter log parser
    than when it's not used. Please consult the section "Capturing HTTP headers
    and cookies" below for more details.

  - "http_request" is the complete HTTP request line, including the method,
    request and HTTP version string. Non-printable characters are encoded (see
    below the section "Non-printable characters"). This is always the last
    field, and it is always delimited by quotes and is the only one which can
    contain quotes. If new fields are added to the log format, they will be
    added before this field. This field might be truncated if the request is
    huge and does not fit in the standard syslog buffer (1024 characters). This
    is the reason why this field must always remain the last one.

4、CLF HTTP协议格式:相当于http协议格式。

CLF HTTP协议格式字段定义串:
    log-format %{+Q}o\ %{-Q}ci\ -\ -\ [%T]\ %r\ %ST\ %B\ \"\"\ \"\"\ %cp\ \
               %ms\ %ft\ %b\ %s\ \%Tq\ %Tw\ %Tc\ %Tr\ %Tt\ %tsc\ %ac\ %fc\ \
               %bc\ %sc\ %rc\ %sq\ %bq\ %CC\ %CS\ \%hrl\ %hsl

5、自定义格式

R	var	field name (8.2.2 and 8.2.3 for description)	type
%o	special variable, apply flags on all next var	
%B	bytes_read (from server to client)	numeric
H	%CC	captured_request_cookie	string
H	%CS	captured_response_cookie	string
%H	hostname	string
H	%HM	HTTP method (ex: POST)	string
H	%HP	HTTP request URI without query string (path)	string
H	%HQ	HTTP request URI query string (ex: ?bar=baz)	string
H	%HU	HTTP request URI (ex: /foo?bar=baz)	string
H	%HV	HTTP version (ex: HTTP/1.0)	string
%ID	unique-id	string
%ST	status_code	numeric
%T	gmt_date_time	date
%Tc	Tc	numeric
%Tl	local_date_time	date
H	%Tq	Tq	numeric
H	%Tr	Tr	numeric
%Ts	timestamp	numeric
%Tt	Tt	numeric
%Tw	Tw	numeric
%U	bytes_uploaded (from client to server)	numeric
%ac	actconn	numeric
%b	backend_name	string
%bc	beconn (backend concurrent connections)	numeric
%bi	backend_source_ip (connecting address)	IP
%bp	backend_source_port (connecting address)	numeric
%bq	backend_queue	numeric
%ci	client_ip (accepted address)	IP
%cp	client_port (accepted address)	numeric
%f	frontend_name	string
%fc	feconn (frontend concurrent connections)	numeric
%fi	frontend_ip (accepting address)	IP
%fp	frontend_port (accepting address)	numeric
%ft	frontend_name_transport (‘~’ suffix for SSL)	string
%lc	frontend_log_counter	numeric
%hr	captured_request_headers default style	string
%hrl	captured_request_headers CLF style	string list
%hs	captured_response_headers default style	string
%hsl	captured_response_headers CLF style	string list
%ms	accept date milliseconds (left-padded with 0)	numeric
%pid	PID	numeric
H	%r	http_request	string
%rc	retries	numeric
%rt	request_counter (HTTP req or TCP session)	numeric
%s	server_name	string
%sc	srv_conn (server concurrent connections)	numeric
%si	server_IP (target address)	IP
%sp	server_port (target address)	numeric
%sq	srv_queue	numeric
S	%sslc	ssl_ciphers (ex: AES-SHA)	string
S	%sslv	ssl_version (ex: TLSv1)	string
%t	date_time (with millisecond resolution)	date
%ts	termination_state	string
H	%tsc	termination_state with cookie status	string
R = Restrictions : H = mode http only ; S = SSL only

Example :

global
        maxconn 65535
        chroot /usr/local/haproxy
        uid 99
        gid 99
        daemon
        nbproc 1
        description haproxy
        pidfile /var/run/haproxy.pid
defaults
        log global
        mode http
        balance roundrobin
        option forceclose
        option dontlognull
        option redispatch
        option abortonclose
        log-format %ci:%cp\ [%t]\ %U\ %HM\ %HU\ %HV\ %ST\ %si:%sp

>&gt;&gt; Sep 12 10:17:52 localhost haproxy[22909]: 10.1.250.98:53300 [12/Sep/2016:10:17:52.532] 496 GET / HTTP/1.1 200 10.1.1.20:9090

错误日志
error日志格式:

 >>> Dec  3 18:27:14 localhost \
          haproxy[6103]: 127.0.0.1:56059 [03/Dec/2012:17:35:10.380] frt/f1: \
          Connection error during SSL handshake

  Field   Format                                Extract from the example above
      1   process_name '[' pid ']:'                             haproxy[6103]:
      2   client_ip ':' client_port                            127.0.0.1:56059
      3   '[' accept_date ']'                       [03/Dec/2012:17:35:10.380]
      4   frontend_name "/" bind_name ":"                              frt/f1:
      5   message                        Connection error during SSL handshake

捕捉 HTTP Headers

Example:
capture request header Host len 15
capture request header X-Forwarded-For len 15
capture request header Referer len 15

自定义headers
frontend webapp
        bind *:80
        capture request header test len 20
        capture request header test2 len 20

difference between SPI and API?

the API is the description of classes/interfaces/methods/… that you call and use to achieve a goal and

the SPI is the description of classes/interfaces/methods/… that you extend and implement to achieve a goal

Put differently, the API tells you what a specific class/method does for you and the SPI tells you what you must do to conform.

Usually API and SPI are separate. For example in JDBC the Driver class is part of the SPI: If you simply want to use JDBC, you don’t need to use it directly, but everyone who implements a JDBC driver must implement that class.

Sometimes they overlap, however. The Connection interface is both SPI and API: You use it routinely when you use a JDBC driver and it needs to be implemented by the developer of the JDBC driver.

// Service provider framework sketch

// Service interface
public interface Service {
    ... // Service-specific methods go here
}

// Service provider interface
public interface Provider {
    Service newService();
}

// Noninstantiable class for service registration and access
public class Services {
    private Services() { }  // Prevents instantiation (Item 4)

    // Maps service names to services
    private static final Map<String, Provider> providers =
        new ConcurrentHashMap<String, Provider>();
    public static final String DEFAULT_PROVIDER_NAME = "<def>";

    // Provider registration API
    public static void registerDefaultProvider(Provider p) {
        registerProvider(DEFAULT_PROVIDER_NAME, p);
    }
    public static void registerProvider(String name, Provider p){
        providers.put(name, p);
    }

    // Service access API
    public static Service newInstance() {
        return newInstance(DEFAULT_PROVIDER_NAME);
    }
    public static Service newInstance(String name) {
        Provider p = providers.get(name);
        if (p == null)
            throw new IllegalArgumentException(
                "No provider registered with name: " + name);
        return p.newService();
    }
}

More