Redis提供了两种Persistence策略,RDB和AOF。RDB是默认的,它定时创建数据库的完整磁盘镜像,即dump.rdb文件。创建镜像的时间间隔是可以设置的,假如每5分钟创建一次镜像,那么当系统崩溃时用户可能会丢失5分钟的数据。因此,RDB不是一个可靠性很高的方案,但是性能不错。RDB非常容易备份,用户直接将dump.rdb文件复制即可。为了提供更好的可靠性,Redis支持AOF,即将操作写入日志中(appendonly.aof文件)。写日志的策略可以是每秒一次或每次操作一次,显然每秒一次意味着用户可能丢失1秒的数据,而每次操作一次的可靠性最高,但是性能最差。日志文件可能会增长到非常大,因此Redis后台会执行rewrite操作整理日志。AOF不适合备份。
Redis推荐使用RDB,以及在需要可靠性的时候用RDB+AOF,不推荐单独使用AOF。Redis为了减少磁盘的负载,任何时刻都不会同时执行写镜像和写日志。更多RDB和AOF的细节可以参见官网。本文会测试写负载比较重时,RDB和RDB+AOF的性能。
Redis还提供主从同步的功能,可以为Redis配置一台slave,当master崩溃时,slave可以接管master的工作。这其中就涉及到主从同步的操作,即replication。Redis采用异步replication。本文会测试写负载很高时,replication对性能的影响。
1. 实验环境
因为persistence和replication与写负载关系很大,所以选择了read/update各占50%的workloada。使用的两台服务在同一百兆以太网内,16GB内存,一个运行YCSB,一个运行redis-server。测试时,首先向Redis插入100万条记录,然后再执行100万条操作(read和update各50%),统计延迟。测试replication时需要第三台服务器,也是16GB内存。AOF每秒执行一次。
2. RDB与AOF
load阶段的结果。
load阶段 | RDB | AOF |
run time (ms) | 807436.0 | 864731.0 |
Avg. insert latency (ms) | 801.21 | 858.34 |
Min insert latency (ms) | 605 | 645 |
Max insert latency (ms) | 37397 | 5855846 |
95th percentile insert latency (ms) | 0 | 0 |
99th percentile insert latency (ms) | 0 | 0 |
run阶段的结果。
run阶段 | RDB | AOF |
run time (ms) | 401831.0 | 437653.0 |
Avg. update latency (ms) | 299.55 | 341.54 |
Min update latency (ms) | 157 | 174 |
Max update latency (ms) | 39069 | 7333105 |
95th percentile update latency (ms) | 0 | 0 |
99th percentile update latency (ms) | 0 | 0 |
Avg. read latency (ms) | 489.56 | 519.13 |
Min read latency (ms) | 342 | 359 |
Max read latency (ms) | 42185 | 7278363 |
95th percentile read latency (ms) | 0 | 0 |
99th percentile read latency (ms) | 0 | 0 |
可见AOF对平均延迟有7%-14%的影响,而对最大延迟的影响非常大,可能的原因是在执行AOF的rewrite时延迟很大(纯属猜测)。
3. replication
此实验关闭AOF,实验前,同时开启主从服务器。
load阶段的结果。
load阶段 | non-replication | replication |
run time (ms) | 807436.0 | 786098.0 |
Avg. insert latency (ms) | 801.21 | 780.44 |
Min insert latency (ms) | 605 | 496 |
Max insert latency (ms) | 37397 | 61072 |
95th percentile insert latency (ms) | 0 | 0 |
99th percentile insert latency (ms) | 0 | 0 |
run阶段的结果。
run阶段 | non-replication | replication |
run time (ms) | 401831.0 | 398683.0 |
Avg. update latency (ms) | 299.55 | 294.72 |
Min update latency (ms) | 157 | 164 |
Max update latency (ms) | 39069 | 37605 |
95th percentile update latency (ms) | 0 | 0 |
99th percentile update latency (ms) | 0 | 0 |
Avg. read latency (ms) | 489.56 | 488.82 |
Min read latency (ms) | 342 | 318 |
Max read latency (ms) | 42185 | 201731 |
95th percentile read latency (ms) | 0 | 0 |
99th percentile read latency (ms) | 0 | 0 |
可见replication对性能影响不大。为了进一步观察,我在load阶段结束之后,同时运行空的slave和负载。
run阶段 | replication |
run time (ms) | 490977.0 |
Avg. update latency (ms) | 390.52 |
Min update latency (ms) | 171 |
Max update latency (ms) | 153180 |
95th percentile update latency (ms) | 0 |
99th percentile update latency (ms) | 5 |
Avg. read latency (ms) | 577.92 |
Min read latency (ms) | 304 |
Max read latency (ms) | 201658 |
95th percentile read latency (ms) | 0 |
99th percentile read latency (ms) | 5 |
这次实验在运行负载时,要进行大量replication,可以看到运行时间增加了25%,而且最大延迟和99th percentile延迟也增加明显。