中国科学技术大学学报 ›› 2020, Vol. 50 ›› Issue (6): 825-831.DOI: 10.3969/j.issn.0253-2778.2020.06.015

• 论著 • 上一篇    下一篇

一种读写均衡的高性能键值存储系统

吴加禹,李永坤,许胤龙   

  1. 中国科学技术大学计算机科学与技术学院,安徽合肥 230026
  • 收稿日期:2020-03-13 修回日期:2020-06-04 接受日期:2020-06-04 出版日期:2020-06-30 发布日期:2020-06-04
  • 通讯作者: 李永坤
  • 作者简介:吴加禹,男,1994年生,硕士生,研究方向:键值存储系统.E-mail:wujy@mail.ustc.edu.cn
  • 基金资助:
    国家自然科学基金(61772484);统筹推进世界一流大学和一流学科建设专项资金(YD2150002003)资助.

A read/write balanced high-performance key-value store

WU Jiayu, LI Yongkun, XU Yinlong   

  1. School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026, China
  • Received:2020-03-13 Revised:2020-06-04 Accepted:2020-06-04 Online:2020-06-30 Published:2020-06-04

摘要: 日志结构合并树(LSM-tree)因能利用外存设备的顺序访问性能,被广泛应用于键值存储系统的核心数据结构.由于LSM-tree层次化的、有序的数据组织结构需要通过大量的数据合并操作维护,故引起了严重的写放大效应.最近的研究工作提出了若干优化方案缓解LSM-tree的写放大,但是牺牲了查询性能和空间利用率.为此基于LSM-tree的键值存储系统提出一种新的架构,其核心设计是采用键值分离的方案降低数据合并开销,并以一种新型的树状结构vTree为值维护一定程度有序性,保障高效的范围查询;同时为vTree设计了相应的数据合并和空间回收方法.实验结果表明,基于该架构实现的键值存储系统在写入、点查询、范围查询各方面有均衡的高性能表现,且空间开销较低.

关键词: 日志结构合并树, 写放大, 键值分离, 范围查询

Abstract: Log-structured merge tree (LSM-tree) is widely used as the core data structure of modern key-value stores due to its ability to utilize sequential access performance of external storage. However, it suffers from expensive merge operations to maintain the layered and ordered data organization, which induces significant write amplification. Recent researches have proposed various optimizations to mitigate write amplification, but at the cost of query performance and space utilization. A new architecture for LSM-tree based key-value store is proposed. The key idea is to leverage the key-value separation design to mitigate merge overhead, while maintaining a certain degree of order for values by using a new tree structure called vTree to improve the range query performance. In the meantime, corresponding data merge and space reclaim algorithms were developed for the vTree. The experimental results show that the key-value store developed with this architecture has well balanced performance on write, point lookup, and range query, with low space cost.

Key words: log-structured merge tree, write amplification, key-value separation, range query

中图分类号: