中国科学技术大学学报 ›› 2020, Vol. 50 ›› Issue (7): 1013-1018.DOI: 10.3969/j.issn.0253-2778.2020.07.019

• 论著 • 上一篇    下一篇

单细胞RNA序列数据的PBMC相关细胞的识别

龚乐君,周佘海,程逸飞,高志宏,李华康   

  1. 1.南京邮电大学计算机学院、软件学院、网络空间安全学院,江苏南京 210023; 2.江苏省大数据安全与智能处理重点实验室,江苏南京 210023; 3.浙江省智慧医疗工程技术研究中心,浙江温州 325035; 4.自然资源部城市国土资源监测与仿真重点实验室,广东深圳 518034; 5.苏州派维斯信息科技有限公司,江苏苏州 215011
  • 收稿日期:2020-06-03 修回日期:2020-06-21 接受日期:2020-06-21 出版日期:2020-07-31 发布日期:2020-06-21
  • 基金资助:
    国家自然科学基金(61502243),浙江省智慧医疗工程技术研究中心资助项目(2016E10011),中国博士后基金(2018M632349);江苏省“六大人才高峰”高层次人才项目(XYDXX-204); 城市国土资源监测与仿真重点实验室开放基金(KF-2019-04-011, KF-2019-04-065);苏州市姑苏科技创业天使计划(CYTS2018233);南京邮电大学引进人才科研启动基金(NY217136)资助.

Identification of PBMC-related cells of single cell RNA sequence data

GONG Lejun , ZHOU Shehai, CHENG Yifei, GAO Zhihong, LI Huakang   

  1. 1. School of computer science,Nanjing university of posts and telecommunications, Nanjing 210023, China; 2. Jiangsu Key Lab of Big Data Security& Intelligent Processing, Nanjing 210023, China; 3. Zhejiang Engineering Research Center of Intelligent Medicine, Wenzhou 325035, China; 4. Key Laboratory of Urban Land Resources Monitoring and Simulation, Shenzhen 518034, China; 5. Suzhou Privacy Information Technology Company, Suzhou 215011, China
  • Received:2020-06-03 Revised:2020-06-21 Accepted:2020-06-21 Online:2020-07-31 Published:2020-06-21

摘要: 细胞类型鉴定是单细胞RNA测序的主要任务之一.针对整个问题,提出基于随机森林的细胞类型自动识别(automatic identification of cell type based on random forest, AICTRF)方法来识别单细胞测序数据中的细胞类型,该方法使用随机森林分类模型进行训练,根据训练的模型进而预测未知的细胞类型.在人类外周血单核细胞(PBMC)测序数据集上训练了随机森林分类模型,利用该模型预测了人类PBMC中B细胞的相关亚型细胞类型.实验结果表明,该方法可以帮助相关研究人员快速而有效地自动识别单细胞测序数据中的细胞类型.

关键词: 单细胞RNA测序数据挖掘, 细胞类型, B细胞亚型, 聚类, 分类

Abstract: Cell type identification is one of the main tasks of single cell RNA sequencing. This paper proposes an automatic identification of cell types based on random forest (AICTRF) method to identify cell types in single-cell sequencing data. This method uses the random forest classification model for training, and then predicts unknown cell types according to the trained model. A random forest classification model was trained on human peripheral blood mononuclear cells (PBMC) sequencing data set to predict the cell types of related subtypes of human PBMC B cells. The results show that the proposed method can help researchers automatically identify cell types in single-cell sequencing data.

Key words: scRNA-seq data mining, cell type, B cell subtype, clustering, classification

中图分类号: