中国科学技术大学学报 ›› 2013, Vol. 43 ›› Issue (8): 607-621.DOI: 10.3969/j.issn.0253-2778.2013.08.002

• 原创论文 • 上一篇    下一篇

纵向数据分析中使用滑动平均Cholesky分解对回归均值和 协方差矩阵进行同时半参数建模

邢 昕   

  1. 中国科学技术大学管理学院统计与金融系, 安徽合肥 230026
  • 收稿日期:2012-12-13 修回日期:2013-05-17 出版日期:2013-08-31 发布日期:2013-08-31

Joint semiparametric mean-covariance modeling by moving average Cholesky decomposition for longitudinal data

XING Xin   

  1. Dept. of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230026, China
  • Received:2012-12-13 Revised:2013-05-17 Online:2013-08-31 Published:2013-08-31
  • Contact: ZHANG Weiping
  • About author:XING Xin, male, born in 1987, master. Research field: Large-sample theory. E-mail: xingxin@mail.ustc.edu.cn

摘要: 近年来,对纵向数据分析中回归均值和协方差矩阵同时进行建模研究得到越来越多的关注.为满足协方差矩阵的正定性约束,文献中常考虑对其逆矩阵进行某种分解. 本文使用一种Cholesky分解方法对协方差矩阵本身进行分解,得到的参数没有取值限制且有着明确的统计意义.具体地,分解后的参数可以视为滑动平均序列的系数和对数更新方差, 且在整个实轴上取值无限制.考虑到模型的稳健性和推断的有效性,提出了一种对回归均值和协方差矩阵同时进行半参数建模的方法, 并利用广义估计方程和B样条给出了半参数模型的估计方法,得到了参数部分估计的渐近正态性以及非参数部分估计的最优收敛速度.最后通过模拟和实例分析对所提方法进行了数值研究.

关键词: 纵向数据, 半参数模型, 广义估计方程, 修改的Cholesky分解, 滑动平均, B样条

Abstract: Modeling the mean and covariance simultaneously has recently received considerable attention when efficiently analyzing the longitudinal data. An unconstrained and statistically interpretable reparameterization of covariance matrix itself was presented by utilizing a novel Cholesky factor. The entries in such decomposition have moving average and log innovation interpretation and can thus be modeled as functions of covariates. With this decomposition and the consideration of model flexibility, new semiparametric models for jointly modeling the mean and covariance itself were proposed, rather than its inverse as commonly studied in literature. A spline based approach using generalized estimating equations was developed to estimate the parameters in the mean and the covariance. It was shown that the estimators for the parametric parts in both the mean and covariance are consistent and asymptotically normally distributed, and the nonparametric parts could be estimated at an optimal rate of convergence. Simulation studies and real data analysis illustrate that the proposed approach could yield highly reliable estimation of the mean and covariance matrix.

Key words: longitudinal data, semiparametric model, generalized estimating equation, modified Cholesky decomposition, moving average, B-spline