[1] TSAI C, WU J. Using neural network ensembles for bankruptcy prediction and credit scoring[J]. Expert Systems With Applications, 2008, 34(4): 2639-2649. [2] PEROLS J. Financial statement fraud detection: An analysis of statistical and machine learning algorithms[J]. Auditing: A Journal of Practice and Theory, 2011, 30(2): 19-50. [3] CRAWFORD M, KHOSHGOFTAAR T M, PRUSA T M, et al. Survey of review spam detection using machine learning techniques[J]. Journal of Big Data, 2015, 2(23):1-24. [4] DE BRUIJNE M. Machine learning approaches in medical image analysis: From detection to diagnosis[J]. Medical Image Analysis, 2016, 33: 94-97. [5] RUBIN V L, CHEN Y. Information manipulation classification theory for LIS and NLP[J]. Proceedings of the Association for Information Science and Technology, 2012, 49(1): 1-5. [6] FAN J, FAN Y. High dimensional classification using features annealed independence rules[J]. Annals of Statistics, 2008, 36(6): 2605-2637. [7] SCHAPIRE R E. The strength of weak learn ability[J]. Machine Learning, 1990, 5(2): 197-227. [8] FREUND Y, SCHAPIRE R E. A decision-theoretic generalization of on-line learning and an application to boosting[J]. Journal of Computer and System Sciences, 1997, 55: 119-139. [9] FRIEDMAN J H, TIBSHIRANI R, HASTIE T. Additive logistic regression: A statistical view of boosting[J]. The Annals of Statistics, 2000, 28(2): 337-407. [10] FRIEDMAN J H. Greedy function approximation: A gradient boosting machine[J]. Annals of Statistics, 2001, 29(5): 1189-1232. [11] BREIMAN L. Bagging predictors[J]. Machine Learning, 1996, 24(2): 123-140. [12] BREIMAN L. Random forest[J]. Machine Learning, 2001, 45: 5-32. [13] ZUKOTYNSKI K, GAUDET V, KUO P H, et al. The use of random forests to identify brain regions on amyloid and FDG PET associated with MoCA score[J]. Clinical Nuclear Medicine, 2020, 45(6): 427-433. [14] CHEN D R, LI H. On the performance of regularized regression learning in Hilbert space[J]. Neurocomputing, 2012, 93(2): 41-47. [15] JOHNSON W B, LINDENSTRAUSS J. Extensions of Lipschitz mappings into a Hilbert space[J]. Contemporary Mathematics, 1984, 26(1): 189-206. [16] DASGUPTA S, GUPTA A. An elementary proof of the Johnson-Lindenstrauss Lemma[J]. Random Structures and Algorithms, 1999, 22(1): 1-5. [17] LI P, HASTIE T J, CHURCH K W. Very sparse random projections[C]// Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2006: 287-296. [18] ALZU’BI A, ABUARQOUB A. Deep learning model with low-dimensional random projection for large-scale image[J]. Engineering Science and Technology,2020, 23(4): 911-920. [19] NGUYEN X V, SARAH E, SAKRAPEE P, et al. Training robust models using Random Projection[C]// 2016 23rd International Conference on Pattern Recognition. IEEE, 2017: 531-536. [20] MATTHEW T, ALEX P. Eigenfaces for recognition[J]. Journal of Cognitive Neuroscience, 1991, 3(1): 71-86. [21] WANG Y, KLIJN J G, ZHANG Y, et al. Gene-expression proles to predict distant metastasis of lymph-node-negative primary breast cancer[J]. Lancet, 2005, 365(9460): 671-679. [22] LPEZ-SNCHEZ D, CORCHADO J M, GONZLEZ ARRIETA A, et al. Data-independent random projections from the feature-map of the homogeneous polynomial kernel of degree two[J]. Information Sciences, 2018, 436-437: 214-226. [23] CANNINGS T I, SAMWORTH R J. Random-projection ensemble classification[J]. Journal of the Royal Statistical Society, 2017, 79(4): 959-1035.
附录
A.1 定理1.1证明
记 hk=(h(1)k,h(2)k,…),hNk=(h(1)k,…, h(N)k)T, h-Nk=(hk(N+1),h(N+2)k,…)T,
Φ(xi)N=(1(xi), 2(xi),…,N(xi))T, Φ(xi)-N=(N+1(xi), N+2(xi),…)T, HN=(h1N,h2N,…,hdN)T, H-N=(h-N1,h-N2,…,h-Nd)T, 则
H(Φ(xi)-Φ(xj))=HN(ΦN(xi)-ΦN(xj))+ H-N(Φ-N(xi)-Φ-N(xj))(A1)
由于{Φ(x1),…,Φ(xn)}互不相同,有‖Φ(xi)-Φ(xj)‖2>0.由Cauchy-Schwartz不等式,可得 ‖H-N(Φ-N(xi)-Φ-N(xj))‖2≤ ∑dk=1‖h-Nk‖2‖Φ-N(xi)-Φ-N(xj)‖2(A2)
因为hk∈l2, Φ(xi)∈l2,所以对∈(0,1/3), N>0,使得 ‖H-N(Φ-N(xi)-Φ-N(xj))‖2≤‖Φ(xi)-Φ(xj)‖2(A3)
且 ‖(ΦN(xi)-ΦN(xj))‖2≥(1-)‖Φ(xi)-Φ(xj)‖2(A4)
所以当d>O(log(n)2)时, 由Johnson-Lindenstrauss定理[15],有 ‖H(Φ(xi)-Φ(xj))‖2≤ ‖HN(ΦN(xi)-ΦN(xj))‖2 + ‖H-N(Φ-N(xi)-Φ-N(xj))‖2 ≤ (1+)‖ΦN(xi)-ΦN(xj)‖2+ ‖Φ(xi)-Φ(xj)‖2 ≤ (1+2)‖Φ(xi)-Φ(xj)‖2(A5)
另一方面, ‖H(Φ(xi)-Φ(xj))‖2 ≥ ‖HN(ΦN(xi)-ΦN(xj))‖2 - ‖H-N(Φ-N(xi)-Φ-N(xj))‖2≥ (1-)‖ΦN(xi)-ΦN(xj)‖2- ‖Φ(xi)-Φ(xj)‖2≥ (1-)(1-)‖Φ(xi)-Φ(xj)‖2- ‖Φ(xi)-Φ(xj)‖2= (2-3+1)‖Φ(xi)-Φ(xj)‖2≥ (1-3)‖Φ(xi)-Φ(xj)‖2(A6)
由(A5)和(A6),我们可得证.
A.2 定理1.2证明 对于ERCRPn,我们有
ERCRPn=Eπ0 ∫Rp 1CRPn(Φ(x))=1dP0(x)+ Eπ1 ∫Rp 1{CRPn(Φ(x))=0}dP1(x)= Eπ0 ∫Rp 1{νn(Φ(x))≥1/2}dP0(x)+ Eπ1 ∫Rp 1{νn(Φ(x))<1/2}dP1(x)= π0 ∫RpP{νn(Φ(x))≥1/2}dP0(x)+ π1 ∫RpPνn(Φ(x))<1/2dP1(x).
令Um=1{CHmn(Φ(X))=1},m=1,…,M,在给定μn(Φ(X))=θ∈[0,1] 的条件下,随机变量U1,…,UM 是独立的,都服从两点分布Bernoulli(θ). 注意到Gn,0 和Gn,1是μn(Φ(X))|{Y=0}的分布函数和μn(Φ(X))|{Y=1}的分布函数,我们可得 ∫Rp Pνn(Φ(x))<1/2dP1(x)= ∫[0,1]P1M ∑Mm=1 Um<1/2 |μn(Φ(X))=θdGn,1(θ)= ∫[0,1] PT
其中,T服从参数为M和θ的二项分布,记为T~Bin(M,θ).类似地, ∫RpPνn(Φ(x))≥1/2dP0(x)= 1-∫[0,1] PT 所以,ERCRPn=∫[0,1]PT ∫[0,1]PT1/2-[[M/2]]Mg°n(1/2)+18M°n(1/2)+o1M(A7)
根据文献[23]的结论,有 ∫[0,1]PT1-α-[[M α]]Mg°n(α)+α(1-α)2 M°n(α)+o1M(A8)
我们令α=1/2即可证得定理.
() () |