报告题目:Conditional probability estimation based classification with class label missing at random
报告人: 王启华 教授 中国科学院数学与系统科学所
报告时间:2021年1月11日 10:00-10:50
报告地点:Zoom 会议 (Zoom 会议id: 770 311 8512, 密码: 378548)
校内联系人: 王培洁 wangpeijie@jlu.edu.cn
报告摘要: For binary classification, it is common that class labels of some subjects are missing. Generally, the complete case analysis and the two stage procedure can be used to extend existing full data classification methods to deal with classification with missing class labels. Nevertheless, these two approaches can not take full advantage of unlabeled subjects. In this paper, binary classification with the class label missing at random (MAR) is considered. Based on the inverse probability weighting (IPW) method and the augmented inverse probability weighting (AIPW) method, two new methods called IPW-CPC and AIPW-CPC are proposed to construct powerful classifiers by estimating the conditional probability in a reproducing kernel Hilbert space (RKHS). Compared with the complete case analysis and the two stage procedure, the proposed IPW-CPC and AIPW-CPC methods can make the best use of unlabeled subjects, which contributes a lot to improving classification accuracy. Theoretically, we show that conditional misclassification rates of the proposed classifiers converge to the Bayes misclassification rate in probability and rates of convergence are also obtained. Finally, simulations and the real data analysis well demonstrate good performances of the proposed IPW-CPC and AIPW-CPC methods in comparison with existing methods.
报告人简介:王启华,中国科学院数学与系统科学研究院研究员,博士生导师,国家杰出青年基金获得者,教育部长江学者奖励计划特聘教授,中科院“百人计划”入选者,首届全国百篇优秀博士论文奖获得者。曾在北京大学与香港大学任教,先后访问加拿大Carleton大学、美国California大学戴维斯分校、美国California大学洛杉矶分校、美国Yale大学、美国华盛顿大学、美国西北大学、德国Humboldt大学、澳大利亚国立大学及澳大利亚悉尼大学等。主要从事生存分析、缺失数据分析、高维数据统计分析及非-半参数统计推断等方面的研究。出版专著三部,在The Annals of Statistics, JASA及Biometrika等国际重要刊物发表论文百余篇。是高维统计分会理事长,分别是生存分析及生物统计分会副理事长,先后是IMS-China和IBS-China委员会委员,是一些国际与国内学术期刊的编委。