Optimal Subsampling Algorithm for Big Data Regression
Reporter: ai Ming wants to teach Peking University
Report time: 10:00-11:00 am, 21 November 2019
Location: 2nd lecture hall, mathematics building
Summary of report:
To fast approximate the MLE with massive data, This paper studies the optimal subsampling method under the A - optimality criterion for generalized linear models (GLM). The consistency and asymptotic normality of the estimator from A general Subsampling algorithm are established, and optimal subsampling probabilities under the A - and L - optimality criteria are derived. Furthermore, using Frobenius norm matrix concentration inequality, Finite sample properties of the subsample estimator -based on optimal subsampling probabilities are also derived. Since the optimal subsampling probabilities depend on the full data estimate, An adaptive two - step algorithm is developed. The Asymptotic normality and optimality of the estimator from this adaptive algorithm are established. The programs the methods are illustrated and evaluated Through numerical experiments on simulated and real datasets.
About the speaker:
Ai mingyao is the director, professor and supervisor of the department of statistics, school of mathematical science, Peking University. He is also secretary-general of probability and statistics society of Chinese mathematics society, executive director of Chinese society of field statistics, chairman of experimental design branch, vice chairman of high-dimensional data statistics branch, etc. Deputy editor of Statistica Sinica, Journal of Statistical Planning and Inference, Statistics and Probability Letters and STAT, editorial board member of system science and mathematics, national core Journal, editorial board member of science press series of Statistics and data science. Mainly engaged in the design of experiments and analysis, the computer test, data analysis and application of statistics teaching and research work, in Ann Statist, JASA, Biometrika, Technometrics, Statist Sinica and other top journals published more than 60 papers at home and abroad, carried out a number of projects of national natural science foundation surface and the corpus, key projects in national ministry of science and technology 973 topic 2 items.