报 告 人:张新雨 研究员
报告题目:Optimal Weighted Random Forests
报告时间:2023 年7月14日(周五)上午10:30-11:30
报告地点:静远楼1506学术报告厅
主办单位:数学研究院、数学与统计学院、科学技术研究院
报告人简介:
       张新雨,中科院数学与系统科学研究院预测中心研究员。主要从事统计学和计量经济学的理论和应用研究工作,具体研究方向包括模型平均、机器学习和组合预测等,发表论文80余篇,其中多篇论文发表在计量经济学和统计学顶级期刊。担任SCI期刊《JSSC》领域主编和其他5个国内外重要期刊的编委,是管理科学与工程学会常务理事、国际统计学会当选会员,先后主持自科优秀和杰出青年基金项目,曾获中国青年科技奖。
报告摘要:
       The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy. In RF, it is conventional to put equal weights on all the base learners (trees) to aggregate their predictions. However, the predictive performances of different trees within the forest can be very different due to the randomization of the embedded bootstrap sampling and feature selection. In this paper, we focus on RF for regression and propose two optimal weighting algorithms, namely the 1 Step Optimal Weighted RF (1step-WRFopt) and 2 Steps Optimal Weighted RF (2steps-WRFopt), that combine the base learners through the weights determined by weight choice criteria. Under some regularity conditions, we show that these algorithms are asymptotically optimal in the sense that the resulting squared loss and risk are asymptotically identical to those of the infeasible but best possible model averaging estimator. Numerical studies conducted on real-world data sets indicate that these algorithms outperform the equal-weight forest and two other weighted RFs proposed in existing literature in most cases.