Review of Distributed Learning on Non/Semi-parametric Estimation

This article provides a comprehensive review of distributed learning methods for nonparametric and semiparametric estimation.

Divide-and-Conquer (or One-Shot) Methods

  • Zhang et al. (2015), Lin et al. (2017) propose a method for nonparametric estimation by averaging local kernel ridge regression estimators.
  • Zhao et al. (2016)
  • Lian et al. (2019) (B-spline), Wang et al. (2021) (B-spline), Lv & Lian (2022) (RKHS)
  • Chen et al. (2022) investigate the use of the kernel-based Smoothed Maximum Score Estimator (SMSE) for solving semi-parametric binary response models.

Communication-Efficient Methods

  • Gao & Wang (2023) consider the partially linear model and propose a communication-efficient method based on the local polynomial regression.
  • Chen et al. (2022) also consider the communication-efficient distributed estimation of the partially linear model using the SMSE.

References

Chen, X., Jing, W., Liu, W., & Zhang, Y. (2022). Distributed estimation and inference for semi-parametric binary response models (arXiv:2210.08393). arXiv. https://arxiv.org/abs/2210.08393
Gao, J., & Wang, L. (2023). Communication-efficient distributed estimation of partially linear additive models for large-scale data. Information Sciences, 631, 185–201. https://doi.org/10.1016/j.ins.2023.02.065
Lian, H., Zhao, K., & Lv, S. (2019). Projected spline estimation of the nonparametric function in high-dimensional partially linear models for massive data. Annals of Statistics, 47(5), 2922–2949. https://doi.org/10.1214/18-AOS1769
Lin, S.-B., Guo, X., & Zhou, D.-X. (2017). Distributed learning with regularized least squares. Journal of Machine Learning Research, 18(92), 1–31.
Lv, S., & Lian, H. (2022). Debiased distributed learning for sparse partial linear models in high dimensions. Journal of Machine Learning Research, 23(2), 1–32.
Wang, Y., Zhang, W., & Lian, H. (2021). Distributed partially linear additive models with a high dimensional linear part. IEEE Transactions on Signal and Information Processing over Networks, 7, 611–625. https://doi.org/10.1109/TSIPN.2021.3111555
Zhang, Y., Duchi, J., & Wainwright, M. (2015). Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates. Journal of Machine Learning Research, 16(1), 3299–3340.
Zhao, T., Cheng, G., & Liu, H. (2016). A partially linear framework for massive heterogeneous data. Annals of Statistics, 44(4). https://doi.org/10.1214/15-AOS1410