I am grateful to the National Science Foundation (NSF) and its division of Computer and Network Systems (CNS) for supporting my CAREER project which started in February 2022. This webpage is dedicated to documenting all details of this project.

Award Information

NSF CNS-2146171

CAREER: From Federated to Fog Learning: Expanding the Frontier of Model Training in Heterogeneous Networks


Synopsis

Today’s networked systems are undergoing fundamental transformations as the number of Internet connected devices continues to scale. Fueled by the volumes of data generated, there has been in parallel a rise in complexity of machine learning (ML) algorithms envisioned for edge intelligence, and a desire to provide this intelligence in near real-time. However, contemporary techniques for distributing ML model training encounter critical performance issues due to two salient properties of the wireless edge: (1) heterogeneity in device communication/computation resources and (2) statistical diversity across locally collected datasets. These properties are further exacerbated when considering the geographic scale of the Internet of Things (IoT), where the cloud may be coordinating millions of heterogeneous devices.

This project is establishing the foundation for fog learning, a new model training paradigm that orchestrates computing resources across the cloud-to-things continuum to elevate and optimize over the fundamental model learning vs. resource efficiency tradeoff. The driving principle of fog learning is to intelligently distribute federated model aggregations throughout a multi-layer network hierarchy. The proliferation of device-to-device (D2D) communications in wireless protocols including 5G-and-beyond will act as a substrate for inexpensive local synchronization of datasets and model updates.

This project is leading to a concrete understanding of the fundamental relationships between contemporary fog network architectures and ML model training. The orchestration of device-level, fog-level, and cloud-level decision-making will expand the limits of distributed training performance under resource heterogeneity and statistical diversity. In addition to its focus on optimizing ML training through networks, this project is developing innovative ML techniques leveraging domain knowledge to support and enhance these optimizations at scale.

This project has both technical and educational broader impacts. The elevated model learning vs. resource efficiency tradeoff achieved will result in lower mobile energy consumption from emerging edge intelligence tasks and better quality of experience for end users. Further, the results of this project will motivate new research directions in (1) ML, based on heterogeneous system constraints, and (2) distributed computing for other applications. The educational broader impacts will promote the importance of data-driven optimization for/by network systems. They are being achieved through new undergraduate and graduate courses with personalized learning modules on specific topics.


Personnel

The following individuals are the current main personnel involved in the project:


Collaborators

I am thankful to have several collaborators on both the research and education development efforts:


Publications

The following is a list of publications produced since the start of the project, ordered chronologically:


GitHub Code Repositories

The following is a list of Github repositories created based on the research efforts in this project:


Educational Activities

The following educational activities have been undertaken as part of the integrated research and education plan of the project:

  • ECE 301: Signals and Systems: This course has been taught twice by the PI at Purdue, in fall 2022 and spring 2023. An undergraduate student from the fall 2022 offering, Adam Piaseczny, has been consistently engaged in research on fog learning, and will present his resulting first-authored paper at IEEE ICC in 2024.
  • ECE 60022: Wireless Communication Networks: The PI taught this course at Purdue in spring 2022. It includes modules on the fundamentals of data-driven and machine learning-driven services delivered over wireless networks.
  • HON 399: Principles of Networks: The PI co-created this course at Purdue, intended for an interdisciplinary audience. Based on his book, its first offering was in fall 2023. In the course project, teams of students from humanities, business, science, and engineering majors worked together on contemporary networking problems.
  • EPICS: Harnessing the Data Revolution: The PI has co-led an Engineering Projects in Community Service (EPICS) team at Purdue each semester since fall 2021. This team is focused on providing data science solutions to Native American tribes in the Dakotas.
  • Personalized Education: The team is involved in a Purdue University-wide initiative on developing analytical tools for improving engagement in first year online courses. We have tailored the federated learning solutions developed in this project to improve the quality of analytics provided to students from underrepresented groups.


Outreach Activities

The following outreach activities have been undertaken in the project:

  • FOGML Workshop: The PI has been co-organizing an annual workshop (2021, 2023, 2024) on Distributed Machine Learning and Fog Networks (FOGML), in conjunction with IEEE INFOCOM.
  • Quantum Summer School: The PI has been delivering annual lectures at Purdue's Quantum Summer School on the basic principles of fog learning, highlighting synergies with quantum computing, through Purdue's Quantum Science and Engineering Institute.