3rd International Workshop on AI in Networks and Distributed Systems

Thanks to rapid growth in network bandwidth and connectivity, networks and distributed systems have become critical infrastructures that underpin much of today’s Internet services. They provide services through the cloud, monitor reality with sensor networks of IoT devices, and offer huge computational power with data centers or edge and fog computing. At the same time, AI and Machine Learning is being widely exploited in networking and distributed systems. Examples are algorithms and solutions for fault isolation, intrusion detection, event correlation, log analysis, capacity planning, resource management, scheduling, and design optimization, just to name a few.

The scale and complexity of today’s networks and distributed systems make their design, analysis, optimization and management a daunting task. For this, smart and scalable approaches leveraging machine learning solutions must be deployed to take full advantage of these networks. 

WAIN workshop aims at showing to the community new contributions in these fields. The workshop looks for smart approaches and use cases for understanding when and how to apply AI. WAIN will allow researchers and practitioners to share their experiences and ideas and discuss the open issues related to the application of machine learning to computer networks.

Schedule

3.00pm-4.25pm Milan, 9.00am-10.25am NYC

  • WAIN Chairs – Welcome message
  • Keynote: Evangelia Kalyvianaki – Federated Asynchronous Learning for Data Center Scheduling
  • Shimin Tao, Weibin Meng, Yimeng Chen, Yichen Zhu, Ying Liu, Chunning Du, Tao Han, Yongpeng Zhao, Xiangguang Wang and Hao Yang. LogStamp: Automatic Online Log Parsing Based on Sequence Labelling
  • Wenwen Hao, Ben Niu, Yin Luo, Kangkang Liu and Na Liu. Improving accuracy and adaptability of SSD failure prediction in hyper-scale data centers

 4.35pm-6.00pm Milan, 10.35am-12.00am NYC

  • Shiva Ketabi, Matthew Buckley, Parsa Pazhooheshy, Faraz Farahvashy and Yashar Ganjali. Correlation-Aware Flow Consolidation for Load Balancing and Beyond
  • David Pujol-Perich, José Suárez-Varela, Albert Cabellos-Aparicio and Pere Barlet-Ros. Unveiling the potential of Graph Neural Networks for robust Intrusion Detection
  • Gustavo de Carvalho Bertoli, Lourenço Alves Pereira Júnior and Osamu Saotome. Improving detection of scanning attacks on heterogeneous networks with Federated Learning
  • Matheus F. C. Barros, Carlos H. G. Ferreira, Bruno Pereira dos Santos, Lourenço A. P. Júnior, Marco Mellia and Jussara M. Almeida. Understanding mobility in networks: A node embedding approach

Keynote speech – Evangelia Kalyvianaki

Federated Asynchronous Learning for Data Center Scheduling

Data center resource allocation or scheduling is a fundamental operation for the workload placement to nodes and the allocation of resources from nodes to workloads. It must satisfy workload performance objectives in order for data center nodes to remain highly utilized. Even small deviations from the desired objectives can have detrimental effects, resulting in potentially millions of dollars in lost revenue. A fundamental requirement in effective scheduling is the ability to predict accurately and in real-time a node’s resource availability when placing incoming jobs. Most existing data center scheduling  approaches rely on estimates of nodes’ future resource availability to schedule workloads in ways to avoid saturation and to utilize efficiently resources across nodes.

In this talk we describe PRONTO, a federated, asynchronous, memory-limited approach for online prediction of nodes’ availability in data center scheduling. PRONTO uses the industry standard virtualized CPU Ready metric to predict a node’s future availability in real-time. CPU Ready expresses a node’s saturation at a fine-time granularity and is used extensively in practice as an indicator for performance problems. PRONTO uses federated learning to combine CPU Ready measurements across nodes for more accurate prediction at-scale. It incrementally computes local model updates within each node and, using the aggregate of the iterates, computes a global system view. Using large-scale real-world traces from a virtualized production data center, we show that, while using limited memory, PRONTO can use the CPU Ready metric to predict changes in responsiveness ahead of time, leading to better scheduling decisions, while scaling horizontally.

Evangelia Kalyvianaki is a Senior Lecturer (Associate Professor) in the Department of Computer Science and Technology at the University of Cambridge and a Fellow at The Alan Turing Institute. Before, she was a Lecturer at the Department of Computer Science at City University London and a post-doctoral researcher in the Department of Computing, Imperial College London. She obtained her Ph.D. from the Computer Laboratory in Cambridge University. Her research interests span the areas of Cloud Computing, Big Data Processing, Autonomic Computing, Distributed Systems and Systems Research in general. She is interested in the design and management of next generation, large-scale applications in the Cloud.

Topics of Interest

The following is a non-exhaustive list of topics of interest for WAIN workshop:

  • Applications of ML in communication networks and distributed systems
  • Data analytics and mining in networking and distributed systems
  • Traffic monitoring through AI
  • AI applied to IoT and 5G
  • Application of reinforcement-learning
  • Methodologies for anomaly detection and cybersecurity
  • Performance optimization through AI/ML and Big Data
  • Experiences and best-practices using machine learning in operational networks
  • Reproducibility of AI/ML in networking and distributed systems
  • Methodologies for performance evaluation of distributed infrastructure
  • Machine Learning application in cloud, edge, and fog computing
  • Performance evaluation of Content Delivery Networks
  • Application of AI/ML in sensor networks
  • AI/ML for data center management
  • AI/ML for cyber-physical systems
  • ML-driven resource management and scheduling
  • AI-driven fault tolerance in distributed systems

Important dates:

Submission deadline: September 2, 2021September 16, 2021 (Anywhere on Earth)

Notification of acceptance: October 7, 2021, October 15, 2021

Camera ready version deadline: October 22, 2021

Workshop day: November 12, 2021

Submission Guidelines:

Papers will be published at ACM SIGMETRICS Performance Evaluation Review (PER, https://www.sigmetrics.org/per.shtml). Submissions must be original, unpublished work, and not under consideration at another conference or journal. The format for the submissions is that of PER (two-column 10pt ACM format), maximum 5 pages + references. Papers must include authors’ names and affiliations for single-blind peer reviewing by the TPC. Authors of accepted papers are expected to present their papers at the workshop.

PER style file can be downloaded from http://www.sigmetrics.org/sig-alternate-per.cls. Please change the argument of the command \conferenceinfo to \conferenceinfo{Workshop on AI in Networks and Distributed Systems (WAIN) 2021}{~~~Milan,Italy}.

The submission page is available at https://easychair.org/conferences/?conf=wain2021.

Chairs

Luca Vassio, Politecnico di Torino, Italy

Danilo Giordano, Politecnico di Torino, Italy

Jinoh Kim, Texas A&M University-Commerce, US

Jon Crowcroft, University of Cambridge, UK

Publicity Chair

Martino Trevisan, Politecnico di Torino, Italy

TPC Members

Abhishek Chandra, University of Minnesota, USA

Ana Paula Couto da Silva, Universidade Federal de Minas Gerais, Brazil

Andrea Morichetta, waiTU Wien, Austria

Baochun Li, University of Toronto, Canada

Carlos Henrique Gomes Ferreira, Universidade Federal de Ouro Preto, Brazil

Chunglae Cho, Electronics and Telecommunications Research Institute, South Korea

Edmundo de Souza e Silva, Federal University of Rio de Janeiro, Brazil

Eiko Yoneki,University of Cambridge, UK

Eric  Chan-Tin, Loyola University Chicago, USA

Giuseppe Siracusano, NEC Heidelberg, Germany

Hamed Haddadi, Imperial College, UK

Jerry Chou, National Tsing Hua University, Taiwan

Laurent Bindschaedler, Massachusetts Institute of Technology, USA

Mário Almeida, Samsung AI in Cambridge, UKa

Martino Trevisan, Politecnico di Torino, Italy

Sang-Yoon Chang, University of Colorado at Colorado Springs, USA

Tian Guo, Worcester Polytechnic Institute, USA

Zied Ben Houidi, Huawei Technologies Co. Ltd, France