Leandro Lanzieri: Detecting Sleigh Fatigue (Christmas Lecture)
Detecting Sleigh Fatigue: Monitoring Ageing in Embedded Systems for Predictive Maintenance at the North Pole
- http://inet.haw-hamburg.de/events/inet-seminar/leandro-lanzieri-detecting-sleigh-fatigue-christmas-lecture
- Leandro Lanzieri: Detecting Sleigh Fatigue (Christmas Lecture)
- 2025-12-09T17:00:00+01:00
- 2025-12-09T18:00:00+01:00
- Detecting Sleigh Fatigue: Monitoring Ageing in Embedded Systems for Predictive Maintenance at the North Pole
Dec 09, 2025 from 05:00 PM to 06:00 PM (Europe/Berlin / UTC100)
Ensuring the reliability of embedded systems is critical in highly-dependable and time-sensitive missions, especially when failure could jeopardize a once-per-year logistics operation as complex as Christmas. As part of our collaboration with the North Pole Reliability Initiative (NPRI), we investigate how hardware ageing affects Commercial-Off-The-Shelf (COTS) devices, which are extensively used in Santa's sleigh avionics and workshop automation. In this talk, we present results of large-scale empirical studies performed on naturally-aged hardware, with the objective of evaluating the presence of ageing indicators and the feasibility of applying machine learning (ML) techniques to its detection.
First, we analyse the wear out of embedded SRAM by studying the cell initialization bias of naturally-aged 154 microcontrollers. By extracting features from the un-initialized memories, we perform a statistical analysis and draw insights on the type of hardware usage. We then employ these features to train, evaluate, and compare various ML models in the estimation of operating time. Second, we study and forecast the degradation of a large deployment of 298 naturally-aged FPGAs. We collect and statistically analyse in-field measurements during 280 days, where we find a generalized and continuous degradation of the propagation delay across all devices. To evaluate predictive maintenance, we forecast future trends of degradation by training ML models on the collected data. Third, we complement these real-world studies with the implementation and experimental evaluation of a software-based self-testing approach to monitor hardware degradation on microcontrollers. This technique leverages timing windows of variable lengths to determine the maximum operational frequency of the COTS microcontrollers. Finally, we give a glimpse into the current efforts in automating the test firmware generation for the proposed technique.
Together, these results demonstrate the feasibility of monitoring ageing in COTS embedded systems and using ML for prediction. Thus, helping ensure Santa's devices remain reliable, efficient, and ready for every upcoming Christmas mission.

