#### ALESSANDRO VACCARO

## INVESTIGATION AND MODELING OF RELIABILITY IN POWER ELECTRONIC DEVICES UNDER POWER CYCLING



## Sede amministrativa: Università degli studi di Padova

Dipartimento di Tecnica e Gestione dei Sistemi Industriali

## SCUOLA DI DOTTORATO IN INGEGNERIA MECCATRONICA E DELL'INNOVAZIONE MECCANICA DEL PRODOTTO

ciclo xxxvi

## INVESTIGATION AND MODELING OF RELIABILITY IN POWER ELECTRONIC DEVICES UNDER POWER CYCLING

Direttore della scuola: Ch.ma Prof.ssa Daria Battini

Supervisore: Ch.mo Prof. Paolo Magnone

Dottorando: Alessandro Vaccaro

Alessandro Vaccaro: *Investigation and modeling of reliability in power electronic devices under power cycling*, November 30, 2023

#### SOMMARIO

I sistemi elettronici di potenza, fondamentali per svariate applicazioni, operano in condizioni ambientali impegnative ed affrontano diverse sollecitazioni, tra cui elevate tensioni, "power" e "thermal cycling" e vibrazioni meccaniche. Queste sollecitazioni possono causare la degradazione dei componenti e, conseguente impatto sulle prestazioni, la sicurezza e la durata del sistema. Pertanto, il potenziamento dell'affidabilità dei sistemi elettronici di potenza riveste un'importanza vitale.

La valutazione dell'affidabilità nell'ambito dell'elettronica di potenza, in particolare durante il power e thermal cycling, richiede un approccio olistico che includa l'analisi dei meccanismi di guasto, la previsione dei tassi di guasto e lo sviluppo di strategie per migliorare l'affidabilità del sistema. Questo compito complesso richiede una comprensione approfondita del comportamento dei dispositivi semiconduttori di potenza, delle tecniche di gestione termica e delle tecnologie di packaging.

La tesi si concentra in particolare sugli studi e lo sviluppo di modelli di affidabilità legati al fenomeno del power cycling sui dispositivi a semiconduttore di potenza. Per far ciò, è stato innanzitutto progettato un set-up sperimentale ad hoc per eseguire test accelerati, consentendo l'innesco dei due principali meccanismi di guasto legati a tale fenomeno: il "solder joint fatigue" e il "wire bond degradation". Il set-up implementa due metodologie di test da power cycling: a corrente costante o a ciclo termico costante. Entrambi i metodi sono utilizzati in questo lavoro per calibrare modelli analitici, allo scopo di determinarne l'impatto sull'accuratezza della stima del lifetime dei dispositivi di potenza soggetti a condizioni di stress non costante.

Nel corso di questo lavoro, si esplora inoltre l'implementazione di tecniche di deep learning per definire un modello in grado di prevedere il lifetime dei dispositivi di potenza, soggetti a diversi meccanismi di degrado. Le indagini iniziali impiegano una rete neurale artificiale (ANN) per sviluppare un modello statico non lineare. Test sperimentali convalidano l'accuratezza e la superiorità di questo modello rispetto a quello analitico tradizionale. In aggiunta, è stato sviluppato un modello basato sui dati utilizzando una rete bLSTM (bidirectional Long Short-Term Memory) per prevedere la vita utile rimanente dei dispositivi basandosi sui profili di degradazione della tensione. L'attenzione è stata rivolta all'impatto della suddivisione del dataset sulle prestazioni del modello, evidenziando l'accuratezza nella previsione della vita utile rimanente, anche con un numero limitato di dati.

#### ABSTRACT

Electronic power systems, crucial for various applications, operate in demanding environmental conditions and face different stressors, including high voltages, power and thermal cycling, and mechanical vibrations. These stressors can lead to component degradation, with consequent impacts on system performance, safety, and lifespan. Therefore, enhancing the reliability of electronic power systems is of paramount importance.

Reliability assessment in the field of power electronics, especially during power and thermal cycling, necessitates a holistic approach that encompasses the analysis of failure mechanisms, the prediction of failure rates, and the development of strategies to improve system reliability. This complex task requires a profound understanding of the behavior of power semiconductor devices, thermal management techniques, and packaging technologies.

The thesis primarily focuses on researching and developing reliability models related to the phenomenon of power cycling in semiconductor power devices. To achieve this goal, firstly a dedicated experimental set-up has been designed to conduct accelerated tests, enabling the initiation of the two primary failure mechanisms associated with this phenomenon: solder joint fatigue and wire bond degradation. The set-up implements two methodologies for power cycling tests: constant current and constant temperature cycling. Both methodologies are adopted in this work to calibrate analytical lifetime models, with the aim of determining their impact on the accuracy of lifetime estimation for power devices subjected to non-constant stress conditions.

Furthermore, this work explores the implementation of deep learning techniques to establish a model capable of predicting the lifetime of power devices subject to various degradation mechanisms. Initial investigations involve the use of an Artificial Neural Network (ANN) to develop a non-linear static model. Experimental tests validate the accuracy and superiority of this model compared to traditional analytical approaches. In addition, a datadriven model has been developed using a bidirectional Long Short-Term Memory (bLSTM) network to predict the Remaining Useful Lifetime (RUL) of devices based on voltage degradation profiles. Attention has been focused on the impact of dataset partitioning on the model's performance, highlighting its potential for accurate predictions even with a limited amount of data.

- A. Vaccaro and P. Magnone, "Analysis of thermal cycling effects in power devices under non-constant cumulative stress," 2022 IEEE Applied Power Electronics Conference and Exposition (APEC), Houston, TX, USA, 2022, pp. 330-335, doi: 10.1109/APEC43599.2022.9773598.
- A. Vaccaro, P. Magnone, A. Zilio and P. Mattavelli, "Predicting Lifetime of Semiconductor Power Devices under Power Cycling Stress using Artificial Neural Network," in IEEE Journal of Emerging and Selected Topics in Power Electronics, 2022, doi: 10.1109/JESTPE.2022.3194189.
- A. Vaccaro and P. Magnone, "Influence of Power Cycling Test Methodology on the Applicability of the Linear Damage Accumulation Rule for the Lifetime Estimation in Power Devices," in IEEE Transactions on Power Electronics, vol. 38, no. 5, pp. 6545-6554, May 2023, doi: 10.1109/T-PEL.2023.3242314.
- A. Vaccaro, A. Zilio and P. Magnone, "Lifetime Prediction in Power Semiconductor Devices: A Comparative study between Analytical Modeling and Artificial Neural Network," 2023 IEEE Applied Power Electronics Conference and Exposition (APEC), Orlando, FL, USA, 2023, pp. 1172-1176, doi: 10.1109/APEC43580.2023.10131380.
- A. Vaccaro, D. Biadene and P. Magnone, "Remaining Useful Lifetime Prediction of Discrete Power Devices by Means of Artificial Neural Networks," in IEEE Open Journal of Power Electronics, vol. 4, pp. 978-986, 2023, doi: 10.1109/OJPEL.2023.3331814.

## CONTENTS

| 1 | INT | RODUC                                               | TION                                                  | 1  |
|---|-----|-----------------------------------------------------|-------------------------------------------------------|----|
| 2 | SEM | SEMICONDUCTOR POWER DEVICES: TECHNOLOGIES AND LIFE- |                                                       |    |
|   | TIM | E MOD                                               | ELS                                                   | 7  |
|   | 2.1 | Reliab                                              | ility in Power Systems                                | 7  |
|   | 2.2 | Power                                               | Device Packaging Technologies                         | 10 |
|   |     | 2.2.1                                               | Discrete                                              | 11 |
|   |     | 2.2.2                                               | Module                                                | 13 |
|   |     | 2.2.3                                               | Press Pack or Capsules                                | 14 |
|   | 2.3 | Power                                               | Cycling Stress                                        | 16 |
|   |     | 2.3.1                                               | Wire Bond fatigue                                     | 17 |
|   |     | 2.3.2                                               | Wire Bond Lift-Off                                    | 17 |
|   |     | 2.3.3                                               | Heel Cracking                                         | 18 |
|   |     | 2.3.4                                               | Aluminum Reconstruction                               | 18 |
|   |     | 2.3.5                                               | Solder Joint Fatigue                                  | 19 |
|   | 2.4 | Metho                                               | ods for Estimating Junction Temperature               | 22 |
|   |     | 2.4.1                                               | Direct Mode                                           | 22 |
|   |     |                                                     | 2.4.1.1 Physical Methods                              | 22 |
|   |     |                                                     | 2.4.1.2 Optical Methods                               | 23 |
|   |     | 2.4.2                                               | Indirect Mode                                         | 23 |
|   |     |                                                     | 2.4.2.1 PN-junction Voltage                           | 23 |
|   |     |                                                     | 2.4.2.2 Gate Threshold Voltage                        | 25 |
|   | 2.5 | Techni                                              | iques for Performing Power Cycling Tests              | 27 |
|   |     | 2.5.1                                               | Constant Current                                      | 27 |
|   |     | 2.5.2                                               | Constant $\Delta T_j$                                 | 29 |
|   |     | 2.5.3                                               | Constant Power                                        | 29 |
|   | 2.6 | Lifetin                                             | ne Models                                             | 31 |
|   |     | 2.6.1                                               | Model-Driven                                          | 31 |
|   |     | 2.6.2                                               | Data-Driven                                           | 33 |
| 3 | DES | IGN OF                                              | POWER CYCLING SET-UP                                  | 37 |
|   | 3.1 | Introd                                              | uction                                                | 37 |
|   | 3.2 | TSEP :                                              | model calibration                                     | 37 |
|   | 3.3 | Experi                                              | imental Set-up Description for Power Cycling Tests    | 39 |
|   |     | 3.3.1                                               | Circuit Diagram and Instruments Used                  | 40 |
|   |     | 3.3.2                                               | LabView Code                                          | 48 |
|   |     |                                                     | 3.3.2.1 The Framework                                 | 49 |
|   |     |                                                     | 3.3.2.2 VI executed by the FPGA                       | 49 |
|   |     |                                                     | 3.3.2.3 VI executed by the DSP                        | 49 |
|   |     |                                                     | 3.3.2.4 VI executed by the PC                         | 50 |
|   |     | 3.3.3                                               | Power Cycling Experiments under constant current      | 55 |
|   |     | 3.3.4                                               | Power Cycling Experiments under constant $\Delta T_j$ | 58 |
| 4 | REL | IABILI                                              | TY INVESTIGATION BASED ON ANALYTICAL LIFETIME         |    |
|   | моі | DELS                                                |                                                       | 61 |
|   | 4.1 | Introd                                              | uction                                                | 61 |
|   | 4.2 | Linear                                              | Damage Accumulation Rule for the Lifetime Estimation  | 61 |
|   |     | 4.2.1                                               | The Impact of Statistics                              | 64 |

|    |         |        | 4.2.1.1 Weibull Statistic                                  | 64  |
|----|---------|--------|------------------------------------------------------------|-----|
|    | 4.3     | The Ir | fluence of Power Cycling Test Methodology on the LDA       | · · |
|    | 15      | Rule A | Accuracy                                                   | 67  |
|    |         | 4.3.1  | Power Cycling Tests Under Constant $\Delta T_i$ Stress     | 67  |
|    |         | 4.3.2  | Power Cycling Tests Under Non-Constant $\Delta T_i$ Stress | 70  |
|    | 4.4     | Analy  | sis of Degradation Mechanisms                              | 77  |
| 5  | <br>АРР | LICATI | ON OF ANN TO MODEL THE RELIABILITY OF SEMI-                |     |
| 5  | CON     | IDUCTO | DR POWER DEVICES                                           | 79  |
|    | 5.1     | Introd | uction                                                     | 79  |
|    | 5.2     | Basics | of ANN                                                     | 79  |
|    | -       | 5.2.1  | Multi Layer Perceptron Neural Network                      | 79  |
|    |         |        | 5.2.1.1 Training Process                                   | 81  |
|    |         | 5.2.2  | Recurrent Neural Network                                   | 82  |
|    | 5.3     | ANN-   | based Static Lifetime Model                                | 84  |
|    |         | 5.3.1  | Limit of Empirical Analytical Lifetime Models              | 84  |
|    |         | 5.3.2  | Methodology                                                | 85  |
|    |         | 5.3.3  | Configuration and Performance of MLP-NN                    | 86  |
|    |         | 5.3.4  | Training of MLP-NN with Experimental Data and Valida-      |     |
|    |         |        | tion                                                       | 92  |
|    |         | 5.3.5  | A Comparative Study between Analytical Modeling and        |     |
|    |         |        | ANN                                                        | 94  |
|    |         |        | 5.3.5.1 Methodology                                        | 94  |
|    |         |        | 5.3.5.2 Experimental Results and Comparison of Ac-         |     |
|    |         |        | curacy of Models                                           | 96  |
|    | 5.4     | Adop   | tion of Neural Networks to Predict Remaining Useful Life-  |     |
|    |         | time c | f Devices                                                  | 101 |
|    |         | 5.4.1  | Methodology for RUL Estimation                             | 101 |
|    |         | 5.4.2  | Results and Discussion                                     | 103 |
|    |         | 5.4.3  | A Methodology to Estimate On-Voltage Degradation of        |     |
|    |         |        | Power Devices According to a Power Cycling Mission         |     |
|    |         |        | Profile                                                    | 108 |
|    |         | 5.4.4  | Implementing an ANN for the Development of a Dynamic-      |     |
|    |         |        | Static Model for RUL Prediction                            | 113 |
| 6  | CON     | ICLUSI | ONS                                                        | 119 |
|    |         |        |                                                            |     |
| BI | BLIO    | GRAPH  | Y                                                          | 123 |

### LIST OF FIGURES

| Figure 1  | The historical development of power semiconductors,                  |    |
|-----------|----------------------------------------------------------------------|----|
|           | power electronics, and reliability engineering                       | 2  |
| Figure 2  | Damage Investment together with total cost as a func-                |    |
|           | tion of system reliability                                           | 3  |
| Figure 3  | Critical failure occurring in power devices                          | 4  |
| Figure 4  | The division of failure phenomena in power electronics               |    |
|           | systems                                                              | 8  |
| Figure 5  | The subdivision of failure phenomena matrix in power                 |    |
|           | electronics systems                                                  | 8  |
| Figure 6  | The Bathtub curve                                                    | 9  |
| Figure 7  | Predominant package type in function of power range                  | 10 |
| Figure 8  | TO packages                                                          | 12 |
| Figure 9  | Cross-section of TO-247 package                                      | 12 |
| Figure 10 | Different sections for the lead                                      | 12 |
| Figure 11 | Standard module package                                              | 13 |
| Figure 12 | Internal section of power module without baseplate                   | 14 |
| Figure 13 | Press pack package                                                   | 15 |
| Figure 14 | Schematic representation of a generic thermal cycle as               |    |
|           | a function of time                                                   | 16 |
| Figure 15 | Wire bond fatigue                                                    | 17 |
| Figure 16 | Wire bond lift-off                                                   | 18 |
| Figure 17 | Heel cracking in a double wire bond                                  | 19 |
| Figure 18 | Wire bond heel crack due to improper bonding process                 | 19 |
| Figure 19 | Aluminum reconstruction                                              | 20 |
| Figure 20 | Aluminum reconstruction with the use of compressive                  |    |
|           | overlayer                                                            | 20 |
| Figure 21 | Solder fatigue due to power - thermal cycling stress                 | 21 |
| Figure 22 | Voids in the solder between ceramic substrate and base               |    |
|           | plate                                                                | 21 |
| Figure 23 | $V_{ce}$ -T <sub>i</sub> relationship                                | 25 |
| Figure 24 | $V_{th}$ - $T_j$ measurement circuit                                 | 26 |
| Figure 25 | $V_{th}$ - $T_j$ relationship                                        | 26 |
| Figure 26 | Current, voltage and junction temperature trends dur-                |    |
|           | ing a standard power cycling test                                    | 28 |
| Figure 27 | Power cycling tests in different control strategies                  | 30 |
| Figure 28 | A modified CIPSo8 model for $\Delta T_j < 30^{\circ}C$               | 33 |
| Figure 29 | Flowchart of PF algorithm                                            | 35 |
| Figure 30 | A simplified schematic circuit for the calibration V <sub>on</sub> - |    |
| 0         | $T_i$ curves                                                         | 37 |
| Figure 31 | The schematic circuit adopted to calibrate $V_{on} - T_i$ curves     | 38 |
| Figure 32 | A picture of the experimental set-up adopted for $\dot{V}_{on}$ –    | -  |
| 5 5       | $T_j$ calibration                                                    | 39 |
| Figure 33 | The $V_{ce,on}(T_i)$ curves based to $I_m$ currents                  | 40 |
| Figure 34 | Simple Schematic representation of the set-up adopted                |    |
| -         | for power cycling tests                                              | 41 |

| Figure 35 | The driving section of switch and DUT                                             | 41 |
|-----------|-----------------------------------------------------------------------------------|----|
| Figure 36 | The circuit utilized to sense $V_{ce}$ and acquire it by means                    |    |
| 0 0       | of CompactRio ADC                                                                 | 42 |
| Figure 37 | Typical curves acquired during a power cycling test                               | 42 |
| Figure 38 | The schematic of I <sub>ref</sub> generation circuit                              | 43 |
| Figure 39 | The bypass circuit                                                                | 43 |
| Figure 40 | Picture of the set-up                                                             | 44 |
| Figure 41 | Real schematic description of PCB used for power cy-                              |    |
|           | cling tests                                                                       | 45 |
| Figure 42 | Gerber file of PCB used for power cycling experiments .                           | 46 |
| Figure 43 | PCB for power cycling experiments                                                 | 47 |
| Figure 44 | The phases in which is divided a power cycling period                             |    |
| 0         | for LabView program                                                               | 48 |
| Figure 45 | VI code FPGA: driving signal                                                      | 49 |
| Figure 46 | VI code FPGA: update phase by implementing global                                 |    |
| -         | variables                                                                         | 50 |
| Figure 47 | VI code DSP: the acquisition phase                                                | 50 |
| Figure 48 | VI PC code: evaluate the $\Delta T_i$                                             | 51 |
| Figure 49 | VI PC code: save data for post processing                                         | 51 |
| Figure 50 | VI PC code: processing ON phase                                                   | 52 |
| Figure 51 | VI PC code: processing OFF1 phase                                                 | 53 |
| Figure 52 | VI PC code: processing OFF2 phase                                                 | 54 |
| Figure 53 | Temperature swing profiles for 8 different samples (Test1)                        | 55 |
| Figure 54 | $V_{ce,on}$ in DC power cycling in case of $\Delta T_i = 140$ °C stress.          | 56 |
| Figure 55 | $Z_{th,jc}$ in the case of non controlled $\Delta T_{j}$ =140 °C stress .         | 56 |
| Figure 56 | Variation junction temperature cycling for 8 different                            |    |
|           | samples (Test2)                                                                   | 57 |
| Figure 57 | $V_{ce,on}$ in DC power cycling in case of $\Delta T_j$ =120 °C stress.           | 57 |
| Figure 58 | $Z_{th,jc}$ in the case of non controlled $\Delta T_{j}{=}120~^\circ\!C$ stress . | 57 |
| Figure 59 | Schematic representation of the non controlled and ac-                            |    |
|           | tive controlled $\Delta T_j$ methodologies adopted for power                      |    |
|           | cycling tests                                                                     | 58 |
| Figure 60 | $\Delta T_j$ in DC power cycling test in case of active control                   |    |
|           | $\Delta T_j$ stress                                                               | 59 |
| Figure 61 | V <sub>ce,on</sub> in DC power cycling test in case of active control             |    |
|           | $\Delta T_j$ stress                                                               | 59 |
| Figure 62 | LabView Code for implementing the active control of                               |    |
|           | $\Delta T_j  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  \dots  $     | 60 |
| Figure 63 | Schematic diagram of LC evaluation based on the mis-                              |    |
|           | sion profile to which a typical power system is subjected                         | 63 |
| Figure 64 | The CDFs of Weibull distribution with same $\beta$ but dif-                       |    |
|           | ferent $\alpha$                                                                   | 64 |
| Figure 65 | Linearization of Weibull CDF                                                      | 66 |
| Figure 66 | Temperature swing and V <sub>ce,on</sub> profiles for power cy-                   |    |
|           | cling tests carried out under constant stress conditions                          |    |
|           | in non-controlled and active control $\Delta T_j$ condition                       | 68 |
| Figure 67 | Experimental CDF for constant $\Delta T_j = 120^{\circ}C$ and $\Delta T_j =$      |    |
|           | 140°C                                                                             | 69 |
| Figure 68 | Application of the Miner's rule for the determination                             |    |
|           | of the lifetime                                                                   | 71 |

| Figure 69 | Theoretical prediction for Test 1                                       | 72  |
|-----------|-------------------------------------------------------------------------|-----|
| Figure 70 | Experimental $V_{ce,on}$ and CDFs profiles for Test 1                   | 73  |
| Figure 71 | Combined non-constant power cycling stress, experi-                     |     |
|           | mental CDFs and Miner's rule predictions for different                  |     |
|           | test conditions in Tab.6                                                | 76  |
| Figure 72 | X-ray images of solder joint for fresh and after failure                |     |
|           | device                                                                  | 77  |
| Figure 73 | Microscope images of wire bonds after power cycling                     |     |
|           | failures                                                                | 78  |
| Figure 74 | Example of MLP-NN with three inputs, three hidden                       |     |
|           | layers and a single output                                              | 80  |
| Figure 75 | Relationship between input and output in a neuron                       | 81  |
| Figure 76 | Schematic description of a gated cell (LSTM network) .                  | 82  |
| Figure 77 | bLSTM network                                                           | 83  |
| Figure 78 | Schematic representation of the methodology adopted                     |     |
|           | to investigate the performance of MLP-NN to model power                 | r   |
|           | cycling effects                                                         | 86  |
| Figure 79 | RMSRE calculated in case of MLP-NN with a 1 hidden                      |     |
|           | layer and a number of neurons ranging from 1 to 60.                     | 87  |
| Figure 80 | RMSRE averaged over 100 tests as a function of the num-                 |     |
|           | ber of neurons , in the case of a single $1HL$ (a) or $2HL$ .           | 88  |
| Figure 81 | Number of cycles to failure as a function of the junction               |     |
|           | temperature cycling $\Delta T_j$ (a) and the heating time $t_{on}$      |     |
|           | via sumalitive data                                                     | 91  |
| Figure 82 | Number of cycles to failure as a function of the junction               |     |
|           | temperature cycling and of the heating time ton via                     |     |
|           | experimental data                                                       | 93  |
| Figure 83 | Schematic representation of the proposed analysis                       | 95  |
| Figure 84 | Experimental number of cycles to failure as a function                  |     |
|           | of the junction temperature cycling                                     | 96  |
| Figure 85 | Typical power cycling tests                                             | 97  |
| Figure 86 | Number of cycles to failure as a function of the junction               |     |
|           | temperature cycling                                                     | 99  |
| Figure 87 | Number of cycles to failure as a function of the heating                |     |
|           | time                                                                    | 100 |
| Figure 88 | Graphic representation of the expected outcome of the                   |     |
|           | data-driven model                                                       | 102 |
| Figure 89 | On-voltage prediction according to the sliding window                   |     |
|           | methodology                                                             | 102 |
| Figure 90 | Experimental on-voltage profiles as a function of the                   |     |
|           | number of cycles                                                        | 104 |
| Figure 91 | $V_{ce,on}$ profiles estimated by the neural network in the             |     |
|           | case of $\Delta T_j = 120^{\circ}C$ and $\Delta T_j = 140^{\circ}C$     | 105 |
| Figure 92 | Relative error between predicted and experimental life-                 |     |
|           | time as a function of the monitored number of cycles .                  | 105 |
| Figure 93 | Predicted RULs vs ideal RULs (dashed curves) for all 8                  |     |
|           | samples stressed at $\Delta T_j = 120^{\circ}C \dots \dots \dots \dots$ | 106 |
| Figure 94 | Predicted RULs vs ideal RULs (dashed curves) for all 8                  |     |
|           | samples stressed at $\Delta T_j = 140^{\circ}C \dots \dots \dots \dots$ | 107 |

| Figure 95  | The methodology based on the knowledge of a mission                                   |     |
|------------|---------------------------------------------------------------------------------------|-----|
|            | profile of $\Delta T_j$ and of a database of typical V <sub>on</sub> degra-           |     |
|            | dation curves at several stress conditions $\Delta T_j$                               | 109 |
| Figure 96  | $V_{on}$ profiles in the case of constant $\Delta T_{j,1}$ and $\Delta T_{j,2}$ , and |     |
|            | average Von profile based on the application of both                                  |     |
|            | stresses                                                                              | 110 |
| Figure 97  | Schematic of a typical power cycling tests                                            | 110 |
| Figure 98  | Experimental Von profiles after power cycling tests                                   | 111 |
| Figure 99  | Comparison of experimental on-voltage profiles to val-                                |     |
|            | idate the methodology                                                                 | 112 |
| Figure 100 | The schematic of the adopted ANN for predicting the                                   |     |
|            | RUL under variable stress conditions                                                  | 113 |
| Figure 101 | Average $V_{on}$ profiles utilized to train the ANN for RUL                           |     |
|            | prediction under general mission profile                                              | 115 |
| Figure 102 | Experimental results and RUL predictions generated by                                 |     |
|            | the ANN under the stress conditions of Mission Profile 1                              | 116 |
| Figure 103 | Experimental results and RUL predictions generated by                                 |     |
|            | the ANN under the stress conditions of Mission Profile 2                              | 117 |
| Figure 104 | Experimental results and RUL predictions generated by                                 |     |
|            | the ANN under the stress conditions of Mission Profile 3                              | 118 |

## LIST OF TABLES

| Table 1  | Comparison of the three common methods for junction               |     |
|----------|-------------------------------------------------------------------|-----|
|          | temperature measurement [6]                                       | 22  |
| Table 2  | Valtage and Current ranges of SMUs                                | 39  |
| Table 3  | Electrical and thermal parameters of TO-247 used as DUT.          | 39  |
| Table 4  | Electrical and thermal parameters of used switches                | 40  |
| Table 5  | Summary of DC power cycling tests under constant                  |     |
|          | current condition.                                                | 55  |
| Table 6  | List of experiments under non-constant $\Delta T_j$ stress        | 70  |
| Table 7  | Experimental lifetime vs. lifetime prediction according           |     |
|          | to the Miner's rule (19). lifetime estimated (19) and             |     |
|          | prediction bound are expressed as a percentage of the             |     |
|          | experimental number of cycles to failure. Active " $\Delta T_j$ " |     |
|          | approach is compared with the case of "non-controlled             |     |
|          | $\Delta T_j$ " approach                                           | 74  |
| Table 8  | Experimental lifetime vs. lifetime prediction according           |     |
|          | to the Miner's rule (19). lifetime estimated with (19)            |     |
|          | and prediction bound are expressed as a percentage of             |     |
|          | the experimental number of cycles to failure. Different           |     |
|          | non-constant power cycling test conditions are consid-            |     |
|          | ered in the case of "active control of $\Delta T_j$ "             | 75  |
| Table 9  | Input parameters adopted for the training process of              |     |
|          | MLP-NNs according to the methodology presented in                 |     |
|          | Fig. <u>78</u>                                                    | 89  |
| Table 10 | List of power cycling experiments.                                | 90  |
| Table 11 | Fitting parameters of empirical analytical model (23).            | 97  |
| Table 12 | Summary of RMSRE                                                  | 98  |
| Table 13 | Summary of experimental test conditions                           | 109 |
| Table 14 | List of mission profiles employed for power cycling ex-           |     |
|          | periments under non-constant stress conditions                    | 115 |

#### ACRONYMS

- ADC Analog Digital Converter
- AIN Aluminum Nitride
- AISiC Aluminum Silicon Carbide
- Al<sub>2</sub>O<sub>3</sub> Aluminum oxide
- ANN Artificial Neural Network
- **bLSTM** Bidirectional Long-Short Term Memory
- CDF Cumulative Distribution Function
- CSV Comma-Separated Values
- CuSn Copper-Tin
- **CTE** Coefficent Thermal Expansion
- DCB Direct Copper Bonding
- DSP Digital Signal Processing
- DUT Device Under Test
- EoL End of Life
- FEM Finite Element Method
- FFNN Feed-Forward Neural Network
- FIT Failures In Time
- FPGA Field-Programmable Gate Array
- GIT Gate Injection Transistor
- GTO Gate Turn-Off
- HL Hidden Layer
- IC Integrated Circuits
- **IGBT** Insulated Gate Bipolar Transistor
- IR Infra-Red
- LC Life Consumption
- LDA Linear Damage Accumulation

LSTM Long-Short Term Memory

MLP-NN Multilayer Perceptron Neural Network

MOSFET Metal-Oxide Field Effect Transistor

NPSV Network-Published Shared Variable

p-AlGaN Aluminium Gallium Nitride doped in p type

- PCB Printed Circuit Board
- PDF Probability Density Function
- **PF** Particle Filter
- PID Proportional Integral Derivative

- **PoF** Probability of Failure
- ppm Parts-per Million
- **RMSE** Root Mean Square Error
- **RMSRE** Root Mean Square Relative Error
- **RNN** Recurrent Neural Network
- **RUL** Remaining Useful Lifetime
- SiC Silicon Carbide
- SMU Source Meter Unit
- SoH State of Health
- **TSEP** Temperature Sensitive Electrical Parameters
- TO Transistor Outline

#### INTRODUCTION

Power electronics is a multidisciplinary field that plays a pivotal role in the efficient conversion, control, and conditioning of electrical power [1]. It encompasses the design, analysis, and application of electronic devices and systems for a wide range of power-related applications, including renewable energy systems, electric vehicles, industrial motor drives, and consumer electronics.

The continuous advancements in power electronic technologies, as depicted in Fig.1, have revolutionized the utilization of electrical energy, contributing to improved energy efficiency, reduced carbon emissions, and enhanced overall performance in numerous applications [2]. The fundamental components of power electronics include passive components such as inductors, capacitors, transformers and active components, i.e. power semiconductor devices.

Among these active components, the invention of thyristor switches in 1957 (see Fig.1) marked a significant milestone [3]. The commercialization of thyristors triggered the rapid advancement of power electronics and paved the way for the development of other devices, including the first power diodes. In the 1970s, the demand for power devices with enhanced switching capabilities surged with the advent of control technologies like Pulse Width Modulation (PWM) [4]. This demand spurred the evolution of power semiconductor devices, leading to the development of Gate Turn-Off (GTO) thyristors. Furthermore, the quest for even higher performance brought about the emergence of power MOSFETs (metal oxide semiconductor field-effect transistor), solidifying their place in the field of power electronics. However, it was in the 1980s that the insulated gate bipolar transistor (IGBT) was introduced [5] (see Fig.1), revolutionizing the landscape of power semiconductor devices. The introduction of IGBTs marked a turning point in power electronics, enabling the development of more advanced and sophisticated applications. Currently, the development of power semiconductor devices is entering its third phase with the utilization of wide bandgap materials such as SiC or GaN. These advancements go beyond the previous stages, which relied on vacuum tube rectifiers (phase 1 of Fig.1), and silicon power devices described in the previous phase (phase 2 of Fig.1). In general, all of these components are strategically integrated to form various power electronic circuits, including rectifiers, inverters, converters, and voltage regulators. Although they typically represent only 10 to 30% of the total value of a power electronics system, they have a significant impact on the system's overall value, dimension, and technical functionality [6]. As the demand for energy efficiency and the integration of renewable energy sources into the power grid continue to rise, the role of power electronics becomes even more critical [7]. Power electronic systems enable the seamless integration of renewable energy sources like solar and wind into the grid by efficiently managing power flow, regulating voltage and frequency, and facilitating energy storage. Furthermore, power electronics plays a vital role in the electrification of transportation, supporting the transition from internal combustion engines to electric vehicles through the provision of efficient charging systems, motor drives, and energy management systems.



**Fig. 1**: The historical development of power semiconductors, power electronics, and reliability engineering [8].

#### RELIABILITY IN POWER ELECTRONICS

While power electronics has achieved remarkable progress in terms of performance and efficiency, reliability remains a critical aspect that demands attention. Reliability refers to the ability of a system or component to consistently perform its intended function under specified conditions for a specified period [9, 10]. In power electronic systems, reliability is of utmost importance due to the critical nature of the applications they serve. Many industrial, healthcare, automotive, energy, transportation, and aerospace applications rely on power electronic circuits [11]. The requirement for reliability in this field has increased considerably [12]. For instance, in some applications such as avionics, the demand for failure tolerance is even zero [11]. Moreover, the sustainability of a power electronic circuit/system is closely related to its durability. Consequently, it has a significant impact from both economic and safety perspectives [11, 13, 14]. In particular, Fig.2 illustrates the concept of reliability cost/benefit analysis. It shows that by increasing investments to improve the system's reliability (red curve), the overall system reliability increases. As a result, the costs incurred due to system failures decrease (blue curve). The sum of these two curves represents the total cost (green curve), which exhibits a minimum point, identifying the optimal level of annual cost/system reliability [15].



System Reliability

## **Fig. 2**: Damage, investment together with total cost as a function of system reliability [15].

Power electronic systems operate under diverse environmental conditions and are subjected to various stresses, such as high voltage, high current, temperature cycling, and mechanical vibrations. These stresses can lead to component degradation, thermal fatigue, material fatigue, and other failure mechanisms, which can significantly impact the system's performance, safety,



**Fig. 3**: Critical failures occur in power electronic devices: discrete (a) and module (b). (c) Fire in a wind turbine blade as a consequence of critical failure in its integrated power system.

and lifespan. Therefore, understanding and improving the reliability of power electronic systems is crucial to ensure their efficient and long-term operation.

Reliability assessment in power electronics involves the analysis of failure mechanisms, the prediction of failure rates, and the development of strategies to enhance system reliability. It requires a comprehensive understanding of the behavior of power semiconductor devices, thermal management techniques, packaging technologies, and system-level considerations [12]. One of the primary concerns in power electronics reliability is the degradation and failure of power semiconductor devices. Power semiconductor devices, such as IGBTs, MOSFETs, and thyristors, are exposed to high power densities and thermal cycling, which can induce stresses leading to device degradation and eventual failure like is shown in Fig.3.

To evaluate the reliability of power electronic systems, accelerated lifetime testing is often employed [16]. Accelerated life testing involves subjecting the devices or systems to harsh conditions, such as elevated temperatures, high voltages, and increased current levels, to expedite the aging and failure process. By analyzing the failure data obtained from accelerated tests, the reliability characteristics, such as failure rates and failure mechanisms, can be estimated and used to predict the expected lifetime under given operating conditions. Advancements in reliability modeling and prediction techniques have enabled researchers and engineers to enhance the design and operation of power electronic systems. Mathematical models have been utilized to estimate the expected lifetime based on the observed degradation mechanisms and stress levels. These models consider factors such as temperature, electrical stress, and mechanical stress to provide insights into the system's reliability and identify critical components that may require additional attention

in terms of design improvements or maintenance strategies. In recent years, the application of artificial neural networks (ANNs) and machine learning algorithms, has gained prominence in power electronics reliability analysis [17–19]. These techniques can capture complex relationships between input variables (such as temperature, current, and voltage) and the corresponding system reliability, enabling more accurate predictions and proactive maintenance strategies. Moreover by training these models with historical data and monitoring the system's health parameters in real-time, it is possible to estimate the Remaining Useful Lifetime (RUL) of critical components and take timely actions to prevent failures and optimize system performance.

In conclusion, as power electronics continues to advance, addressing reliability challenges will remain a key focus area. By combining interdisciplinary knowledge, advanced modeling techniques, and data-driven approaches, researchers and engineers can continue to improve the reliability of power electronic systems, leading to increased performance, longevity, and safety in a wide range of applications.

#### OBJECTIVES

Overall, the objectives of the thesis are to enhance the understanding of power device reliability under power cycling conditions, identify critical stress parameters, and investigate the applicability of advanced methodologies, such as deep learning, in accurately predicting device lifetime.

In more detail, the dissertation aims to conduct studies and develop reliability models for power devices subjected to power cycling stresses. To achieve this, a dedicated set-up is designed and implemented to conduct power cycling experiments. The set-up allows for simultaneous stress-testing of multiple components under well-defined experimental conditions. The main goal is to induce solder degradation and wire bond degradation, which are the primary failure mechanisms associated with power cycling.

Additionally, the thesis focuses on evaluating different methodologies for accelerated lifetime testing of IGBT devices under power cycling stress. Special attention is given to understanding the differences in lifetime estimation under non-constant stress conditions and analyzing the performance of the LDA rule in predicting device reliability. Furthermore, the thesis will explore the potential of utilizing deep learning techniques, specifically ANNs, for lifetime prediction to advance the field of research.

#### STRUCTURE OF THIS WORK

This dissertation is structured into six primary sections. The initial section, presented in **Chapter 1**, elucidates the research objectives and delineates the scope of the dissertation.

**Chapter 2** offers a concise introduction to the impact of power cycling stress on the reliability of power electronic systems, it explores various packaging types and their associated mechanisms of wear-out failure. Additionally, the chapter delves into the standard direct current power cycling test and examines several advanced power cycling test methods published in recent years. The historical progression and the current state of the art in lifetime models for power semiconductor devices are also presented.

**Chapter 3** primarily presents detailed information regarding the designed experimental set-up for conducting power cycling experiments, along with the appropriate techniques and methodologies employed to carry out such experiments. Additionally, it describes the measurement set-up used to implement the temperature sensitive electrical parameters (TSEP) method during the power cycling experiments, aiming to estimate the junction temperature.

**Chapter 4** focuses on the traditional analytical model for estimating life consumption in the presence of non-constant cumulative stresses. Specifically, the impact of the experimental methodology on the accuracy of linear damage accumulation theory is investigated.

**Chapter 5** delves into the utilization of deep learning techniques to enhance reliability assessment within in the context of power cycling stress. To begin, an empirical static model based on ANN is constructed for estimating the lifetime. This model is then compared against the conventional analytical empirical model. Following that, a dynamic lifetime prediction model, also based on ANN, is developed. This dynamic model enables the estimation of the RUL and future prediction of an electrical parameter used as the state of health (SoH) for power devices.

In **Chapter 6**, a comprehensive summary of the dissertation's principal findings is provided, with reference to the research objectives defined in the introduction.

# SEMICONDUCTOR POWER DEVICES: TECHNOLOGIES AND LIFETIME MODELS

#### 2.1 RELIABILITY IN POWER SYSTEMS

The request for increased reliability in power electroinic systems has grown in recent years due to three main reasons:

- The increase in device density, driven by a growing demand for higher current density on the chip. This has led to advancements in packaging technologies to limit the impact of high temperatures and associated temperature gradients on power devices;
- The widespread use of power electronics systems in various sectors in particular aviation and medical fields, where zero failure tolerance is required;
- Modern industrial systems demand low failure rates as multiple power systems need to operate simultaneously at the same frequency, significantly increasing the failure rate [11, 20].

It is under these appropriate considerations that the average failure rate of power modules has decreased from 1000 FIT in 1995 to 20 FIT in 2000, where FIT represents 1 failure per 10<sup>9</sup> [21]. To address these issues, it is first necessary to identify the components that are susceptible to failure in power systems and determine the causes behind such failures. In general, in order of importance, the following elements are typically involved [22]:

- Capacitors;
- PCBs;
- Semiconductors;
- Solder joints;
- Connections or other factors.

In Fig.4, the breakdowns in percentages of the most frequently occurring failure phenomena in power systems are reported.

From Fig.5, it can be observed that all the failure mechanisms listed in and reported in Fig.4 are predominantly caused by thermal phenomena, both static and cyclic, occurring in more than half of the cases. Consequently, considering the significant impact of thermal phenomena on power electronic systems, this PhD thesis focuses on various aspects related to these phenomena arising from an active action of the power system, known as power cycling.

In particular, the effects of power cycling are considered on solder joints and semiconductors, which are responsible for more than one-third of the causes of failure in power electronic systems, as reported in Fig.4. This is mainly due



**Fig. 4**: According to [23], the highest percentage of failures is attributed to capacitors (30%) followed by PCBs (26%). Semiconductor and solder joints (failures predominantly affecting power devices) account overall for 34%. Finally, 10% of failure is attributed to various causes or to connections.



**Fig. 5**: According to [24], the majority of the failure phenomena reported in Fig.4 are primarily attributed to thermal phenomena, both static and cyclic, accounting for approximately 55%. Other causes include vibration/shock (20%), humidi-ty/moisture (19%), and dust (6%).



**Fig. 6**: The bathtub curve describes the relationship between the failure rate and the operational time of a component.

to the fact that different operational conditions to which the load connected to the power system is subject have an impact on the currents flowing through it. These current variations induce temperature changes in the power devices, leading to cyclic thermal stresses [25, 26].

The failure phenomenon due to power cycling is predominantly attributed to device aging and falls within the third region of the bathtub curve shown in Fig.6. This type of curve exhibits three regions, in which three different mathematical trends of the failure rate, defined as the anticipated number of times that an item fails in a specified period of time [27], are identified:

- *Early Failure*: As evident from the curve, the first region represents failure occurring solely due to production defects. Therefore, this type of failure can only occur in the early stages of component operation, resulting in a decreasing failure rate.
- *Random Failure*: In the random failure region, the failure rate remains constant. Here, there are no manufacturing defects or aging effects, so the only type of failure that can occur is of a random nature.
- *Wear-out Failure*: In the third region, failure is attributed to device aging, indicating that the device is approaching the end of its life. The failure rate exhibits an exponential increase as a function of operating time.

Under nominal operating conditions, the occurrence of power cycling-related failure events requires long times. In order to analyze this phenomenon, understand the underlying physical mechanisms, and identify potential lifetime improvements brought by technological innovations in packaging, accelerated tests need to be implemented in the laboratory [16, 28, 29].

#### 2.2 POWER DEVICE PACKAGING TECHNOLOGIES

In power electronics, one of the considerations is to ensure that the heat generated by Joule effect in the chip can flow properly to the heat sink. It is important to note that in this field of electronics, extremely high power densities are involved. For example, it is common for power in the range of several hundred watts to be dissipated on a surface of a few cm<sup>2</sup> [20]. Therefore, it is necessary to encapsulate such chips (e.g., in silicon) to improve thermal conductivity, but also to:

- Increase reliability: provide high durability due to the changing current conditions required by the load;
- Reduce parasitic elements such as resistances, capacitances, and inductances; improving electrical resistance;
- Provide electrical isolation between switches (in the case of modules) or with the heat sink;
- Protect the device from external environmental influences such as humidity;
- Provide protection against mechanical stress.



Fig. 7: Power range of modern semiconductor devices in function of predominant package technology [20].

It is precisely due to these challenges that the development of packaging for power devices lags behind IC packages [30]. The choice of packaging depends

on the required power rating. In the following Fig.7 is shown an overview of the required power ratings in function of package technology.

The packaging technologies currently used in the field of power electronics are:

- Discrete;
- Module;
- Press pack.

#### 2.2.1 Discrete

As shown in Fig.7, the use of discrete packages covers the entire range of lowpower applications (ranging from tens of watts to a few kilowatts). There is a wide variety of packaging options for these packages (some example are reported in Fig.8), but the most used are the TO-247 (of which a cross-section diagram is provided in Fig.9) and TO-220 [20]. Discrete devices are encapsulated in a transfer mold compound based on silicon gel, which provides higher dielectric constants and enhances resistance to moisture, mechanical stresses and external chemical agents [31]. In this type of packaging, the silicon die is directly soldered to the copper substrate, and aluminum wires are directly connected to the leads. Typically, one of these wires is the gate pin, while the others are connected to the source lead. Due to the structure arrangement, the drain lead and the substrate are electrically connected. This indicates the need to use an insulating pad between the heat sink and the power device package.

One of the weak points of these devices is the lead themselves. Considering that the leads also have an ohmic resistance, although much smaller than that of the device, the power dissipated on them could reach the melting temperature of the solder joints to the PCB holes. Obviously, to overcome this issue, the power dissipation impact needs to be reduced. However, enlarging the lead's size is not feasible due to fixed hole dimensions (see Fig. 10b). Furthermore, maintaining minimal clearance distances between the leads is necessary to satisfy insulation requirements [20]. Therefore, the focus should be on optimizing the cross-section improving the shape of the lead [20]. Particularly, by improving the placement of the lead cross-section (black square in Fig.10) to better fit within the hole of the PCB where it will be soldered (white circle in Fig.10), it is possible to reduce the ohmic resistance due to the power device's lead on the PCB, thus decreasing power dissipation, as demonstrated with the proposed solution in Fig.10C [20].

Another drawback of these devices is the presence of wire bonds, which are subjected to power cycling stress and are susceptible to the well-known phenomenon of lift-off, as well as having an impact on parasitic inductances. To limit these issues, discrete packages have been developed (see Fig.8a), also known as DirectFET structures. In these cases, the wire bond has a significantly larger thicknesses (clip contacts) compared to the traditional ones, and the pins are substitued with pads. This allows for handling higher currents, reducing parasitic inductances and improving the reliability in terms of wire bonding degradation [20, 32]. However, these newer packaging types require



Fig. 8: Commercial pacakges: DirectFET (a), TO-220 (b) and TO-247 (c).

greater attention to thermal design, as they have the potential for higher current capability. Additionally, their encapsulation type makes them more susceptible to atmospheric corrosion and moisture (eliminates the plastic packaging typical for TO), and they also have higher costs compared to the traditional TO packages.



Fig. 9: Cross-section of TO-247 package.



**Fig. 10**: (a) Standard section of a lead in a TO-247 package; (b) enlarged version of the lead cross-section; (c) super leg, which providing a 16% improvement in current capability [20].

#### 2.2.2 Module

The development of this type of package allows for encapsulating multiple chips connected in parallel, enhancing their electrical capabilities. In Fig.11, a commercial power module is shown, as well as a schematic representation of the cross-section. In this package, the silicon chip is connected through a wire bond to the top side contacts, while it is soldered to the DCB (Direct Copper Bonding) substrate consisting of two copper layers and an insulator. The presence of the insulator ensures electrical isolation of the device from the heat sink (which is not possible in TO packages). The substrate is then connected to the base plate through soldering. Typically, the wire bond material is aluminum, while the metallization layers and the substrate are made of copper. The oxide layer is Al<sub>2</sub>O<sub>3</sub> to achieve a proper balance of the CTE.

However, the traditional power module structure is susceptible to two common failure phenomena related to power/thermal cycling, namely solder joint fatigue and wire bond degradation. To mitigate or eliminate these issues, various solutions have been developed. For example, to address the CTE mismatch between neighboring materials such as silicon and aluminum, which is the cause of wire bond degradation, a solution with copper wire bonding has been proposed. Copper and Aluminim have a CTE of 16.5 ppm/°C and 22.5ppm/°C, respectively. Hence, the CTE of copper is closer to that of silicon, that is 2.6 ppm/°C. However, this solution has the drawback of creating a thick metallization layer of copper for subsequent electrical contacts made of aluminum in the upper part of the module. This practice does not provide sufficient protection for the ultrasonic bonding process of the copper wire bonds to the chip. Additionally, this copper layer adds extra thermal capacitance to the device, leading to increased thermal stress compared to thermal phenomena [33, 34].

For modules requiring higher thermal conductivity,  $Al_2O_3$  oxides are replaced by AIN oxides, where  $Al_2O_3$  and AIN oxides have thermal conductivities of 24W/mK and 170W/mK, respectively.



**Fig. 11**: (a) A commercial IGBT power module; (b) a simplified internal cross-section of the power module with baseplate.

The presence of AIN insulation may require the use of an AISiC baseplate instead of copper in applications where high performance is required, in order to mitigate CTE differences. However, the presence of an AISiC substrate increases the thermal impedance of the entire system. This is why, currently, an increasing number of power modules are being designed without a baseplate (as depicted in Fig.12) [20]. The absence of a baseplate also allows for relaxation of dimensional constraints in the substrate zone, copper-oxide-copper, where typically their layer thicknesses are 0.3 mm, 0.4/0.6 mm, and 0.3 mm, respectively. Consequently, there are no inherent limitations on the footprint size, potentially reducing production efforts and the likelihood of errors during manufacturing [20]. Additionally, this approach reduces thermal gradients on the chip, enhancing reliability. However, the absence of a baseplate negatively affects the heat dissipation pathway, resulting in a spread of temperature across the device. Therefore, the baseplate-less power module solution is preferable only for small-sized chips. Furthermore, the substrate-heat sink interface requires the use of thermal interface materials (TIM) with extremely thin thicknesses to minimize the impact on the overall thermal impedance of the system.



**Fig. 12**: A simplified internal section of power module with the use of thermal interface materials (TIM) and without baseplate.

#### 2.2.3 Press Pack or Capsules

This type of packaging is highly utilized for high or very high power applications, particularly for encapsulating devices such as thyristors or GTOs. The first power IGBTs used this type of package; however, due to the much higher cell density compared to thyristors and space limitations, power devices have transitioned to module-type packaging. Additionally, it has been observed that in terms of power cycling capability, IGBTs encapsulated in press-pack packages are less robust compared to those encapsulated in modules [6]. This package type consists of a double metal disc (anode and cathode) with the silicon chip sandwiched in between (see Fig.13). This design is intended to distribute pressure uniformly within the package. Furthermore, a molybdenum disc improves pressure uniformity and ensures excellent thermal expansion coefficient matching with silicon, resulting in better heat dissipation. The gate of the thyristor is placed on a metal lead extending from the cathode into the silicon chip. A defined physical contact pressure, typically around 10-20  $N/mm^2$ , is applied to establish electrical and thermal contact via external clipping [35]. The interconnection between the silicon device and molybdenum can vary depending on the package size, but the process typically involves sintering techniques where a noble metal layer is created to form the metallic contact, using metal powder under high pressure and at a temperature of approximately 250°C [20].

The positive aspects of this packaging type include:

- The ability to cool both surfaces of the device.
- No wire bonding, eliminating the associated failure mechanisms.
- Excellent die/package surface ratio and compact design.
- Due to its specific structure, especially the arrangement of cathode and anode, it is possible to easily connect multiple of them in series, which is highly advantageous, especially in applications that demand such solutions, such as high-voltage applications.

The negative aspects include:

- Isolation circuits are necessary due to the absence of electrical insulation in the package structure.
- It is diffult to choose and maintain an appropriate pressure during the package assembly phase.



**Fig. 13**: (a) A typical press pack commercial GTO. (b) A simplified internal section of the capsule [20].

#### 2.3 POWER CYCLING STRESS

In the Sec.2.1, it is discussed the significant impact of power/thermal cycling phenomena on power systems and the need to perform accelerated tests to analyze their effects.

Power cycling denotes the phenomenon in which the load connected to the power system necessitates varying current levels over time. This phenomenon directly impacts the current flowing through the constituent devices of the system, consequently causing variations in their operating junction temperatures. These load-induced temperature variations are typically non-periodic and often involve rather lengthy timescales, allowing power devices to cool down  $(t_{cold})$  and heat up  $(t_{heat})$ , thus defining a cycle of temperature variation (like in Fig.14).

This cyclic temperature variation causes expansion and contraction of the adjacent materials within the power device, the intensity of which depends on the difference in their coefficient of thermal expansion (CTE), resulting in thermo-mechanical stresses at the interface of these materials. These repeated stresses induce material degradation and damage to the device, altering its electrical and thermo-electrical properties. This can adversely affect the device's performance or even lead to failure in the overall system where the power device is integrated.



**Fig. 14**: Schematic representation of a generic thermal cycle as a function of time, where  $(t_{heat})$  and  $(t_{cold})$  the periods when the power device heats up and cools down, respectively.

This paragraph focuses on the failure mechanisms associated with power cycling. These fully permanent failures include:

- Wire bond fatigue;
- Wire bond lift off;
- Wire bond heel cracking;
- Aluminum reconstruction;
- Solder joint fatigue.
### 2.3.1 Wire Bond fatigue

The wire bonds connected to the leads of the power device make contact with the silicon chip through an aluminum pad. These wire bonds generally have diameters ranging from 300 to 500 µm and are primarily composed of aluminum with small additions of magnesium or nickel. This composition helps prevent the formation of corrosive phenomena on the silicon chip and mitigates the differences in CTE between aluminum and silicon if the interface were exclusively composed of the contact between these two materials.

The portion of the wire bond closest to the chip is most exposed to temperature gradients during device operation, and the power dissipated, typically ranges from 100 mW to a maximum of 400 mW, depending on the wire bond diameter. However, during transient phases (switching modes) of the power device, the current density through the wire bond section is non-uniform due to the skin effect. This results in shear stress between the wire bond and the pad to which it is connected on the silicon chip, as well as repeated flexing of the wire bond. Consequently, there is a change in the device's resistive behavior and a modification in current distribution, as the electrical resistance of the interface pad increases. The point at which this stress occurs is indicated in the Fig.15 [36, 37].



**Fig. 15**: Cross section of a wire bond, with a representation of the transition from wire bond to the silicon chip via an intermediate aluminunim pad [36].

# 2.3.2 Wire Bond Lift-Off

The wire bond lift-off failure is caused by thermal stress at the interface between two materials, silicon and aluminum, which have different CTE. This phenomenon does not affect the copper lines connected to the aluminum wire bond, as the CTE difference between copper and aluminum is smaller compared to that between silicon and aluminum.

The crack generated due to thermo-mechanical stress forms at the base of the wire bond, expands, and detaches from the contact pad connected to the silicon chip. It is known that polycrystalline materials like aluminum have a maximum acceptable stress value (elastic stress region), and if this value is exceeded, the impact of thermo-mechanical stress amplifies (plastic stress region). Of course, this depends on the intensity and duration of the applied stress. In Fig.16a, the wire bond lift-off phenomenon is depicted, while in Fig.16b, the footprint of the wire bond after lift-off is displayed. Additionally, it is possible to observe the traces of the wire bond's solder placed on its sides rather than in the central part.

To mitigate the impact of wire bonding lift-off, a series of polymer layers called "coating layers" are used, which mix with the wire bond during the application of the ultrasonic bonding technique to connect to the bond pad of the silicon chip [36].



Fig. 16: (a) Wire bond lift-off (SEM image, 40x). (b) Footprint of wire bond after lift-off (SEM image, 100x) [36].

# 2.3.3 Heel Cracking

This type of failure effect is less common compared to wire bond lift-off and is primarily caused by an incorrect ultrasonic bonding process. It is triggered by thermo-mechanical stress, specifically related to the contraction and expansion of the wire bond in the presence of temperature gradients. The viscosity of the silicone gel used for device encapsulation can also influence this phenomenon.

Furthermore, during the wire bond displacement phase, where the degradation mechanism is accelerated, this phenomenon can be observed in the section of the wire bond connected to the pins. Fig.17 illustrates an example of heel cracking [36].

In Fig.18, it can be observed how the crack forms at the site of a defect during the ultrasonic bonding phase. This type of failure effect is particularly evident when the self-heating effect is stronger and non-uniform, which typically occurs when there are two wire bonds for the emitter lead [38].

### 2.3.4 Aluminum Reconstruction

This phenomenon is often a secondary effect related to wire bond lift-off. It is caused by thermo-mechanical stress as well, which, due to the expansion and contraction of the wire bond, leads to the formation of granules in the metallic interface layer between the wire bond and the silicon chip. This is due to the significant difference in CTE between silicon and aluminum. The roughness of the silicon chip layer has a certain impact on this type of phenomenon, as it causes the stress to exceed the elastic limit and become plastic.

The phenomenon manifests itself through the formation of granules, cracks, or plastic deformations (see Fig.19). The specific manifestation depends on



Fig. 17: Heel cracking in a double wire bond [36].



Fig. 18: Wire bond heel crack due to improper bonding process (SEM image, 25x ) [36].

the cyclic nature of the stress and its impact. The type of metallization layer texture also determines the size of the granules. This phenomenon tends to occur in the areas of the layer near the center of the silicon chip, where higher temperatures are reached, while it is less significant at the edges. That is why the phenomenon is related to wire bond lift-off because when one of the contacts detaches, the remaining contact has to bear the entire current flow, leading to increased temperature and plastic stress levels in the area of the pad where the contact is still connected.

This effect contributes to an increase in the resistance offered by the device, as it increases the electrical resistance of the metallization layer. Among the technological solutions implemented to mitigate this effect is the use of a "compressive overlayer" that reduces the increase in electrical resistance of the metallization layer subjected to high temperature. In the Fig.20, it is possible to observe the section of the layer without the use of the compressive layer and the one with the compressive layer, highlighting how the effect is significantly mitigated [36].

## 2.3.5 Solder Joint Fatigue

The material used for soldering the silicon chip with the baseplate (in the case of discrete package) or with DCB substrate (in power module) is typically composed of tin, indium, or tin-lead alloys. These alloys exhibit excellent thermoelectric properties with melting points reaching 185°C, and their self-heating effect is negligible in terms of thermal characteristics. However, when they come into contact with other materials such as copper (which is composed



**Fig. 19**: (a) Emitter metalization of an IGBT chip before power cycling test (SEM image, 1000x). (b) Reconstruction phenomenon in the same emitter metalization after power cycling stress (SEM image, 1000x) [36].



**Fig. 20**: Reconstructed emitter metalization after removal of the polyimide passivation (compressive overlayer) in the left of metalizzation. In the right part, the mitigation effect due the compressive overlayer can be observed (SEM image, 800x) [36].

the baseplate, for example), it is necessary to use combinations of copper and tin, such as the  $CuSn_6$  alloy, to mitigate the differences in CTEs. In the case of tin-lead solders, their actual composition consists of a tin-enriched phase and a lead-enriched phase. These two phases tend to expand during device power cycling or thermal cycling, leading to fractures near the copper layer, particularly in the areas adjacent to the ceramic substrate. Usually, to activate these mechanisms, the size of the system involved needs to be considered. For instance, in discrete devices, the duration of the stress has a greater influence than the effect of the temperature gradient in causing material degradation.

In the Fig.21a, the impact of solder failure is assessed after a power cycling test and in Fig.21b a thermal cycling. During thermal cycling, the failure is observed at the edges of the solder joints, as these are the points where passive thermal stress is most pronounced. Conversely, power cycling stress, where only the chip heats up, solder failure is found in regions where the chip is located, typically in the central areas [39–42].

Another potential cause of solder failure is the presence of voids within the solder joints resulting from production processes (see Fig.22). Generally, the presence of large voids creates highly localized and steep temperature gradients, leading to expansion of the solder and the formation of cracks. These voids also contribute to increased temperatures on the silicon chip. To ad-

dress this issue, efforts are made to minimize the occurrence of large voids by promoting the formation of smaller voids, thereby reducing the localized temperature gradients on them.



**Fig. 21**: (a) Damage is developed below active chips of a module in the substrate solder joint after power cycling stress [39]. (b) Damage is developed at the corners of the substrate solder as a result of passive thermal cycling in the module [39].



Fig. 22: Voids in the solder between ceramic substrate and base plate (SEM image, 100x) [42].

| Category                           | Examples                                           | Advantages                             | Disadvantages                                                                 |
|------------------------------------|----------------------------------------------------|----------------------------------------|-------------------------------------------------------------------------------|
| Physical<br>method<br>(direct)     | Thermocouple<br>Thermal probes<br>Liquid crystal   | Temperature map<br>Spatial resolution  | Low response time<br>Package opening<br>Physical contact<br>Noise measurement |
| Optical<br>method<br>(direct)      | IR radiation<br>Spectroscopy<br>Thermo reflectance | Temperature map<br>Spatial resolution  | Package opening<br>Expensive                                                  |
| Electrical<br>method<br>(indirect) | PN-junction<br>Gate threshold<br>On-resistance     | Small time constant<br>No contact need | Average temperature                                                           |

**Tab. 1**: Comparison of the three common methods for junction temperature measurement [6].

#### 2.4 METHODS FOR ESTIMATING JUNCTION TEMPERATURE

Measuring or estimating the temperature of devices during power cycling tests is of paramount importance as it allows defining critical operating points used to identify failure phenomena, replicate operating conditions for qualification tests (e.g., JEDEC standards), and develop models that consider the actual impact of temperature. These models are essential for providing a reliable prognosis of the power device's lifetime.

The methods for measuring and estimating temperature can be essentially divided into two categories: direct and indirect.

#### 2.4.1 Direct Mode

In the case of direct methods, the two main approaches are physical and optical, each with their positive and negative aspects as reported in the Tab.1.

#### 2.4.1.1 *Physical Methods*

If direct access to the silicon chip is possible, a way to measure the junction temperature is by using temperature sensors such as thermocouples and thermistors. The spatial resolution of this approach varies across the spectrum from sub-micrometer to millimeter scales, based on the number and dimensions of the probes utilized. The use of liquid crystals, commonly employed to identify hot spots in power devices, is implemented in this method to enable the extraction of temperature maps [6]. The accuracy of the measurement depends on the sensor's capabilities, especially in terms of time response related to its thermal capacity. However, in general, these types of approaches are highly accurate even though the response times are in the order of 5 ms [6]. However, this method require the chip to be exposed.

#### 2.4.1.2 Optical Methods

An optical beam is directed towards the region where the silicon chip is located. This beam is reflected, and by measuring the energy of the reflected beam, indications of the junction temperature can be obtained. These methods include the use of IR sensors, microscopes or infrared cameras, and optical fibers [43–48]. Although measurements obtained with these methods are sufficiently precise, e.g. with an IR camera, spatial resolutions of a few tens of micrometers with a temporal resolutions of the tens ms can be achieved [47]. Their drawback is that they require the removal of the ceramic case and protective gel to detect the junction temperature.

Both direct methods require chip exposure, which is rarely practical, which is why indirect methods are preferred [43].

#### 2.4.2 Indirect Mode

The indirect method, i.e. the electrical method (see Tab.1), is based on equivalent circuits, or intrinsic electrical properties of the semiconductor [48]. Among these, the most well-known and widely used indirect method is based on the intrinsic electrical properties of silicon, such as the concentration of intrinsic carriers and the mobility of charge carriers. These parameters are influenced by temperature. For example, there is a dependence between on-voltage and junction temperature, and this parameter is defined as the TSEP method [43, 49–52]. The use of this method provides much faster response times compared to the two previously mentioned methods. However, the estimated temperature value must be considered as an average value, as the chip area does not allow for discerning the points of maximum temperature, such as the center of the chip, compared to the corners. This implies that the measurement requires a certain time for the chip to effectively reach a temperature close to the estimated average temperature, typically in the order of several hundred milliseconds [53].

Some studies utilize the on-voltage under high current conditions as the TSEP method [54–56] because it ensures good measurement sensitivity. However, this approach is not advantageous when the currents are sufficiently high to significantly influence  $R_{on}$ , considering its temperature dependence (as in MOSFET devices), resulting in variations in the on-voltage. Consequently, significant errors occur in estimating the junction temperature. Additionally, especially in cases where accelerated testing is necessary and wire bond degradation occurs, this type of approach is not recommended. Therefore, when using the on-voltage as the TSEP method, it is primarily employed under conditions of very low current measurement. In this case, the monitored voltage is the one across the pn junction and is not affected by voltage variations due to  $R_{on}$  (and thus mobility). Further details on this method are provided in the following section.

#### 2.4.2.1 PN-junction Voltage

Considering a basic structure of pn junction, its forward voltage is closely related to temperature. This voltage is often used as a "temperature sensor"

#### 24 SEMICONDUCTOR POWER DEVICES: TECHNOLOGIES AND LIFETIME MODELS

for silicon chip devices like bipolar transistors, MOSFETS, IGBTS, and GITS with p-AlGaN where, in this case, the pn junction exists between the gate and source, examples are reported in [47, 57–64]. This measurement is performed under very low current conditions, so that the voltage drop can be attributed solely to the effect of the pn junction, without being influenced by the base region, channel resistance, or ohmic contacts. The mathematical expression of this voltage, assuming an ideal pn junction, is represented in the following expression [65]:

$$V_{j}(T_{j}) = \frac{nk_{b}T_{j}}{q} \ln\left(\frac{I_{d}}{I_{d,s}} + 1\right)$$
(1)

where  $V_j$  is the junction voltage, q is the elementary charge, n is the ideal factor of the pn junction,  $k_b$  is the Boltzmann constant, and  $T_j$  is the junction temperature.  $I_d$  and  $I_{d,s}$  represent the current density and reverse saturation current density, respectively. The expression of  $I_{d,s}$  is as follows:

$$I_{d,s} = qn_i^2 \left( \frac{D_p}{L_p N_D} + \frac{D_n}{L_n N_A} \right)$$
(2)

where  $D_n$  ( $D_p$ ) is the diffusion coefficient of electrons (holes),  $L_n$  ( $L_p$ ) is the diffusion lenght of electrons (holes), and  $N_D$ ,  $N_A$  and  $n_i$  are the donor, acceptor and intrinsic concentrations, respectively.

To establish the empirical or mathematical relationship between  $T_j$  and forward voltage ( $V_F$ ), the following steps are followed:

- The device is turned on, by applying a gate voltage larger than the threshold voltage.
- A very small sensing current is set, typically 1/1000 of the rated current of the device, to ensure that the current in the device is negligible in terms of self-heating effects [20].
- The device is externally heated, for example in an oven, and the temperature of the case is measured using a thermocouple, assuming that the case temperature corresponds to that of the silicon chip.

The relationship between  $T_j - V_F$  is expressed through a linear dependence like in (3), where the slope coefficient has a negative value in the case of IGBT device (see Fig.23):

$$T_j = m \cdot V_F \tag{3}$$

Generally, the measurement resolution is around  $-2mV/^{\circ}C$ . Additionally, it is observed that the sensing current should not be excessively small, as it would worsen the resolution error in  $V_{ce}$  measurement (see the case for 100µA in Fig.23).



Fig. 23: Calibration curves  $V_{ce}$ -T<sub>j</sub> for IGBT for different level of sensing current [6].

### 2.4.2.2 Gate Threshold Voltage

Another possible indirect method is to find the experimental relationship between threshold voltage ( $V_{th}$ ) and ( $T_j$ ), where, for the IGBT, the mathematical expression showing the temperature dependence is given as:

$$V_{th}(T_j) = \frac{k_b T_j}{q} \ln\left(\frac{N_A}{N_D}\right) - \frac{Q_{ox}}{C_{ox}} + \frac{\sqrt{4 \cdot \epsilon_{si} \cdot N_A \cdot k_b}}{C_{ox}} \cdot \sqrt{\ln\left(\frac{N_A}{n_i}\right) \cdot T_j}$$
(4)

In (4),  $Q_{ox}$  is the effective gate oxide charge density,  $\epsilon_{si}$  is the permittivity of silicon,  $C_{ox}$  is the oxide capacitance [56].

Typically, the measurement circuit is implemented as shown in Fig.24. The gate-emitter ( $V_{ge}$ ) and collector-emitter ( $V_{ce}$ ) voltages are set equal, a fixed current  $I_m$  is applied and the corresponding voltage  $V_{th}$  is measured. This method is applied in the same manner for both MOSFETs and IGBTs. A typical  $V_{th}$  vs  $T_j$  relationship is shown in Fig.25.

From Fig.25, it can be observed that the dependence of  $V_{th}$  on the temperature is nonlinear. This type of measurement exhibits better sensitivity compared to the  $V_{ce}$  vs  $T_j$  dependence, typically around 10-15 mV/°C. This is likely due to the presence of the oxide layer relative to the pn junction, resulting in different impurity quantities, thus improving the measurement sensitivity [60].

In power cycling tests, it is crucial to keep the parameters used as TSEP constant during the experiment. Consequently, the  $V_{on}(T_j)$  method under low measuring current conditions is preferred over  $V_{th}(T_j)$ . This preference stems from the fact that during the aging process, the value of  $V_{th}$  may experience variations due to gate oxide degradation caused by charge trapping issues.

This phenomenon has been observed in IGBT devices but not in MOSFETs as demonstrated [37]. However, this limits the feasibility of using  $V_{th}$  as TSEP



Fig. 24: Circuit for calibration curves V<sub>th</sub>-T<sub>j</sub> for IGBT [6].



Fig. 25: Calibration curves  $V_{th}$ -T<sub>j</sub> for an IGBT [6].

in power cycling tests. Furthermore, an additional limitation of this method is its difficulty in implementation on a device used in a real converter. In SiC technology, this method is not preferred due to the trapping effects occurring in the gate oxide, which capture some charge carriers and cause a shift in  $V_{\rm th}$  [62, 66, 67].

#### 2.5 TECHNIQUES FOR PERFORMING POWER CYCLING TESTS

It has been discussed in sec.2.1 that accelerated stress tests are conducted in laboratories to analyze failure phenomena, evaluate new materials and/or packaging designs, identify their weaknesses, and seek improvements [11, 16].

The strategy of using accelerated tests is based on the fact that under standard operating conditions (excluding the region of premature failure shown in Fig.6), a device takes years to reach a failure event, making the aforementioned objectives impractical.

There are two possible methods for accelerating tests:

- Thermal cycling;
- Power cycling.

In thermal cycling tests, the device is passively heated, usually using an oven. This approach ensures accurate estimates of the junction temperature but is relatively slow due to the sluggishness of the thermal cycling system.

A faster way to conduct such accelerated tests is through power cycling, which involves active heating cycles of the device. Although this electricalbased process does not provide an extremely precise estimation of temperature compared to the previous method, it is significantly faster, even in executing individual stress cycles. Moreover, power cycling tests approximate the device's behavior under operational conditions more effectively [68].

In these tests, the parameters that influence the device's reliability from a thermo-mechanical stress perspective are increased compared to those in standard operating conditions. This necessitates the creation of models, whether empirical or simulation-based, (that incorporate and fit to these accelerated parameters) which providing estimations of lifetime giving as input the stress parameters under standard operational conditions.

To perform power cycling tests, three main methods are:

- Constant current;
- Constant ΔT<sub>j</sub>;
- Constant power.

### 2.5.1 Constant Current

The constant current method is recommended as the standard for power cycling stress method by AQG 324 and the IEC 60749-34 standard. It is also referred to as the DC-power cycling test [69–72]. In this type of accelerated test, a high DC current is applied to stress the component. This current is injected into the device for a defined interval of time (heating time/on-time), representing a percentage of the cycle period. As a result, the power dissipated in the device leads to a junction temperature ( $T_j$ ) increase due to the Joule effect, resulting in a thermo-mechanical stress.

The remaining time of the cycle period is defined as the cooling time. During the cooling time, the high current is stopped, and a small sensing current is injected into the device, in order to acquire voltage adopted as TSEP parameter. The stress parameters, associated with power cycling that can be controlled to calibrate the experiment, are:

- DC current.
- Heating time, cooling time.
- Case temperature.



Fig. 26: Current, voltage and junction temperature trends during a standard power cycling test.

In Fig.26, the high current value ( $I_{dc}$ ) is chosen to reach the desired maximum junction ( $T_{j,max}$ ) value. However, it is important to note that the choice of on-time ( $t_{on}$ ), must be sufficiently long to ensure that the ( $T_{j,max}$ ) value is evenly distributed across the entire chip, as discussed in sec.2.4.2.1. This requires a minimum  $t_{on}$  of several hundreds of milliseconds [53].

During the cooling phase ( $t_{off}$ ), the first measured voltage value (with low current  $I_{ref}$ ) in stable conditions (i.e., with electrical transients exhausted, typically after a time interval of hundreds of microseconds) retains the information of the recently heated device, corresponding to  $T_{j,max}$  (assuming that the thermal time constant is greater than hundreds of microseconds). On the other hand, in order to estimate the minimum junction temperature ( $T_{j,min}$ ), that can be controlled through the heat sink, the voltage value at the last point before the onset of the next on-phase (or heating time) is sampled [6]. This type of test is considered the fastest and most critical [73]. Through power cycling, it is possible to induce both of the main failure mechanisms in devices, as reported in [39, 74–79]. Specifically during the heating time, the information provided by on-voltage is used to monitor the SoH of the wire bond, while information regarding the junction-to-case variation temperature ( $\Delta T_{jc}$ ) and the dissipated power (P<sub>d</sub>) is used to monitor the junction-to-case thermal resistance ( $Z_{th,jc} = \Delta T_{jc}/P_d$ ) and thus the conditions of the solder joints inside the component. In general, for V<sub>on</sub> and Z<sub>th,j</sub>, an increase of 5% and 20%, respectively, from their initial values are set as EoL condition [20].

#### 2.5.2 Constant $\Delta T_{j}$

The constant  $\Delta T_j$  test is considered less critical compared to the conventional test, as it limits the most significant stress parameter, namely  $\Delta T_j$ , thereby eliminating the nonlinear effects caused by self-heating and temperature increase when the power device approaches its EoL [80, 81].

A possible method to keep  $\Delta T_j$  constant is the online control of the DC current (I<sub>dc</sub>) like in [81] (see Fig.27a). Alternatively, it is possible to change, during the experiment, the heating time (t<sub>on</sub>) or the gate voltage.

A controller, like PID controller in [80], can be used to regulate the device's gate voltage during the heating phase, ensuring that any measured  $\Delta T_j$  variations are balanced in the subsequent cycle (an example is shown in Fig.27b). This method requires interrupting the control and cyclically evaluating the onvoltage under standard conditions (as indicated in SubSec.2.5.1), if the stress induces the wire bond degradation.

To maintain a constant  $\Delta T_j$  during the test, another possibility is to manage the heating time (t<sub>on</sub>). In this case, the current parameter remains constant, as well as the gate voltage parameter, allowing the measurement of the onvoltage (V<sub>on</sub>) as a parameter of SoH using the same reference (with no need to periodically interrupt the test and measure under standard conditions) [72].

### 2.5.3 Constant Power

In constant power tests, it is necessary to maintain a constant power dissipation on the device during the experiment. This can be achieved by controlling either the stress current or the on-voltage [72, 81]. Even in this case, if the gate voltage or the current are changed during the on-phase, it is necessary to periodically interrupt the experiment to measure SoH of the wire bonds through the on-voltage under reference conditions. However, if the stress induces solder degradation only, the test does not require cyclic interruption as shown in the example of Fig.27c.



**Fig. 27**: (a) On-voltage, cycling current, and  $\Delta T_j$  trends of three power cycling tests (red, blue, and green curves) in constant  $\Delta T_j$  stress condition via  $I_{dc}$  control [81]. (b) Gate voltage regulation during a long period to mantain constant  $\Delta T_j$  [80]. (c) Parameter trends of IGBT module during a constant power power cycling test by means gate-emitter voltage control [72].

#### 2.6 LIFETIME MODELS

Lifetime models are crucial in studying the reliability of power systems. As mentioned earlier, one of the objectives of reliability is to predict the LC or RUL of a power system under specific mission profiles. One of the key steps to achieve this goal is to extract the device's lifetime through an equation. Generally, there are two possible approaches:

- Model-driven;
- Data-driven.

### 2.6.1 Model-Driven

These models determine the component's lifetime by defining a static equation. They are typically categorized as physically-based models [82–85] and empirical models [51, 86–93].

Physically-based models calculate the lifetime based on information related to the physical and mechanical structure of the device. They rely on parameters such as deformation intensity and crack depth, which are obtained through stress tests or FEM simulations. However, developing a physical model that accurately predicts lifetime is a complex task, as it requires integrating electrical, thermal, and mechanical phenomena. These models have certain limitations and their accuracy is dependent on their validity range.

Empirical models, on the other hand, are more prevalent. They rely on wellfitted experimental data that consider thermal and electrical aspects although lack information about the physical aspects of failure.

One of the early empirical lifetime models developed was the Coffin-Manson model (5) [86, 87]. This model formulated a power law to describe the relationship between  $\Delta T_j$  and lifetime in terms of the number of applied stress cycles (N<sub>f</sub>):

$$N_{f} = K \cdot \Delta T_{j}^{\alpha} \tag{5}$$

Where K and  $\alpha$  are fitting parameters whose values depend on the technology of the specific device type for which the model has been extrapolated. The value of  $\alpha$ , determining the slope of the power law, is negative, highlighting the fact that with the increase of  $\Delta T_j$  stress (the most influential parameter on the lifetime regarding power cycling [94]), the average lifetime of the device decreases.

Subsequently, additional analysis was performed to investigate the interdependencies of various stress parameters on the lifetime of the IGBT module, as discussed in [51]. Observing the dependency of average junction temperature  $(T_{j,\alpha\nu})$  on lifetime, the term of the Arrhenius law was introduced (6), taking into account the impact of the chemical reaction rate through the activation energy (E<sub>a</sub>) and the Boltzmann constant (k<sub>b</sub>):

$$N_{f} = K \cdot \Delta T_{j}^{\alpha} \cdot e^{\frac{E_{\alpha}}{k_{b} T_{j,\alpha\nu}}}$$
(6)

In-depth studies and improvements in the packaging structure of power devices, particularly IGBT modules, led to the introduction of a new empirical model known as CIPSo8 [88]. In this model (7), for the first time, the dependence on heating time  $(t_{on})$ , as well as the geometric parameters (D), current density per wire (I) and rating voltage (V) of the power device, are also highlighted.

$$N_{f} = K \cdot \Delta T_{j}^{\beta_{1}} \cdot e^{\frac{\beta_{2}}{T_{j,av}}} \cdot t_{on}^{\beta_{3}} \cdot I^{\beta_{4}} \cdot V^{\beta_{5}} \cdot D^{\beta_{6}}$$
(7)

The CIPSo8 model paved the way for several other models, allowing for the development of even more complex ones (8), as shown in [89]:

$$N_{f} = K \cdot \Delta T_{j}^{\alpha} \cdot e^{\frac{E_{\alpha}}{k_{b}T_{j,\alpha\nu}}} \cdot ar^{\left(\beta_{1} \cdot \Delta T_{j} + \beta_{0}\right)} \cdot \frac{C + t_{on}^{\gamma}}{C + 1} \cdot f_{diode}$$
(8)

In this model, which was extrapolated for the IGBT module (SKiM63), additional parameters were introduced, including the aspect ratio (ar) which is the ratio between the height of the wire bond and the distance between two wire bonds, and the derating factor of the anti-parallel diode ( $f_{diode}$ ). The inclusion of this latter parameter in the empirical model allowed for improvements regarding the size of the wire in the anti-parallel diode, which was found to have lower reliability compared to the wire bond on the silicon chip. It should be noted that the implementation of lifetime models is also useful for identifying critical technological aspects that were previously not considered in terms of lifetime.

A first empirical model (9) developed for discrete devices was presented in [90]:

$$N_{f} = K \cdot \Delta T_{j}^{\alpha} \cdot e^{\frac{E_{\alpha}}{k_{b}T_{j,\alpha\nu}}} \cdot I^{\gamma}$$
(9)

This model was based on [88] and aimed to consider the impact of current density, temperature variation, and the Arrhenius law. A more comprehensive study covering multiple discrete packages and also taking into account the impact of  $t_{on}$  was reported in [91]. The fitting coefficients were recalculated considering whether the dependence on the Arrhenius law involved  $T_{j,min}$ ,  $T_{j,max}$ , or  $T_{j,av}$ .

It is important to consider that these empirical models are derived from adapting experimental data obtained under accelerated test conditions, which do not correspond to real operational conditions. The potential of these methods lies in extrapolating lifetime predictions to the stress parameters encountered in operational conditions. Therefore, efforts have been made to validate these models for low variations in junction temperature ( $\Delta T_j$ ), or at least close to operational conditions. For example, in [92], an attempt was made to verify the model in [88] for low values of  $\Delta T_j$ , specifically  $\Delta T_j = 30$ °C, demonstrating that the model remains valid as it is still within the plastic deformation zone.

Meanwhile, [93] tested the validity of the model in [88] for  $\Delta T_j$  less than 30°C and proposed an approximation in such cases, introducing an exponential term (10) to compensate for the underestimation error of the model introduced in [88] (see results in Fig.28):

$$N_{f} = K' \cdot \Delta T_{j}^{\left(e^{\frac{\Delta T_{j} - 27.1K}{2.08K}} + \beta_{1}\right)}$$
(10)

It should be noted that both models were extrapolated from extremely accelerated power cycling tests with a stress cycle period ( $T_s$ ) of less than 100ms. This condition is necessary to analyze failure mechanisms under such conditions; otherwise, the test duration would be impractical [34].



Fig. 28: Comparison of CIPSo8 model and modified CIPSo8-model with (10) which well fit experimental data [93].

#### 2.6.2 Data-Driven

Data-driven models are based on predicting the lifetime by analyzing and monitoring one of the electrical parameters related to  $S_{OH}$  of a power device over time. Compared to empirical models, these models require fewer experiments for validation but still rely on power cycling tests for practical validation purposes. The precursor parameters for SoH, such as on-voltage ( $V_{on}$ ) and thermal junction impedance ( $Z_{th}$ ), have been investigated in several literature works [95–97].

These methods are based on equations that define the initial conditions and parameter values of the function, which are then recalibrated to make future predictions of the parameter used to monitor SoH. Predictive algorithms are implemented to achieve this goal. In this context, the use of particle filters is gaining popularity, mainly because they can handle the inherent uncertainty in failure phenomena [95]. In [96, 97], PFs were implemented to monitor  $V_{on}$  and make predictions about it to determine the RUL.

The predicted value of  $V_{on}$  ( $V_{(on),pre}$ ) and its measured value ( $V_{(on),act}$ ) at the k-th instant involve the functions f and h, where f represents a predefined nonlinear transition function and h represents a measurement function (intended as y=x) [97]:

$$V_{(on),pre,k} = f(V_{(on),pre,k-1}) + v_{k-1}$$
(11)

$$V_{(on),act,k} = h\left(V_{(on),pre,k}\right) + m_k$$
(12)

The v and m represent process and measurement noises, respectively, which follow Gaussian distributions according to the central limit theorem. In particular  $m_k$  is known as posterior PDF and defined as  $p(V_{(on),pre,k}|V_{(on),act,k})$ . At k = 0, the posterior PDF will exhibit a constant pattern since there is no previous information available.

By employing an iterative approach, the posterior PDF gradually converges towards a uniform distribution while at the end remains non-Gaussian; its Gaussian approximation is described by an ideal PDF called "importance PDF" [96, 97]. The Particle Filter (PF) algorithm approximates the posterior PDF, using a collection of discrete random sampling points, commonly referred to as particles:

$$p\left(V_{(\text{on}),\text{pre},k}|V_{(\text{on}),\text{act},k}\right) = \sum_{i=1}^{N} w_k^i \delta(V_{(\text{on}),\text{pre},k} - V_{(\text{on}),\text{pre},k}^i)$$
(13)

where  $\delta()$  is Dirac-delta function,  $w_k^i$  is importance weight of i-th sample at time k, and i is index of ith sample drawn from posterior PDF [96].

The  $w_k^i$  is calculated as the normalization of ratio between posterior PDF and importance PDF at time k. During the iterative weights calculation ( $w_k^i$ ), many weights tend to approach zero (weight degeneracy). This poses a critical issue for the estimation performance of the PF since calculations are performed on particles with negligible weights.

To address this, a resampling process is conducted, replacing particles with very small weights with a large number of particles with higher weights. After resampling, the weight values become equal for all particles and are set to 1/N. Therefore, the calculation of future particles is expressed in (14). The flowchart in Fig.29 illustrates the organization of the PF implementation.

$$p\left(V_{(\text{on}),\text{pre},k}|V_{(\text{on}),\text{act},k}\right) = \frac{1}{N}\sum_{i=1}^{N}\delta(V_{(\text{on}),\text{pre},k} - V_{(\text{on}),\text{pre},k}^{i})$$
(14)



Fig. 29: Flow chart of PF algorithm predicting RUL.

## 3.1 INTRODUCTION

This section focuses on the design of the set-up employed for power cycling experiments. It provides an in-depth understanding of the various technical details implemented to conduct these tests, as outlined in the theory in Chapter 2, Sec.2.5. Additionally, the chapter presents crucial details of the LabView code used for controlling and managing the experimental setup. Furthermore, it elucidates the measurement configuration necessary for developing a Temperature Sensitive Electrical Parameters (TSEP) model, in accordance with the description provided in Chapter 2, SubSec.2.4.2.1. This model will subsequently be implemented in the LabView code for power cycling tests, with the aim of estimating the junction temperature ( $T_j$ ) of the Device Under Test (DUT). Finally, the experimental results of the two different power cycling test methods adopted (constant current and constant  $\Delta T_j$ ) are presented, demonstrating the operational effectiveness of both the set-up and the LabView code.

### 3.2 TSEP MODEL CALIBRATION

In the literature, several works such as [60, 98-101] have proposed circuit solutions for the empirical estimation of V<sub>on</sub>-T<sub>j</sub> model. Among these circuit solutions, the most widely adopted, and utilized in this work is schematically depicted in Fig.30.



Fig. 30: A simplified schematic circuit for the calibration  $V_{on} - T_j$  curves.

In Fig.30, a DC voltage ( $V_{dc}$ ) is used to drive the power device and put it in the on state ( $V_{dc}$ > $V_{th}$ ). An SMU is used to inject a low current ( $I_m$ ) and measure the voltage across the DUT, the device is passively heated to specific temperatures using a suitable system. For the purpose of this work, the set-up that allows for the extraction of the  $V_{on}$ - $T_j$  model is schematized in Fig.31 and depicted in Fig.32. It consists of:

- Two SMUs: Keithley 2651A e Keithley 2450;
- One power device (DUT);
- An oven with internal temperature control.

In Fig.31, the Keithley 2651A SMU is used to provide the bias voltage to the power device ( $V_{dc}$  in Fig.30), delivering an output voltage of 15V. The Keithley 2450 SMU supplies the measurement current to be injected into the power device (Im in Fig.30) and measures the on-voltage across its terminals.

The choice of using SMU2 as the measurement instrument is due to its superior measurement sensitivity compared to SMU1 (see Tab.2). The oven is the system that, through a PID controller, allows heating the power device. By means of its internal temperature sensor, it is possible to precisely reach a specific temperature for the device, as the sensor is a Class A PT100 with a sensitivity of ±0.15°C. The DUT used is an IGBT with a discrete TO-247 package, whose technical specifications are listed in Tab.3. It will be commonly used for all the experiments conducted in this work. The chosen current  $I_m$  to be injected into the device is 50mA. Given its small value, the junction temperature is approximated to be equal to the case temperature of the power device, which corresponds to the temperature detected by the oven's internal temperature sensor. The chosen measurement approach is the 4-wire technique to avoid measurement errors introduced by the internal resistance of the cables. Additionally, considering the measurement environment, cables with a polymer coating suitable for withstanding ambient temperatures above 180°C have been selected for the connections.



**Fig. 31**: The schematic circuit adopted to calibrate  $V_{on} - T_j$  curves. The DUT (an IGBT device) is biased with two SMUs.

The methodology used for the measurements followed the selection of temperature points at which to heat the device and perform voltage measurements. The oven system is a thermodynamic system, so to ensure temperature homogeneity, a waiting period of a couple of minutes was allowed for the voltage measurement to stabilize. The actual chosen temperature points are 14, ranging from  $25^{\circ}$ C to  $150^{\circ}$ C.



**Fig. 32**: A picture of the experimental set-up adopted for  $V_{on} - T_j$  calibration. The DUT is placed in an oven to control its temperature.

| Tab. 2: Valtage and Current ranges of SMUs. |           |     |  |  |  |
|---------------------------------------------|-----------|-----|--|--|--|
| Voltage ranges (SMU1)                       | 100m - 40 | [V] |  |  |  |
| Voltage ranges (SMU2)                       | 20m - 200 | [V] |  |  |  |
| Current ranges (SMU1)                       | 100n - 50 | [A] |  |  |  |
| Current ranges (SMU2)                       | 10n - 1   | [A] |  |  |  |

In Fig.33, the experimental results of the obtained curves after the conducted measurements are presented. It can be observed, as mentioned in Chapter2, SubSubSec.2.4.2.1 at currents around 1/1000 of the DC current of power device, the linear relationship between  $V_{ce}$  and  $T_j$ , as expected by theory, is verified. Based on these measurements, it was observed that the value of  $I_m$  (also called  $I_{ref}$ ) closest to 1/1000 of the DC current, specifically the curve obtained at 50mA, was chosen. Moreover, the linear response  $V_{ce}(T_j)$  obtained at 50mA, in accordance with theory, has a slope of approximately -2mV/°C.

# 3.3 EXPERIMENTAL SET-UP DESCRIPTION FOR POWER CYCLING TESTS

| J 1                                 |                    | 17  |       |
|-------------------------------------|--------------------|-----|-------|
| Voltage rating                      | V <sub>bk</sub>    | 650 | [V]   |
| Pulsed current rating               | Ip                 | 120 | [A]   |
| DC collector current                | I <sub>dc</sub>    | 40  | [A]   |
| Thermal resistance, junction - case | Z <sub>th,jc</sub> | 0.6 | [K/W] |

Tab. 3: Electrical and thermal parameters of TO-247 used as DUT.



**Fig. 33**: The  $V_{ce,on}(T_j)$  curves based to  $I_m$  currents.

| ————————————————————                |                    |      |       |
|-------------------------------------|--------------------|------|-------|
| Voltage rating                      | V <sub>bk</sub>    | 40   | [V]   |
| Pulsed current rating               | Ip                 | 1500 | [A]   |
| DC collector current                | Idc                | 200  | [A]   |
| Thermal resistance, junction - case | Z <sub>th,jc</sub> | 0.41 | [K/W] |
|                                     |                    |      |       |

Tab. 4: Electrical and thermal parameters of used switches.

#### 3.3.1 Circuit Diagram and Instruments Used

The power cycling experiments are conducted using a custom-designed board specifically created for this purpose, with a simplified schematic representation shown in Fig.34. Four devices are alternately subjected to equal stress using the same current value, I<sub>dc</sub>, supplied by the EA-PSB 9080 current generator through a multiplexing approach. This approach is achieved by employing electronic switches So-S3 whose electrical characteristics are listed in Tab.4.

The FPGA on the CompactRio board, as depicted in Fig.34, is utilized to generate digital control signals for switches So-S<sub>3</sub>. Specifically, these digital signals are sent to the driver section of the switches, as illustrated in Fig.34. The switches are MOSFET devices that, due to their electrical characteristics (see Tab.4), enable the dissipation of almost all the power in the DUT.

The driver section of switch " $S_x$ " consists of a digital isolator and a MOSFET driver, as shown in Fig.35a. The digital signal output from the FPGA, which is used as an input in the driver section, has a voltage range between 0V to 5V, while the voltage range at the output of the driver section is between 0V and 15V. The signal periodicity is managed through the LabView software.

The driver section of the IGBTs, used as DUTs for power cycling experiments, consists of a negative-feedback amplifier that acts as a voltage buffer (see Fig.35b), ensuring that the DUTs are always in an on-state with a gate voltage



Fig. 34: Simple Schematic representation of the set-up adopted for power cycling tests.

of 15V (DC<sub>bias</sub> in Fig.34). In Fig.36, the V<sub>ce</sub> voltage at the device terminals is fed into a differential amplifier with a gain of 3. The output signal from the amplifier is sampled by the ADC of the CompactRio. The acquired data is then processed on the PC using LabView software through an Ethernet connection (see Fig.34).



**Fig. 35**: (a) The driving section of MOSFET used as a switch. (b) The driving section of DUT.



Fig. 36: The circuit utilized to sense V<sub>ce</sub> and acquire it by means of CompactRio ADC.



**Fig. 37**: The current flowing in the DUT (a) is switched between a large value (ON phase) and a small value (OFF phase). The corresponding voltage drop is reported in (b). During the ON phase a large voltage drop is observed and the  $V_{ce,on}$  estimation is taken at the end of this phase. During the OFF phase, a sensing current ( $I_{ref}$ =50mA) is injected in the DUT and the  $V_{ce,off}$  profile is used for the estimation of  $T_j$ . The inset illustrates the temperature profile resulting from the application of the TSEP methodology to the  $V_{ce,off}$  profile.

Fig.37 depicts the typical current and voltage profiles in a DUT. During the heating period (ON phase), a high current  $I_{dc}$  flows through the device, while during the cooling phase (OFF phase), a reference current  $I_{ref}$  of 50 mA is injected into the device (via the circuit shown in Fig.38) to measure  $V_{ce,off}$ .

As shown in Fig.37b, the  $V_{ce,off}$  profile is used to estimate the  $T_j$  by adopting the TSEP method. The inset of Fig.37b presents the result of the TSEP methodology, whose model was developed following the steps in the Sec.3.2.

The temperature profile starts with a maximum junction temperature  $(T_{j,max})$ , reached at the end of the heating phase, and gradually approaches a min-



Fig. 38: The schematic of I<sub>ref</sub> generation circuit.

imum value ( $T_{j,min}$ ), at the end of the cooling phase. The thermal cycle is defined as the temperature difference  $\Delta T_j = T_{j,max} - T_{j,min}$ .

As shown in Fig.34, a bypass section can be implemented to support the  $I_{dc}$  current. Specifically, when a DUT fails due to reaching its EoL condition, the corresponding switch must be permanently opened, and switch S4 is activated to maintain the same  $t_{on}$  and  $t_{off}$  values for the remaining DUTs. Fig.39 illustrates the configuration of the bypass circuit, composed of two identical IGBT devices that serve as protective elements and are not utilized as test components. To ensure this, their DC current rating is twice that of the DUTs. Furthermore, these two IGBT devices are connected in parallel to evenly divide the current, thereby further reducing the potential impact of stress on them.



Fig. 39: The bypass circuit implemented for the custom board.

Fig.40 displays an image of the experimental set-up. The custom board, containing the DUTs, is mounted on a liquid-cooled thermal plate. A ther-



**Fig. 40**: Picture of the set-up. The test circuit is placed on a liquid-cooled thermal plate, whose temperature is fixed by means of a temperature controller.

mal controller, Julabo Presto A40 (also depicted in the schematic diagram of Fig.34), is used to control the temperature of the thermal plate. The DUTs are positioned on the backside of the custom board and are in direct contact with the thermal plate.

The average junction temperature can be estimated using the formula:

$$T_{j,a\nu} = T_{ref} + R_{th,jh} \cdot P_{a\nu}$$
<sup>(15)</sup>

Where  $P_{av}$  is the average power dissipated in the DUTs,  $R_{th,jh}$  is the thermal resistance between the DUT junction and the thermal plate, and  $T_{ref}$  is the temperature of the thermal plate set by the temperature controller. Therefore, for each experiment, the value of  $T_{ref}$  is properly adjusted to achieve the desired  $T_{j,av}$ . It is important to note that the value of  $R_{th,jh}$  of a DUT is also influenced by neighboring devices (mutual heating effects). Hence, when a DUT fails and is permanently turned off, small changes in  $T_j$  are observed in the adjacent DUTs. This effect is compensated by appropriately modifying the value of  $T_{ref}$ . However, a transient effect may be visible in the  $T_j$  and  $V_{ce}$  profiles.

The health status of the DUTs subjected to power cycling stress is estimated by monitoring the thermal impedance between the junction and case ( $Z_{th,jc}$ ) and the  $V_{ce}$  voltage in the on-phase ( $V_{ce,on}$ ). The failure event is generally determined by a 20% increase in  $Z_{th,jc}$  or a 5% increase in  $V_{ce,on}$  [20].

Regarding the power cycling tests, different methodologies can be adopted to achieve the desired cycling of the junction temperature [72, 81, 102, 103]. In this set-up, the developed approaches are "non-controlled  $\Delta T_j$ " and "active control of  $\Delta T_j$ ", and the details are reported in SubSec.3.3.3 and subSec.3.3.4.

Overall, Fig.41, Fig.42, and Fig.43 respectively depict the actual schematic, associated Gerber file, and the physical appearance of the PCB board.



Fig. 41: Schematic of PCB.



Fig. 42: Gerber file of PCB used for Power Cycling experiments.



Fig. 43: PCB for power cycling experiments.



Fig. 44: The phases in which is divided a power cycling period for LabView program.

# 3.3.2 LabView Code

The code and the overall framework in LabView have been divided into different sections based on the "phases" that make up a power cycle of a DUT.

This division allows the program to process the data extremely quickly, as the PC already knows in advance the phase of the data set received from the DSP. Fig.44 reports a generic example of a period in the thermal cycle of a DUT to illustrate the concept.

- Phase 1: on phase (ON) This phase starts at the beginning of the DUTs heating process when the load current starts to flow into the device. During this phase, the conduction voltage under high current conditions is sampled to monitor the state of health SoH of the device.
- Phase 2: Initial off-phase (OFF1) This phase starts at the end of the heating phase to include the rapid initial cooling dynamics and lasts for about ten milliseconds. The goal of this phase is to sample the conduction voltage at the beginning of the off-phase, where the reference current I<sub>ref</sub> is injected. Generally, a delay of about 50 microseconds, associated with the switching transient, is introduced before using the first sampled voltage data to estimate the maximum junction temperature using the TSEP method.
- Phase 3: Advanced off-phase (OFF2) This phase starts after the end of OFF1 and includes the slower cooling dynamics. It ends before the start of the next heating process. During this phase, the sampled voltage value is used to estimate the minimum junction temperature.

#### 3.3.2.1 The Framework

The framework of the virtual instrument (VI) is divided into three parts:

- "ThermalCycle.vi": This VI is executed by the FPGA of the CompactRio.
- "acquisition.vi": This VI is executed by the DSP of the CompactRio.
- "getDataprocessing.vi": This VI is executed by the PC.

Each one is hence executed on a different device and has a specific function. The three main VIs automatically synchronize during execution by implementing global variables.

# 3.3.2.2 VI executed by the FPGA

This section of the program is executed within the FPGA of the CompactRio and is responsible for managing the power cycle. In particular, it handles the switching-on and switching-off of the MOSFETs, allowing to control the activation time  $(t_{on})$  during power cycling and managing the dead times for the driving signals.

Fig.45 represents a portion of the code that manages these specifications. In this section, the voltage for driving the DUTs is provided and the activation or deactivation of the power supply is managed. Communication between this part of the program in the FPGA and the other components occurs exclusively through the control panel, using global variables as shown in Fig.46.



**Fig. 45**: Main cycle (partial figure) that manages the entire power cycling and gate voltage of the DUTs.

# 3.3.2.3 VI executed by the DSP

The following VI is executed within the DSP of the CompactRio and is responsible for data acquisition and its corresponding transmission to the PC via Ethernet communication. For data transmission, NPSV shared variables are used, one for each phase (identified by the global variables in the blue box in Fig.47) and for each channel (3 phases x 4 channels = 12 NPSVs) like is shown in the red boxes in Fig.47.

The DSP processor utilizes the NI9223 module with an integrated ADC. The ADC sampling frequency is set to the maximum value of 1MHz. This value determines the actual sampling frequency for the OFF1 phase, while the other phases (ON and OFF2) undergo decimation to avoid the accumulation of less significant data, as they exhibit slower and more stable dynamics. The management of decimation is performed in the corresponding PC VI.



**Fig. 46**: Cycle responsible for updating the current phase (for VI synchronization) of all DUTs through the use of global variables.



**Fig. 47**: The code consists of a single main loop that handles the data transmission to the PC through the NPSV of the current phase for each DUT. The blue box highlights the global variables that identify in which of the three phases (ON, OFF1, OFF2) the DUTx is situated (in this case for DUTo), as communicated by the FPGA. The three NPSVs, highlighted in red, store the sampled data from the ADC during each of the three phases.

The "LoopTime" parameter has been introduced, indicating the cycle time for data transmission to the PC. This value has been determined through stability tests of the code and is also useful for detecting any anomalies. It is set to 3ms, representing the minimum time interval in which the PC can reliably keep up with the data processing executed by the DSP processor.

# 3.3.2.4 VI executed by the PC

The following VI is executed by the PC and is responsible for processing the data received from the DSP processor. Specifically, it manages the data acquired by sampling the  $V_{ce}$  voltage in the three phases. During the ON phase,



**Fig. 48**: Part of the code allowing to estimate  $\Delta T_j$  (one for each DUT).



Fig. 49: Part of the code allowing for data storage.

the data is decimated by a factor of 50, while in the OFF2 phase, the decimation factor is 20. As mentioned earlier, the OFF1 phase is not decimated. By using the  $V_{ce,on}$  voltage in the ON phase, SoH related to contact degradation is monitored. Additionally, the first sample obtained after the switching transient in the OFF1 phase and the last sample acquired in the OFF2 phase are used  $T_{j,max}$  and  $T_{j,min}$ , respectively, according to the previously developed TSEP model. The temperature swing  $\Delta T_j$  can be calculated, as shown in Fig.48. Temperature information, along with the voltage in the ON phase and the  $I_{dc}$  current information, is used to derive the  $Z_{th,ic}$  information and monitor the SoH of the solder joints. Some of the processed data is then sent to the FPGA via global variables to update the experiment's state, for example, in case of DUT failure and activation of the bypass circuit driving signal. The main data, such as the on-voltage,  $T_{j,max}$  and  $T_{j,min}$  are saved in CSV files for subsequent post-processing, as depicted in Fig.49. This VI also manages the interfacing with the experimental system's peripherals, such as the power supply and temperature controller via RS232. The code sections that handle and, if necessary, decimate the data from the DSP processor are presented in Fig. 50, Fig. 51 and Fig. 52 for the ON, OFF1, and OFF2 phases, respectively.














| Test  | I <sub>dc</sub> [A] | T <sub>j,min</sub> [°C] | $\Delta T_j$ [°C] | <b>Ts</b> [s] | duty [-]  |
|-------|---------------------|-------------------------|-------------------|---------------|-----------|
| Test1 | 73.5                | 25                      | 140               | 2.5           | 0.22-0.28 |
| Test2 | 68.5                | 25                      | 120               | 2.5           | 0.22-0.28 |

Tab. 5: Summary of DC power cycling tests under constant current condition.

### 3.3.3 Power Cycling Experiments under constant current

The previously described codes in SubSec.3.3.2 allow conducting standard experiments, namely DC power cycling. In Chapter 2, Sec. 2.5, the theory related to this type of test has been explained. In this section, some of the results from DC power cycling experiments conducted under the conditions indicated in Tab. 5 are reported. Two tests, Test1 and Test2, were performed with currents I<sub>dc</sub> of 73.5A and 68.5A, respectively, while maintaining a controlled temperature of 25°C using the temperature control device for the heatsink (JULABO Presto A40). The currents were selected to obtain an appropriate value of  $\Delta T_j$ , and for each test condition, eight samples were experimentally evaluated. Differences between the samples can affect power losses and, consequently, the values of  $\Delta T_j$ . Therefore, the duty cycle of individual devices was slightly adjusted (in a range between 0.22 and 0.28) in order to obtain similar  $\Delta T_j$  values.



Fig. 53: Temperature swing profiles, as a function of the number of power cycles for 8 different samples (Test1).

For Test1, the results of  $\Delta T_j$ ,  $V_{ce,on}$ , and  $Z_{th,jc}$  are shown in Fig.53, Fig.54 and Fig.55, respectively, as a function of the number of power cycles. From the conducted experiment and the obtained results, it can be observed that the failure phenomenon is associated with wire bond degradation, as EoL is reached with a 5% increase in  $V_{ce,on}$  compared to its initial value. Furthermore, this increase is responsible also for the  $\Delta T_j$  rise as observed in Fig.53. Towards the end of the device's useful life, a slight increase in  $Z_{th,jc}$  is observed, although it is much lower than 20%. Therefore, it is not considered the primary factor causing device failure. Similar considerations can be made for the results obtained in Test2, as reported in Fig.56, Fig.57, Fig.58. By observing Fig.54 and Fig.57, abrupt variations can be noticed. These variations are mainly attributed to a transient phase in which a failed device no longer generates the same heating effect on the plate and adjacent devices, causing a transient phase of thermal imbalance, which can also be observed for  $\Delta T_j$  profiles in Fig.53 and Fig.56. However, this imbalance will be compensated by controlling the reference temperature value ( $T_{ref}$ ) through the active control system responsible for the heat sink temperature, bringing the experiment back to a steady-state condition.



Fig. 54:  $V_{ce,on}$  as a function of the number of cycles in DC power cycling under  $\Delta T_j{=}140~^\circ C$  stress



**Fig. 55**:  $Z_{th,jc}$  as a function of the number of cycles in the case of Test1 condition. The increase of thermal resistance is always lower than about 1%, indicating that the degradation in the device is mainly linked to wire bonding effects.



Fig. 56: Variation junction temperature cycling for 8 different samples (Test2).



Fig. 57:  $V_{ce,on}$  as a function of the number of cycles in DC power cycling under  $\Delta T_i$ =120 °C stress.



Fig. 58:  $Z_{th,jc}$  as a function of the number of cycles in the case of Test1 condition.

### 3.3.4 Power Cycling Experiments under constant $\Delta T_i$

In addition to the standard DC power cycling stress, referred to as noncontrolled  $\Delta T_j$  in this work, this set-up also included power cycling experiments conducted under a constant  $\Delta T_j$ , also indicated as active control of  $\Delta T_j$ in this thesis.

To maintain the actual value of  $\Delta T_j$  constant, or within a limited range of variation, a control on  $\Delta T_j$  is adopted. As shown in Fig.59 (red curves), the temperature increase (due to wear-out phenomenon) is compensated by reducing the heating time of the DUT. In this set-up, a hysteresis control is considered. The hysteresis thresholds are  $\pm 1^{\circ}$ C with respect to the  $\Delta T_j$  reference value. Therefore, whenever the  $\Delta T_j$  value goes beyond the thresholds, the t<sub>on</sub> time is reduced or potentially increased. It is important to note that the heating time is also a parameter that affects the power components' lifetime, albeit to a lesser extent compared to  $\Delta T_j$ . Therefore, it is crucial to avoid significant changes in t<sub>on</sub> during a test.



Number of cycles

**Fig. 59**: Schematic representation of the different methodologies adopted for power cycling tests. In the case of standard DC power cycling stress ("non-controlled  $\Delta T_j$ " approach), the temperature swing, obtained by means of a constant heating current, deviates from the nominal value (a) because of V<sub>ce,on</sub> or Z<sub>th,jc</sub> degradation. The "active control of  $\Delta T_j$ " allows limiting the temperature increase by dynamically reducing the t<sub>on</sub> time (b).

This method is applied to all four devices mounted on the test board. Consequently, the sum of all  $t_{on}$  times might be different (lower) than the periodicity of the control signals. In such cases, the bypass transistor (see Fig.34 in SubSec.3.3.1) is activated for a short period to ensure a constant periodicity of power cycling tests. Here is an example of a controlled  $\Delta T_j$  experiment to evaluate the code implementation and the robustness of the set-up for the power cycling test methodology. An appropriate current was chosen to achieve a  $\Delta T_j$  value of 140°C. The period was set to 2 seconds, while the  $t_{on}$  time varies based on the  $\Delta T_j$  value. In Fig.60, it can be observed that the  $\Delta T_j$  values remain within the range of ±1°C with respect to the 140°C reference value throughout the duration of the experiment. Additionally, the phases where the control becomes more active can be observed through abrupt temperature changes when the power device is nearing EoL. Fig.61 shows the on-voltage curves, highlighting that this type of experiment induces failure due to wire bond degradation. Fig. 62 shows the LabView code that implements the method for the active control of  $\Delta T_j$  described in Fig. 59.



Fig. 60:  $\Delta T_j$  as a function of the number of cycles, in the case of active control  $\Delta T_j$  stress.



**Fig. 61**:  $V_{ce,on}$  as a function of the number of cycles, in the case of active control  $\Delta T_j$  stress.



Fig. 62: LabView Code for implementing the active control of  $\Delta T_j$ .

### 4.1 INTRODUCTION

This chapter explores the use of analytical models to investigate and predict failures caused by power cycling. Specifically, the focus is on enhancing the applicability of the Linear Damage Accumulation (LDA) rule. The LDA approach provides a systematic methodology for predicting the life consumption (LC) of components exposed to repetitive load cycles, which is highly relevant in the field of power electronics. Consequently, this chapter will examine typical scenarios where the LDA is applied, analyze the literature evaluating the validity of its implementation, and emphasize the significance of statistical analysis for its proper application, particularly when dealing with non-constant stress conditions over time.

By focusing on non-constant stress, this chapter delves into how the methodology used in power cycling experiments affects the accuracy of predictions derived from the LDA. This aspect is particularly important because understanding how design of experiments (DoE), conditions, and procedures can influence the accuracy of the model is crucial for obtaining reliable and meaningful results. Furthermore, the analysis conducted will align with a comprehensive exploration of the observed failure mechanisms. This approach allows for the verification of the underlying assumption of the LDA, specifically, the hypothesis of a single stress-induced failure mechanism.

## 4.2 LINEAR DAMAGE ACCUMULATION RULE FOR THE LIFETIME ESTIMA-TION

In general, there is a strong request for an accurate prediction of the lifetime in power electronics, in order to satisfy the reduction of development and testing time [104]. In a consolidated approach, the analysis of the reliability of a generic power system begins with the study of the mission profile [105–107].

As schematized in Fig.63, based on the electric and thermal model of the system, the mission profile is translated in a temperature profile in power semiconductor devices. The rainflow algorithm can be adopted to evaluate the number and the amplitude of temperature cycles [108]. Lifetime models are used to predict number of cycles to failure as a function of relevant parameters: temperature swing ( $\Delta T_j$ ), minimum temperature junction ( $T_{j,min}$ ), heating time  $t_{on}$  and current density per wire.

The number of cycles to failure can be defined either as the average number or as the number leading to a given PoF. Hence, based on the considered lifetime model, the Miner's rule is adopted to predict the LC for a given temperature profile, under the assumption of LDA (see Fig.63). The applicability of LDA is a fundamental point, which have been considered in literature.

In [107], LDA rule was validated considering the superimposition of different temperature profiles, having different heating times. Moreover, the analysis in [107] was carried out at different values of PoF. In [109], the application of combined power cycling stresses led to an underestimation in the lifetime prediction, which was explained assuming a dual degradation mechanism, resulting in a prediction error. In [110] under the assumption of a single degradation mechanism, combined power cycling stresses verified the applicability of LDA rule. In [26], combined experimental tests at different  $\Delta T_i$  values did not verify the linearity of the Miner's rule, particularly in the case of a combined stress with significantly different  $\Delta T_i$  values (varying between 110°C and 70°C). In [111, 112] the impact of combined vibrating and thermal cycling stresses was analyzed. An overestimation of the lifetime was found by applying the Miner's rule. This inconsistency was ascribed to a change of the thermo-mechanical response due to the interaction between different types of stress [111] or to an additional stress phenomenon due to random vibrations during the test being temperature dependent [112]. In [113], a non-linear cumulative damage model was proposed for ceramic column grid array electronic package subjected to a combination of thermal cycling and vibration.

Also in [114], combined thermal cycling and vibration stress, under the assumption of a single failure mechanism, i.e. solder fatigue, led to an overestimation of lifetime with the Miner's rule, because of dynamic effects combined with thermal stress. In [115], a non-linearity in the accumulation of damage to solders in combined thermal cycling and vibration stress was found. In this case, the prediction error, despite the hypothesis of a single degradation mechanism and no interaction between the two stresses, was ascribed to the formation of intermetallic material at the interfaces or to the increase of voids size, amplifying the degradation process.



Fig. 63: Schematic diagram of LC evaluation based on the mission profile to which a typical power system is subjected. The use of the electro-thermal model allows for the extraction of a temperature profile, whose cyclic stress values  $(\Delta T_j, T_{j,min}, ...)$  will be counted using the Rainflow algorithm. The information obtained from the counting algorithm, along with the mean lifetime value relative to the i-th counted stress values, will be used by the Miner's Rule under the assumption of the LDA theory.

### 4.2.1 The Impact of Statistics

The study of the lifetime under under a generic mission profile, hence experiencing a non-constant cumulative stress, requires the knowledge of the statistics of failure events occurring under constant stress conditions. Therefore, in this subsection, a brief theoretical explanation of the statistical methods commonly implemented is provided.

## 4.2.1.1 Weibull Statistic

The CDF gives the probability that a device will fail within a given number of cycles N (in order words the percentage of population expected to be failed as a function of N). The Weibull statistics is widely adopted to describe thermal/power cycling phenomena in power semiconductor devices [116]. Its CDF is expressed as:

$$CDF = 1 - e^{\left(\frac{N}{\beta}\right)^{\alpha}}$$
(16)

where  $\alpha$  is the shape parameter and  $\beta$  is the scale parameter. In particular, the shape parameter provides information about the behavior of the failure rate [116]:

- $\alpha$  < 1: it indicates that the samples fail prematurely;
- *α* = 1: constant failure rate, if it occurs, the failure is not correlated to aging mechanism;
- α > 1: increasing failure rate, it indicates that the type of failure is associated with a wear-out mechanism.

It is possible to observe how the CDF changes with respect to alpha in Fig.64.



**Fig. 64**: The CDF of Weibull distribution with the same  $\beta$  but different  $\alpha$ .

Typically, in the presence of experimental data, specifically the number of cycles to failure, in order to analyze the statistics and construct the CDF plot, the Bernard formula [117] is employed:

$$CDF_{exp}(N_k) = \left(\frac{k - 0.3}{N_{tot} + 0.4}\right)$$
(17)

where  $N_k$  is the number of cycles to failure of the k – th experiment (with experiments sorted in ascending order according to the number of cycles to failure) and  $N_{tot}$  is the total number of experiments.

From the values obtained using (17), it is possible to determine the  $\alpha$  and  $\beta$  parameters of (16) using two approaches: The least squares fitting for the linearization of the Weibull distribution (also known as Weibit) or the maximum likelihood estimation method. Both methods are employed to validate the obtained experimental results.

The first method is based on fitting the experimental data with a linear function that minimizes the mean squared error. The second method requires solving a system of equations. Both methods, if executed correctly, yield consistent results. In the context of this work, the first approach was utilized. This method is based on the following mathematical steps:

$$\ln(1 - CDF) = -\left(\frac{N}{\beta}\right)^{\alpha}$$
$$\ln(-\ln(1 - CDF)) = \alpha \cdot \ln\left(\frac{N}{\beta}\right)$$

$$\ln(-\ln(1 - \text{CDF})) = \alpha \cdot \ln(N) - \alpha \cdot \ln(\beta)$$
(18)

According to (18), considering a plot of  $\ln(-\ln(1 - \text{CDF}))$  vs  $\ln(N)$ , the experimental data can be fitted with a linear function having a slope equal to  $\alpha$ . An example of this approach is provided using the results of the two real power cycling experiments described in Chapter 3, SubSec.3.3.3, and depicted in Figure 65. Furthermore, Fig.65 displays the confidence and prediction intervals at a certain percentage (to account for data variability and method robustness), along with different levels of PoF.



**Fig. 65**: The linearization of CDF for  $\Delta T_j$ =140 °C and  $\Delta T_j$ =120 °C constant stresses (described in Chapter 3, SubSec.3.3.3), fitted assuming Weibull statistics. Different PoF, ranging from 10% to 75%, are also reported and named B10, B25, B50 and B75.

## 4.3 THE INFLUENCE OF POWER CYCLING TEST METHODOLOGY ON THE LDA RULE ACCURACY

This section explores the impact of the power cycling test methodology on the accuracy of the LDA rule from an academic point of view. To investigate this, power cycling tests are conducted using two different approaches:

- Non-controlled ΔT<sub>j</sub> approach, in which a constant heating current is used to achieve the desired ΔT<sub>j</sub> (constant current or DC power cycling test as indicated in Chapter 2 and Chapter 3);
- Active control of  $\Delta T_j$  approach (power cycling test under constant  $\Delta T_j$  as outlined in Chapter 2 and Chapter 3), in which the heating time is modulated in order to keep the  $\Delta T_j$  value close to the desired value for the entire experiment.

### 4.3.1 Power Cycling Tests Under Constant $\Delta T_{j}$ Stress

As mentioned in the more theoretical SubSec.4.2.1, the study of the lifetime under non-constant cumulative stress requires the knowledge of the statistics of failure events occurring under constant stress conditions.

In this section, the experimental results of power cycling tests are reported, by considering constant  $\Delta T_i$  values: 120°C and 140°C. In both cases, "noncontrolled  $\Delta T_i$ " and "active control of  $\Delta T_i$ " approaches are considered for the sake of comparison.  $V_{ce,on}$  and  $\Delta T_j$  profiles are reported in Fig.66 for the nominal  $\Delta T_j = 120^{\circ}$ C. The adopted heating current is 63.5A with a variation of a maximum of  $\pm 0.5$  A, with  $T_{j,min} = 25^{\circ}$  C. In the case of "active control of  $\Delta T_{j}$ ", the temperature swing is kept constant to the nominal value of 120°C, within the hysteresis threshold of  $1^{\circ}$ C (see Fig.66a). The V<sub>ce,on</sub> profile, reported in Fig.66b, is initially flat, while it sharply increases close to the EoL of the components. In all twelve experiments the failure is determined only by an increase of  $V_{ce,on}$  by 5%. On the other hand, in the case of "non-controlled  $\Delta T_j$ " approach, the temperature swing increases up to around 127°C (see Fig.66c). Although the qualitative profile of  $V_{ce,on}$  (see Fig.66d) is in agreement with the one observed in the case of "active control of  $\Delta T_i$ ", the increase of temperature reported in Fig.66b is responsible of modifications in the number of cycles to failure. In the case of  $\Delta T_i = 140^{\circ}$ C, an heating current of 68.5A is adopted, with  $T_{j,min} = 25^{\circ}C$ .  $V_{ce,on}$  and  $\Delta T_j$  profiles are analogous to those reported in Fig.66.

The experimental number of cycles to failure can be adopted to build the CDF plot. By means of (17), the experimental CDFs are reported in Fig.67 for both  $\Delta T_j = 120^{\circ}$ C and  $\Delta T_j = 140^{\circ}$ C and for both "non-controlled  $\Delta T_j$ " and "active control of  $\Delta T_j$ " approaches. Aiming at linearizing the dependence between CDF and N, using (18) a linear fitting has been adopted to estimate both  $\alpha$  and  $\beta$  parameters. Lines at specific PoFs are reported in Fig.67 and are labeled B10, B25, B50 and B75. In general, the adoption of "active control of  $\Delta T_j$ " approach leads to a larger number of cycles to failure for a given PoF with respect to the "non-controlled  $\Delta T_j$ " approach. In the latter case, according to [80], a positive feedback relationship between the wire bonds degradation and  $\Delta T_j$  leads to lower lifetimes. CDFs, estimated in the case of "active control of  $\Delta T_j$ " approach, exhibit a similar shape parameter  $\alpha$ , while the change



**Fig. 66**: Power cycling tests carried out under constant stress conditions. Temperature swing and  $V_{ce,on}$  profiles are reported as a function of the number of cycles in the case of "active control of  $\Delta T_j$ ", Fig. (a) and (b), and in the case of "non-controlled  $\Delta T_j$ ", Fig. (c) and (d). Twelve samples are stressed under the same test conditions ( $T_{j,min} = 25^{\circ}C$  and  $\Delta T_j = 120^{\circ}C$ ). The increase of  $V_{ce,on}$  by 5% is considered as failure criterion.

of the stress level leads to a modification of the scale parameter  $\beta$ . On the other hand, the adoption of "non-controlled  $\Delta T_j$ " approach leads to a statistic in which the shape parameter  $\alpha$  is significantly reduced in the case of  $\Delta T_j$  = 140°C. It is possible that during the degradation phase the non-controlled increase of temperature can cause some early failures, hence modifying the  $\alpha$  parameter of the distribution.



**Fig. 67**: Experimental CDF for constant  $\Delta T_j = 120^{\circ}$ C and  $\Delta T_j = 140^{\circ}$ C. Results arising from both techniques, "active control of  $\Delta T_j$ " and "non-controlled  $\Delta T_j$ ", are reported. Experimental data are fitted assuming a Weibull distribution. Prediction bounds (99%) are also included in the plots.

### 4.3.2 Power Cycling Tests Under Non-Constant $\Delta T_i$ Stress

In the case of non-constant  $\Delta T_j$  stress, the Miner's rule, which formalizes the LDA theory and relies on an understanding of the statistics described in the previous subsection, is adopted for the estimation of LC. Its expression for a specific percentage of PoF is as follows:

$$LC_y = \sum_{i=1}^m \frac{n_i}{N_{i,y}}$$
(19)

where  $n_i$  is the number of cycles for the i-th stress level, m is the number of stress levels considered, and  $N_{i,y}$  is the expected lifetime to achieve y percentage PoF under the i-th stress level. When LC is equal to 1, the targeted y percentage of PoF has been achieved.

The case of Test 1 (as illustrated in Tab.6) is reported in Fig.68. Non-constant stress is defined as: 6000 cycles at  $\Delta T_j = 140^{\circ}$ C and the remaining cycles at  $\Delta T_j = 120^{\circ}$ C, with Tj,min = 25°C. According to (19), the LC is calculated by considering the expected number of cycles at  $\Delta T_j = 120^{\circ}$ C and  $\Delta T_j = 140^{\circ}$ C. These values can be directly derived from Fig.67a and Fig.67b (active control of  $\Delta T_j$ ) for the given PoF (10%). However, the CDFs of Fig.67 are defined within given prediction bounds with a level of certainty of 99%. Consequently, the LC profile is also known in a prediction interval, as reported in Fig.68. The lifetime is then calculated as the number of cycles leading to LC = 1. Overall, a lifetime interval can be estimated, arising from the limited statistics in the experimental activity.

| N range<br>(*10 <sup>3</sup> ) | 0-6   | 6-10 | 10-14 | 14-15 | 15-EoL |
|--------------------------------|-------|------|-------|-------|--------|
| Test 1                         | 140°C |      |       | 120°C |        |
| Test 2                         | 14    | lo∘C |       | 120   | °C     |
| Test 3                         |       | 120  | °C    |       | 140°C  |
| Test 4                         |       |      | 120°C |       | 140°C  |

**Tab. 6**: List of experiments under non-constant  $\Delta T_i$  stress.

For the sake of comparison, the analysis of Test 1 is then carried out by considering lifetime models derived with both "non-controlled  $\Delta T_j$ " and "active control of  $\Delta T_j$ " approaches and for the probability of failure ranging from 10% to 75%. The application of the Miner's rule for both cases is reported in Fig.69. The LC is estimated in Fig.69a and Fig.69c at different PoF. By using these pairs of values, i.e. the number of cycles to failure and the PoF, a CDF can be predicted according to the Miner's rule (see Fig.69b and Fig.69d). Although both predicted CDFs are included in the range of constant stresses ( $\Delta T_j$  = 120°C and  $\Delta T_j$  = 140°C), the adoption of lifetime models calibrated with a "non-controlled  $\Delta T_j$ " approach leads to higher probability of failure (under non-constant stress).

Experimental non-constant  $\Delta T_j$  stresses are reported in Fig.70 in the case of Test 1. In Fig.70a,  $\Delta T_j$  profiles were obtained by actively controlling t<sub>on</sub> and



**Fig. 68**: Application of the Miner's rule for the determination of the lifetime in the case of non-constant stress (Test 1 of Tab.6). LC is calculated according to (19), by considering the expected number of cycles estimated in Fig.67 for a PoF of 10%. The prediction interval arises from the prediction bound of Fig.67.

hence exactly matching the conditions of Test 1 (see Tab.6). This is the most appropriate profile to consider, since the only available lifetime models are those for  $\Delta T_j = 120^{\circ}$ C and  $\Delta T_j = 140^{\circ}$ C. For the sake of comparison, in Fig.70b  $\Delta T_j$  profiles were generated by only controlling the heating current, hence an uncontrolled temperature increase close to the end of life is observed. In Fig.70c, by considering the CDF calculated on the basis of models calibrated with the "active control of  $\Delta T_j$ " methodology, the application of the Miner's rule leads to a lifetime prediction being in a very good agreement with the experimental CDF deriving from the tests of Fig.70a.

As reported in Tab.7, the experimental number of cycles to failure is always included in the prediction interval (associated to the Miner's rule estimation) for the full range of PoFs. In the case of "non-controlled  $\Delta T_j$ " approach, the application of the Miner's rule leads to a lifetime prediction which is accurate in the case of large PoFs, while at low PoF the experimental results differ from the calculated values (they are even outside of the prediction intervals).

Considering the non-constant stress profile of Fig.70b (in which the stress methodology is analogous to the one adopted for the calibration of lifetime models) the difference between the Miner's prediction and the experimental CDF decreases but it is still relevant in the case of PoF close to 10%. The error around PoF = 10% can be explained by considering the CDF at constant stress reported in Fig.67. More specifically, in Fig.67 the number of cycles to failure for  $\Delta T_j = 140$ °C is very low in the case of PoF = 10%. As discussed before, this is probably due to the positive feedback relationship between the wire bonds degradation and  $\Delta T_j$ , possibly leading to the premature failure of samples in which the thermo-mechanical stress is not kept constant. As a result, the application of the (19) in the case of combined 140°C/120°C stress leads to an



**Fig. 69**: Combined non-constant power cycling stress (Test 1 of Tab.6) for both "active control of  $\Delta T_j$ " (left column) and "non-controlled  $\Delta T_j$ " (right column). Lifetime consumption is reported in (a) and (c) considering different probabilities of failure. CDFs arising from the application of the Miner's rule are as reported in (b) and (d), along with prediction bounds. Weibull fittings for constant  $\Delta T_j = 120^{\circ}$ C and  $\Delta T_j = 140^{\circ}$ C are included in order to delimit the region in which results are expected.

underestimation of the lifetime with respect to the experimental value at PoF = 10%. According to the analysis reported in Fig.70 and Tab.7, the way in which accelerated lifetime tests are performed can have an impact on the accuracy of the linear damage accumulation theory. On the one hand, if lifetime models are calibrated by means of accelerated tests with "active control of  $\Delta T_j$ ", the thermo-mechanical stress can be considered constant, since  $\Delta T_j$  is fixed at the nominal value. Consequently, the Miner's rule gives an accurate prediction when the considered stress is a combination of the stresses at constant  $\Delta T_j$ . On the other hand, during the calibration of lifetime models based on a "non-controlled  $\Delta T_j$ " approach, power devices are subjected to a temperature cycling exceeding the nominal  $\Delta T_j$  value. Therefore, the effective  $\Delta T_j$  value to be considered for the lifetime modeling purpose should be higher. When applying the Miner's rule for a given (non-constant) temperature profile, the



**Fig. 70**: Experimental non-constant  $\Delta T_j$  stresses for Test 1. In (a) the temperature cycling profile is obtained by actively controlling the heating time. In (b) a constant heating current is adopted, leading an increase of temperature close to the end of life. Experimental CDFs, for both  $\Delta T_j$  profiles, are reported in (c) and compared with those calculated according to the Miner's rule (Fig.69).

adopted lifetime model is based on the nominal  $\Delta T_j$  value rather than the effective  $\Delta T_j$  value. Hence, some inaccuracies are introduced in the lifetime estimation. The "active control of  $\Delta T_j$ " approach is extensively verified in all the test conditions reported in Tab.6 and the results are illustrated in Fig.71.

|                      | PoF | Exp.#cycles | Estimated<br>with(19)[%] | Prediction<br>Bound[%] |
|----------------------|-----|-------------|--------------------------|------------------------|
|                      | B10 | 16028       | 112.22                   | [98.30; 130.03]        |
| Test1 active control | B25 | 17373       | 101.36                   | [92.65; 110.95]        |
| of $\Delta T_j$      | B50 | 19858       | 99.84                    | [91.02; 105.90]        |
|                      | B75 | 22006       | 98.58                    | [87.35; 103.78]        |
|                      | B10 | 11506       | 208.81                   | [150.77; 289.21]       |
| Test1 non-controlled | B25 | 13723       | 128.79                   | [107.62; 178.83]       |
| of $\Delta T_j$      | B50 | 17210       | 106.86                   | [93.24; 122.64]        |
|                      | B75 | 20790       | 100.51                   | [85.00; 113.71]        |

**Tab.** 7: Experimental lifetime vs. lifetime prediction according to the Miner's rule (19). lifetime estimated (19) and prediction bound are expressed as a percentage of the experimental number of cycles to failure. Active " $\Delta T_j$ " approach is compared with the case of "non-controlled  $\Delta T_j$ " approach.

The different conditions foresee the same  $\Delta T_j$  stresses (120°C and 140°C), but with different orders and switching points. Therefore, the Miner's rule predictions can be again estimated from the results of Fig.67. As illustrated in Fig.71 the experimental CDFs are always well aligned with the application of the Miner's rule. The maximum error, which is reported in Tab.8, is in the order of 10%, which typically falls in the prediction bound calculated for the lifetime estimation. Therefore, we can conclude that the application of the Miner's rule allows accurately calculating the number of cycles to failure at any PoF.

However, guidelines for the qualification of power devices, such us [71], typically do not allow for modifications of the heating time during the power cycling test. This approach is referred to as "non-controlled  $\Delta T_j$ " method in this discussion, since the application of a constant heating current leads to a change of the temperature swing during the test (due to the modification of self-heating effects). In this case, the application of the Miner's rule for an arbitrary mission profile can lead to less accurate results, but still in the prediction bounds if large probabilities of failure are considered.

| active control of $\Delta T_j$ | PoF      | Exp.#cycles | Estimated with(19)[%] | Prediction Bound[%] |
|--------------------------------|----------|-------------|-----------------------|---------------------|
|                                | B10      | 16028       | 111.22                | [98.30; 130.03]     |
| Test1                          | B25      | 17373       | 101.36                | [92.65; 110.95]     |
|                                | B50      | 19858       | 99.84                 | [91.02; 105.90]     |
|                                | B75      | 22006       | 98.58                 | [87.35; 103.78]     |
|                                | B10      | 11228       | 93.19                 | [80.74; 112.79]     |
| Test2                          | B25      | 13374       | 90.87                 | [81.93; 101.62]     |
|                                | B50      | 16740       | 96.14                 | [86.10; 102.55]     |
|                                | B75      | 18436       | 93.08                 | [81.04; 98.31]      |
|                                | B10      | 16013       | 97.13                 | [90.51; 105.32]     |
| Test3                          | B25      | 17166       | 94.22                 | [89.30; 99.06]      |
|                                | B50      | 19884       | 96.76                 | [94.06; 103.44]     |
|                                | B75      | 21089       | 98.43                 | [90.81; 101.65]     |
|                                | B10      | 16843       | 100.50                | [93.15; 117.39]     |
| Test4                          | B25      | 16994       | 91.49                 | [86.80; 97.97]      |
|                                | B50      | 19300       | 95.50                 | [88.58; 98.97]      |
|                                | $B_{75}$ | 19972       | 91.19                 | [81.49; 95.89]      |

Tab. 8: Experimental lifetime vs. lifetime prediction according to the Miner's rule (19). lifetime estimated with (19) and prediction bound are expressed as a percent control





### 4.4 ANALYSIS OF DEGRADATION MECHANISMS

This section aims to provide further insight into the previously discussed results by conducting an analysis on the degradation mechanisms. Device under tests considered in these experimental results are discrete IGBTs in TO-247 package. They are characterized by a typical lead-frame substrate and solderable pins as terminal contacts. Discrete devices are encapsulated in a transfer mold compound based on an epoxy resin [31]. In order to observe the presence of stress in the solder joint region, X-ray images are captured by means of an EasyTom tomograph. The fresh sample, reported in Fig.72a, shows some voids at the interface between the silicon die and the copper tab, which can be ascribed to the manufacturing process [90, 107, 110]. Similarly, devices subjected to power cycling (both constant and non-constant temperature cycling) exhibit some voids, but no signs of delamination can be found in Fig.72b, Fig.72c or Fig.72d. It is worth noting that the solder joint has a significantly larger volume, with respect to wire bonds, with a consequent higher thermal time constant. For this reason, the solder joint fatigue typically occurs when considering a longer heating time than the value considered in these experiments ( $t_{on} = 0.625s$ ) [37].



(c)

**(d)** 

**Fig. 72**: X-ray images of solder joint regions taken from the back side of the component: (a) fresh device; (b) after failure -  $\Delta T_j = 120^{\circ}C$  with "active control of  $\Delta T_j$ "; (c) after failure -  $\Delta T_j = 120^{\circ}C$  with "non-controlled  $\Delta T_j$ ". (d) after failure – Test 1. Voids can be observed in all samples at the Si/Copper tab interface. The shadow of the two wire bonds is also visible in the images.

The packages of some samples have been opened for the inspection of wire bonds. As reported in Fig.73, in all considered cases (both constant and non-constant temperature cycling, both "active control of  $\Delta T_j$ " and "non-controlled  $\Delta T_j$ ") the formation of a crack at the Al/Si interface is visible in the images acquired through a Leica MS5 microscope. Therefore, according to this study, the discussion about the applicability of the LDA theory is related to the wire bonds degradation mechanism. It is worth mentioning that the Miner's rule can be considered only if a single failure mode occurs in the component [110]. The analysis of degradation mechanisms is in agreement with the electrical wear-out of the components. In fact, as reported in Fig.66, the failure events are associated to only an increase of  $V_{ce,on}$ .





**Fig. 73**: Microscope images of wire bonds after power cycling failures: (a)  $\Delta T_j = 120^{\circ}C$  with "active control of  $\Delta T_j$ "; (b)  $\Delta T_j = 120^{\circ}C$  with "non-controlled  $\Delta T_j$ " (c) Test 1. Red arrows indicate the localization of the crack formation.

# APPLICATION OF ANN TO MODEL THE RELIABILITY OF SEMICONDUCTOR POWER DEVICES

## 5.1 INTRODUCTION

This chapter presents an overview of the contributions enabled by the utilization of deep learning techniques in the domain of power cycling reliability, particularly focusing on predicting the lifetime of power devices. To achieve this, two distinct approaches have been evaluated:

- Employing artificial neural network (ANN) as a lifetime model to develop a unified framework for various types of failure mechanisms.
- Developing a deep learning technique for estimating the remaining useful lifetime (RUL) of semiconductor power devices.

These studies involved the development of a specialized methodology and were validated through experimental testing.

## 5.2 BASICS OF ANN

ANNs represent a class of computational models inspired by the human brain's functioning. These networks distinguish themselves through their capacity to learn from data and discern complex patterns. Comprising interconnected units termed neurons, ANNs process and transmit information via weighted connections [118, 119]. The significance of ANNs lies in their ability to address intricate problems that frequently surpass the scope of conventional analysis techniques [118, 120]. There are various types of ANNs, including the multi layer perceptron neural network (MLP-NN) and recurrent neural network (RNN), both employed in this work, with each being trained for specific objectives.

Notably in the field of power electronics, ANNs find substantial utility in monitoring and predicting the efficiency, reliability, and remaining useful lifetime of electronic devices [17–19, 121–129]. Their proficiency in decoding intricate data and identifying anomalies renders them invaluable tools for enhancing the performance and management of power electronic devices and systems.

## 5.2.1 Multi Layer Perceptron Neural Network

In this type of ANN, the neurons are divided into layers which are typically fully connected. They are also called feed-forward neural networks (FFNNs) due to the fact that the information only travels forward in the network. When the relationship between input and output is a (non-linear) static function, MLP-NN is the most suitable ANN to be used [17, 121], with an appropriate number of hidden layers and neurons. The basic elements in a MLP-NN are as follows [130]:

- Number of layers;
- Number of neurons in each layer;
- Activation function of each layer;
- Algorithm used during the training process.

In MLP-NNs there are at least three layers: the input, the output and the shallow layer. The number of neurons in the input layer must be equal to the number of input signals, while in the output layer, it depends on the type of ANN. In classification problems the number of output neurons matches the number of classification classes, while in regression problem, it matches the number of outputs. The internal layers between input and output are denoted as hidden layers. Commonly, increasing the problem's complexity (i.e. the complexity of the function to be estimated) leads to an increase in the number of neurons and hidden layers. Unfortunately, it is not possible to theoretically determine how many hidden layers or neurons are needed for each problem. To address this issue, in this work, a systematic analysis is carried out in Sec.5.3, aiming at establishing the most suitable configuration in the case of power cycling tests. Fig.74 depicts a regressive MLP-NN with three inputs, three hidden layers, each with four neurons, and a single neuron in the output layer. The activation functions are essential in each ANN to describe non-linear functions. Among the different activation functions developed in the literature [131, 132], the Hyperbolic Tangent Sigmoid Activation Function is adopted for the architectures used for this work. It is defined as:





**Fig. 74**: Example of MLP-NN with three inputs, three hidden layers and a single output.

As depicted in Fig.75, the output of a generic neuron is defined as:

$$y = f\left(b \cdot w_0 + \sum_{i=1}^{n} [x_i \cdot w_i]\right)$$
(21)



Fig. 75: Relationship between input and output in a neuron.

where,  $x_i$  is i - th neuron's input, b is the bias value,  $w_0$  is the weight associated to the bias,  $w_i$  is the weight for the i - th input, f() is the activation function and n is the number of inputs in a neuron.

## 5.2.1.1 Training Process

In the training process, the MLP-NN has to extrapolate and to learn the correlations between inputs and outputs of the dataset provided. To this purpose, the weights and the biases of each neuron are adjusted to reduce a specific cost function. The training process can be divided into three steps:

- *Preliminary operations*: in order to obtain an accurate and a robust MLP-NN, the dataset is split in training (around 80%) and test (around 20%) data. Training data are used during the learning process and test data during the performance evaluation step. After this, the data are usually normalized in order to achieve better performances during training process. The normalization presented in [133] can be adopted. When presenting the final estimated results, denormalization is performed. During this phase, the MLP-NN structure is defined in terms of number of inputs, outputs and hidden layers.
- *Training*: the training algorithm is executed in order to find the minimum in a cost function (i.e. the RMSE). It is typically split into two sub-steps. In the first one, the output is calculated using inputs and network weights, while in the second one the backpropagation algorithm [134] is applied and the network weights and biases are updated. The number of epochs measures how many times the algorithm has been executed on the entire dataset. The training process ends when a minimum (which may be local) is reached in the cost function or when a predefined RMSE is achieved. The validation data are used to detect overfitting or underfitting.
- *Performance evaluation* : the test set is evaluated, and the obtained output is compared with the dataset output. If the performances of the MLP-NN are not sufficient, the MLP-NN structure (i.e., number of hid-

den layers, number of neurons) or the training algorithms should be updated.

With the rapid development of artificial intelligent techniques, several deep learning frameworks and libraries have been developed. The most used ones are Matlab Deep Learning, TensorFlow as well as PyTorch. For the architecture of the ANNs utilized in this work, the aforementioned frameworks are employed.

### 5.2.2 Recurrent Neural Network

In the previous section, it was discussed how among the architectures of ANNS, the MLP-NN is the most suitable when the input-output relationship is described by a nonlinear static function. Within ANNS, there are architectures capable of predicting the future performance and reliability trends of devices. These types of ANNS are referred to as RNNS. These neural networks allow, through historical information, the definition of a dynamic nonlinear model, which improves its prediction as new data is added during the testing phase.



**Fig. 76**: Schematic description of a gated cell (LSTM network). σ and tanh are the sigmoid and hyperbolic functions, respectively.

To tackle time-sequence forecasting, RNNs are designed to effectively process sequential data. Compared to traditional MLP-NNs, where inputs are propagated and processed through the hidden layer stack, RNNs allow previous outputs to be used as inputs. The key feature of RNNs is their ability to maintain an internal memory or hidden state that can capture temporal dependencies in the input data. This memory enables RNNs to process sequences of variable length and make predictions based on previous elements in the sequence.

RNNs are affected by the vanishing gradient issue, making it challenging for RNNs to learn and capture long-term dependencies effectively. LSTM can be considered to overcome this problem, thanks to its ability to ignore or retain information to remember [135]. The atomic element of an LSTM network is the gated cell shown in Fig.76. The cell is supplied with three gates, namely forget, input and output, regulating the flow of information into and out of the cell. Each gate processes the linear combination of its inputs through a non-linear function (i.e., the activation function) and returns a value between o and 1 used to weigh the desired information.

The input and forget gates act directly on the cell state  $c_k$ . The first one determines how to balance the new knowledge from the input  $x_k$  and the previous cell output  $h_{k-1}$ , while the second one decides the contribution of the preceding state  $c_{k-1}$ . Lastly, the output gate regulates the cell output  $h_k$ . based on the current cell state. Remarkably, inputs and states are both processed using the tanh function to mitigate the vanishing or exploding gradient issues.

An extension and improvement of LSTM performance is achieved with the bLSTM [136]. As illustrated in Fig.77, bLSTM consists of two chains of LSTM cells that consider both time directions. According to the temporal input order  $x_k$ , gated cells connected in ascendent order define the forward state. On the contrary, the ones associated with the descending order give the backward state. The output layer (i.e., the output sequence  $y_k$ ) is then given by a combination of both forward and backward states.



Fig. 77: Bidirectional Long-Short Term Memory (bLSTM) network.

The network training process, to develop a single-step time-series forecasting model, is based on a sliding window approach. Therefore, a window of a predetermined size is chosen to be fed as input to the network for training, and the subsequent point in the window is given as the training output to the network. This type of approach has been found to be among the most suitable for forecasting and estimating the RUL, ensuring high accuracy, as indicated in [137]. A fixed window, containing m samples, from the input sequence x is selected as the model's input (i.e.,  $x_k, ..., x_{k-m+1}$ ). The neural network predicts the subsequent value  $\tilde{x}_{k+1}$ , where k is the index of the last input value. The learning process is aimed at tuning the parameters of the non-linear function  $f_{NN}$  associated with the ANN architecture minimizing the loss function (e.g., RMSE) of the predicted value

$$\tilde{\mathbf{x}}_{k+1} = f_{NN} \left( x_k, x_{k-1}, \dots, x_{k-m+1} \right)$$
 (22)

with respect to the real one  $x_{k+1}$ . To this purpose, the input dataset used for the training is composed of portions of the on-voltage profiles arising from different samples. The corresponding next value of the sequence window is the target output.

#### 5.3 ANN-BASED STATIC LIFETIME MODEL

This section analyzes the problem of modeling the lifetime in semiconductor power devices subjected to more than one mechanism of failure linked to power cycling stress using ANNs. It discusses the optimal configuration of ANNs for the considered problem, aiming at minimizing the error in the predicted lifetime and at reducing the required number of training data. Moreover, being the device lifetime a stochastic parameter, the suitability of ANNs is verified in the case of variability in the input training data.

### 5.3.1 Limit of Empirical Analytical Lifetime Models

In Chapter 2, Subsection 2.6.1, an in-depth analysis of the primary empirical lifetime models found in literature has been conducted. Among these models, particular attention is drawn to the group of researchers who developed the model, as described in (8) in Chapter 2 and comprehensively presented in [89]. This group carried out an analytical analysis in [34] and [138], highlighting that the adaptation parameters within the empirical model, developed in [89], significantly vary depending on whether the model predicts wire bond degradation or solder degradation. This underscores a well-known issue affecting analytical models, namely their limitation in effectively describing a single failure mechanism. As extensively discussed in [139], it is crucial to differentiate between these two failure mechanisms. The primary rationale behind this necessity lies in the evidence that the impact of stress parameters on the device, in terms of failure mechanisms, is fundamentally different, as reported both in [138] and [139].

The challenge of obtaining accurate predictions in classic empirical lifetime models when considering multiple failure mechanisms can be attributed to the excessive complexity required to adapt a single model to this array of scenarios. Consequently, the utilization of deep learning techniques emerges as a promising solution to this challenge. These techniques have demonstrated remarkable capabilities in describing and modeling complex phenomena that link input and output data.

### 5.3.2 Methodology

The goal of this work is to investigate the suitability of an ANN to define a non linear static model for power cycling failure phenomena. Failures are stochastic events characterized by a Weibull random distribution [116] (see Chapter4). Hence, when considering the training process of MLP-NNs, the variability of training data must be considered. Averaging several tests, under the same conditions, is a solution to reduce the randomness of the lifetime estimated under power cycling effects. However, this can be significantly time consuming, also considering that a relevant number of data is required in the training phase of a MLP-NN.

In order to evaluate the influence of random distribution of training data on the accuracy of a MLP-NN prediction, a preliminary analysis using a failure model validated in the literature [91] is performed. The methodology reported in Fig.78 is adopted using the following steps. For a given combination of junction temperature cycling ( $\Delta T_j$ ), minimum junction temperature ( $T_{j,min}$ ), heating time ( $t_{on}$ ) and current density per wire (I), the average number of cycles to failure is estimated according to [91]:

$$N_{av} = A \cdot \Delta T_{j}^{\gamma} \cdot e^{\frac{\beta}{T_{j,\min}}} \cdot t_{on}^{\alpha} \cdot I^{\phi}$$
(23)

being A,  $\alpha$ ,  $\beta$ ,  $\gamma$  and  $\phi$  fitting parameters based on experimental data [91], considered also valid for IGBT devices in TO-247 package. The current density I is calculated as the ratio between the current flowing in the device and the wire diameter, whose typical value is taken from [91]. Although, according to (23), four parameters are assumed to affect the lifetime of IGBT devices, the major contribution is expected to come from the junction temperature cycling [88]. N<sub>av</sub> represents the average value of the number of cycles to reach the failure condition under power cycling tests. Afterwards, Weibull distribution is considered, with a 50% percentile at N<sub>av</sub> and with a shape parameter of 10 which is in agreement with typical experimental power cycling tests. A random value of the number of cycles to failure (N<sub>random</sub>) is then generated according to the considered Weibull distribution and to the N<sub>av</sub> input value. Step a) is repeated K times under different input combinations, being K=22 a tradeoff between precision and measurement time. However, different values of K could be adopted, depending on the available hardware setup.

The dataset of N<sub>random</sub>, obtained at step a), is used for the training process of the MLP-NN. Depending on the input conditions, the number of cycles to failure can change of orders of magnitudes, possibly leading to larger errors for very low values of N during the error minimization process. Hence, the MLP-NN is trained according to the logarithmic value of N<sub>random</sub>.

The ANN derived at step b) is used to predict the number of cycles to failure  $N_{ANN}$  under several input conditions, being different from those considered in step a).  $N_{ANN}$  values are compared to average values predicted by (23),  $N_{\alpha\nu}$ , and the relative error is calculated as:

$$e_{\rm r} = \frac{N_{\rm ANN} - N_{\rm a\nu}}{N_{\rm a\nu}} \tag{24}$$

Consequently, **RMSRE** is estimated as:

$$RMSRE = \sqrt{\sum_{i=1}^{M} \frac{e_{r,i}^2}{M}}$$
(25)

where M is the number of test conditions and  $e_{r,i}$  is the relative error referred to the i-th test condition.



Fig. 78: Schematic representation of the methodology adopted to investigate the performance of MLP-NN to model power cycling effects: (a) Number cycles to failure is calculated considering the application of (23) and a random Weibull distribution; (b) N<sub>random</sub> values are used in the training phase of MLP-NN; (c) Trained MLP-NN predicts the number of cycles to failure N<sub>ANN</sub>; (d) The relative error is calculated considering the difference between N<sub>ANN</sub> and the average value N<sub>av</sub> given by (23).

## 5.3.3 Configuration and Performance of MLP-NN

The list of input parameters, considered for the training process of the MLP-NN, is reported in Tab.9. A set of K=22 input combinations is selected among those arising from Tab.9. Following the methodology reported in Fig.78, the number of cycles to failure  $N_{random}$  are generated considering the selected input



**Fig. 79**: RMSRE calculated in case of MLP-NN with a 1 hidden layer and a number of neurons ranging from 1 to 60. Each neural network is trained with a dataset of 20 input combinations. The training process is repeated 100 times for each network configuration in order to gain statistical relevance of results.

combinations. The search for the optimum MLP-NN foresees the analysis of networks with different number of hidden layers (from 1 to 2) and with different neurons within each hidden layer (from 1 to 60). Therefore, the methodology reported in Fig.78 is carried out for each MLP-NN configuration. Moreover, the statical relevance of the results is verified by repeating 100 times the entire procedure. Every time the procedure is repeated, although  $N_{\alpha\nu}$  dataset is unchanged (because the 22 input combinations are the same), the random generation of the number of cycles, based on the Weibull statistics, leads to different  $N_{random}$  values. Therefore, a total amount of 60 (number of neurons) x 2 (number of HLS) x 100 (number of tests) = 12000 MLP-NNs are trained for this task, each of them being characterized by a dataset of 22 inputs.

The results in the case of a single hidden layer and Hyperbolic Tangent Sigmoid activation function are reported in Fig.79. For some MLP-NN configurations (i.e. for a given number of neurons in the hidden layer) the RMSRE value can significantly increase (up to 70% in some cases) by considering a different randomly generated dataset. Therefore, it is fundamental to select an MLP-NN configuration in which the RMSRE has always a limited value and it is verified for a large number of tests. The average values of RMSRE, over 100 tests, are reported in Fig.80 in the case of one or two hidden layers. The average errors are close to 10% in both cases, even if slightly lower values



**Fig. 80**: **RMSRE** averaged over 100 tests as a function of the number of neurons, in the case of a single hidden layer (a) or two hidden layers (b). Error bars represent standard deviations around the average values. The **RMSRE** in the case of fitting with model (23) is also reported (dashed line) as a reference.

are observed in the case of 1 hidden layer. In order to select the number of neurons, the following desired conditions are considered in this work: i) low average value of RMSRE; ii) reduced standard deviation of RMSRE; iii) limited sensitivity to a change of the number of neurons. The box reported in Fig.80 indicates the region in which the selected configuration of the MLP-NN is localized. As a consequence, a MLP-NN having a single hidden layer with 15 neurons is considered for the remainder of this work. However, alternative values for the number of neurons are suitable for the goal of this paper. For example, in the case of neurons ranging from 3 to 10 similar characteristics, with respect to the case of 15 neurons, are observed in Fig.80.

For the sake of comparison, in Fig.80 it is also reported the RMSRE value obtained in the case of fitting performed with the analytical model. In this case, step (b) of Fig.78 is replaced with a conventional fitting by means of the model (23) and in step (c) a simple analytical calculation of the number of cycles to failure is carried out, according to the model calibration. The procedure is repeated 100 times and the average RMSRE is then calculated and reported in Fig.80. It is worth noting that for the selected MLP-NN configuration, the RMSRE value is slightly lower with respect to the analytical fitting.

Once the selected neural network has been trained, an extensive comparison is done between the model of (23) and the MLP-NN prediction. Some of the significant results are reported in Fig.81, where the dependence on the junction temperature cycling and on the heating time is investigated. The lifetime predicted by the MLP-NN is in good agreement with respect to the model of (23), being adopted in the methodology of Fig.78. When evaluating the
| Input parameter    | Values                                                |
|--------------------|-------------------------------------------------------|
| $\Delta T_j$       | [60; 70; 80; 90; 100; 110; 120; 150] °C               |
| T <sub>j,min</sub> | [25; 51.67; 70.33; 85] °C                             |
| t <sub>on</sub>    | [0.3; 0.75; 1; 1.2; 1.875; 2; 3; 18; 32] s            |
| Ι                  | [71.11; 80; 100; 116.67; 136.44; 151.11; 175.56] A/mm |

**Tab.** 9: Input parameters adopted for the training process of MLP-NNs according to the methodology presented in Fig.78.

accuracy of results, two important considerations should be kept in mind: a) the MLP-NN is trained with randomly distributed number of cycles to failure ( $N_{random}$ ), while in Fig.81 average values  $N_{av}$  are used as a comparison; b) test conditions reported in Fig.81 are in general different from those adopted for the training phase, hence confirming the suitability of the MLP-NN in the whole range of parameters (reported in Tab.9).

The detailed list of experimental tests conducted in this work is reported in Tab.10, overall, 25 experimental power cycling tests are carried out, under different parameter configurations. As reported in Tab.10, when a large  $t_{on}$  is considered (in the order of 30 seconds), a significant increase of  $Z_{th}$ is observed, eventually determining the failure of the component when the threshold of +20% is reached. Hence, solder joint fatigue can be assumed as a failure mechanism in this case. For low values of  $t_{on}$ , instead, the predominant failure is due to wire bonds degradation.

| ΔT <sub>j</sub><br>[° <b>C</b> ] | T <sub>j,min</sub><br>[°C] | t <sub>on</sub><br>[s] | I <b>[A/mm]</b> | N f<br>[-] | V <sub>ce,on</sub> @<br>failure | Z <sub>th</sub> @<br>failure | scope    |
|----------------------------------|----------------------------|------------------------|-----------------|------------|---------------------------------|------------------------------|----------|
| 60                               | 85                         | 0.3                    | 100             | 486700     | +5%                             | $\sim +0\%$                  | training |
| 70                               | 70.33                      | 0.3                    | 109.56          | 151200     | +5%                             | $\sim +0\%$                  | training |
| 80                               | 70.33                      | 2                      | 85.11           | 73820      | +5%                             | $\sim +0\%$                  | training |
| 90                               | 51.67                      | 1                      | 102.89          | 54720      | +5%                             | $\sim +0\%$                  | training |
| 90                               | 70.33                      | 1                      | 100.44          | 38465      | +5%                             | $\sim +0\%$                  | training |
| 90                               | 70.33                      | 3                      | 85.33           | 32760      | +5%                             | $\sim +0\%$                  | training |
| 100                              | 25                         | 0.3                    | 145.56          | 82390      | +5%                             | $\sim +0\%$                  | training |
| 100                              | 25                         | 1                      | 116.67          | 64691      | +5%                             | $\sim +0\%$                  | training |
| 100                              | 51.67                      | 0.3                    | 136.44          | 37315      | +5%                             | $\sim +0\%$                  | training |
| 108                              | 85                         | 18                     | 80              | 2355       | +5%                             | ~+1%                         | training |
| 110                              | 51.67                      | 1.25                   | 111.11          | 22305      | +5%                             | $\sim +0\%$                  | test     |
| 110                              | 85                         | 32                     | 71.11           | 560        | ~+4%                            | +20%                         | training |
| 116                              | 80                         | 29                     | 75.55           | 571        | +5%                             | $\sim$ +16%                  | test     |
| 120                              | 25                         | 0.3                    | 158.89          | 24480      | +5%                             | $\sim +0\%$                  | training |
| 120                              | 25                         | 0.625                  | 139.33          | 21775      | +5%                             | $\sim +0\%$                  | test     |
| 120                              | 25                         | 0.75                   | 137.11          | 22885      | +5%                             | $\sim +0\%$                  | test     |
| 120                              | 25                         | 1.875                  | 110.89          | 18250      | +5%                             | $\sim +0\%$                  | training |
| 120                              | 25                         | 3                      | 105.56          | 14055      | +5%                             | $\sim +0\%$                  | training |
| 120                              | 51.67                      | 0.3                    | 152.67          | 22510      | +5%                             | $\sim +0\%$                  | training |
| 120                              | 51.67                      | 2                      | 106.67          | 15646      | +5%                             | $\sim +0\%$                  | training |
| 120                              | 51.67                      | 3                      | 112             | 13305      | +5%                             | $\sim +0\%$                  | training |
| 140                              | 25                         | 0.625                  | 151.11          | 10764      | +5%                             | $\sim +0\%$                  | test     |
| 150                              | 25                         | 0.3                    | 175.56          | 10584      | +5%                             | $\sim +0\%$                  | training |
| 150                              | 25                         | 1.2                    | 71.11           | 10183      | +5%                             | $\sim +0\%$                  | test     |
| 150                              | 25                         | 2                      | 126.22          | 7459       | +5%                             | $\sim +0\%$                  | training |

Tab. 10: List of power cycling experiments.



Fig. 81: Number of cycles to failure as a function of the junction temperature cycling  $\Delta T_j$  (a) and the heating time  $t_{on}$  (b). Values calculated with the analytical model of (2), are compared with values predicted by the selected neural network (1 HL and 15 neurons).

#### 5.3.4 Training of MLP-NN with Experimental Data and Validation

The ANN has been configured according to the discussion reported in previous section: MLP-NN with a single hidden layer having 15 neurons. Experimental data reported in Tab.10 are adopted during the training process of the MLP-NN.

The dataset is divided in 19 training data and 6 test data, as illustrated in Tab.10. The input parameters for these experiments are in similar ranges adopted for the analysis in the previous section. The trained MLP-NN is then used to predict the number of cycles to failure as a function of the junction temperature cycling, the minimum junction temperature and the heating time, and a representative dataset is reported in Fig.82. Experimental results, adopted as both training and test data, are also included in Fig.82 for the sake of the comparison.

Despite the reduced number of experimental data, a good fitting is observed in a large range of input parameters. In particular, the RMSRE value, calculated between the full experimental dataset and MLP-NN predictions, is 16.23%, while its value decreases to 10.94% when only test data are considered. This error is consistent with the randomness of the experimental number of cycles to failure. In fact, a too small value of the error, would be characteristic of an overfitting model. In Fig.82a the number of cycles to failure vs.  $\Delta T_j$  is reported for different  $T_{j,min}$  and  $t_{on}$  combinations. In the case of  $T_{j,min}=70.33$ °C and  $t_{on}=3s$  a single experimental point was available. Nevertheless, a reasonable dependence on  $\Delta T_j$  is found confirming the suitability of the model in a large range of input parameters. A similar result is observed in Fig.82c for  $T_{j,min}=51.67$ °C and  $\Delta T_j=110$ °C, where no training data are present at all, but still a consistent dependence on  $t_{on}$  is verified and a good matching with test data is observed.

In Fig.82a and Fig.82b experimental results arising from both bond wire failure ( $t_{on} \leq 3s$ ) and solder joint fatigue ( $t_{on} = 32s$ ) are reported. The proposed neural network is able to properly predict lifetime of the component in a large range of operating conditions and when more than one mechanism of failure linked to the power cycling occurs in the power device.



**Fig. 82**: Number of cycles to failure as a function of the junction temperature cycling  $\Delta T_j$  (a) and of the heating time  $t_{on}$  (b). A zoom of (b) in the  $t_{on}$  range from o to 3.5s is reported in (c). MLP-NN is trained against experimental data and predicted values are reported as dashed lines. Some of the training data (open symbols) and of the test data (stars) are also included for the sake of the comparison.

#### 5.3.5 A Comparative Study between Analytical Modeling and ANN

This section aims at comparing the adoption of ANNs and analytical models to predict the lifetime of discrete IGBT devices under power cycling stress. To this purpose, an experimental activity is carried out, based on Sec.5.3 and with additional test conditions, in which different failure mechanisms limit the lifetime of components.

# 5.3.5.1 *Methodology*

The workflow adopted for the calibration and verification of both MLP-NN and analytical model is reported in Fig.83. In the first step, junction temperature cycling, minimum junction temperature, heating time and current density per wire are selected in a large range of test conditions. A total amount of 27 experimental tests are carried out (25/27 are the same reported in Tab.10) and the dataset is split in training (22) and test (5) data.

In this work two different approaches are followed for the lifetime modeling:

- MLP-NN with four inputs (ΔT<sub>j</sub>, Tj,min, t<sub>on</sub>, I), with a certain number of hidden layers and neurons, and one output (N, number of cycles to failure) (based on the methodology and results discussed in Sec.5.3).
- Fitting of a conventional analytical model, accounting for relevant power cycling parameters, whose expression is (23).

In order to define the best ANN structure in terms of number of neurons and hidden layers, the methodology proposed in Sec.5.3 is applied. According to this study, the selected MLP-NN has 2 hidden layers with a total number of neurons equal to 9. The activation functions are log-sigmoid and hyperbolic tangent sigmoid, used for the first and second hidden layers, respectively.

Bayesian regularization is used as a training algorithm with a learning rate set to 0.1, being implemented by means of Matlab Deep Learning libraries.

According to Fig.83, experimental training data are adopted for the calibration of both MLP-NN and analytical lifetime models. Afterward, their accuracy is evaluated by comparing the lifetime prediction with test data (non-adopted for the training/fitting phase) as well as with training data. The relative error is calculated with (24) As a final step, the RMSRE values are calculated using (25) and compared quantitatively between the application of MLP-NN and the analytical model.







#### 5.3.5.2 Experimental Results and Comparison of Accuracy of Models

Fig.84, based also on data in Tab.10, reports a summary of all experimental number of cycles to failure as a function of  $\Delta T_j$  and  $t_{on}$ . It is clear that devices tested with a short heating time ( $\leq$ 3s) follow a different trend with respect to the case of a longer heating time ( $\geq$ 18s). This is a confirmation of the presence of different failure mechanisms occurring in devices subjected to different stress conditions.

A further evidence of this, in Fig.85, power cycling tests are reported by considering very different heating times. In the case of  $t_{on}=0.3s$ , on-voltage  $(V_{ce,on})$  sharply increases close to the end of life, while the junction-to-case thermal impedance  $(Z_{th})$  does not change significantly. This suggests that bond wire degradation is occurring in the device. On the other hand, in the case of  $t_{on}=29s$ ,  $Z_{th}$  significantly increases, determining the failure event when the threshold value (+20%) is achieved. At the same time,  $V_{ce,on}$  appears to be almost unchanged. Therefore, solder joint failure can be assumed in this case.



**Fig. 84**: Experimental number of cycles to failure as a function of the junction temperature cycling. Power cycling tests are carried out at different heating times and minimum junction temperatures.

Having trained the MLP-NN and fitted the analytical model (whose fitting parameters are reported in Tab.11) with the experimental data, lifetime prediction as a function of  $\Delta T_j$ , is compared in Fig.86(a) and Fig.86(b) with respect to both training and test data. Several combinations of  $T_{j,min}$  and  $t_{on}$  are considered in Fig.86(a) and Fig.86(b), while I is linearly swept in the range reported in the legend. In Fig.86(a), the adoption of MLP-NN leads to curves with different slopes, depending on the heating time. This allows to achieve a good accuracy for both bond wire degradation (short  $t_{on}$ ) and solder joint fatigue (long  $t_{on}$ ). In Fig.86(b), the adoption of a conventional analytical model leads to a constant slope in the plot, according to  $\Delta T_i^{\gamma}$  dependence of (23). However,



**Fig. 85**: Typical power cycling tests. On-voltage (a) and thermal impedance (b) are monitored as a function of the number of cycles. Failure thresholds are reported as dashed lines.

| A         | γ      | β      | α       | φ      |
|-----------|--------|--------|---------|--------|
| 5.217e+11 | -4.779 | 2287.7 | -0.7118 | -0.366 |

Tab. 11: Fitting parameters of empirical analytical model (23).

since different failure mechanisms are involved,  $\gamma$  parameter should change with t<sub>on</sub>. This kind of relationship cannot be captured by the simple model (23) and significant errors are observed in Fig.86(b).

In Fig.87(a) and Fig.87(b), the number of cycles to failure as a function of the heating time is reported in the case of MLP-NN and analytical model, respectively. Again, the analytical model fails in accurately predicting the lifetime over the entire  $t_{on}$  range, while the ANN approach is able to properly model the lifetime at any  $t_{on}$  value. Tab.12 summarizes the RMSRE values calculated with (25) for both approaches. The RMSRE value in the case of analytical model is more than double with respect to the ANN approach. More in details, when considering ton  $\ge$  18s (i.e. for failure events due to the increase of  $Z_{th}$ ) the error of the analytical model with respect to the experimental data grows up to 56.29%.

In the case of ANN approach, the RMSRE is very similar for short and long  $t_{on}$  values, and it is close to 21%. It is worth noting that failures are stochastic events, hence a certain randomness is expected in the experimental data, leading to a non-negligible RMSRE value. To reduce this number, more experimental tests should be performed and averaged. Nevertheless, in Sec.5.3, it is shown that the modeling of lifetime by means of ANN brings to relatively accurate results also in the case of training data affected by randomness.

|            |             | 5         |                    |                        |
|------------|-------------|-----------|--------------------|------------------------|
|            | all dataset | test data | $ton \leqslant 3s$ | $ton \geqslant 18s$    |
| ANN        | 21.58%      | 8.83%     | 21.59%             | 21.56%                 |
| Analytical | 45.26%      | 41.64%    | 42 17%             | 56 20%                 |
| model      | 49.9070     | 4         | 40.1/              | J <b>0</b> <u>9</u> 70 |

Tab. 12: Summary of RMSRE.

This results confirms that conventional analytical models are suitable only to model a single failure mechanism (either bond wires degradation or solder joint fatigue), while they fail if multiple failure mechanisms are considered. In the latter case, multiple lifetime models need to be derived, requiring dedicated and more complicated test structures. On the other hand, artificial neural networks with the architecture of MLP-NN are proved to be a good solution, since they are able to find complex correlations between inputs and output, being difficult to model with a conventional analytical method.





99





# 5.4 ADOPTION OF NEURAL NETWORKS TO PREDICT REMAINING USEFUL LIFETIME OF DEVICES

In Chapter 2, Subsec.2.6.2, the data-driven models are introduced, which are used to estimate the lifetime by monitoring and forecasting the future behavior of a parameter known as SoH (e.g., on-voltage or thermal impedance).

Among the widely used analytical methods for describing a data-driven model, the PF is described (see Chapter 2, Subsec.2.6.2). However, according to [122, 140], imprecise knowledge of the parameters of the function that describes the SoH, as well as inaccurate initialization of the filter, can lead to inconsistent prognosis results. The ANNs represent a viable solution for data-driven prognostic methods, as they avoid the need for model definition, can learn online, and adapt themselves to the degradation profile [122]. In Sub-Sec.5.2.2, families of architectures capable of implementing dynamic models for future parameter prediction are defined. For this purpose, an ANN based on bLSTM is adopted. The bLSTM is trained by using experimental on-voltage degradation profiles. The proposed method relies on monitoring a precursor, specifically the on-voltage degradation, and based on this precursor, the model allows for the prediction of the RUL.

#### 5.4.1 *Methodology for RUL Estimation*

The proposed approach aims at developing a deep learning-based model for predicting the degradation profile of the on-voltage of switching devices under fixed stress conditions. Being the failure event a stochastic phenomenon, ANN models are the most suitable to account for the variability in the degradation process. Fig.88 illustrates the expected outcome of the data-driven model, with the predicted on-voltage profile over time as the model output. In power cycling stress scenarios, the on-voltage is expected to increase due to wire bonds degradation, and a 5% increment is considered as the failure threshold [20]. The estimated on-voltage profile, and consequently the lifetime prediction, relies on the real-time on-voltage acquisition. Initially, the prediction is mainly based on the off-line training of the model, resulting in an approximation close to the average value of the voltage profiles used in the training phase. However, as the monitoring time increases and the on-voltage of the tested device is experimentally measured, the accuracy of the lifetime prediction improves. Consequently, the RUL estimation approaches the ideal value.

The forecast is based on recursive iterations of the bLSTM model to obtain the on-voltage profile along the thermal cycles, as schematically reported in Fig.89. At the first iteration (initial guess), m samples of the experimental profile are provided to the bLSTM model to guess the subsequent value  $\tilde{x}_{k+1}$ . At the next iteration, the predicted value  $\tilde{x}_{k+1}$  is used as the model's input discarding the oldest sample  $x_{k-m+1}$  and sliding one step forward the mlength window. At the i-th iteration, the on-voltage is predicted through both experimental and predicted samples if i < m, or only predicted values if i > m

$$\hat{x}_{k+i} = f_{NN} \left( \tilde{x}_{k+i-1}, \dots, \tilde{x}_{k+1}, x_k, x_{k-1}, \dots, x_{k-m+1} \right), i < m$$

$$\tilde{x}_{k+i} = f_{NN} \left( \tilde{x}_{k+i-1}, \dots, \tilde{x}_{k-m+i+1} \right), i \ge m$$
(26)



Fig. 88: Graphic representation of the expected outcome of the data-driven model.



**Fig. 89**: On-voltage prediction according to the proposed methodology. m samples are considered  $(x_k, \ldots, x_{k-m+1})$  as the input of the NN and allows calculating  $\tilde{x}_{k+1}$ . Subsequently, the vector  $(\tilde{x}_{k+1}, \ldots, x_{k-m+2})$  is considered as a new input of the NN and another value  $(\tilde{x}_{k+2})$  is estimated. This process is repeated until the EoL condition is reached.

This process is iterated until  $\tilde{x}_{k+i}$  reaches the EoL condition (i.e., an increase of 5% of the initial on-voltage value). From this definition, the RUL can be expressed as

$$RUL(k) = i | \quad \widetilde{x}_{k+i} \ge X_{EoL} \quad AND \quad \widetilde{x}_{k+i-1} \le X_{EoL}$$
(27)

where k and i represent the number of monitored cycles and the remaining number of cycles to failure, respectively.  $x_{EoL}$  is the failure threshold.

#### 5.4.2 Results and Discussion

The **bLSTM** has been trained according to the procedure reported in Sub-Sec.5.2.2, by using the experimental  $V_{ce,on}$  profiles reported in Fig.90. These profiles are decimated by a factor 100 in order to reduce the complexity of the neural network while maintaining good performances. A window size (m) of 45 elements is considered for both training and testing phases, corresponding to 4500 cycles for the chosen decimation factor. Regularization techniques have been implemented to improve the network's learning ability, and the Adam algorithm with a learning rate of 0.1 has been used to train the bLSTM [141]. The dataset is split into a training subset (6 profiles) and a test subset (2 profiles). To verify the robustness of the model concerning the partition of the available dataset, the model is trained with all the possible combinations of training/test subsets, totaling 28 possible neural networks, i.e. the binomial coefficient (8, 6). Two different conditions are considered for the training phase:  $\Delta T_i = 120^{\circ}C$  and  $\Delta T_i = 140^{\circ}C$ . An example of  $V_{ce,on}$  profiles estimated by means of the neural network is reported in Fig.91. In particular, Fig.91a (or 91b) considers a neural network trained at  $\Delta T_i = 120^{\circ}C$  (or  $\Delta T_i = 140^{\circ}C$ ) with samples #1, #2, #4, #5, #6, and #7 (or samples #9, #10, #12, #13, #14 and  $\#_{15}$ ) and tested on sample  $\#_3$  (or  $\#_{11}$ ). Experimental V<sub>ce,on</sub> profiles as a function of the number of cycles are reported in black (solid lines), along with the thresholds assumed for the failure criterion (dashed lines). The other curves are those predicted by the neural network according to the selected observation windows, i.e. the monitored number of cycles indicated as k in (26) and (27).

After an observation of 4500 cycles, predicted lifetimes are relatively different from those experimentally evaluated. However, the predicted values are within the range of values adopted for the neural network training. As the monitored number of cycles increases, the predicted  $V_{ce,on}$  profiles get closer to the expected ones, hence improving the accuracy in the lifetime estimation.

The RUL represents the difference between the predicted lifetime and the monitoring time, both expressed as number of cycles, the results of the RUL analysis are reported in Fig.93 and Fig.94 for  $\Delta T_i = 120^{\circ}C$  and  $\Delta T_i = 140^{\circ}C$ , respectively. Both RUL and monitored number of cycles are expressed as a percentage value of the effective lifetime. The RUL is estimated for all the 16 samples (8 for each stress condition). The 7 different curves reported in each sub-plot refer to different **bLSTM**, trained with a different combination of samples. For each  $\Delta T_{i}$  stress condition, 28 bLSTMs are trained in total, which are used to test the 2 samples not adopted in the training phase of the specific neural network. As a result, 56 RUL curves are visible in each figure. Although the estimated RULs can be initially different with respect to the ideal ones (black dashed lines), the accuracy of the RUL prediction improves with the monitored number of cycles. In some cases, like samples #4 and #6 the estimated RULs differ from the ideal values also when approaching to the EoL. This inaccuracy could be ascribed to the high variability of the individual experimental sample. By increasing the number of samples adopted for training phase, the



**Fig. 90**: Experimental on-voltage profiles as a function of the number of cycles.  $V_{ce,on}$  profiles are obtained for  $\Delta T_i = 120^{\circ}C$  (a) and  $\Delta T_i = 140^{\circ}C$  (b).

**bLSTM** is expected to be more robust against variability in on-voltage profiles. In order to assess the performance of the proposed **bLSTM**, the relative error, defined as the relative difference between the predicted and the experimental lifetime, is averaged for all the 56 tests performed at a given  $\Delta T_j$ . The results are reported in Fig.92. In the range of the monitored number of cycles, comprised between 20% and 100% of the device lifetime, the average relative error is always equal or lower than 13%. As long as the number of cycles increases, the relative error, along with the standard deviation associated to the averaging process, tends to decrease. For example, by exceeding 80% of the device lifetime, the average relative error is below 7%, with a standard deviation lower than 5%. This is a remarkable result for predictive maintenance, since the EoL can be accurately predicted well before the failure event.



**Fig. 91**:  $V_{c\,e,on}$  profiles estimated by the neural network in the case of  $\Delta T_j = 120^{\circ}C$  (a) and  $\Delta T_j = 140^{\circ}C$  (b). Each curve arises from the experimental observation of a given number of power cycles (as reported in the legend) and from the application of the proposed recursive algorithm. As a result, the accuracy in the lifetime estimation improves as long as the monitored number of cycles increases.



**Fig. 92**: Relative error between predicted and experimental lifetime as a function of the monitored number of cycles. Average values are reported for both  $\Delta T_j = 120^{\circ}C$  and  $\Delta T_j = 140^{\circ}C$ . The error bars refer to the standard deviation.



**Fig. 93**: Predicted RULs in comparison with ideal RULs (dashed curves) for all 8 samples stressed at  $\Delta T_j = 120^{\circ}$ C. Each sample is tested with 7 differently trained neural networks.



**Fig. 94**: Predicted RULs in comparison with ideal RULs (dashed curves) for all 8 samples stressed at  $\Delta T_j = 140$ °C. Each sample is tested with 7 differently trained neural networks.

# 5.4.3 A Methodology to Estimate On-Voltage Degradation of Power Devices According to a Power Cycling Mission Profile

In the previous section, a deep learning-based solution was introduced, enabling the prediction of the Remaining Useful Life (RUL) of a power device under power cycling stress conditions. This approach utilizes the degradation profile of the on-Voltage (Von) as a key parameter. However, the objective is to extend this result by defining a new ANN capable of predicting the SoH of a power device using Von information, while considering variations in stress conditions in terms of  $\Delta T_i$ . This extension is motivated by the fact that current data-driven models, whether analytical as in [96, 97] or based on ANNs as in [122, 127–129], rely exclusively on experimental V<sub>on</sub> degradation curves. Typically, constant stress conditions (e.g., constant temperature swing  $\Delta T_i$ ) are assumed for model calibration and training, which, however, appears to be unrealistic when a generic mission profile is taken into account. Consequently, models should be trained with Von degradation curves obtained under non-constant stress conditions, requiring a broad range of test conditions to achieve statistical relevance. Nevertheless, power cycling tests are time-consuming, making this approach impractical. To overcome this limitation, this section proposes a methodology for generating the V<sub>on</sub> degradation curve of a power device subjected to a generic power cycling mission profile, which can be subsequently used for ANN training, implementing a datadriven model for predicting the RUL. This methodology eliminates the need to conduct tests under non-constant  $\Delta T_i$  stress conditions. To achieve this goal, experimental power cycling tests under constant  $\Delta T_i$  stress are combined to predict the Von degradation curve under arbitrary mission profiles.

The basic idea of the proposed methodology is reported in Fig.95. This method is based on a database of  $V_{on}$  vs N curves, being N the number of cycles and arising from power cycling test at constant  $\Delta T_j$  stress. A mission profile is then taken into account and, following a typical Rainflow algorithm [108], the i-th stress  $\Delta T_{j,i}$  is quantified in terms of the fraction  $p_i = \frac{n_i}{N_{tot}}$ , being  $n_i$  the number of cycles occurring at the specific level of stress and  $N_{tot}$  the total number of cycles. Successively, an average  $V_{on}$  profile, being characteristic of the given mission profile, is estimated. Defined  $f_i$  as the  $V_{on}$ -N relationship at the the i-th stress level, the number of cycles  $N_i$  corresponding to a generic value  $V_{on} = V^*$  is calculated as:

$$N_{i} = f_{i}^{-1}(V^{*})$$
(28)

The average number of cycles  $(N_{\alpha\nu})$  is then estimated, for  $V_{on} = V^*$ , on the basis of the fraction of number of cycles  $p_i$ :

$$N_{a\nu}(V^*) = \sum_{i=1}^{n} p_i \cdot N_i$$
(29)

This process is schematized in Fig.96 for the simplified case in which only two levels of stress are considered. In order to validate the proposed method,

three types of experimental power cycling tests have been carried out (as illustrated in Tab.13): constant  $\Delta T_j = 140$ °C (Test1); constant  $\Delta T_j = 120$ °C (Test2); a combination of Test1 and Test2 to generate an arbitrary mission profile (Test3). More details about Test3 are reported in Fig.97. Eight cycles are carried out at 140°C, followed by 7 cycles at 120°C. After that, a relatively low current value I<sub>s</sub> is injected in order to sense the on-voltage degradation. It is worth noting that, although a different level of stress requires a different level of current, in order the method to be successful the on-voltage degradation must be always evaluated under standard conditions (i.e. constant I<sub>s</sub> current).



Fig. 95: The methodology considered in this work is based on the knowledge of a mission profile of  $\Delta T_j$  and of a database of typical  $V_{on}$  degradation curves at several stress conditions  $\Delta T_j$ . An average  $V_{on}$  profile is then calculated, being representative of the considered mission profile.

| Test  | T <sub>j,min</sub> [°C] | t <sub>on</sub> [ <b>s</b> ] | I <sub>stress</sub> [A] | I <sub>s</sub> [A] | ∆Tj [° <b>C</b> ] |
|-------|-------------------------|------------------------------|-------------------------|--------------------|-------------------|
| Test1 | 25                      | 0.5                          | 68                      | 20                 | 140               |
| Test2 | 25                      | 0.5                          | 63                      | 20                 | 120               |
| Testa | 25                      | 0.5                          | 68-63                   | 20                 | 140 (53.3%)       |
|       |                         | 0.9                          |                         |                    | 120 (46.7%)       |

Tab. 13: Summary of experimental test conditions.

To validate the described methodology, power cycling experiments have been conducted on discrete IGBT devices, having TO-247 package. For each



Number of cycles

**Fig. 96**:  $V_{on}$  profiles in the case of constant  $\Delta T_{j,1}$  and  $\Delta T_{j,2}$ , and average  $V_{on}$  profile based on the application of both stresses:  $\Delta T_{j,1}$  for a fraction of time  $p_1$  and  $\Delta T_{j,2}$  for a fraction of time  $p_2$ .



**Fig. 97**: Typical power cycling tests carried out in this work. Different levels of stress current are applied to the DUT in order to generate a non-constant temperature cycling. Periodically, a sensing current  $(I_s)$  is applied in order to measure the on-voltage in standard conditions.

type of test, six different samples have been considered. The results of power cycling tests are reported in Fig.98. Fig.98a reports the case of constant  $\Delta T_j$  stress, which must be adopted as an input for the proposed methodology. On the other hand, Fig.98b illustrates the case of non-constant  $\Delta T_j$  stress, which serves as the experimental reference to validate the proposed methodology. As expected, a statistical dispersion is observed on  $V_{on}$  degradation curves. In order to apply the proposed methodology, a single  $V_{on}$ -N profile must be identified for Test1 and Test2. Therefore, profiles arising from different samples, are averaged according to (29) and considering the same weight  $p_i=1/6$ . Following the same approach, the average curve of Test3 can be estimated from Fig.98b.



Fig. 98: Experimental  $V_{on}$  profiles. A constant  $\Delta T_j$  is considered in (a), equal to 140°C (Test1) or 120°C (Test2). A non-constant  $\Delta T_j$  is considered in (b): 53.3% of cycles at 140°C and 46.7% of cycles at 120°C (Test3). For each test condition, six samples are analyzed.

The average  $V_{on}$  profiles, arising from the experimental power cycling tests under Test1, Test2, and Test3 conditions, are reported in Fig.99. Test1 and Test2 are then used as an input for the proposed model and, according to (29), the  $V_{on}$  profile for Test3 is calculated considering the given mission profile:  $\Delta T_{j,1} = 140^{\circ}$ C,  $p_1 = 0.533$ ,  $\Delta T_{j,2} = 120^{\circ}$ C,  $p_2 = 0.467$ . The estimated curve is also reported in Fig.99. In the case of Test3, prediction intervals (with a level of certainty of 99%) are also included.  $V_{on}$  degradation curves arising from experimental tests and estimated with the proposed methodology are very close each other. It is important to note that there is an overlap between the prediction intervals of the two profiles, confirming the validity of the proposed methodology. In conclusion, the study has successfully validated a methodology for generating  $V_{on}$  profiles representative of a generic power cycling mission profile. These  $V_{on}$  profiles can be used to train data-driven models.



**Fig. 99**: On-voltage profiles in the case of: i) average of experimental Test1 (constant  $\Delta T_j = 140^{\circ}$ C); ii) average of experimental Test2 (constant  $\Delta T_j = 120^{\circ}$ C); iii) average of experimental Test3 (non-constant  $\Delta T_j$ ); iv) predicted according to the proposed methodology (considering the same mission profile of Test3). Prediction intervals (with level of certainty of 99%) for experimental Test3 overlap the prediction intervals of the proposed methodology.

# 5.4.4 Implementing an ANN for the Development of a Dynamic-Static Model for RUL Prediction

In SubSec.5.4.1, the methodology for training an ANN was presented, along with a specification of the type of ANN that can be employed to implement a data-driven model. Simultaneously, in SubSec.5.4.3, a method for creating a dataset intended for the training of an ANN similar to the one described in SubSec.5.4.1 was outlined. This method takes into account variable stress conditions, expressed in terms of  $\Delta T_i$ , arising from power cycling. In this section, the two outcomes are combined to define an ANN capable of implementing a data-driven model for predicting RUL, taking into account the static information regarding the stress conditions to which the power device is subjected. Fig.100 provides a schematic representation of the ANN designed for this purpose. Specifically, the **bLSTM** (highlighted in red) models the dynamic variation of the  $V_{on}$ , while the green box defines an FFNN capable of considering the percentages of constant stress within a cycle of variable stress observation. The outputs of both networks are combined in a cascade sequence of "Dense Layers", with the first layer having 64 neurons and the second layer having 128 neurons. The final result is represented by a single neuron in the output layer, predicting the subsequent point in the Von profile after observing a window composed of 45 previous samples, under the variable stress conditions considered by the FFNN.



**Fig. 100**: The schematic of the adopted ANN for predicting the RUL under variable stress conditions.

The ANN, in Fig.100, is trained following the methodology described in Sub-Sec.5.4.1, and the training dataset is created using the procedure outlined in SubSec.5.4.1. In particular, for the training dataset, the V<sub>on</sub> average profiles of Test1 and Test2 in SubSec.5.4.3, whose conditions are described in Tab.13, serve as the basis for generating additional V<sub>on</sub> profiles through data augmentation, following the methodology outlined in the same section. The resulting five V<sub>on</sub> profiles, as depicted in Fig.101, represent the primary combinations of stress (in terms of  $\Delta$ T<sub>j</sub>) percentages from Test1 and Test2 in Tab.13.

After training the ANN, it undergoes testing using power cycling experiments under variable stress conditions, as detailed in Tab.14. Mission profiles are defined while considering the specified stress conditions, as outlined in Tab.13 within SubSec.5.4.3. Specifically, for each predefined mission profile condition, power cycling tests are conducted on four samples to assess the performance of the ANN. It's important to note that the ANN is trained on average profiles using data augmentation approach (theory in SubSec.5.4.3), so evaluating its performance under varying conditions of the phenomenon is crucial.

The experiments are conducted following a procedure that involves measuring the on-voltage, as described in SubSec.5.4.3, and schematically depicted in Fig.97. In Fig.102, the mission profile, as shown in Fig.102a, is defined as 8000 cycles under test1 stress conditions (see Tab.13), with the remaining cycles up to the end of life (EoL) conducted under test2 conditions. The associated V<sub>on</sub> profiles are displayed in the lower part of Fig.102a.

By applying the methodology described in SubSec.5.4.1, the ideal RUL for each experimental  $V_{on}$  profile (the blue curves in Fig.102b) is calculated and then compared with the predictions made by the ANN. The ANN takes into account the stress variation during the test and updates its predictions based on the percentage of constant stress applied in the mission profile. The predicted RUL curves from the network show increased accuracy as the number of acquired samples increases, confirming a similar trend observed in the bLSTM trained and tested in SubSec.5.4.1.

This rigorous analytical framework is systematically applied to two additional mission profiles, as conspicuously outlined in Tab.14. The results, encompassing the Von profiles and the corresponding RUL predictions proffered by the ANN, are meticulously presented in Fig.103 and Fig.104. In both instances, as delineated in Fig.103b and Fig.104b, the initial estimations of RUL are marked by a degree of imprecision. This initial disparity can primarily be attributed to the phenomenon whereby, during the preliminary prediction phase (referred to as the initial guess), the forecasted profile demonstrates a propensity to converge toward the average profile that forms the basis for the ANN's training. Nevertheless, as the number of acquired samples continues to accrue and variable stress conditions are diligently considered, the accuracy of the predictions exhibits a gradual convergence towards the ideal RUL curve characteristic of the respective degradation profile. These findings underscore the remarkable capacity of the ANN to adapt and enhance its predictive prowess with the acquisition of additional data, especially as stress conditions dynamically evolve during the testing process.



**Fig. 101**: Average V<sub>on</sub> profiles utilized to train the ANN for RUL prediction under general mission profile. The average V<sub>on</sub> profiles from Test1 and Test2 in Tab.13, represented by the red and blue curves, respectively, are linearly combined using the methodology described in SubSec.5.4.3. This combination generates additional 5 average profiles through data augmentation, taking into account for different percentages of constant stress applicable in a generic mission profile.

 Tab. 14: List of mission profiles employed for power cycling experiments under non-constant stress conditions.

 N range
 0-8
 8-14.4
 14.4-24
 24-FoI.

| N range<br>(x 10 <sup>3</sup> ) | 0-8                         | 8-14.4             | <b>14.4-2</b> 4    | 24-EoL             |  |
|---------------------------------|-----------------------------|--------------------|--------------------|--------------------|--|
| Mission                         | Stress type 1               | Stress type 2      |                    |                    |  |
| Profile 1                       | (Test 1 in Tab.13)          | (Test 2 in Tab.13) |                    |                    |  |
| Mission                         | Stress type (Test 3 in Tab. | 3                  | Stress type 1      |                    |  |
| Profile 2                       |                             | 13)                | (Test 1 in Tab.13) |                    |  |
| Mission                         | Stress type 2               |                    |                    | Stress type 1      |  |
| Profile 3                       | (Test 2 in Tab.13)          |                    |                    | (Test 1 in Tab.13) |  |



**Fig. 102**: Mission Profile 1 (see Tab.13) is depicted in the upper part of (a). In the lower part of (a), the degradation profiles of  $V_{on}$  are reported in the case of 4 different DUTs. In (b), the predicted RUL profiles by the ANN for each test sample are presented, compared with their respective ideal RUL profiles (dashed line).



**Fig. 103**: Mission Profile 2 (see Tab.13) is depicted in the upper part of (a). In the lower part of (a), the degradation profiles of V<sub>on</sub> are reported in the case of 4 different DUTs. In (b), the predicted RUL profiles by the ANN for each test sample are presented, compared with their respective ideal RUL profiles (dashed line).



**Fig. 104**: Mission Profile 3 (see Tab.13) is depicted in the upper part of (a). In the lower part of (a), the degradation profiles of  $V_{on}$  are reported in the case of 4 different DUTs. In (b), the predicted RUL profiles by the ANN for each test sample are presented, compared with their respective ideal RUL profiles (dashed line).

## CONCLUSIONS

# 6

This work involves performing studies and developing reliability models for power cycling stresses on semiconductor power devices. To accomplish this, a dedicated set-up was created to conduct power cycling experiments. The set-up allows for stress-testing 4 components with a multiplexed approach under established experimental conditions. By varying the stress parameters, both solder degradation and wire bond degradation, which are the main failure mechanisms associated with power cycling, can be triggered. The devices under test are IGBT discrete devices with package TO-247. The set-up allows for implementing both standard experimental conditions, as indicated by the AEQ-324 standard, also known as DC power cycling (referred to as non-controlled  $\Delta T_j$  in this work), as well as constant  $\Delta T_j$  (referred to active control of  $\Delta T_j$ ) stresses. Specifically, the "active control of  $\Delta T_j$ " approach involves dynamically modulating the heating time to maintain a constant  $\Delta T_j$ .

The two different methodologies, implemented in the experimental set-up, are used to apply power cycling stress in the context of accelerated life tests conducted on TO-247 IGBT devices. Specifically, the aim is to academically analyze the impact of the methodologies employed in accelerated power cycling tests on the estimation of lifetime when power devices are subjected to non-constant stress conditions. To achieve this objective, lifetime models are derived with both the "non-controlled  $\Delta T_i$ " and "active control of  $\Delta T_i$ " approaches, spanning a range of PoF from 10% to 75%. By considering the "active control of  $\Delta T_i$ " methodology, the application of Miner's rule results in highly consistent lifetime predictions with respect to experimental nonconstant stress conditions. More specifically, the experimental number of cycles to failure consistently falls within the 99% prediction interval associated with Miner's rule estimation, covering the full spectrum of PoFs. In contrast, when employing the "non-controlled  $\Delta T_i$ " approach, the application of Miner's rule yields accurate lifetime predictions for high PoFs, but discrepancies become apparent, particularly at low PoFs. These discrepancies, especially at lower probability levels, can be attributed to the positive feedback relationship between wire bond degradation and  $\Delta T_i$  when thermo-mechanical stress is not maintained at a constant level.

These results obtained, emphasize impact of the chosen methodology in accelerated lifetime testing. On one hand, when lifetime models are calibrated through tests employing the "active control of  $\Delta T_j$ " approach, thermo mechanical stress can be considered constant, as  $\Delta T_j$  remains fixed at its nominal value. Consequently, Miner's rule provides accurate predictions, when dealing with a combination of stresses at constant  $\Delta T_j$ . On the other hand, during the calibration of lifetime models based on the "non-controlled  $\Delta T_j$ " approach, power devices undergo temperature cycling that exceeds the nominal  $\Delta T_j$  value, primarily due to the previously mentioned positive feedback mechanism. Therefore, the effective  $\Delta T_j$  value considered for lifetime modeling should be higher. When applying Miner's rule to a given non-constant temperature profile, the adopted lifetime model relies on the nominal  $\Delta T_j$  value, introducing inaccuracies in lifetime estimation.

The possibility of implementing deep learning techniques in the field of reliability has been explored. As an initial study, the adoption of ANN was investigated for developing a non-linear static model to predict the lifetime of semiconductor power devices under power cycling stress. The methodology described has potential applications in cases involving degradation related to different failure mechanisms or novel packaging designs.

As a preliminary step, a MLP-NN was trained based on a well-established analytical power cycling model. The training process considered a Weibull random distribution in the number of cycles to failure. It was determined that a single hidden layer MLP-NN with 15 neurons provided the most suitable configuration, minimizing the error introduced by the MLP-NN (approximately 10%) and yielding reproducible results.

Experimental power cycling tests were performed on 25 different combinations of junction temperature cycling, minimum junction temperature, and heating time. The MLP-NN was subsequently trained using the acquired experimental data. Despite the limited training data available, the MLP-NN demonstrated good performance in accurately fitting the experimental data across a wide range of input parameters involving various failure mechanisms.

To further prove the validity of the proposed model, the performance of the MLP-NN model is compared with that of an analytical model. It was observed that the conventional analytical models are only capable of representing a single failure mechanism, such as bond wire degradation or solder joint fatigue, and are inadequate when multiple failure mechanisms are present. In such cases, deriving multiple lifetime models becomes necessary, which requires complex test structures. On the other hand, ANNs prove to be a viable solution as they are able to capture complex correlations between input and output variables that are challenging to model using traditional analytical methods. In fact, the RMSRE value of the analytical model is more than double compared to the MLP-NN approach. This highlights the capability of neural networks to handle the intricacies of multiple failure mechanisms and their ability to provide more accurate predictions.

The application of deep learning techniques has been extended to develop a data-driven model for predicting the lifetime of semiconductor power devices. The proposed model utilizes **bLSTM** blocks and is trained using experimental on-voltage degradation profiles obtained from power cycling tests with temperature swings of 120°C and 140°C. The model predicts the lifetime of the devices based on the monitoring of the on-voltage profile. When only a limited amount of data is available, the lifetime prediction falls within the range observed in the experimental samples used for training. As more data on SoH of the tested device are acquired, the model's accuracy improves. The impact of dataset partitioning on the performance of the **bLSTM** networks is also analyzed. Specifically, 28 **bLSTM** networks are trained for each  $\Delta T_j$  stress condition. These trained networks are then used to evaluate the RUL of test samples based on the monitored number of cycles. The relative error between the lifetime predicted by the neural network and the actual experimental lifetime tends to decrease as the number of monitored cycles increases. On average,

the relative error among all the trained neural networks remains below 13%, and it can even be as low as 5% when the monitoring time exceeds 80% of the device's lifetime.

This work is also intended to define a new ANN capable of predicting the State of Health (SoH) of a power device using Von information while accounting for variations in stress conditions in terms of  $\Delta T_j$ . This extension is motivated by the fact that current data-driven models, whether analytical or based on ANNs, rely exclusively on experimental Von degradation curves. Typically, constant stress conditions (e.g., a constant temperature swing  $\Delta T_i$ ) are assumed for model calibration and training. However, this assumption appears to be unrealistic when considering a generic mission profile. To overcome this limitation, a methodology has been developed to generate the  $V_{on}$ degradation curve of a power device subjected to a generic power cycling mission profile. This curve can be subsequently used for ANN training, implementing a data-driven model for predicting the Remaining Useful Life (RUL). Importantly, this methodology eliminates the need to conduct tests under non-constant  $\Delta T_i$  stress conditions. To validate the methodology, power cycling experiments are conducted on discrete IGBT devices with TO-247 packages. The V<sub>on</sub> degradation curves resulting from experimental non-constant stress conditions and that estimated with the proposed methodology closely resemble each other. It is important to note that there is an overlap between the prediction intervals of these two profiles, confirming the validity of the proposed methodology.

A comprehensive ANN capable of implementing a data-driven model to predict RUL under varying stress conditions has been developed. This model combines two key outcomes: the dynamics of Von over the stress time captured by the **bLSTM** and the information related to varying stress conditions, integrated into the FFNN, utilizing the methodology of dataset creation through data augmentation. The training technique used for the initial data-driven model implemented on the **bLSTM** has been reapplied to this new ANN. The testing phase of the ANN is conducted through power cycling experiments across various mission profiles. Notably, the initial RUL estimations made by the ANN exhibit some degree of imprecision. This difference can be attributed to the phenomenon where, during the initial prediction phase (referred to as the initial guess), the forecasted profile tends to converge towards the average training profile used for the ANN. However, with the acquisition of more samples and consideration of variable stress conditions, the accuracy of the predictions progressively approaches the ideal RUL curve specific to the respective degradation profile. These findings underscore the remarkable capacity of the ANN to adapt and enhance its predictive accuracy as it acquires more data, especially in response to the dynamic evolution of stress conditions during the testing process.

Extending the applicability of such models to other wide bandgap devices entails addressing various significant issues. One of the primary challenges pertains to the diversity of available packages, necessitating a reconsideration and adaptation of the experimental setup to ensure accurate and comparable results. Moreover, emphasis is placed on the need to recalibrate the TSEP to account for the specific characteristics of the new devices, introducing a dynamic element to the experimental procedure. These adaptations are crucial to preserve the reliability and precision of the results obtained during power cycling on wide bandgap devices. As a future perspective, further exploration and refinement of experimental methodologies and calibration techniques will be essential for broadening the scope of these models to encompass an even wider array of wide bandgap devices.

- [1] European Center for Power Electronics (ECPE). *ECPE Position Paper on Energy Efficiency – the Role of Power*. Tech. rep. 2007.
- [2] Frede Blaabjerg and Simon Round. POWER ELECTRONICS: REVOLU-TIONIZING THE WORLD'S FUTURE ENERGY SYSTEMS. 2021. URL: https://www.hitachienergy.com/news/perspectives/2021/08/ power-electronics-revolutionizing-the-world-s-future-energysystems.
- [3] J. L. Moll, M. Tanenbaum, J. M. Goldey, and N. Holonyak. "PNPN Transistor Switches." In: *Proc. IRE* 44 (1956), pp. 1174–82.
- [4] Noriyuki Iwamuro and Thomas Laska. "IGBT History, State-of-the-Art, and Future Prospects." In: *IEEE Transactions on Electron Devices* 64.3 (2017), pp. 741–752. DOI: 10.1109/TED.2017.2654599.
- [5] H. W. Becke and C. F. Wheatley. "Power MOSFET with an anode region." Patent 4,364,073. 1982.
- [6] Guang Zeng. "Some aspects in lifetime prediction of power semiconductor devices." PhD thesis. Technische Universitat Chemntz, 2019.
- [7] Zhongting Tang, Yongheng Yang, and Frede Blaabjerg. "Power electronics: The enabling technology for renewable energy integration." In: *CSEE Journal of Power and Energy Systems* 8.1 (2022), pp. 39–52. DOI: 10.17775/CSEEJPES.2021.02850.
- [8] Huai Wang and Frede Blaabjerg. "Power Electronics Reliability: State of the Art and Outlook." In: *IEEE Journal of Emerging and Selected Topics in Power Electronics* 9.6 (2021), pp. 6476–6493. DOI: 10.1109/JESTPE. 2020.3037161.
- [9] Institute of Electrical and Electronics Engineers. *IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries*. New York, NY: IEEE, 1990. ISBN: 1-55937-079-3.
- [10] Kailash Kapur and Michael Pecht. *Reliability Engineering*. First. Hoboken, NJ, USA: Wiley, 2014.
- [11] Abu Hanif, Yuechuan Yu, Douglas DeVoto, and Faisal Khan. "A Comprehensive Review Toward the State-of-the-Art in Failure and Life-time Predictions of Power Electronic Devices." In: *IEEE Transactions on Power Electronics* 34.5 (2019), pp. 4729–4746. DOI: 10.1109/TPEL. 2018.2860587.
- [12] Huai Wang, Marco Liserre, Frede Blaabjerg, Peter de Place Rimmen, John B. Jacobsen, Thorkild Kvisgaard, and Jørn Landkildehus. "Transitioning to Physics-of-Failure as a Reliability Driver in Power Electronics." In: *IEEE Journal of Emerging and Selected Topics in Power Electronics* 2.1 (2014), pp. 97–114. DOI: 10.1109/JESTPE.2013.2290282.

- [13] Ariya Sangwongwanich and Frede Blaabjerg. "Reliability Assessment of Fault-Tolerant Power Converters including Wear-Out Failure." In: 2022 IEEE Applied Power Electronics Conference and Exposition (APEC).
   2022, pp. 300–306. DOI: 10.1109/APEC43599.2022.9773367.
- [14] Murat Demir, Gürmen Kahramanoğlu, and Ali Bekir Yıldız. "Importance of reliability for power electronic circuits, case study: Inrush current test and calculating of fuse melting point." In: 2016 IEEE International Power Electronics and Motion Control Conference (PEMC). 2016, pp. 830–834. DOI: 10.1109/EPEPEMC.2016.7752101.
- [15] Roy Billinton and W. Zhang. "Cost related reliability evaluation of bulk power systems." In: *International Journal of Electrical Power and Energy Systems* 23 (Feb. 2001), pp. 99–112. DOI: 10.1016/S0142-0615(00) 00046-6.
- [16] Lakshmi Reddy GopiReddy, Leon M. Tolbert, and Burak Ozpineci.
   "Power Cycle Testing of Power Switches: A Literature Survey." In: *IEEE Transactions on Power Electronics* 30.5 (2015), pp. 2465–2473. DOI: 10.1109/TPEL.2014.2359015.
- [17] S. Peyghami, T. Dragicevic, and F. Blaabjerg. "Intelligent long-term performance analysis in power electronics systems." In: *Scientific Reports* 11.1 (2021), p. 7557. DOI: 10.1038/s41598-021-87165-3.
- [18] Tomislav Dragičević, Patrick Wheeler, and Frede Blaabjerg. "Artificial Intelligence Aided Automated Design for Reliability of Power Electronic Systems." In: *IEEE Transactions on Power Electronics* 34.8 (2019), pp. 7161–7171. DOI: 10.1109/TPEL.2018.2883947.
- [19] Shiyi Liu, Dao Zhou, Chao Wu, and Frede Blaabjerg. "Recurrent neural networks model based reliability assessment of power semiconductors in PMSG converter." In: *Microelectronics Reliability* 126 (2021), p. 114314.
   ISSN: 0026-2714. DOI: https://doi.org/10.1016/j.microrel.2021. 114314.
- [20] J. Lutz, Heinrich Schlangenotto, Uwe Scheuermann, and Rik De Doncker. Semiconductor Power Devices - Physics, Characteristics, Reliability. Jan. 2018. ISBN: 978-3-319-70917-8. DOI: 10.1007/978-3-319-70917-8.
- [21] Shaoyong Yang, Angus Bryant, Philip Mawby, Dawei Xiang, Li Ran, and Peter Tavner. "An Industry-Based Survey of Reliability in Power Electronic Converters." In: *IEEE Transactions on Industry Applications* 47.3 (2011), pp. 1441–1451. DOI: 10.1109/TIA.2011.2124436.
- [22] Shaoyong Yang, Dawei Xiang, Angus Bryant, Philip Mawby, Li Ran, and Peter Tavner. "Condition Monitoring for Device Reliability in Power Electronic Converters: A Review." In: *IEEE Transactions on Power Electronics* 25.11 (2010), pp. 2734–2752. DOI: 10.1109/TPEL.2010.2049377.
- [23] E. Wolfgang. "Examples for failures in power electronics system." In: ECPE Tutorial on Reliability of Power Electronics System. Nuremberg, Germany, 2007.
- [24] ZVEL. Handbook for Robustness Validation of Automotive Electrical/Electronic Modules. Tech. rep. ZVEL, 2007.
- [25] Infineon Technologies AG. AN2019-05: PC and TC Diagrams. [Online]. 2019. URL: https://www.infineon.com/dgdl/Infineon-AN2019-05\_ PC\_and\_TC\_Diagrams - ApplicationNotes - v02\_01 - EN.pdf?fileId= 5546d46269e1c019016a594443e4396b.
- [26] Zoubir Khatir, Son-Ha Tran, Ali Ibrahim, Richard Lallemand, and Nicolas Degrenne. "Effect of load sequence interaction on bond-wire lifetime due to power cycling." In: *Scientific Reports* 11 (Mar. 2021). DOI: 10.1038/s41598-021-84976-2.
- [27] "8 Production capability management." In: *Practical E-Manufacturing and Supply Chain Management*. Ed. by Gerhard Greeff and Ranjan Ghoshal. Oxford: Newnes, 2004, pp. 214–242. ISBN: 978-0-7506-6272-7. DOI: https://doi.org/10.1016/B978-075066272-7/50011-7.
- [28] G.A. Klutke, P.C. Kiessler, and M.A. Wortman. "A critical look at the bathtub curve." In: *IEEE Transactions on Reliability* 52.1 (2003), pp. 125– 129. DOI: 10.1109/TR.2002.804492.
- [29] Huai Wang, Ke Ma, and Frede Blaabjerg. "Design for reliability of power electronic systems." In: IECON 2012 - 38th Annual Conference on IEEE Industrial Electronics Society. 2012, pp. 33–44. DOI: 10.1109/ IECON.2012.6388833.
- [30] Yong Liu and Dan Kinzer. "Challenges of power electronic packaging and modeling." In: 2011 12th Intl. Conf. on Thermal, Mechanical & Multi-Physics Simulation and Experiments in Microelectronics and Microsystems.
   2011, pp. 1/9–9/9. DOI: 10.1109/ESIME.2011.5765799.
- [31] Alexander Otto, Rainer Dudek, Ralf Doering, and Sven Rzepka. "Investigating the mold compounds influence on power cycling lifetime of discrete power devices." In: *PCIM Europe 2019; International Exhibition and Conference for Power Electronics, Intelligent Motion, Renewable Energy and Energy Management.* 2019, pp. 1–8.
- [32] Kay Hofmann, Christian Herold, Menia Beier, Josef Lutz, and Jens Friebe. "Reliability of discrete power semiconductor packages and systems — D2Pak and CanPAK in comparison." In: 2013 15th European Conference on Power Electronics and Applications (EPE). 2013, pp. 1–10. DOI: 10.1109/EPE.2013.6634611.
- [33] Karsten Guth, Dirk H. Siepe, Jens Görlich, Holger Torwesten, Roman Roth, Frank Hille, and Frank Umbach. "New assembly and interconnects beyond sintering methods." In: 2010.
- [34] Uwe Scheuermann and Marion Junghaenel. "Limitation of Power Module Lifetime Derived from Active Power Cycling Tests." In: CIPS 2018; 10th International Conference on Integrated Power Electronics Systems. 2018, pp. 1–10.
- [35] Xizi Zhang, Zhongyuan Chen, Zhongkang Lin, Lei Zhang, Jinyuan Li, and Guanbin Wu. "Internal pressure distributions of press-pack IGBT modules under two contact methods." In: 2019 IEEE 3rd International Electrical and Energy Conference (CIEEC). 2019, pp. 612–617. DOI: 10. 1109/CIEEC47146.2019.CIEEC-2019254.
- [36] Mauro Ciappa. "Selected failure mechanisms of modern power modules." In: *Microelectron. Reliab.* 42 (2002), pp. 653–667.

- [37] Serkan Dusmez, Syed Huzaif Ali, Mehrdad Heydarzadeh, Anant S. Kamath, Hamit Duran, and Bilal Akin. "Aging Precursor Identification and Lifetime Estimation for Thermally Aged Discrete Package Silicon Power Switches." In: *IEEE Transactions on Industry Applications* 53.1 (2017), pp. 251–260. DOI: 10.1109/TIA.2016.2603144.
- [38] S. Ramminger, Norbert Seliger, and Gerhard Wachutka. "Reliability model for Al wire bonds subjected to heel crack failures." In: *Microelectronics Reliability* 40 (Aug. 2000), pp. 1521–1525. DOI: 10.1016/S0026-2714(00)00139-6.
- [39] Tobias Herrmann, Marco Feller, Josef Lutz, Reinhold Bayerer, and Thomas Licht. "Power cycling induced failure mechanisms in solder layers." In: 2007 European Conference on Power Electronics and Applications. 2007, pp. 1–7. DOI: 10.1109/EPE.2007.4417702.
- [40] Wuchen Wu, M. Held, P. Jacob, P. Scacco, and A. Birolini. "Thermal stress related packaging failure in power IGBT modules." In: *Proceedings of International Symposium on Power Semiconductor Devices and IC's: ISPSD* '95. 1995, pp. 330–334. DOI: 10.1109/ISPSD.1995.515059.
- [41] J.-M. Thebaud, E. Woirgard, C. Zardini, S. Azzopardi, O. Briat, and J.-M. Vinassa. "Strategy for designing accelerated aging tests to evaluate IGBT power modules lifetime in real operation mode." In: *IEEE Transactions on Components and Packaging Technologies* 26.2 (2003), pp. 429– 438. DOI: 10.1109/TCAPT.2003.815112.
- [42] Mauro Ciappa. Some Reliability Aspects of IGBT Modules for High-Power Applications. Jan. 2001. ISBN: 3-89649-657-3. DOI: 10.13140/RG.2.1. 3644.4006.
- [43] D.L. Blackburn. "Temperature measurements of semiconductor devices

   a review." In: *Twentieth Annual IEEE Semiconductor Thermal Measurement and Management Symposium (IEEE Cat. No.04CH37545)*. 2004, pp. 70–80. DOI: 10.1109/STHERM.2004.1291304.
- [44] Janusz Zarebski and Krzysztof Górecki. "The Electrothermal Large-Signal Model of Power MOS Transistors for SPICE." In: *IEEE Transactions on Power Electronics* 25.5 (2010), pp. 1265–1274. DOI: 10.1109/TPEL. 2009.2036850.
- [45] Leonardo Hillkirk. "Dynamic surface temperature measurements in SiC epitaxial power diodes performed under single-pulse self-heating conditions." In: *Solid-state Electronics - SOLID STATE ELECTRON* 48 (Dec. 2004), pp. 2181–2189. DOI: 10.1016/j.sse.2004.05.077.
- [46] Xavier Perpiñà, J.F. Serviere, J. Saiz, D. Barlini, Michel Mermet-Guyennet, and Jose Millan. "Temperature measurement on series resistance and devices in power packs based on on-state voltage drop monitoring at high current." In: *Microelectronics Reliability* 46 (Sept. 2006), pp. 1834– 1839. DOI: 10.1016/j.microrel.2006.07.078.
- [47] Laurent Dupont, Yvan Avenas, and Pierre-Olivier Jeannin. "Comparison of junction temperature evaluations in a power IGBT module using an IR camera and three thermo-sensitive electrical parameters." In: 2012 Twenty-Seventh Annual IEEE Applied Power Electronics Conference

and Exposition (APEC). 2012, pp. 182–189. DOI: 10.1109/APEC.2012. 6165817.

- [48] Mohamed Halick Mohamed Sathik, Josep Pou, Sundararajan Prasanth, Vivek Muthu, Rejeki Simanjorang, and Amit Kumar Gupta. "Comparison of IGBT junction temperature measurement and estimation methods-a review." In: 2017 Asian Conference on Energy, Power and Transportation Electrification (ACEPT). 2017, pp. 1–8. DOI: 10.1109/ACEPT. 2017.8168600.
- [49] Kerry Maize, Xi Wang, Dustin Kendig, Ali Shakouri, William French, Barry O'Connell, Philip Lindorfer, and Peter Hopper. "Thermal characterization of high power transistor arrays." In: 2009 25th Annual IEEE Semiconductor Thermal Measurement and Management Symposium. 2009, pp. 50–54. DOI: 10.1109/STHERM.2009.4810742.
- [50] Yong-Seok Kim and Seung-Ki Sul. "On-line estimation of IGBT junction temperature using on-state voltage drop." In: *Conference Record* of 1998 IEEE Industry Applications Conference. Thirty-Third IAS Annual Meeting (Cat. No.98CH36242). Vol. 2. 1998, 853–859 vol.2. DOI: 10.1109/ IAS.1998.730245.
- [51] M. Held, P. Jacob, G. Nicoletti, P. Scacco, and M.-H. Poech. "Fast power cycling test of IGBT modules in traction application." In: *Proceedings of Second International Conference on Power Electronics and Drive Systems*. Vol. 1. 1997, 425–430 vol.1. DOI: 10.1109/PEDS.1997.618742.
- [52] Ui-Min Choi, F. Blaabjerg, Francesco Iannuzzo, and S. Jørgensen. "Junction temperature estimation method for a 600V, 30A IGBT module during converter operation." In: *Microelectronics Reliability* 55 (July 2015). DOI: 10.1016/j.microrel.2015.06.146.
- [53] Christian Herold, Jörg Franke, Riteshkumar Bhojani, Andre Schleicher, and J. Lutz. "Requirements in power cycling for precise lifetime estimation." In: *Microelectronics Reliability* 58 (Jan. 2016). DOI: 10.1016/j. microrel.2015.12.035.
- [54] Andreas Koenig, Thomas Plum, Peter Fidler, and Rik W. De Doncker.
   "On-line Junction Temperature Measurement of CoolMOS Devices." In: 2007 7th International Conference on Power Electronics and Drive Systems.
   2007, pp. 90–95. DOI: 10.1109/PEDS.2007.4487683.
- [55] Laurent Dupont and Yvan Avenas. "Evaluation of thermo-sensitive electrical parameters based on the forward voltage for on-line chip temperature measurements of IGBT devices." In: 2014 IEEE Energy Conversion Congress and Exposition (ECCE). 2014, pp. 4028–4035. DOI: 10.1109/ECCE.2014.6953950.
- [56] Bastian Strauss and Andreas Lindemann. "Measuring the junction temperature of an IGBT using its threshold voltage as a temperature sensitive electrical parameter (TSEP)." In: 2016 13th International Multi-Conference on Systems, Signals & Devices (SSD). 2016, pp. 459–467. DOI: 10.1109/SSD.2016.7473664.
- [57] Uwe Scheuermann and Ralf Schmidt. "Investigations on the VCE(T)-Method to Determine the Junction Temperature by Using the Chip Itself as Sensor." In: May 2009.

- [58] Sophia Frankeser, Sebastian Hiller, Gerd Wachsmuth, and Josef Lutz. "Using the on-state-Vbe,sat-voltage for temperature estimation of SiC-BJTs during normal operation." In: *Proceedings of PCIM Europe 2015; International Exhibition and Conference for Power Electronics, Intelligent Motion, Renewable Energy and Energy Management.* 2015, pp. 1–8.
- [59] Zoubir Khatir, Laurent Dupont, and Ali Ibrahim. "Investigations on junction temperature estimation based on junction voltage measurements." In: *Microelectronics Reliability* 50 (Sept. 2010), pp. 1506–1510. DOI: 10.1016/j.microrel.2010.07.102.
- [60] Paolo Cova, Mauro Ciappa, Giovanni Franceschini, P. Malberti, and Fausto Fantini. "Thermal characterization of IGBT power modules." In: *Microelectronics Reliability* 37 (Oct. 1997), pp. 1731–1734. DOI: 10. 1016/S0026-2714(97)00150-9.
- [61] David L. Blackburn and David W. Berning. "Power MOSFET temperature measurements." In: 1982 IEEE Power Electronics Specialists conference. 1982, pp. 400–407. DOI: 10.1109/PESC.1982.7072436.
- [62] C. Herold, J. Sun, P. Seidel, L. Tinschert, and J. Lutz. "Power cycling methods for SiC MOSFETs." In: 2017 29th International Symposium on Power Semiconductor Devices and IC's (ISPSD). 2017, pp. 367–370. DOI: 10.23919/ISPSD.2017.7988994.
- [63] Jörg Franke, Guang Zeng, Tom Winkler, and J. Lutz. "Power cycling reliability results of GaN HEMT devices." In: May 2018, pp. 467–470. DOI: 10.1109/ISPSD.2018.8393704.
- [64] David L. Blackburn. "An Electrical Technique for the Measurement of the Peak Junction Temperature of Power Transistors." In: 13th International Reliability Physics Symposium. 1975, pp. 142–150. DOI: 10.1109/ IRPS.1975.362688.
- [65] W. Shockley. "The theory of p-n junctions in semiconductors and p-n junction transistors." In: *The Bell System Technical Journal* 28.3 (1949), pp. 435–489. DOI: 10.1002/j.1538-7305.1949.tb03645.x.
- [66] Aivars J. Lelis, Daniel Habersat, Ronald Green, Aderinto Ogunniyi, Moshe Gurfinkel, John Suehle, and Neil Goldsman. "Time Dependence of Bias-Stress-Induced SiC MOSFET Threshold-Voltage Instability Measurements." In: *IEEE Transactions on Electron Devices* 55.8 (2008), pp. 1835– 1840. DOI: 10.1109/TED.2008.926672.
- [67] Thomas Aichinger, Gerald Rescher, and Gregor Pobegen. "Threshold voltage peculiarities and bias temperature instabilities of SiC MOS-FETs." In: *Microelectronics Reliability* 80 (Jan. 2018), pp. 68–78. DOI: 10.1016/j.microrel.2017.11.020.
- [68] C. Durand, M. Klingler, D. Coutellier, and H. Naceur. "Power Cycling Reliability of Power Module: A Survey." In: *IEEE Transactions on Device and Materials Reliability* 16.1 (2016), pp. 80–97. DOI: 10.1109/TDMR.2016. 2516044.
- [69] Semiconductor device Mechanical and climatic test methods Part 34: Power cycling. Tech. rep. 60749-34. International Electrotechnical Commission, Year.

- [70] Qualification of power electronic modules for use in motor vehicle components, general requirements, test conditions and tests. Tech. rep. LV 324. International Electrotechnical Commission, 2017.
- [71] European Center for Power Electronics (ECPE). *ECPE Guideline AQG* 324: *Qualification of power modules for use in power electronics converter units (PCUs) in motor vehicles.* Tech. rep. ECPE AQG 324. 2018.
- [72] Guang Zeng, Felix Wenisch-Kober, and J. Lutz. "Study on power cycling test with different control strategies." In: *Microelectronics Reliability* 88-90 (Sept. 2018), pp. 756–761. DOI: 10.1016/j.microrel.2018.07.088.
- [73] Stefan Schuler and Uwe Scheuermann. "Impact of Test Control Strategy on Power Cycling Lifetime." In: May 2010.
- [74] Uwe Scheuermann and Ralf Schmidt. "Impact of solder fatigue on module lifetime in power cycling tests." In: *Proceedings of the 2011 14th European Conference on Power Electronics and Applications*. 2011, pp. 1– 10.
- [75] Camille Durand, Markus Klingler, Maxence Bigerelle, and Coutellier Daniel. "Solder fatigue failures in a new designed power module under Power Cycling." In: *Microelectronics Reliability* 66 (Oct. 2016). DOI: 10.1016/j.microrel.2016.10.002.
- [76] R. Dudek et al. "Investigations on power cycling induced fatigue failure of IGBTs with silver sinterea interconnects." In: 2015 European Microelectronics Packaging Conference (EMPC). 2015, pp. 1–8.
- [77] V.A. Sankaran, C. Chen, C.S. Avant, and X. Xu. "Power cycling reliability of IGBT power modules." In: IAS '97. Conference Record of the 1997 IEEE Industry Applications Conference Thirty-Second IAS Annual Meeting. Vol. 2. 1997, 1222–1227 vol.2. DOI: 10.1109/IAS.1997.630841.
- [78] J. Lutz, T. Herrmann, M. Feller, R. Bayerer, T. Licht, and Raed Amro. "Power cycling induced failure mechanisms in the viewpoint of rough temperature environment." In: 5th International Conference on Integrated Power Electronics Systems. 2008, pp. 1–4.
- [79] A. Morozumi, K. Yamada, T. Miyasaka, and Y. Seki. "Reliability of power cycling for IGBT power semiconductor modules." In: *Conference Record of the 2001 IEEE Industry Applications Conference*. 36th IAS Annual Meeting (Cat. No.01CH37248). Vol. 3. 2001, 1912–1918 vol.3. DOI: 10. 1109/IAS.2001.955791.
- [80] Son-Ha Tran, Zoubir Khatir, Richard Lallemand, Ali Ibrahim, Jean-Pierre Ousten, Jeffrey Ewanchuk, and Stefan V. Mollov. "Constant DeltaTj Power Cycling Strategy in DC Mode for Top-Metal and Bond-Wire Contacts Degradation Investigations." In: *IEEE Transactions on Power Electronics* 34.3 (2019), pp. 2171–2180. DOI: 10.1109/TPEL.2018.2847234.
- [81] Zoltan Sarkany, Andras Vass-Varnai, and Marta Rencz. "Comparison of different power cycling strategies for accelerated lifetime testing of power devices." In: *Proceedings of the 5th Electronics System-integration Technology Conference (ESTC)*. 2014, pp. 1–5. DOI: 10.1109/ESTC.2014. 6962833.

- [82] Nausicaa Dornic, Zoubir Khatir, Son Ha Tran, Ali Ibrahim, Richard Lallemand, Jean-Pierre Ousten, Jeffrey Ewanchuk, and Stefan V. Mollov. "Stress-Based Model for Lifetime Estimation of Bond Wire Contacts Using Power Cycling Tests and Finite-Element Modeling." In: *IEEE Journal of Emerging and Selected Topics in Power Electronics* 7.3 (2019), pp. 1659–1667. DOI: 10.1109/JESTPE.2019.2918941.
- [83] Koji Sasaki, Naoko Iwasa, Toshiki Kurosu, Katsuaki Saito, Yoshihiko Koike, Yukio Kamita, and Yasushi Toyoda. "Thermal and Structural Simulation Techniques for Estimating Fatigue Life of an IGBT Module." In: 2008 20th International Symposium on Power Semiconductor Devices and IC's. 2008, pp. 181–184. DOI: 10.1109/ISPSD.2008.4538928.
- [84] Oliver Schilling, M. Schäfer, Krzysztof Mainka, M. Thoben, and Frank Sauerland. "Power cycling testing and FE modelling focussed on Al wire bond fatigue in high power IGBT modules." In: *Microelectronics Reliability* 52 (Sept. 2012), pp. 2347–2352. DOI: 10.1016/j.microrel. 2012.06.095.
- [85] Borong Hu, Sylvia Konaklieva, Nadia Kourra, Mark A. Williams, Li Ran, and Wei Lai. "Long-Term Reliability Evaluation of Power Modules With Low Amplitude Thermomechanical Stresses and Initial Defects." In: *IEEE Journal of Emerging and Selected Topics in Power Electronics* 9.1 (2021), pp. 602–615. DOI: 10.1109/JESTPE.2019.2958737.
- [86] L F Coffin Jr. "A STUDY OF THE EFFECTS OF CYCLIC THERMAL STRESSES ON A DUCTILE METAL." In: (June 1953). URL: https:// www.osti.gov/biblio/4363016.
- [87] S. S. Manson. "Behavior of materials under conditions of thermal stress." In: 1953.
- [88] Reinhold Bayerer, Tobias Herrmann, Thomas Licht, Josef Lutz, and Marco Feller. "Model for Power Cycling lifetime of IGBT Modules various factors influencing lifetime." In: 5th International Conference on Integrated Power Electronics Systems. 2008, pp. 1–6.
- [89] Uwe Scheuermann and Ralf Schmidt. "A New Lifetime Model for Advanced Power Modules with Sintered Chips and Optimized Al Wire Bonds." In: May 2013.
- [90] Guang Zeng, Ludger Borucki, Oliver Wenzel, Oliver Schilling, and Josef Lutz. "First Results of Development of a Lifetime Model for Transfer Molded Discrete Power Devices." In: PCIM Europe 2018; International Exhibition and Conference for Power Electronics, Intelligent Motion, Renewable Energy and Energy Management. 2018, pp. 1–8.
- [91] Alexander Otto and Sven Rzepka. "Lifetime modelling of discrete power electronic devices for automotive applications." In: AmE 2019 - Automotive meets Electronics; 10th GMM-Symposium. 2019, pp. 1–6.
- [92] M. Hernes, Salvatore D'Arco, Antonios Antonopoulos, and Dimosthenis Peftitsis. "Failure analysis and lifetime assessment of IGBT power modules at low temperature stress cycles." In: *IET Power Electronics* 14 (May 2021). DOI: 10.1049/pel2.12083.

- [93] Josef Lutz, Christian Schwabe, Guang Zeng, and Lukas Hein. "Validity of power cycling lifetime models for modules and extension to low temperature swings." In: 2020 22nd European Conference on Power Electronics and Applications (EPE'20 ECCE Europe). 2020, P.1–P.9. DOI: 10.23919/EPE20ECCEEurope43536.2020.9215609.
- [94] Ralf Schmidt, Felix Zeyss, and Uwe Scheuermann. "Impact of absolute junction temperature on power cycling lifetime." In: 2013 15th European Conference on Power Electronics and Applications (EPE). 2013, pp. 1–10. DOI: 10.1109/EPE.2013.6631835.
- [95] Keting Hu, Zhigang Liu, He Du, Lorenzo Ceccarelli, Francesco Iannuzzo, Frede Blaabjerg, and Ibrahim Adamu Tasiu. "Cost-Effective Prognostics of IGBT Bond Wires With Consideration of Temperature Swing." In: *IEEE Transactions on Power Electronics* 35.7 (2020), pp. 6773– 6784. DOI: 10.1109/TPEL.2019.2959953.
- [96] Moinul Shahidul Haque, Seungdeog Choi, and Jeihoon Baek. "Auxiliary Particle Filtering-Based Estimation of Remaining Useful Life of IGBT." In: *IEEE Transactions on Industrial Electronics* 65.3 (2018), pp. 2693– 2703. DOI: 10.1109/TIE.2017.2740856.
- [97] Zhen Rao, Meng Huang, and Xiaoming Zha. "IGBT Remaining Use-ful Life Prediction Based on Particle Filter With Fusing Precursor." In: *IEEE Access* 8 (2020), pp. 154281–154289. DOI: 10.1109/ACCESS.2020. 3017949.
- [98] Mahera Musallam and C. Mark Johnson. "Real-Time Compact Thermal Models for Health Management of Power Electronics." In: IEEE Transactions on Power Electronics 25.6 (2010), pp. 1416–1425. DOI: 10. 1109/TPEL.2010.2040634.
- [99] Anis Ammous, Bruno Allard, and Hervé Morel. "Transient temperature measurements and modeling of IGBT's under short circuit." In: *Power Electronics, IEEE Transactions on* 13 (Feb. 1998), pp. 12 –25. DOI: 10.1109/63.654955.
- [100] L. Meysenc, L. Saludjian, A. Bricard, S. Rael, C. Schaeffer, and D. Wagner. "A high heat flux IGBT micro exchanger setup." In: IAS '96. Conference Record of the 1996 IEEE Industry Applications Conference Thirty-First IAS Annual Meeting. Vol. 3. 1996, 1309–1316 vol.3. DOI: 10.1109/IAS. 1996.559235.
- [101] Yvan Avenas, Laurent Dupont, and Zoubir Khatir. "Temperature Measurement of Power Semiconductor Devices by Thermo-Sensitive Electrical Parameters—A Review." In: *IEEE Transactions on Power Electronics* 27.6 (2012), pp. 3081–3092. DOI: 10.1109/TPEL.2011.2178433.
- [102] Ui-Min Choi, Søren Jørgensen, and Frede Blaabjerg. "Advanced Accelerated Power Cycling Test for Reliability Investigation of Power Device Modules." In: *IEEE Transactions on Power Electronics* 31.12 (2016), pp. 8371–8386. DOI: 10.1109/TPEL.2016.2521899.
- [103] Uwe Scheuermann and Stefan Schuler. "Power cycling results for different control strategies." In: *Microelectronics Reliability* 50.9-11 (2010), pp. 1203–1209.

- [104] Ke Ma, Ui-Min Choi, and Frede Blaabjerg. "Prediction and Validation of Wear-Out Reliability Metrics for Power Semiconductor Devices With Mission Profiles in Motor Drive Application." In: *IEEE Transactions on Power Electronics* 33.11 (2018), pp. 9843–9853. DOI: 10.1109/TPEL.2018. 2798585.
- [105] Lorenzo Ceccarelli, Ramchandra M. Kotecha, Amir Sajjad Bahman, Francesco Iannuzzo, and Homer Alan Mantooth. "Mission-Profile-Based Lifetime Prediction for a SiC mosfet Power Module Using a Multi-Step Condition-Mapping Simulation Strategy." In: *IEEE Transactions on Power Electronics* 34.10 (2019), pp. 9698–9708. DOI: 10.1109/TPEL.2019. 2893636.
- [106] Dao Zhou, Huai Wang, Frede Blaabjerg, Soeren Knudsen Kaer, and Daniel Blom-Hansen. "Real mission profile based lifetime estimation of fuel-cell power converter." In: 2016 IEEE 8th International Power Electronics and Motion Control Conference (IPEMC-ECCE Asia). 2016, pp. 2798– 2805. DOI: 10.1109/IPEMC.2016.7512741.
- [107] Ui-Min Choi, Ke Ma, and Frede Blaabjerg. "Validation of Lifetime Prediction of IGBT Modules Based on Linear Damage Accumulation by Means of Superimposed Power Cycling Tests." In: *IEEE Transactions on Industrial Electronics* 65.4 (2018), pp. 3520–3529. DOI: 10.1109/TIE. 2017.2752142.
- [108] Lakshmi Reddy GopiReddy, Leon M. Tolbert, Burak Ozpineci, and João O. P. Pinto. "Rainflow Algorithm-Based Lifetime Estimation of Power Semiconductors in Utility Applications." In: *IEEE Transactions* on Industry Applications 51.4 (2015), pp. 3368–3375. DOI: 10.1109/TIA. 2015.2407055.
- [109] Uwe Scheuermann and U. Hecht. "Power cycling lifetime of advanced power modules for different temperature swings." In: May 2002.
- [110] Guang Zeng, Christian Herold, Torsten Methfessel, Marc Schäfer, Oliver Schilling, and Josef Lutz. "Experimental Investigation of Linear Cumulative Damage Theory With Power Cycling Test." In: *IEEE Transactions on Power Electronics* 34.5 (2019), pp. 4722–4728. DOI: 10.1109/TPEL. 2018.2859479.
- [111] Haiyu Qi, Michael Osterman, and Michael Pecht. "Plastic Ball Grid Array Solder Joint Reliability for Avionics Applications." In: *Components and Packaging Technologies, IEEE Transactions on* 30 (July 2007), pp. 242 –247. DOI: 10.1109/TCAPT.2007.898346.
- [112] Kumar Upadhyayula and Abhijit Dasgupta. "An incremental damage superposition approach for reliability of electronic interconnects under combined accelerated stresses." In: *ASME international mechanical engineering congress & exposition.* Dallas,1997.
- [113] Andy Perkins and Suresh K. Sitaraman. "A study into the sequencing of thermal cycling and vibration tests." In: 2008 58th Electronic Components and Technology Conference. 2008, pp. 584–592. DOI: 10.1109/ECTC. 2008.4550032.

- [114] Cemal Basaran and Rumpa Chandaroy. "Thermomechanical Analysis of Solder Joints Under Thermal and Vibrational Loading." In: *Journal* of Electronic Packaging - J ELECTRON PACKAGING 124 (Mar. 2002). DOI: 10.1115/1.1400752.
- [115] D. Ghaderi, M. Pourmahdavi, V. Samavatian, O. Mir, and M. Samavatian. "Combination of thermal cycling and vibration loading effects on the fatigue life of solder joints in a power module." In: *Proceedings of the Institution of Mechanical Engineers, Part L: Journal of Materials: Design and Applications* 233.9 (2019), pp. 1753–1763. DOI: 10.1177/1464420718780525.
- [116] R. Amro, J. Lutz, and A. Lindemann. "Power cycling with high temperature swing of discrete components based on different technologies." In: 2004 IEEE 35th Annual Power Electronics Specialists Conference (IEEE Cat. No.04CH37551). Vol. 4. 2004, 2593–2598 Vol.4. DOI: 10.1109/PESC. 2004.1355239.
- [117] Stephen Luko. "A Review of the Weibull Distribution and Selected Engineering Applications." In: (Sept. 1999). DOI: 10.4271/1999-01-2859.
- [118] Anil K. Jain, Jianchang Mao, and K. M. Mohiuddin. "Artificial Neural Networks: A Tutorial." In: 29.3 (1996). ISSN: 0018-9162. DOI: 10.1109/2. 485891. URL: https://doi.org/10.1109/2.485891.
- [119] Amruta Dongare, R. R. Kharde, and Amit D. Kachare. "Introduction to Artificial Neural Network." In: 2012.
- [120] Roman Novak, Yasaman Bahri, Diego A Abolafia, Jeffrey Pennington, and Jascha Sohl-Dickstein. "Sensitivity and generalization in neural networks: An empirical study." In: (2018).
- [121] Fionn Murtagh. "Multilayer perceptrons for classification and regression." In: Neurocomputing 2.5 (1991), pp. 183–197. ISSN: 0925-2312. DOI: https://doi.org/10.1016/0925-2312(91)90023-5. URL: https://www.sciencedirect.com/science/article/pii/0925231291900235.
- [122] Alireza Alghassi, Suresh Perinpanayagam, and Mohammad Samie. "Stochastic RUL Calculation Enhanced With TDNN-Based IGBT Failure Modeling." In: *IEEE Transactions on Reliability* 65.2 (2016), pp. 558–573. DOI: 10.1109/TR.2015.2499960.
- [123] Zhan Li, Yuan Gao, Hao Ma, and Xin Zhang. "A Simple ANN-Based Diagnosis Method for Open-Switch Faults in Power Converters." In: *IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society.* 2020, pp. 2835–2839. DOI: 10.1109/IECON43393.2020.9254607.
- [124] Zhiqiang Xu, Yang Gao, Xupeng Wang, Xiangyu Tao, and Qiang Xu.
   "Surrogate Thermal Model for Power Electronic Modules using Artificial Neural Network." In: *IECON Proceedings (Industrial Electronics Conference)*. Vol. 2019-Octob. 2019, pp. 3160–3165. DOI: 10.1109/IECON. 2019.8927494.
- [125] Yao Zhang, Zeyu Wang, Haoran Wang, and Frede Blaabjerg. "Artificial Intelligence-Aided Thermal Model Considering Cross-Coupling Effects." In: *IEEE Transactions on Power Electronics* 35.10 (2020). DOI: 10.1109/TPEL.2020.2980240.

- [126] Baozhu Hu et al. "Heat-Flux-Based Condition Monitoring of Multichip Power Modules Using a Two-Stage Neural Network." In: *IEEE Transactions on Power Electronics* 36.7 (2021). DOI: 10.1109/TPEL.2020.3045604.
- [127] Karthik Pugalenthi, Huisung Park, and N. Raghavan. "Prognosis of power MOSFET resistance degradation trend using artificial neural network approach." In: *Microelectronics Reliability* 100-101 (2019). DOI: 10.1016/j.microrel.2019.113467.
- [128] Weizhi Li, Binyu Wang, Junyong Liu, Guorong Zhang, and Junyan Wang. "IGBT aging monitoring and remaining lifetime prediction based on long short-term memory (LSTM) networks." In: *Microelectronics Reliability* 114 (2020). DOI: 10.1016/j.microrel.2020.113902.
- [129] Zhenyu Lu, Changming Guo, Ming Liu, and Ruoyu Shi. "Remaining useful lifetime estimation for discrete power electronic devices using physics-informed neural network." In: *Scientific Reports* 13.1 (2023). DOI: 10.1038/s41598-023-37154-5.
- [130] Panorios Benardos and George-Christopher Vosniakos. "Optimizing feedforward artificial neural network architecture." In: *Engineering Applications of Artificial Intelligence* 20 (Apr. 2007), pp. 365–382. DOI: 10. 1016/j.engappai.2006.06.005.
- [131] Sibi Ittiyavirah, S. Jones, and P. Siddarth. "Analysis of different activation functions using Backpropagation Neural Networks." In: *Journal of Theoretical and Applied Information Technology* 47 (Jan. 2013), pp. 1344– 1348.
- [132] Siddharth Sharma, Simone Sharma, and Anidhya Athaiya. "ACTIVA-TION FUNCTIONS IN NEURAL NETWORKS." In: International Journal of Engineering Applied Sciences and Technology 04 (May 2020), pp. 310– 316. DOI: 10.33564/IJEAST.2020.v04i12.054.
- [133] Anil Jain, Karthik Nandakumar, and Arun Ross. "Score normalization in multimodal biometric system." In: *Pattern Recognition* 38 (Dec. 2005), pp. 2270–2285. DOI: 10.1016/j.patcog.2005.01.012.
- [134] David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. "Learning representations by back-propagating errors." In: *Nature* 323 (1986), pp. 533–536.
- [135] Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. "The Performance of LSTM and BiLSTM in Forecasting Time Series." In: 2019 IEEE International Conference on Big Data (Big Data). 2019, pp. 3285– 3292. DOI: 10.1109/BigData47090.2019.9005997.
- [136] M. Schuster and K.K. Paliwal. "Bidirectional recurrent neural networks." In: *IEEE Transactions on Signal Processing* 45.11 (1997), pp. 2673–2681. DOI: 10.1109/78.650093.
- [137] Tangbin Xia, Ya Song, Yu Zheng, Ershun Pan, and Lifeng Xi. "An ensemble framework based on convolutional bi-directional LSTM with multiple time windows for remaining useful life estimation." In: *Computers in Industry* 115 (Feb. 2020), p. 103182. DOI: 10.1016/j.compind. 2019.103182.

- [138] Ralf Schmidt and Uwe Scheuermann. "Separating Failure Modes in Power Cycling Tests." In: 2012 7th International Conference on Integrated Power Electronics Systems (CIPS). 2012, pp. 1–6.
- [139] Ivana Kovacevic-Badstuebner, Johann Kolar, and Uwe Schilling. "Modelling for the lifetime prediction of power semiconductor modules." In: Dec. 2015, pp. 3–137. ISBN: 978-1-84919-901-8. DOI: 10.1049/PBP0080E\_ch5.
- [140] James Durbin and Siem Jan Koopman. *Time Series Analysis by State Space Methods: Second Edition*. Oxford Stat. Sci. Ser. 2012.
- [141] P. Murugan and S. Durairaj. "Regularization and Optimization strategies in Deep Convolutional Neural Network." In: (2017). URL: http: //arxiv.org/abs/1712.04711.
- [142] Alessandro Vaccaro, Andrea Zilio, and Paolo Magnone. "Lifetime Prediction in Power Semiconductor Devices: A Comparative study between Analytical Modeling and Artificial Neural Network." In: 2023 IEEE Applied Power Electronics Conference and Exposition (APEC). 2023, pp. 1172–1176. DOI: 10.1109/APEC43580.2023.10131380.