To some skeptics, the value of failure analysis is not readily obvious. Spending time and resources on devices that, by definition, are nonfunctional and will not be released to a customer may seem wasteful. The true value in failure analysis, however, lies in its ability to identify characteristics that may lead to further failures, costing a company untold amounts in production fall-out and damaging their reputation with their customers. Presented here are two case studies of devices, both exhibiting multiple anomalous electrical opens. Even though both devices behaved similarly, the root cause of the failure between the two units was vastly different; determining the root cause of failure for both devices required in-depth IC failure analysis, resulting in major process weaknesses being identified in both cases.
Device A is a high-speed switching device, designed for use in communications devices that require an exceptionally high data rate. The device uses flip-chip interconnect technology to bond the die to its substrate, which is made of ceramic for withstanding thermal stress. Wile E. Coyote Microelectronics (hereafter referred to as WEC Micro), the manufacturer of the device, received several customer complaints stating that the device was malfunctioning immediately after being installed in their products. WEC Micro sent a sample of these malfunctioning devices to an external lab for failure analysis after determining that electrical opens were the cause of the malfunctions. Following good practices, the external lab ran several non-destructive tests on the failing units, including x-ray and acoustic microscopy. An x-ray revealed several misshapen and potentially missing flip-chip solder joints between the die and the ceramic substrate; acoustic microscopy showed greater than 75% delamination at the first subsurface interface, between the device die and the underfill material used to fill in the space between the die and ceramic substrate. Armed with the knowledge gained from non-destructive testing, the analyst chose to cross-section the device in-situ, without removing the die from its substrate. The cross-section revealed the problem; the die bump solder appeared to have reflowed after processing, and had wicked into the delaminated area between the die surface and underfill material; this wicking process did not leave enough solder to make good contact between the die and substrate. WEC Micro took the results of the analysis and concluded that their process would need to be improved to prevent this type of delamination, otherwise they would be faced with many, many more of these failures.
Device B is a microcontroller, designed by Real Genius Integrated Circuits (RGIC) to control a high-power laser system. The device is packaged as a plastic-encapsulated ball grid array (pBGA) using traditional gold wire bond interconnect technology. RGIC noted anomalous behavior in first-run production testing of the device and requested failure analysis to determine the root cause of the problem. Initial inspection and non-destructive testing showed no obvious defects; however, electrical testing revealed electrical opens on several signal pins. Since the non-destructive testing had not indicated any problem with the packaging of the device, the analyst determined that the failure was most likely somewhere on the die. The device was decapsulated to remove the plastic from over the surface of the device die. The analyst performed passive voltage contrast imaging using an electron microscope and found several metal traces that were “glowing”, indicating possible sites for an open circuit. Cross-sectioning these sites revealed improperly processed vias between metal layers causing an open circuit. With the failure mechanism identified, RGIC was able to alter their process and eliminate the anomalous behavior in future production runs.
In both of these case studies, the failure mode – electrical opens – was identical. However, the actual cause of these failures was completely different, with one device failure due to a packaging defect and the other suffering from improper die processing. The value of IC failure analysis becomes more apparent when viewed in this light; without the proper analysis, the root cause of failure for both of these devices would not have been found, and the manufacturers would not have been able to improve their process and deliver a functional, reliable product to market.