PCB Failure Analysis - Dealing with Burnt and Mangled Boards
Semiconductor and Electronic Failure Analysis Blog
Welcome to the Semiconductor and Electronics Failure Analysis Blog, and discussion forum for all things related to electrical, integrated circuit (IC) board and electronics failure analysis. Please subscribed to our feed and feel free to leave a comment or question. Thanks for visiting.
In order to successfully characterize, isolate, and eventually uncover a defect on a semiconductor device, it is necessary to begin with a basic understanding of the problem at hand. A basic description of the failure – for example, “output pin stuck high” or “device draws excessive power” – can go a long way towards helping an analyst formulate a plan for tackling a defective part. Once this basic semiconductor failure mode has been identified, the proper tools and procedures can be chosen to locate even the most minuscule of defects.
Considering the relative ubiquity of printed circuit boards in modern electronics, a typical failure analysis engineer will undoubtedly see countless numbers of printed circuit board failures over the course of his or her career. At first blush, many of these jobs may seem to have very little in common – a twisted, charred circuit board from the onboard computer of a river ferry and a defective video game console could hardly be more dissimilar. While it is true that no two failure analysis jobs are alike, and that all defects have subtle nuances that make them unique, PCB failures can generally be broken down into two categories: those occurring during the manufacturing process, and those that occur after the unit has been delivered to the end user.
As discussed previously, current leakage is one of the most prevalent failures in modern semiconductors and electronic devices. One of the most common techniques for locating current leakage is a liquid crystal, which is a quick and effective way of isolating failure sites; however, the liquid crystal has some limitations that prevent it from being useful in all cases. Liquid crystal works by using the heat generated by a leakage site to raise the temperature of the crystal to a “transition point”, where an analyst can optically observe a change in the properties of the crystal and thereby identify the leakage site. A more subtle failure may never be able to heat the liquid crystal to its transition point since smaller defects dissipate less power and therefore generate less heat. At the opposite end of the spectrum, high amounts of leakage can produce enough heat to raise the temperature of the entire device quickly enough that it is impossible to identify the transition point. To combat these shortcomings, fluorescent microthermal imaging can be used to supplement the standard liquid crystal.
One of the single most common failures that plague modern electronic devices is current leakage. This leakage can manifest in many ways; some devices may exhibit normal functionality with excessive power consumption, while others may stop working altogether. This is partly due to the multitude of different causes for current leakage – improper processing, packaging, or handling (in the form of electrostatic discharge damage) can result in defects that will draw excessive current, as can electrical overstress of a device in the field. Since current leakage is such a common failure mode, a good failure analyst will have many different tools to assist in the detection and isolation of defects that may cause an anomalous current draw.
Most modern electronic devices are packaged as proverbial “black boxes”; it is nearly impossible to tell what is happening inside a device by looking at the outside packaging. What’s more, many devices are designed to be virtually impossible to open without causing irreversible changes to the product. These types of devices pose a unique problem for failure analysis – without being able to see the functional pieces of a device, it is nearly impossible to find a failing component or signal. While there are a plethora of destructive techniques available, allowing the analyst access to the “guts” of a device, these techniques often carry with them a certain level of risk; destructively opening an integrated circuit or other assembly can, in very rare cases, induce damage. To help prove beyond reasonable doubt that any damage an analyst finds was pre-existing and not created during the course of the analysis, a non-destructive way of looking inside the black box is necessary. X-Ray imaging lends itself perfectly to this application, penetrating the shroud surrounding most devices with ease.
The world of electronics grows and evolves at a breakneck pace. On a seemingly daily basis, new electronic gadgets hit the market – the tech-savvy consumer is inundated with choices for faster home computers, powerful smartphones, and more visually stunning TVs; these examples only scratch the surface of the ever-changing landscape of electronic devices.
This process of continual growth and discovery seems clearly beneficial for all; there is, however, an unspoken corollary to the unfettered progress made in electronics: the specter of obsolescence looms large, relegating the old, broken, and tragically untrendy devices to the wastebasket. As electronics have become cheaper and more commonplace, the amount of electronics waste in landfills around the world grows at a seemingly exponential rate.
To limit the ecological impact of the growing problem of “e-waste”, the European Union created the Restriction on Harmful Substances (RoHS) directive, establishing limits on the amounts of the most ecologically dangerous materials commonly used in electronics. Manufacturers who choose to “go green” take this directive to heart; however, given the complex supply chain involved in most modern manufacturing, it is sometimes difficult to ensure that all components of a device meet RoHS requirements. In these situations, RoHS auditing can provide these manufacturers with some much-needed peace of mind.
At first glance, the modern integrated circuit may appear to be nothing more than a jumbled mess. Billions of transistors are connected to one another by a vast, labyrinthine network of metal traces, vias, wire bonds, and solder connections; a single electrical pulse may weave its way through countless other signals, moving through a spiraling spider’s web of conductors, before reaching its final destination at an output pin. For an analyst tasked with inspection or failure analysis of such a device, this convoluted system may resemble the proverbial Gordian knot. There is hope, however: just as Alexander was able to cut through the jumble of the fabled knot with his sword, an analyst skilled in deprocessing can slice through the tangles of circuitry, driving to the heart of the device under test.
While many different tools and techniques are used in performing failure analysis and assembling a report, the crux of the analysis is a clear, sharp photograph of the defect that lies at the root of the failure. Indeed, not only in failure analysis but in any of the sciences, it can be said that "seeing is believing": a detailed picture can remove any shadow of a doubt as to the nature of an object. In the case of failure analysis, a good image can help to identify the type of corrective action that must be implemented to resolve a recurring problem. For larger defects, an image taken with an optical microscope is often sufficient; however, given the infinitesimally small geometries used in modern semiconductors, a defect that may be catastrophically huge in terms of circuit performance may still be so small that it is effectively invisible to a traditional microscope - some defects are so minuscule, it is physically impossible to image them accurately with any sort of visible light optics. In these cases, electron microscopy is more than capable of peeling away the cloak of invisibility enshrouding a defect, providing crisp, detailed images at magnifications far beyond the limits of a traditional microscope.
As discussed in previous blogs, acoustic microscopy is a valuable part of the failure analysis process. The ability to use ultrasonic waves to construct an image of a device and study its construction without damaging or destroying the part is useful for guiding an analyst in choosing the proper approach for finding a failure. There are many different types of defects that can be detected with acoustic microscopy, each providing several avenues for further analysis.
The ability to isolate a defect in a sea of circuitry, pinpointing a problem hiding amongst a plethora of transistors and metal lines, is one of the cornerstones of successful failure analysis. An analyst would be hard-pressed to study an anomaly in depth without first knowing where the anomaly is. The resourceful analyst has many tools and techniques to aid in the detection of defects on an integrated circuit; some, like liquid crystal or thermal imaging, are best used to find short circuits that generate large quantities of heat, while others, like time domain reflectometry, are best suited to finding open circuits. Unfortunately, these techniques are often not sufficient, and an analyst must find a way to characterize a device, creating a baseline against which to contrast a failing unit in order to detect the defect at the root of an electronic component failure. In these cases, emission microscopy provides the perfect platform upon which to build an analysis.
To some skeptics, the value of failure analysis is not readily obvious. Spending time and resources on devices that, by definition, are nonfunctional and will not be released to a customer may seem wasteful. The true value in failure analysis, however, lies in its ability to identify characteristics that may lead to further failures, costing a company untold amounts in production fall-out and damaging their reputation with their customers. Presented here are two case studies of devices, both exhibiting multiple anomalous electrical opens. Even though both devices behaved similarly, the root cause of the failure between the two units was vastly different; determining the root cause of failure for both devices required in-depth IC failure analysis, resulting in major process weaknesses being identified in both cases.
Since the demonstration of the first integrated circuit in the late 1950s, semiconductor technology has developed explosively, growing at an exponential rate. The guidance computers that were used in the Apollo space program, performing the critical calculations necessary to land a manned spacecraft on the moon, have been completely dwarfed in complexity, memory capacity, and processing power by modern video game consoles and handheld MP3 players. Where an early microchip might contain several hundred devices, today's IC is home to billions of transistors. Even though semiconductor technology has come so far from its inception, it is not yet infallible, and failures do occur as a result of improper processing, misuse, or simply due to the inexorable march of time. Finding a defect on such a complex device may bring to mind clichéd sayings about needles and haystacks; however, the process of semiconductor failure analysis brings together a comprehensive toolset, a breadth of industry experience, and a certain degree of intuition, all in order to find that one in a billion defect.
Printed circuit board (PCB) technology serves as one of the fundamental building blocks of modern electronics. One would be hard-pressed to find an electronic device of even moderate complexity made in the past ten to twenty years that does not include at least one PCB in its construction. The ubiquity of PCBs in electronics means, of course, that a failure analyst is likely to see several malfunctioning boards in his or her professional lifetime. The size and complexity of a modern circuit board would seem to make successfully finding a defect an impossibility; however, with experience and the right mindset, PCB failure analysis can be a successful endeavor.
Electronics failure analysis is, at times, a daunting task. An analyst must constantly question his or her assumptions about a given device, circuit, or process, discarding false premises and peeling away the myriad layers of a problem until the root cause of the failure can be determined. Sometimes, one of the steps in this grand inquisition is to question the fundamental composition and purity of a material. Could ionic contamination be causing a short circuit? Was residual material, left behind on an improperly cleaned printed circuit board, the underlying cause for a solder joint failure? Fortunately, the analyst has tools to analyze materials, even down to their elemental makeup. Auger spectroscopy failure analysis is one of several such tools that an analyst might choose in such a case.
In this article, we take a look at common semiconductor defects or faults which can occur inside a package. Each type of error has multiple detection techniques and the electronic failure analysis method chosen depends on the sensitivity required, the type of chip it is, and whether or not the process is destructive.
Scanning Acoustic Microscopy (SAM) is a fast, non-destructive investigative technique frequently used in electronic failure analysis.
SAM uses ultrasound waves to image interfaces and detect possible defects within optically opaque structures and components such as chip capacitors, chip resistors, circuit board traces, discrete semiconductor devices, integrated circuits (ICs), and other electronic components.
SAM is frequently used in failure analysis to evaluate die attach integrity, heat spreader adhesion, and solder quality.
Integrated circuits can be fragile and will fail if not packaged correctly. Even those circuits which are designed to withstand shock are required to operate within very specific parameters. In this article, we take a look at how the failure analysis procedure may need to focus on whether the packaging of the electronic components is compromised and if so, to what extent. Various methods exist to detect the loss of package integrity, fine and gross leak testing is one example of testing that we use to identify package integrity.
When a complicated system fails, it's not easy to always understand what has caused the failure. For example after an employee evaluation, if people are dissatisfied with the entire exercise, it's not immediately apparent what the cause could be. There can be many factors thathave to occur simultaneously, or several individual factors operating independently. Perhaps both. If your car fails to start, there isn't always just one cause or even a fixed set of causes. Imagine how much more complex it must be to analyze the failure of an integrated circuit with its millions of components all networked together. In order to assist us with the evaluation of high-level faults, engineers have developed Fault Tree Analysis (FTA.)
Much of failure analysis work is done before any actual testing is done. It might seem that taking a chip and putting it in a scanning electron microscope or thermal microscope to find points of failure is the most direct way to detect where a problem is occurring, but this is reactive and the most costly approach. It also damages a company's reputation with its customers, which can be very costly.
Failure analysis is a procedure that should start from the ground up at the design stage itself. The initial investment of designing something to work around the failures of past systems can be repaid many times over later on with reduced failure costs.
In this article, we look at the Failure Mode and Effects Analysis (FMEA) procedure which is a technique for preventing failures from occurring in a chip in the first place.
Detecting and isolating a failure in an integrated circuit is no easy matter. There are many techniques used in failure analysis and choosing the right one is an art as well as a science. Sometimes we may need to use many techniques both for better detection as well as independent corroboration so that we can be sure of the results of a particular test. But all tests fall into one of two categories - destructive testing, and non-destructive testing. In this article, we look at why non-destructive testing is so important and what methods fall into this type of test.