Why AI Struggles to Crack the Code of Nondestructive Testing

Disclaimer: this is an AI-generated article intended to highlight interesting concepts / methods / tools used within the SmartDATA Lab’s research. This is for educating lab members as well as general readers interested in the lab. The article may contain errors.

When algorithms meet ultrasonic echoes, missing data and messy signals stand in the way of a smarter inspection future.

There’s a common trope in science fiction: an all-seeing machine, unblinking and exact, scanning the world for flaws we can’t detect. In the real world of nondestructive testing (NDT)—the science of using sensors to find cracks, corrosion, and hidden damage in everything from aircraft wings to oil pipelines—that dream still feels frustratingly out of reach.

We have artificial intelligence that can beat world champions at Go, generate Hollywood-quality dialogue, and diagnose rare diseases from pixelated scans. And yet, when we point these same tools at ultrasonic signals or thermographic images from NDT inspections, the results are… inconsistent at best.

Why? Because the core challenge of AI in NDT isn’t just technical—it’s fundamentally structural. The very nature of NDT data throws a wrench into most off-the-shelf AI approaches. It’s a problem of scarcity, diversity, and mismatch—three interlocking issues that keep inspection-focused machine learning stuck in second gear.

Scarcity: The Paradox of Critical, Rare Data

Modern AI thrives on abundance. It learns by example—hundreds of thousands of examples, ideally. But in NDT, data is precious, and defects are rare by design. Engineers don’t build bridges to fail, and manufacturers don’t ship cracked turbine blades for training purposes.

That makes compiling robust datasets a headache. A typical dataset might contain hundreds of “normal” signals, and just a handful of signals from parts with actual defects. Worse still, some critical defects (like delaminations or embedded voids) only appear under specific load conditions, environments, or materials. Training a deep neural network on this kind of data imbalance is like teaching a dog breed classifier with 10,000 photos of Labradors and five blurry images of Dalmatians—it’s going to guess “Labrador” every time.

One emerging solution is synthetic data: generating training examples using simulations based on physical models, like finite element methods (FEM). This approach has promise, especially for ultrasonic inspections. But it raises its own set of questions: how realistic is the simulated noise? Are the defect geometries faithful? Can a model trained on synthetic data generalize to messy, real-world environments?

Diversity: NDT Data Is a Patchwork Quilt

Nondestructive testing isn’t one field—it’s a constellation of overlapping technologies, each with its own data types and quirks. There’s ultrasonic testing (A-scans, B-scans, C-scans), radiography, thermography, eddy current testing, and acoustic emission, to name a few. Each produces data with different structures: waveforms, images, time-frequency representations, and more.

Even within a single method like ultrasonics, signal characteristics can vary wildly depending on material properties, transducer frequency, geometry, surface roughness, and coupling conditions. That means a model trained to detect delaminations in carbon fiber wind turbine blades might completely fail on stainless steel pipes.

In some domains, transfer learning has emerged as a fix—repurposing models trained on one dataset to another with a few tuning adjustments. But transfer learning assumes a shared underlying structure. When your data spans multiple physical domains, transfer becomes less “fine-tuning” and more “translation.”

Mismatch: AI Assumptions vs. Physical Reality

Most AI models assume their data is independently and identically distributed (i.i.d.), meaning each sample is drawn from the same distribution. But NDT data often breaks this assumption. Imagine inspecting two aircraft panels: one aluminum, one composite. Even if both are defect-free, their ultrasonic responses will look different. AI might mistakenly interpret those material differences as signs of damage.

Compounding the problem is label uncertainty. Many NDT datasets rely on manual annotation, where inspectors label whether a signal contains a flaw. But ground truth is hard to come by. Did the signal come from a crack, or just a rough weld? Was that reflection from a real defect or a calibration artifact? Even seasoned inspectors sometimes disagree.

This mismatch between AI expectations and physical complexity is where many promising models stumble. A classifier might reach 99% accuracy on a clean lab dataset—then drop to 65% when tested in the field, where things are noisy, irregular, and unlabeled.

The Mathematics Behind the Curtain

The challenges aren’t just data-driven—they also emerge from the mathematical tools AI uses to make sense of signals.

Take dimensionality reduction, a staple of many AI pipelines. Algorithms like principal component analysis (PCA) or t-SNE map high-dimensional sensor data into lower-dimensional spaces, where patterns (and outliers) become more visible. But these mappings depend heavily on the structure of the data. Inconsistent sampling, noise, or missing features can distort the low-dimensional embedding and lead to false positives or negatives.

Similarly, when convolutional neural networks (CNNs) process ultrasonic waveforms or scan images, they rely on local kernels—essentially sliding windows of weights—to extract features. But unlike natural images (which have translation-invariant patterns like edges or textures), NDT signals often encode position-specific physics. The same reflection pattern might mean “flaw” in one part of the signal, and “normal boundary” in another. Without integrating physical constraints, CNNs are prone to misinterpretation.

Toward Smarter Inspection: What’s Next?

There’s growing interest in physics-informed machine learning—a hybrid approach that embeds domain knowledge directly into AI models. Instead of letting the model learn everything from scratch, researchers use known wave equations, boundary conditions, or material properties to guide the learning process. It’s like giving your algorithm a compass before setting it loose in a data jungle.

Another promising area is uncertainty quantification. Rather than producing a hard yes/no defect classification, AI systems are being designed to report confidence intervals, allowing inspectors to focus on ambiguous regions and override the model if needed. This human-in-the-loop approach blends AI’s speed with expert judgment.

Finally, there’s a push for standardized datasets—something NDT has historically lacked. The creation of open-access, labeled repositories for different materials, geometries, and defect types could help build more generalizable models and benchmark progress across research groups.

The Bottom Line

Applying AI to nondestructive testing isn’t just about upgrading tools—it’s about reshaping how we think about data, modeling, and uncertainty in high-stakes environments. While the dream of fully autonomous inspections is still some distance away, we’re making progress—not by forcing AI to do everything, but by respecting the complexity of the problem and designing systems that balance pattern recognition with physical understanding.

Sometimes, the smartest machine is the one that knows when to ask for a second opinion.

References on AI in Nondestructive Testing

Uhlig, S., Alkhasli, I., Schubert, F., Tschöpe, C., & Wolff, M. (2023).
A review of synthetic and augmented training data for machine learning in ultrasonic non-destructive evaluation.
Ultrasonics, 134, 107041.
https://doi.org/10.1016/j.ultras.2023.107041
This paper provides a comprehensive review of methods for generating synthetic and augmented ultrasonic testing (UT) data, addressing challenges in data scarcity for machine learning applications in NDT.
Taheri, H., Gonzalez Bocanegra, M., & Taheri, M. (2022).
Artificial Intelligence, Machine Learning and Smart Technologies for Nondestructive Evaluation.
Sensors, 22(11), 4055.
https://doi.org/10.3390/s22114055
This survey discusses the state-of-the-art AI and machine learning techniques in NDT, including the integration of smart technologies like digital twins and machine vision.
Gardner, P., Fuentes, R., Dervilis, N., Mineo, C., Pierce, S., Cross, E., & Worden, K. (2020).
Machine learning at the interface of structural health monitoring and non-destructive evaluation.
Philosophical Transactions of the Royal Society A, 378(2184), 20190581.
https://doi.org/10.1098/rsta.2019.0581
This article explores the intersection of machine learning with structural health monitoring and NDT, highlighting the challenges and opportunities in data analysis and interpretation.
Meyendorf, N., Bond, L., & Curtis-Beard, J. (2017).
NDE 4.0—NDE for the 21st Century.
APCNDT Proceedings.
https://www.researchgate.net/publication/316899060_NDE_for_the_21st_century_industry_40_requires_NDE_40_Conference_Presentation
This presentation introduces the concept of NDE 4.0, emphasizing the need for digital transformation in nondestructive evaluation practices to meet the demands of Industry 4.0.
Wunderlich, C., Tschöpe, C., & Duckhorn, F. (2018).
Advanced methods in NDE using machine learning approaches.
AIP Conference Proceedings, 1949(1), 020022.
https://doi.org/10.1063/1.5031519
This paper discusses the application of machine learning techniques in NDT, focusing on pattern recognition in acoustic signals and automated processing of imaging data.

Challenges Diversity Nondestructive Testing Physical Mismatch Scarcity Structural Health Monitoring

Why AI Struggles to Crack the Code of Nondestructive Testing