An open labelling process for annotating metal L-PBF data

26 February 2021

AMEXCI is developing a deep learning-based module for metal Laser Powder Bed (L-PBF) defect detection to support the qualification process of Additive Manufactured (AM) parts. The bridge between Artificial Intelligence (AI) techniques and AM can bring tremendous value for part qualification by analysing AM data, given the fact that AI is becoming more mature, and AM being a digital industry. AMEXCI aims at using AI for making AM a reliable and sustainable technology, by ensuring that a part complies with quality standards and avoiding build failures that imply material loss.

Yet, like any type of intelligence, an artificial one needs to be trained. This training must be based on structured input knowledge, meaning an organized and meaningful set of data. You want your model to automatically recognize cats and dogs? Get thousands of cats and dogs’ images from the net and train it. You want your model to recognize deformations during your L-PBF print? Well… there is no such labelled database available out there. Just need to create one.

Creating this labelling database as a ground truth for a machine learning development comes with a great effort for data acquisition, structuring and annotation before one can build a model over it. At AMEXCI, we have been creating this defect database from our internal build jobs, based on surface, geometrical and recoating related deformations within the metal L-PBF process. Once we gathered enough data, the next and most challenging step was to accurately label this data.

In this process we wondered — how could the labelling process be facilitated by an open innovation process?

1. What is data labelling and why AMEXCI did it

There are different types of AI techniques whether you are analysing pictures, values, or text. And different types of AI whether your data is structured and labelled or not. When data is not labelled, one should try the unsupervised way, meaning that the model will sort the dataset out and structure it by common features. When the data is already labelled, it is easy to build a model that can process it. Labelling simply means to tag, or attribute a label to a data, so that the model can be trained to automatically recognize the learnt feature on a new dataset.

The complexity of any AI project comes with the fact that one does not only need the data to be labelled to ease up the development, but it also needs it in vast quantity. This is the reason why it is so complex to annotate many different data types and challenging to develop intelligent tools for data analysis.

In our case, working with L-PBF monitoring data has not been easy and brought up many questions; what is this colour map telling about the quality of the layer? What is this blurry melting powder saying about the process? How can one recognize that those internal channels have been deformed on this type of pictures? So imagine, how could an AI figure this out by itself when it is hard to make sense of those data at all in the AM industry? AMEXCI’s in-house expertise enabled this labelling.

2. Data labelling in-house – creating this ground truth at AMEXCI

Creating a ground truth for metal Laser Powder Bed Fusion process data requires time and expertise. It is indeed hard to get vast sets of labelled data in a highly specialized field like metal L-PBF. At AMEXCI, we have worked with our domain experts (Application Engineers, Designers, AM Process Specialists) to progressively get an understanding of our process data.

Thanks to our partner Peltarion we had our 3×40.000 (3 types of pictures for each process deviation- our main data) clustered into an annotation platform by type of defect. This enabled us to work with a pre-processed dataset of defects. In other word, instead of tagging 120.000 data points we could look at 39 separate clusters showing different common features. By analysing the different defect patterns on our pictures and having done specific Design of Experiments to generate this data, we could understand why the clustering model had previously sorted some pictures together in those clusters.

Although we have the knowledge required to annotate the pictures, defining this ground truth on how to interpret the data needed more inputs from other experts. That is why we called out for researchers in the field through an open innovation process.

3. AM data labelling hackathon – Open-innovation as the way forward

From the 5th to the 18^th of February 2021, a labelling hackathon brought together 8 teams of experts in the field of metal L-PBF process data, from different universities across Europe. All experts got access to our annotation platform to tag the pictures using a pre-selected list of labels chosen by AMEXCI. During those two weeks, the annotation progresses of the teams were monitored. At the end of the challenge, we at AMEXCI evaluated all inputs based on the quantity of data annotated and the relevance of the labelling.By relevance we mean how much those annotations were aligned with our work at AMEXCI and the work of the other participants.

Opening the labelling to experts enabled us to gain a better understanding of our pictures by collecting diverse interpretations of our data. Since our defect classification model will be based on this ground truth, making sure that they are correct is key. Hence, we could double check our first label choice – was this a geometrical deformation or rather a false positive? Was this a lack of recoating or some recording problems?

Opening the development process to external experts is key especially when managing novel and innovative technologies. Through this challenge, AMEXCI demonstrated that it is possible to gather collective intelligence to boost the application of Artificial Intelligence for Additive Manufacturing, where initial lack of ground truth in the industry can be seen as a major barrier. At the same time this open approach ensured that the knowledge generated within AMECXI is dissipated back to research institutions around Europe, to ensure further progress in the industry for the years to come.

We want to raise awareness among businesses on the benefits of collaboration to overcome the biggest challenges using this example of applying AI to AM. It is necessary to encourage a boost in collective data generation, data cooperation and exchanges across both companies and research entities. At AMEXCI we strongly believe in building those ecosystems of trust to enhance collaboration as illustrated by the Rosetta Protocol for data sharing within AM users, that we initiated almost a year ago. After having created this dynamic for data sharing, we now want to promote the establishment of a possible voluntary labelling scheme within the Additive Manufacturing industry. Join us and reach out to maud.chidiac@amexci.com, if you are facing the same challenges!

Back to newsroom