Cooking up a report, but first, something fun

To begin this post, it is worth sharing a company update. WE HAVE A FIRST CUSTOMER. More accurately, we should say “user,” as we are not asking for money from this group. They have given happy consent to be included in today’s blog, and they are called Natural Way Food Group, a Fayetteville based peanut butter manufacturing company. We will take this opportunity to say that if you have not tried their peanut butter, you should. It is genuinely excellent. Their product is a low ingredient list novelty of the modern age. They are a perfect study point for our group as a food industry manufacturer here in Northwest Arkansas. Further, because they are a growing factory, they have growing factory problems. These are the same struggles faced by most factories, but amplified by a lack of support from the traditional vectors the industry relies on. This incoming rant will be saved for the next blog post, where we will dive more deeply into what we are actually doing with NWFG, but it is hard not to call out how poorly they have been treated. At Innova-Harmonics, we believe successful manufacturing comes from focused, valuable engineering attention, and it borders on criminal how difficult it is to provide that when starting a factory from scratch.

For next week’s update, look forward to how we used a virtual machine to revive a piece of software that is over twenty years old, and hopefully a machine of similar age in their facility. In the meantime, today we are going to talk about a fun dataset we came across while studying vibrational harmonics and their use in machine learning for industry. I, Nathaniel, am going to do my best not to butcher the data science behind the paper and the dataset, while almost certainly earning our CSO Micah, a data scientist, a few extra points on his blood pressure score.

The dataset we are talking about today is “ToyADMOS: A Dataset of Miniature Machine Operating Sounds for Anomalous Sound Detection.” It is a genuinely interesting paper, available on arXiv: https://arxiv.org/pdf/1908.03299

So as you can see from the title, figures, and abstract, this paper and dataset are focused on compiling information for anomaly detection using toys. This approach strikes us as genuinely clever, especially having seen firsthand the cost associated with analyzing full scale industrial machines. Innova-Harmonics gets particularly excited when we read even the opening line of the introduction: “Since anomalies might indicate faults or malicious activities, prompt detection of anomalies may prevent such problems. Microphones have been used as sensors to detect anomalies, referred to as anomaly detection in sounds (ADS) [1] or acoustic condition monitoring [2], in many applications such as audio surveillance [3–6], machine condition inspection, and fault diagnosis [7–9].” Honestly, that is really cool.

They go on to state that, to the best of their knowledge, no freely available datasets exist for anomaly detection in sounds, and we agree. It is striking just how valuable this line of research could be if paired with the right datasets. I am struggling to remember whether I have said this before here on the blog, but where we find ourselves in industry is at the intersection of “we know what we want” and “we know it is hard to get.” Individuals and organizations can often provide electrical hardware, software, machines to pull data from, or the means to compile that data, but very rarely all of those pieces together in one place.

The paper addresses this problem by working with three different toy types and introducing several ways those toys can “malfunction” through intentional damage. The data is then labeled as “normal sound,” “anomalous sound,” and “environmental noise.” The environmental noise samples are included to simulate different factory conditions, with the goal of building simple baseline models that can later be adapted to more realistic use cases. This approach is refreshingly approachable, and we hope it leads to stronger and more accessible research in this space going forward. We certainly found the dataset useful, if only as a motivator for the approaches we are now taking with other data sources and other machines.

Those reading should check out the paper and further their interest by going the extra mile and downloading the dataset. We believe approaches like this one would go really far in ensuring that industry “catches on” to approaches like this one for characterizing machine failure.

Leave a Reply

Your email address will not be published. Required fields are marked *

en_USEnglish