Creme Global Head of Data Modelling and Statistics, John O’Brien, offered his insights for the discussion paper ‘AI & Predictive Analytics for Food Risk Prevention’, published as part of the EU-funded digital transformation initiative called Big Data Grapes.
What impacts do you believe that Artificial Intelligence (AI) and predictive analytics are having on the food supply chain?
So far, the impact of AI/machine learning/predictive analytics seems to be all at one end of the food supply chain, with retailers using data for demand forecasting and customer engagement. As an example, the French retailer, Intermarché, reported great improvements in accuracy (15% up) and error rates (75% down) when using a demand forecasting AI as compared with its existing statistical systems.
Meeting predicted demand effectively will require extending these methods back into supply chain planning, and some retailers are starting to use AI solutions to address areas such as inventory and warehouse management, delivery coordination, and shelf-life prediction.
More broadly, there are other interesting areas amenable to using predictive analytics within the manufacturing environment. This can be everything from assessing incoming material, to safety within the factory, to exposure and reformulation studies.
What are the critical food risk assessment and prevention questions that you would expect AI to help answer?
All foodstuffs suffer from the risk of microbial contamination with potential consequences of pathogenicity and/or spoilage. This is particularly an issue in food processing plants where throughput is high and varied, where many niches for microbial growth may exist, and where large numbers of consumers may be impacted by a contamination event.
These microbial species do not exist in isolation, but many species can be found together forming a community that is known as a microbiome. Understanding the risk from microbial contamination means understanding the microbiome’s composition – which organisms are present and in what numbers. Compounding the problem, many bacteria (or other microbes) are non-culturable, either because they are unknown or because they are not recoverable using current growth media and conditions and so are intractable to conventional techniques.
The solution to that issue is to use DNA sequence data, which can now be obtained without the need for culturing the microbes beforehand. There are a number of different techniques to achieve this, but they all overcome the culturing problem and provide a catalog of the bacteria present in the particular microbiome under investigation.
This is where predictive analytics comes in. It has been shown that the presence of particular bacteria in a microbiome can create conditions conducive for the growth of other organisms. This means that the make-up of the microbiome can be predictive of the appearance of organisms of concern such as pathogens or food spoilage microbes. This is an area that Creme Global is currently active in.
What kind of public or private data assets will be required to train and feed such AI applications?
Since DNA sequence is the key to unlocking the microbiome, annotated repositories of reference sequences are required for identification of the microbiome’s constituent organisms.
A thoroughly annotated genome sequence permits a deeper analysis of the microbiome, where many interesting questions about the organisms present can be asked. What can they metabolize? What temperatures do they like? Can they form spores? Which antimicrobial agents are they susceptible to?
Which factors will accelerate or impede the widespread adoption of AI and predictive analytics for food risk prevention?
Speed will be the critical factor. All the utility and power of these tools will count for little if it takes weeks to provide results. Of course in some circumstances speed is not crucial. For example, conducting a baseline survey, temporal or spatial, of the microbiome in a plant. But if the tools are to be used, and trusted, for predictive and diagnostic purposes in a regular ongoing manner then prompt output is essential. The requirement of machine learning algorithms for training data can mean a lag period in implementation of such techniques, but this is less likely to be an issue as in many cases archives of the required data will be available and, where this is not the case, the burden of leaving existing protocols in place while these data are collected will be slight.
Another factor will be presenting results in a manner that is possible for the user to interpret. This is a challenge for AI generally, where the reasons for particular outcomes can be very difficult to isolate. Nevertheless, leaving the user with little more to go on than “because I said so” can result in mistrust of the tool.
Beyond hampering acceptance, this is important because, for all their seeming omniscience, AI/ML algorithms are blind to everything outside their input. Users who are familiar with the plant and its processes can be good partners to the algorithm when they have a good sense of how the algorithm’s decisions are made. Potentially to the extent of providing valuable feedback to further improve the algorithm’s performance.
Finally, and perhaps obviously, the tools must offer an improvement on conventional approaches. The improvement should be in predictive power but also, ideally, in cost and time also. It is likely that in many cases, the predictive tools will initially run side-by-side with the conventional approaches used in the plant and this is the window for the algorithm to prove itself and to demonstrate its worth to the user.