How do you create a new data model?
To make a model, we work with all the data that we have to hand. As a liquid, the wine sample is very homogeneous and therefore, the components we want to measure will be evenly distributed throughout.
Working with thousands of samples, we have a full spectrum of data to go on, and each spectrum has a reference value that we use in the lab. Measuring a spectrum is easy – it’s just a case of pushing a button. The samples are tested on a reference instrument in the lab, which uses the data to create the model.
The beauty of working at FOSS is that we already have all the wine sample data that we need, so we don’t have to go out and collect it – we can make use of the data at our fingertips. We input the data into the algorithms to predict new wine samples coming into the analyzer. Our data is so well structured and linear that we don’t need fancy methods to build our models – it’s very easy for us to do.
How much data do you need for it to be reliable model and how can you have a model that works for different wine regions around the world?
We spend a lot of time ensuring that we have covered what we consider to be the main components within a wine sample. We have winemakers on our team, who we work closely with to ensure that we cover all bases, from high sugar and high alcohol wines to low sugar, low alcohol expressions, giving us all the necessary combinations for validation. After nearly 25 years of data gathering, we have what we need to cover the whole world and every possible type of wine sample that may go into one of our machines.
Can you explain the role of infrared light in the analysis process?
When analyzing a wine sample, we look at the vibrational patterns in the organic compounds through infrared technology. We can measure for over 20 key components in a wine sample, from its alcohol and sugar level to tartaric acid – it all comes down to what the infrared can see within the sample. Generally, our instruments can detect any variable above 50 ppm. They have a source that emits infrared light that goes through a component in the heart of the instrument called a cuvette. This is a third of the thickness of a sheet of paper – it needs to be incredibly thin in order for the light to pass through it. When it does so, it’s modulated and we get a signal.
Different characteristics of the sample stop some of the light going through it so less light comes out on the other side. The sensor creates an infrared spectrum based on the amount of light that has come out. The instrument is programmed to recognize certain spectra, like tartaric acid, to give just one example. This will show up as a particular pattern, and it can work out the level of concentration from that. The more data the instrument can use to create the models, the more reliable the results and the more variables the instrument can cope with.
How is this infrared light used in the modelling process?
All the organic molecules in the liquid have bonds. When the light passes through the sample and if the light has the right frequency, then the bond will absorb the energy and won’t be detected on the other side.
Our measurements are a transmission, as we send light through the samples. We’re looking for what’s not coming through on the other side and how the light is absorbed by the wine sample. This gives us the fingerprint of the wine that contains all the molecules present within it, many of which overlap with each other. If we’re testing for malic acid, for example, we can build a model that predicts malic acid, apply this model to the spectrum to return a malic acid value. It’s a fairly easy process.
How unique is this data modelling concept?
We are no longer unique in using the modelling idea – our value lies in the fact that we were first-movers and we now have data that we’ve collected over the last 25 years. The data covers all the different wines in the world from a variety of growing seasons and vintages, which hasn’t been done before on such a scale, so makes us unique.
Are the analyzers updated from a data cloud, or would customers need to buy a new instrument if they want the latest updates?
The instruments are being updated remotely all the time by FOSS through a data cloud like a software update while your phone is on charge, so users don’t need to worry about it. Our FOSS analyzers are quite alike in nature, so if you have two different analyzers then you can have them in a network and can apply the adjustments to your whole fleet through the cloud.
What are the main things to consider about data modelling?
It’s important to know that a data model is never a fixed thing; you can always update it and change it. The value lies in the data that goes into the model, not the model itself.
The more data that you put into these models, the better and more reliable the result is that you get out of them. People often forget that and just look at the reference and predictions graph. Some less sophisticated wine analyzers may seem good to begin with, but as they don’t operate with the same level of core data, if there is a change in season or conditions that the machine doesn’t recognize, then the model won’t be able to handle the sample and you’ll be thrown off course.