Let's quickly grasp what is there at each layer, bottom-up. First, you still have the software backend.
Yep, you cannot ignore this: let's place it at the base of the pyramid.
To generate the predictions, you need to invoke the ML model somehow. A simpler example is batch inference
. You can run it daily, hourly, or on-demand and use a workflow manager to orchestrate the process. It would access the data source, run the model and write the predictions to a database. The online inference
is a bit more complex. You might wrap the model as a service and expose REST API to serve predictions at request. There are more moving pieces to track.
At the ground level, you still need to monitor how this software component works. Did the prediction job execute successfully? Did the service respond? Is it working fast enough? Second, you have the data.
The production ML models take new data as input, and this data changes over time. There are also many issues with data quality and integrity
that might occur at the source or during transformation.
This data represents the model's reality, and you must monitor this crucial component. Is the data OK? Can you use it to generate predictions? Can you retrain the model using this data? Third, you have the hero: the ML model itself.
No model is perfect, and no model lasts forever
. Still, some of them are useful and relevant for the given task. Once the model is in production, you must ensure its quality remains satisfactory.
This model-focused component of ML monitoring is the most specific one. Is the model still fit for the task? Are the predictions accurate? Can you trust them? Lastly, there is the business or product KPI.
No one uses ML to get "90% accuracy" in something. There is a business need behind it, such as converting users into buyers, making them click on something, getting better forecasts, decreasing delivery costs, etc. You need a dollar value assigned to the model, a measurable product metric, or the best proxy you can get.
That is the ultimate goal of why you have the ML system in place and the tip of the monitoring pyramid. Does the model bring value to the business? Are the product metrics affected by the model OK? An ML system has these four components
: the software piece, the flowing data, the machine learning model, and the business reason for its existence.
Since an ML system is all of the above, ML monitoring has to be too.