Blogs

all articles

MLOps: debugging a pipeline

26.11.2022 11:34

Healthcare in England is broken down into about 40 regions. For each region, we want to measure the differences in clinical outcomes conditioned on the ethnic and socioeconomic categories of the patients. To do this, we feed the data for each health region into a Spark GLM (Generalized Linear Models). The problem Everything was fine with our pipeline for six months before it started to blow up with: Caused by: org.

MLOps debugging: an example

11.04.2022 11:43

In our ML pipeline, we use generalized linear models to calculate the odds of certain clinical outcomes. We showed this to the client but odds were hard for them to understand. “Can we have probabilities instead?” they asked. So, having trained the GLMs, we fit the same data and calculate the average probabilities for each cohort. We then bastardise the data to create the counterfactuals For example, socioeconomic status is of interest so let’s make everybody the same level (counterfactual) then make our predictions again and compare with the factual results.