SH: To get started, what is your answer to the question ‘what is a model’?
AA: I’m glad that you said ‘what is your answer to this’ because these definitions are difficult. You need to be very careful on what you include and exclude. Broadly speaking, a prediction model is a simplified representation of reality that allows us to make projections into the future, make educated guesses about what is more likely to happen. A familiar example would be a weather forecast. It gives us an idea of what might happen tomorrow and it allows us to make better decisions. Models are educated guesses, with a level of uncertainty about what might happen in future, but generally they allow us to make better decisions.
SH: That part I find interesting- they’re "educated guesses".
AA: With any model you simplify reality, and you do that with a set of assumptions. You also feed your models some data. Your model is only as good as your assumptions and as good as your data. This is one of the places where you can criticize a model: if you don’t agree with certain assumptions of the model, or if you think the data is inadequate or biased. All predictive models are concerned about what would happen in the future under different sets of assumptions. Those could be ‘what if’ scenarios – 'what if we do this?', 'what if we did that?'. The idea is to give you a peek into what’s more likely to happen under different scenarios and, hopefully, that will allow you to make better decisions.
SH: So, what is a 'clinical prediction' model?
AA: Again, broadly speaking, a clinical prediction model is a predictive model that applies to either outcomes of diseases or diagnosis of diseases. Say you have a CT scan of someone’s chest and you want to determine whether they have a certain disease or not, or a patient is sitting in front of you and you want to predict their risk of developing a certain complication. If you can differentiate your patient population into high-risk and low-risk individuals, then you might be able to tailor treatment to their individual needs in a more efficient way. You might treat high-risk patients more aggressively to protect them from, say, a heart attack, and low-risk patients less aggressively to protect them from the potential harms that are associated with treatment.
SH: Can you tell me about ACCEPT?
AA: ACCEPT stands for Acute COPD Exacerbation Prediction Tool. COPD, or chronic obstructive pulmonary disease, is a common disease of the lungs. It’s the major cause of death and disability across the globe. In Canada, for example, COPD is the second leading cause of hospitalization, second only to childbirth, so a huge amount of burden on patients and society in general. Most of it is due to occasional flareup of symptoms in patients, something we refer to as 'exacerbations' or lung attacks, when the patient gets worse and would have to seek care. It could be as mild as calling their doctor, going to see their doctor, or as severe as being hospitalized in an Intensive Care Unit.
Because these lung attacks are the major driver of the burden of COPD, if you get an idea of who’s more likely to face a lung attack, that could help provide preventative treatment targeted at those high-risk individuals. This is what ACCEPT does. ACCEPT uses simple clinical and demographic data, such as age, biological sex, medications that a person is on, and lung function test to predict rate of exacerbation, severity of exacerbation, and probability of having an exacerbation, for an individual over the next one year. That allows the doctor to tailor treatment to the needs of that specific patient. For example, there are two preventative therapies. Daily azithromycin, which is a daily antibiotic therapy that can reduce the risk of exacerbation but is also associated with some adverse effects, including hearing loss. Then roflumilast, which is another potential preventative treatment, is again associated with some harm as well. A benefit-harm study showed that the benefits of roflumilast would outweigh the harm if the patient has a severe exacerbation risk of at least 22%. This shows you how, by having an idea of how likely an individual is to have a lung attack, we can tailor the treatment to the particular needs of that individual and hopefully achieve a better outcome for that individual.
SH: How do you define harms and benefits?
AA: Harms are side effects and adverse events that are associated with therapy. Say you take the treatment to prevent exacerbation and you end up with hearing loss; that would be the harm. The benefit, in this context, would be preventing an exacerbation or not receiving a medication that you do not need. So, when we talk about benefit or harm we are kind of talking about what can happen to the quality of life.
SH: So, would the harms and benefits be put into a [formal] quality of life framework?
AA: That is usually beyond the scope of a model development and validation study, but yes this is something you would do in your study if you are doing a cost-effective analysis or a harm-benefit analysis. Our model development and validation study, however, ends with reporting the risk. We report to the doctor and the patient a risk of developing severe COPD within next year, here’s how many exacerbations you might experience within the next year, and it will be up to the doctor and the patient to use this information in their shared decision-making process.
SH: In the clinical context [using ACCEPT], would the interaction between the doctor and the patient go like, "If you go with the preventative treatment, these are the odds that you’re going to have hearing loss, or these are the odds of another adverse outcome"? Then would it be up to the patient to weigh those harms and benefits for themselves?
AA: Pretty much. Shared decision-making is one of the things that this model enables. The doctor and the patient can sit together in front of the results of a prediction model and discuss the potential benefits of this treatment and the potential harms of this treatment, and then it would be up to them both to decide on which course of treatment they want to go with. It would be one of the possibilities that this model would bring about.
SH: Can you tell me about how you went about choosing the predictors of exacerbation that made it into ACCEPT?
AA: Sure. Broadly speaking, there are two different strategies for choosing predictors in any model. Either you can decide on what your predictors will be – based on, say, clinical expertise – before you start doing anything. Or, you could go and collect data and use your data to guide the process of selecting predictors. This latter approach is something that people refer to as variable selection. The risk with variable selection would be that you open yourself up to the potential of having bias, 'overfitting' and 'optimism', because you may pick up random noise in your particular dataset and interpret it as a signal to predict the outcome. It might work in your particular dataset, but as soon as you move to another dataset, or to the real world, it might not hold up. To prevent that, we prespecified predictors using clinical knowledge. We consulted clinician experts and asked them what they consider when they are trying to determine risk for COPD exacerbation.
We also did a survey of the literature. We looked at the variables that people use in actual clinical practice to guide their decision-making. We basically came up with the predictors beforehand, based on clinical knowledge. This method has the advantage of preventing overfitting, bias, and optimism. It also has the advantage of clinicians potentially being more willing to take up this model because it makes sense to them. It follows their line of thinking and makes it a bit more numerical and objective.
SH: Are there other predictors of exacerbation than those included in ACCEPT?
AA: Yes, for example, there are behavioural and environmental factors. I’ve seen studies that have collected daily symptom reports from patients, something that the patient would text on a daily basis or enter on their smartphone on a daily basis. There have been efforts to harness the predictive potential of that data. As we move towards using more and more smart devices that have a bunch of sensors on them and collect information from us, we will have more data to look at for these kinds of predictions. In case of COPD, nothing of that sort has made it to clinical application, but that does remain a possibility that I believe will be more and more explored in the future.
SH: What other variables of interest would be fruitful to have?
AA: Air quality, for example, indoors and outdoors. In our model, we use data from three different clinical trials. If we had data on pollution exposure per individual that could have benefited us immensely, but the data is just not that easily available. There are also nuances to that data. For example, you may live in a city or a particular neighbourhood of a city that does not have a lot of air pollution, but you may be a professional cook and that subjects you to a considerable amount of particulate matter pollution. You know how radiologists have these dosimeters that they wear and it records the radiation exposure that they have? If we could have something like that, that would report on the air pollution exposure that an individual would have over the course of their daily living, that might have some prediction power in it. Another thing is infections. Many exacerbations are caused by infections of the lower respiratory tract. The more data on that you have, and the closer to the timeframe that you want to predict exacerbations, the better.
SH: Why do you think we don’t collect data on some exposures, like workplace hazards, like air quality?
AA: There’s a couple of things that come to my mind. First of all, just to clarify, we used clinical trial data to build this model and those clinical trials were aimed at evaluating a preventative therapy for exacerbations. They were looking at a treatment, at a medication, and we’re doing a secondary analysis on the data collected in these trials. The purpose in these trials was not looking at something like air quality. So, the reason they didn’t collect air quality data was that they didn’t need that data. The reason why we are using clinical trial data is that clinical trial data is of very good quality, both in terms of exposures and outcomes. It is also cost effective, because the data has already been collected.
SH: If you had built ACCEPT with a patient partner in the team, is that something that would have changed the process?
AA: We do have a patient advisory committee, and they did provide input on the web app for ACCEPT. They say in product design all the time ‘talk to the user, talk to the user, talk to the user’. It’s common sense because if you build something for someone and they’re not willing to use it, then it’s pointless, right? And potentially ACCEPT could be used as a shared decision-making tool. Because of that, it was important to see what patients think of the model. When we did consult our patient partners, one of the things that came up was a lot of them had privacy concerns when they were using the web app. This was something that we never thought of because it was obvious to us that the web app was not recording any personal information, but patients didn’t know this and they were a little bit alarmed by it. The lesson we learned there was that we need to explain this better. We need to make sure that we make it clear that we’re not collecting any of this data. We just use it on the fly to make a prediction and it goes away. So again, there are all sorts of aspects of the work that can be affected by this input. Patient input is definitely something that you can benefit from, especially when it comes to under-represented populations. One of the positive developments of the past few years has been giving more attention to things like race and ethnicity. So, are we looking at how this disease might progress differently in different ethnic, racial populations, or are we just developing all of this for a certain ethnic group? Again, these are things in which we can benefit a lot from the patient’s perspective. It’s important to underscore that we, as modellers, have a certain perspective that is not identical to that of a patient. We don’t have that lived experience; we may not be able to appreciate the many ways these predictive tools might affect patientcare, and that is why it’s important for us modellers to listen and to learn from patients, so we can design a tool that is useful to them.
SH: Was there ever any point in developing ACCEPT where you came to a fork in the road?
AA: This question reminds me of a very interesting article Andrew Gelman wrote a couple of years ago, which he cleverly called the garden of forking paths. The reality of developing models is that there are many of these forks in the road; we notice some of them and we make explicit choices, but there might be many other decision-points in which the researcher isn’t even consciously aware that they are making a choice.
One thing that we did explore but eventually did not include in the model was the effect of seasonality. We do know that exacerbations have a seasonal pattern, because of a wide potential variety of reasons, including flu season. But then we ran into a number of problems, in terms of the computational capacity required to incorporate that variable and the potential benefit that it would have. It was a lot of complication and cost with little predictive benefit to the model. We ended up reporting the outcome as [a] one-year prediction, and part of the reason was because in a 12-month prediction you’re averaging out the seasonal effects, so you don’t need to worry about it. Whereas if you want to say whether or not your patient will have an exacerbation within the next six or three months, then it matters what time of the year it is right now. So that was one of those decisions.
SH: You recently wrote about making clinical prediction models available online. Can you break down your argument?
AA: Sure. We published an opinion piece in which we talked about the problem of having too many prediction models and not enough validations. At the time that we were writing that manuscript, there were 408 clinical prediction models for COPD outcomes. This was more than the total number of completed phase III randomized trials, ever, on COPD outcomes at that point in time. All these models have been built, but very few of them end up being actually used in clinic, for a wide variety of reasons. One of them is that you need to extensively validate a model. Ideally, you want to validate your model across geographies, in different parts of the world, across different populations, depending on your model, obviously. If you have a model that has been widely validated then you will have more confidence in using it in a clinical setting or in running the randomized trials to see whether or not this model is actually useful in clinic. But not enough validations are being done. One reason is that there’s not a lot of incentives to do a validation study. It’s much easier to attract funding for a new prediction model than, for a study that validates someone else’s model, even though this validation might actually be more useful.
We are suggesting a potential strategy to solve this problem. We’re saying, look, we might be able to take inspiration from the principles underlying natural selection and selective breeding. For millennia, our ancestors have developed their desired traits in animals and in plants. What if we could do something similar for clinical prediction models? What if we can sort of make them come alive and give the ones that perform better a bit of survival advantage and wait for the best models to emerge? Because the risk right now is that there might be some very good models that are being developed but they are kind of buried in the noise.
So, the idea is to put models on the cloud and, having privacy and security measures that are required, feed data from administrative databases, such as electronic medical/health record systems to these models constantly over time. Then rank the models based on how they perform over time and in different geographies. Then a physician in a certain hospital, in a certain city, can see which models are performing better in their particular setting and use that. The idea is to make this whole process of trying to determine 'which models perform better?' as automated and as smart as possible and without a lot of effort on the side of researchers.
SH: You’re making ACCEPT available on the cloud. For you, what would success look like in making that available?
AA: We decided early on that we wanted to go for maximum transparency, so our model is ‘open source’. But not just open source: we’re publishing a web app, publishing an R package, publishing an interactive Excel sheet, and now publishing this web API that would allow people to integrate this model into other pieces of software. My hope is that it inspires other researchers to critique this model, to attempt to validate it independently, using their own settings available to them. Maybe up to the point of conducting a clinical trial to see if it actually is useful in practice. I also hope that it might inspire other researchers to share their models, make their models more easily available, because I think this is a win-win situation for everyone.
This helps the iterative process of science because now we can learn from each others' models. We can discredit the models that don’t perform well, and we can show which ones do perform well. This also serves as an excellent educational tool to trainees who are starting to get into these fields.
If 10 years down the road, we learn that because we made it simpler to access this model, five more validations have been attempted, regardless of the outcome, regardless of whether the model survived those validation attempts or didn’t, we’ve learned something which wasn’t as easy to learn before. I think I would call that a success, even if the model failed. And if the model performed well in different settings, then perfect.
SH: Do you see that article as being connected to ideas about transparency?
AA: It is. I would also like to point out that it’s important to make sure the model is accessible, because you can be transparent but not accessible. I can publish a very complex piece of software code that no one understands and claim that it is open source, you know. But if people don’t understand it, if people can’t interact with it, if people cannot easily point out what’s wrong with it, then it’s not that accessible; so, I think transparency is required but not necessarily enough. You also need to make it easy to understand and easy to interact with.
SH: What’s the end goal of that cluster of virtues, like transparency, accessibility and interpretability? What are we trying to achieve with those things?
AA: I think we put a lot of effort into research and it’s sad when you see there’s so much research waste. A lot of work, a lot of financial resources, a lot of human capital, goes into doing all this research. If we can prevent research waste, prevent things that do not end up being used, and if we can make it more efficient to improve upon what we already have, I think that’s a great outcome.
SH: Why does some research get wasted and some become very useful?
AA: I think part of the problem is that the short-term incentives for academics are not necessarily always aligned with the long-term goals of the science. So that, for example, leads to this explosion of clinical prediction models. Sometimes we have very complex clinical prediction models that don’t perform as good as an incredibly simple one and I understand why that happens. A lot of times it’s because it’s easier to get funding for that kind of project, like that kind of project gets better news coverage, and there’s a lot of hype around it.
So, I think what we’re trying to do here is to develop some tools and best practices that make it easier to do good research and contribute to a better outcome for everyone. As a modeller, my goal is to contribute to improved care. Sometimes this could mean picking up a model that is performing well in validations; other times it means getting rid of a model that is not improving care. Either way, we need to know how well a model performs, right? And I think Peer Models Network makes it easier for people to critique models – to show which models work, which models don’t work – and opens models [up] to a wider community that did not previously have easy access to models. This could be patients, advocacy groups, journalists, policymakers, or any curious citizen really.
I would love to see someone who comes from a different background – not from this community of modellers, not bound by our shared ideas of how to make these models – take a look at it with a different angle, with a different perspective, and criticize it for something that we never even thought of. I think a lot of benefits can come from opening up your models to the entire world.