A POMDP Approach to Personalize Mammography Screening Decisions

Introduction

In the September-October issue of Operations Research, Turgay Ayer, Oguzhan Alagoz and Natasha Stout write about personalizing protocols for breast cancer screening using mammography.  (link to full paper in Articles in Advance) The purpose of a mammogram is to detect breast cancer at an early stage.  When a cancer is detected early there is greater flexibility in treatment modalities and increased probability of cure.   As a result it has become standard for women to receive regular mammograms annually or bi-annually from the age of 40.  However, mammograms have high false positive rates leading to unnecessary testing and treatments, as well as anxiety.  Mammograms also expose women to radiation that over time may cause cancers as well.    The goal of this paper is to develop a method for creating screening protocols that will improve detection and survival rates through more timely detection while at the same time reducing the overall usage of mammography and false positive rates.  Current screening protocols are a one size fits all approach and this paper seeks to customize them to an individual woman’s personal risk characteristics and screening history.  The paper is indicative of an important trend in healthcare.  As medical researchers discover more genetic links to diseases and patient information profiles get richer easier to store, communicate and analyze it will become easier to personalize healthcare customizing diagnostic and  treatment protocols it to individuals.  The challenges to personalizing healthcare that arise in the mammography context will apply to others as well. 

Invited Comments

The editors have invited comments on this work from several experts.

Dr. Diana Buist (ezembed

Buist-Comments -Password-protected content.

pdf Buist-Comments ) is the Principal Investigator of the Breast Cancer Surveillance Project at the Group Health Research Institute and is an affiliated Professor of Epidemiology at the University of Washington School of Public Health. She is an epidemiologist and health services researcher whose work focuses primarily on early detection of cancer and reducing illness and death from cancer in populations.

Dr. Brian Denton (ezembed

Denton-Comments -Password-protected content.

pdf Denton-Comments ) is an Associate Professor and Edward P. Fitts Faculty Fellow at North Carolina State University in the Department of Industrial & Systems Engineering. He also holds a fellowship appointment at the Cecil Sheps Health Services Research Center at University of North Carolina at Chapel Hill.  He researches applications of optimization techniques to health care delivery and medical decision making.

Dr. Doug Owens  (ezembed

Owens-Comments -Password-protected content.

pdf Owens-Comments ) is the Henry J. Kaiser, Jr. Professor, and Director of the Center for Health Policy in the Freeman Spogli Institute for International Studies (FSI) and of the Center for Primary Care and Outcomes Research (PCOR) in the Department of Medicine and School of Medicine at Stanford.  He is a general internist and Associate Director of the Center for Health Care Evaluation at the VA Palo Alto Health Care System.  Owens is a professor of medicine and, by courtesy, of health research and policy at Stanford University.  His research focuses on technology assessment, cost-effectiveness analysis, evidence synthesis, and methods for clinical decision making and he has developed methods for developing clinical practice guidelines tailored to specific patient populations.

Dr. Diana Petitti (ezembed

Petitti-Comments -Password-protected content.

pdf Petitti-Comments ) is a Professor at Arizona State University in the  Department of Biomedical Informatics.  She was the Vice Chair of the U.S. Preventive Services Task Force that recommended changes to Mammography screening protocols in 2009. She is a widely regarded epidemiologist and public health expert. 

Dr.  Abraham Seidmann (ezembed

Seidmann-Comments -Password-protected content.

pdf Seidmann-Comments ) is the Xerox Professor of Computers and Information Systems at the Simon School of Business , University of Rochester.  He is an expert on medical informatics and health care having researched and consulted on many process design and technology adoption problems in healthcare particularly in medical imaging.

Discussion

The idea that screening decisions should be based upon one’s own probability of disease is not a new one.  As Petitti states:

 

"

The attempt to tailor recommendations about screening based on a person’s underlying probability of disease using information on epidemiologic and other risk factors has a long history.

"
Petitti

However, in the context of the march toward greater personalization of medicine the work of Ayer et al is making an important contribution.  There is a progression from blanket recommendations such as all women over the age of 40 should have a mammogram every one or two years to a recommendation as made by the United States Preventive Services Task Force (USPSTF) in 2009 that there be "individualized, informed decision making about when to begin screening mammography"  for average-risk younger (ages 40 to 49) women, and biennial mammography from ages 50-74. The first is a global screening policy based on aggregate statistics that views all women 40 and above as the same.  The second acknowledges that different women are at different levels of risks and that they may also have individual preferences when it comes to the tradeoffs involved in choosing when to screen.  In the end though the recommendation is rather crude because it still aggregates women by age and does not indicate how the decision should be made at the personal level for the women in the 40 to 49 year old range.  To go to the next level of personalization of the screening recommendation we need a decision tool that can take a richer set of personal data and turn that into a screening recommendation.  This is where the paper by Ayer et all comes in.

"

The partially observable Markov decision process (POMDP) model described by Ayer, Alagoz and Stout is another effort at developing a model that could be used to tailor recommendations about screening mammography.  It is an especially well-crafted extension of prior modeling efforts aimed at providing better information upon which individual and “personalized” recommendations about mammography screening.  The dynamic nature of the model is a noteworthy advance.  By incorporating prior screening history and outcomes of screening, the model permits identification of women who do not need to be screened (or need to be screened less often) because they are a low risk of developing breast cancer. The incorporation of information on the age-dependence of disease progression, mortality, and test accuracy is also important.  

"
Petitti

The commentators have raised a number of implementation challenges that have broader implications for personalizing medicine as well.

The approach in the paper is built upon optimizing QALY (Quality Adjusted Life Years) but as Denton says

"

Some patients would undoubtedly struggle with interpreting the notion of a QALY used by the authors, which is dependent on how patients would rate the impact of a mammogram or biopsy on their quality of life.  In fact, the goal of maximizing average QALYs may not accurately represent a patient’s personal criteria. For example, some patients are likely to be risk averse, and not simply seeking to maximize average QALYs.

"
Denton

Beyond the issue of QALY, as Dr. Buist notes:

"

While personalized screening protocols sound good in practice, there are a number of complexities that currently limit their reality in most US settings. Significant infrastructure and resources need to be in place to support risk assessment (and continued monitoring of changes to risk) and sustained monitoring of outreach to ensure that individuals are getting screened is needed as well. Information sharing across providers is also a prerequisite.

"
Buist

For the personalized approach modeled in the paper to achieve its potential, accurate and consistent data must be collected and women need to comply with the “optimal” screening schedule.   Buist’s experiences with breast cancer surveillance indicates the effort that must be expended to make sure that updated risk factor data is collected and that women come for their screenings, as she says:

"

Personalized medicine has the potential to increase the effectiveness of medical care and the findings from this study add to the accumulating evidence about various strategies for decreasing the harms of screening.  However these benefits come at a price.  Adopting a systematic approach to personalized risk based screening has several noteworthy challenges: systematic collection of accurate data that can be updated to reflect changes in risk factor status over time and that is centrally located for health care providers to access to provide reminders on the appropriate interval..

"
Buist

Further she notes that another important challenge of personalized screening is that it puts a burden on providers to “be able to engage in informed discussions about the risks and benefits of differing screening strategies with women “.

To have these discussions the healthcare provider must not only have the time but also have the understanding of the models being used to generate a screening regimen to explain the recommendations to the patient and the patients must be able to understand the explanation.

Dr. Petitti’s experience with the USPSTF and the intense resistance to its 2009 recommendations suggests that such understanding is sorely lacking.

"

My overall conclusions about models to guide recommendations, decisions and policy about mammography and to “tailor” or individualize these recommendations based on models flowing from this experience are as follows.  First, there is essentially no interest of women and their physicians in numeric data from models as a guide to decision making about mammography, unless perhaps the data were to support a recommendation of “mammography every day, for every woman, forever.”  Second, the ability of women, physicians, and the media to understand information from models is so limited as to make potentially useful model essentially useless.  Third, with only a few exceptions, policymakers in the United States are unwilling to use models as input to decisions that would mean that something as “popular” as mammography would be unavailable to anyone, no matter how small the benefit.

"
Petitti

As Operations Research modelers it is natural to ask if these challenges mean that we cannot expect modeling to influence healthcare and contribute to something like the personalization of medicine?  The study effectively demonstrates the potential benefits of personalized screening regimens but they will only be effective if they are accepted. 

Petitti is pessimistic about this:

"

Can a better model turn the tide of mammography overscreening?  Would an average, even an above average physician, be able to use a better model better?  Given the sorry state of numeracy of the general population and the inherent complexity of the models that accurately capture the complexity of the underlying issue of tailored screening, can modelers hope to convince women, physicians, the media and policymakers of their utility?  Depressingly, for now, I conclude that the answer to all of these questions is no.

"
Petitti

On the other hand we know that, for example, in the case of radiation treatments, complex mathematical models are used to optimize doses and treatment plans.  No one expects the patient to resist the model and advocate for a different orientation of the beam of radiation. Something is different about breast cancer screening.  Partly it is because in the case of radiation treatment the patient has made the decision to do the treatment and leaves the technical details to the experts.  In the case of breast cancer screening the patient perceives that they are choosing whether or not to have a screening each time.  They are not choosing to be screened in general but letting the expert decide the details of the screening regimen.  Perhaps if when a woman turned 40 she was presented with a comprehensive screening plan for the next twenty years of her life with clearly marked update points it would reframe the issue in a way that led to decisions that better optimized screening.

Additionally, it is important to note that mammography has become highly politicized and the public has been conditioned to believe that all screening is good preventive care and the more the better.  It is always going to be difficult to advocate for less of something that people have gotten used to having especially if those people represent 50% of the voting public.

Petitti offers advice for Ayer et al that would serve any modeler taking on a controversial subject:

"

Keep up the fight.  But buy armor or develop a thick skin.  On reflection, do both.”

"
Petitti

I am not as pessimistic as Petitti. It is also important to realize that while we are making progress in modeling and risk analysis there is a long way to go and it is fair for people to be skeptical of the results.  In his commentary Owens points out that at the core of the model in Ayer et al is the Gail risk model which while widely used has significant limitations.

"

Assessing the risk for a woman is challenging. As the author’s note, the Gail model is commonly used to estimate risk.  The Gail model has good calibration (it accurately predicts the number of cancers in a group of women); however, it’s discrimination is not strong.  Discrimination is the ability to determine whether an individual woman will develop cancer.  In a study of 82,000 women, the probability that a randomly chosen woman who developed breast cancer had a higher Gail-model estimated risk than a woman who was disease free was only 0.58 (5).  Further work is therefore needed on approaches for estimating risk for individual women.

"
Owens

In his commentary Seidmann raises additional obstacles to a rationale approach to mammography decision making.  He points out that:

"

Studies looking at the value of population screening, and early interventions, to lower cancer mortality suffer from two standard biases. Thelead-time biaswill tend to show longer survival rates from the date of diagnosis.  And, for purely probabilistic reasons, screening primarily detects the slow growing tumors, leading to thelength bias. The longer the tumor stays in the body, the more likely it is to be picked up. But, what is more worrisome are the rapidly growing tumors that are detected clinically between screening epochs. These two biases mean that the commonly used measure of ten-year survival of patients with breast cancer is highly misleading in a screening setting.

"
Seidmann

At the same time

"

Seeing more, with advanced medical imaging technologies, has created a lot of difficult judgment calls.  In a growing number of cases it is tough to tell with certainty whether some detected abnormalities should be treated, or whether the patient would have fared better if they had been left undiscovered.

"
Seidmann

And,

"

The matters surrounding screening are made worse because physicians are graded on performance based on how many mammograms and other screening tests they order, and patients are penalized if they do not participate in recommended early detection programs.

"
Seidmann

To summarize, we are still at a relatively early stage in developing optimal screening strategies for breast cancer.  As a result much of the model parameters in an analysis such as done in Ayer et al will be suspect if simply because imaging technologies are constantly evolving.  But as Owens points out:

"

… a key factor in personalizing decisions is incorporating a woman’s utilities for the outcomes.  The framework that Ayer, Alagoz and Stout develop is the kind of approach that will enable such personalization.  It’s an important step, and could help provide a way forward for a woman who is trying to understand when and how often to undergo screening, and to ensure that her decisions lead to outcomes that are consistent with her preferences.

"
Owens

Even in an environment that is empirically fluid as breast cancer screening there is a role for good models that help us organize our thinking about the problem and assess the impact of changes in the evidence available.

Comments