Global Health Policy

 

How Plausible Are the Predictions of AIDS Models?

November 29, 2010


UNAIDS, WHO, PEPFAR and the Global Fund for AIDS TB and Malaria (GFATM) all depend on long-run projections in order to make the case for increased attention and financing for AIDS.  This dependency is a response to the reality that HIV is a slow epidemic with extraordinary “momentum”.  Even small changes in the course of new infections require years to implement and have health and fiscal consequences for decades thereafter.  According to the UNAIDS web site, “[s]ince 2001, the UNAIDS Secretariat have led cutting-edge international work to define and project the developing world’s HIV/AIDS financing needs.” In 2007 UNAIDS published estimated future resource needs here.  The GFATM used projection models to argue unsuccessfully for sustained funding here.  And according to Congressional testimony here, PEPFAR has “looked at the impact of combination interventions on HIV infection rates, applying sophisticated modeling techniques to a generalized, high-prevalence context, and found that infections could be cut by more than half.” All of these projections were produced by one modeling group, The Futures Institute, with their suite of modeling tools, called GOALS which is available as a free download here.

Computer models are also at the heart of the active policy debate over the degree to which the AIDS community should depend on AIDS treatment as a way to prevent future infections.  The possibility that the so-called “test-and-treat” proposal, which I have blogged here,  could eliminate the AIDS epidemic is contested by other modelers using other models here.

But often these disputes occur above the heads and out of sight of the policy makers, US Congressional staffers and other consumers of these long-run projections, who have few grounds on which to judge their plausibility.  What are the questions that consumers of these projections should  (and should not) be asking?

First of all, it is clear that consumers should not be asking whether these estimates are “correct,” because no model is ever “correct.” but whether they are “plausible” and internally consistent guesses that obey some fundamental adding-up constraints.  A further criterion is whether they convey to the decision maker a realistic appreciation of the impact that policy decisions will have on the AIDS epidemic and also, importantly, the *uncertainty* around the “best guess” scenario.  Finally, one might ask whether policy makers exposed to the model results will be less likely to make decisions today that they or their successors will regret ten years down the road?

In approaching the GOALS model, or any other model of the future course of the AIDS epidemic, the consumer might be better armed for critical engagement if he or she understands that any set of model projections is dependent on both the structure of the model and the data or “parameters” that populate that structure.  As an annex to this posting, I present a few questions that the model consumer might raise, divided into issues of “structure” and issues of “parameter estimation or data”.

Having considered all the issues I raise below, should we be skeptical about the modeling results promulgated by UNAIDS, WHO and PEPFAR, which are all based on a single model?  Yes, I think we should.  First, we should ask how this particular model is constructed and how its parameters are estimated.  Were these the best choices to inform the policy questions under discussion?  And we should also ask for modeled predictions of the effect of alternative AIDS policies to be replicated by various groups of modelers.  We should ask whether each model has been validated by being subjected to a barrage of independent tests.  As is the case for projections of the future of the US economy, we should be asking for “consensus models” or  perhaps for the “consensus forecast” of a group of modelers.

To correct the market failure caused by insufficient academic rewards for impact evaluation, various public sector financiers have seen the wisdom of establishing specialized impact evaluation institutions like NICE and 3IE.  Similarly, the academic community provides few rewards for the mundane task of replicating already published, agency-supported predictions of the future course of the global HIV epidemic.  As in the case of impact evaluation, there is a strong justification for public support of an institution that would facilitate and underwrite public comparisons and even competitions among epidemiological projection models for AIDS and other long-cycle epidemics.  An important principle of such an institution would be that a model is run through its paces and evaluated on criteria like those I propose below by someone other than its author.  For example, why not turn loose squads of graduate students on each of the available models?  Do I have any volunteers?

===========================================================================

Questions to ask of any model:

Issues of Structure

A model’s structure, like the structure of an airplane, affects not only whether it flies at all (i.e. whether it can make plausible predictions of future trends based on past trends), but also its behavior in response to its pilot’s guidance (i.e. what it predicts will happen as a result of a policy change).  Predicting the continuation of a past rend is relatively easy.  Correctly predicting the response of a dynamic system like the AIDS epidemic in response to policy changes is a much more daunting challenge to the modelers.  And structure plays a particularly important role in the latter.

Epidemiological structure

In the structure category, I would include characteristics of the epidemiological model of HIV transmission, such as whether it is “compartment-based” or “agent-based”.  A “compartment-based” model characterizes a population of people by allocating each person to a stage or a compartment and then specifying equations to describe how people move or transition from one compartment to another.  The Wikipedia entry gives a good introduction here.  Instead of compartments containing aggregates of people, the components of an agent-based model each represents an individual person who sequentially “decides” how to “behave” in response to a sequence of situations or events to which that simulacrum is exposed.  Again Wikipedia has a detailed description here.

For some purposes, such as understanding the impact of concurrent sexual partners on the spread of HIV, an agent-based model is thought to be better suited.  Since GOALS is compartment-based, it is fair to ask whether it can successfully capture concurrency and if not, how much damage that inability does to its projections.  If one accepts using a compartment-based model, because of its relative simplicity, it then becomes relevant to know how many compartments there are, what the transition probabilities are across them, etc..  The equations that link the compartments constitute the most basic description of the structure.  But these are hard for most of us to parse, so a diagram and sensitivity analysis would be helpful.

In addition to these structural characteristics, a model has “emergent” characteristics – which are only revealed by running it many times to check its sensitivity to alternative assumptions.  For example, the figures that I used in two previous blogs (here and here) emerge from about 5000 runs each of my AIDSCost model.  The GOALS model could similarly be run multiple times in order to trace out response frontiers which would more clearly reveal the emergent properties of its underlying structure than do a handful of runs.

Similarly, for the GOALS model, which is used to construct most of the projections cited in the first paragraph of this blog, I am curious whether the relationship between various prevention interventions is additive or synergistic or possibly one of substitution.  That is, when two prevention interventions are expanded to scale jointly, does the GOALS model predict that their effects on averted HIV infections would be the simple sum of their independent effects, or more than the sum (synergy) or less than the sum (redundancy)?  While an analysis of the equations of the model would yield clues to the answer to this question, running the model for a variety of intervention combinations and tracing out response frontiers would be more informative.

For a dynamic model, the mathematics suggests that emergent properties are particularly likely to be surprising and unpredictable from structure alone when the model is non-linear.  Nonlinearities can occur in the epidemiology (e.g. from a standard SIS model) or from any of the areas of structure I list below.

Structure of the cost of supplying services

For models like my AIDSCost model and the Futures Institute’s Model, which project the the cost of the supply of delivered HIV treatment or prevention services, it becomes relevant to ask whether the structure of the cost model is linear (i.e. constant unit costs) or more realistic.  For example, can the cost structure of the model capture economies of scale (at the national or the facility level), economies of scope (ditto), economies of integration with the health system (ditto), economies that accrue to competitive service delivery as opposed to hierarchically controlled monopolistic service delivery (whether public or private).  All of these nonlinearities are relevant to projecting the future costs of HIV service delivery, but their relative magnitudes and the degree they are amenable to policy manipulation are empirical questions that have not yet been answered – or in some cases even addressed.

Structure of the determinants of service uptake (i.e. of the “demand” for services)

In order to make plausible predictions of the cost of achieving any given degree of future service uptake or utilization, it is necessary to model not only the cost of supplying services but also the demand for those services.  (Econ 101: Utilization is the intersection of supply and demand.)  Thus one must ask how any projection model captures the demand for services.  It is well known that demand for any service is elastic to varying degrees with respect to price, distance, convenience, attractiveness, and the price distance, convenience and attractiveness of substitute and complement goods and services.  What assumptions does the GOALS model make about these elasticities?

(To my knowledge the only AIDS cost projection model that incorporates demand elasticities is that done for Thailand by Tim Brown, Wiwat Peerapatanapokin, myself and co-authors here.  For example, demand elasticities do not appear in my AIDSCost model.)

Stochastic structure

Uncertainty can influence a model’s predictions either as part of a model’s structure or by way of its data.  Some models embody the view that all human and natural phenomenon are fundamentally stochastic and therefore make a random draw from a probability distribution at every point that an arithmetic computation is performed.  Other models are a mix of deterministic computations and a few stochastic components.  Still others are fundamentally deterministic, but could be run many times with randomly distributed parameter values.  I believe that GOALS (like my AIDSCost model and many others) is in this latter category.   Imagine two distributions of projected HIV prevalence in the year 2020.  The two are produced from:
(1) a stochastic model run 1000 times with the same mean values for every parameter, yielding 1000 predictions for HIV prevalence in the year 2020,
(2) a deterministic model run with 1000 randomly chosen values of those same parameters, again yielding 1000 predictions for HIV prevalence in the year 2020.
Other things equal, which of these depictions of the uncertainty in future HIV prevalence is more plausible?  This is a deep question, to which I don’t have an answer.  I suspect a case could be made for the stochastic model, provided the details of its stochastic specifications are themselves plausible.  But ultimately one would have to compare actual models.

Given the stochastic structure of any specific model, the question then arises how to convey truthfully to policy makers the uncertainty contained in model predictions.  This is a difficult communication challenge.  Although the UNAIDS modelers wanted for years to release upper and lower bounds for their estimates of AIDS prevalence, UNAIDS only began to publish ranges rather than single estimates after data accumulated that they had badly over-estimated the worldwide total number of HIV infections for decades.  See my discussion of the revision here.

Parsimony

“A model should be as simple as possible, but no simpler.” (Einstein said this but so did the Lord of Occam before him.)  Of course, parsimony, like beauty, is in the eye of the beholder, so one person’s “beautifully parsimonious” model is another’s “overly reductionist caricature” of reality.  While model builders often like to add bells and whistles to their models, there is a serious danger that the fillips and adumbrations they add to a model’s basic structure will, like the  epicycles added to the Ptolemaic model of the solar system, lead the model farther and farther away from reality.  (The Ptolemeic model, with the earth at the center instead of the sun, predicted the motions of the planets across the sky pretty well but would have done a really bad job of predicting the impact of a policy intervention – such as  blasting a rocket towards Mars.)  Thus all of the complexity that I suggest above should be introduced only to the degree that it improves the plausibility of the model’s predictions and the usability of the model for policy analysis.   Anybody like me who would like to see some additions to a model must make a convincing case that greater complexity would be worth the loss of parsimony.

Issues of Parameter “guesstimation” and Data

A model’s structure, with the components described above, is just a set of equations with unknown parameters.  In order to make predictions, we must of course attach values to those parameters.   For a simple model of demand, when one has observations of 1000 individuals choosing how much detergent to buy at 1000 different combinations of price, distance, characteristics and the price distance and characteristics of substitute products, one has enough degrees of freedom to formally estimate the dozen or so parameters of the structure of detergent demand.  Unfortunately for these epidemiological-economic projection models, we are in the opposite situation, with perhaps ten times as many parameters as we have data points.  So instead of estimating those parameters, we have to “guestimate” them.  The French call it “la pifometrie”.   And that’s appropriate, because the process of attaching values to these parameters, we would all agree, requires one to hold one’s nose.

Perhaps it is useful to distinguish among: Epidemiological parameters, biological or medical parameters, efficiency parameters, effectiveness parameters and demand parameters.

Epidemiological parameters.  These include the shares of the various risk groups in the population and the baseline rates of activity and the rate of sexual mixing between the various groups.  Depending on the structure of the model, a measure of concurrency or a measure of mean partnership duration is also required, for each compartment in the model.   Important theoretical work by Anderson and May has shown that accurate prediction requires information about not only the mean but also the variance of each of these numbers, but whether it would be useful to know the variance within each separate compartment or only the variance across all compartments (which could be deduced from the distribution of their mean values) is unclear to me.

Biological or medical parameters.  These include features of the natural history of HIV and the degree of infectiousness of an infected person in the various compartments through which s/he passes and the susceptibility of the uninfected individual.  If acquired and used, condoms and microbicides intervene at this point to reduce an individual’s infectiousness and susceptibility.

Efficiency parameters.  These include the parameters of the structure of the cost of production and distribution of HIV services.

Effectiveness parameters.  These include the effectiveness of a service at preventing an HIV infection and/or prolonging the life of an infected person – assuming that the service is used.  But this is a big assumption.  To relax this assumption, a model would need to have a demand structure and …

Demand parameters.  These include the responsiveness of utilization to changes in the policy instruments that governments and donors can manipulate, such as the price, distance, convenience attractiveness of a service and of its substitutes and complements.

With respect to each parameter within each class of parameters, we can ask whether sufficient data exists to rigorously construct estimates of a mean and a confidence interval.  Where insufficient data exists, we can ask whose subjective Bayesian priors are represented in the model, what those priors are and how sensitive the model’s predictions would be to the choice of alternative priors.

If you have gotten this far, you must be either really interested in modeling per se or really interested in whether HIV models are producing reliable predictions about the magnitude of the future health and fiscal burden of AIDS.  If you are a model “consumer,” rather than a modeler, I’m curious whether this checklist seems helpful.  Or would you rather just accept the model predictions from someone else – and then take them on faith?

Possibly Related Posts

  AddThis Social Bookmark Button


4 Responses to “How Plausible Are the Predictions of AIDS Models?”

  1. Thank you for this interesting and thought-provoking article. I agree that there is a need to question the structure and parameters of any model being used for such high-level decision-making, and the checklist you provide is helpful. However, I believe you are misrepresenting what the modelers are attempting to achieve. The GOALS applications are rarely done by “the authors” or a few people at Futures Institute. The model was designed to facilitate policy dialogue around HIV/AIDS planning and resource estimation/mobilization and each application is conducted not in a room in Connecticut or Washington, but amidst a multisectoral group, including modelers, decision-makers, advocates, members of CCMs, etc. The process of engaging this broad audience and using the GOALS model to reach consensus in the strategic planning process has been an invaluable process at the country level. The flexible nature of the model is such that individual applications can be tailored to a country’s epidemic but also to country priorities.

    Apart from the value added to the process of strategic planning, GOALS has added value by providing a “best estimate” of the resources required – at the country, regional and global level. Of course these models lie on several assumptions but the authors/creators of GOALS have always been very clear about these assumptions, as well as the limitations of the models.

    Greater public accountability and competition would indeed be fruitful and is warranted given the huge amount of resources allocated to the HIV/AIDS epidemic. However, turning “loose squads of graduate students” to the models would not add value. One doesn’t simply “run” the model. Instead, one uses the model as a tool to facilitate dialogue and planning amongst a multisectoral group.

    Finally, it is worth noting that Futures Institute is not working in isolation but has instead collaborated with Tim Brown’s group at the East-West Center, as well as other universities, donors, governments, UN agencies, etc. in an attempt to ensure the resource estimates and estimates of impact are as “plausible” as possible. I encourage ongoing efforts to increase public accountability and competition in this very important area of work.

  2. please see the website above for a conference call for papers on this very topic.

  3. Also see the papers of:

    The OptAIDS project: towards global halting of HIV/AIDS http://www.biomedcentral.com/b.....9?issue=S1

  4. Thanks to Sarah for describing how GOALS and other forward-looking computer models can be used to inform policy discussions in affected countries, apparently based on her own experience.

    Since the 1960s, donor-supported policy analysts have conducted itinerant workshops in developing countries aimed at training local policy makers to think about the future consequences of current policy decisions. To my knowledge, in the population, health and development domain John Stover was the first analyst to take advantage of the advent of “luggable” portable computers in the early 1980’s in order to add to these workshops the “wow” factor of real-time, audience-responsive computer projections. The “RAPID” model he developed for showing workshop participants the future impacts of their population policy choices today was successful in my view not only in persuading the audience that family planning was in their countries’ long-term interest, but also in persuading funders that a real-time computer model with an easy-to-understand user interface could be a powerful tool in spreading messages about the future consequences of current actions.

    When the HIV-epidemic became a policy concern, John and his team at the Futures Institute developed the SPECTRUM family of models, including GOALS. John and people like yourself who have put in years of hard work developing these models and helping others to apply them in customized country-specific settings deserve recognition as having made major contributions towards improving the rationality, coherency and integrity of the planning exercises to which you have contributed. Without you, policy analysis in the fields of demography, family planning and HIV/AIDS planning would be just groping in the dark.

    Of course, I also recognize that the Futures Institute’s models are informed by the mathematical epidemiological literature and community, in part through John’s two decades of participation in the UNAIDS Epidemiology Reference Group, formerly chaired by Roy Anderson and now by Geoff Garnett of Imperial College, which gives John and its other members the benefit of the constructive criticism of their peers from among the world’s best epidemiological modelers.

    Although I think you and I agree on all of the above, we differ on whether “turning loose squads of graduate students on the models would add value”. In my view, science is only likely to advance when younger cohorts of imaginative thinkers replicate, over and over again, the work of their predecessors, stress testing it from youthful perspectives in order to discover its weaknesses and build on those to generate new insights and – eventually – to create stronger models. A major weakness of the itinerant policy workshop cum computer model approach that you and I describe and have implemented is that the underlying structural assumptions of the model typically remain opaque to the participants – who after all are NOT graduate students, but only momentary dabblers in HIV computer modeling. The typical in-country user lacks the technical skills and breadth of epidemiology and mathematical competence of the graduate student and thus has a disadvantage vis-à-vis the workshop “animateurs” such as you and me.

    Thanks to these policy models, workshop participants are no longer groping in the dark, but instead are following an experienced guide on a pre-charted path through what nevertheless remains for them a dark and unknown terrain. As a result when we experts leave the country, the model we leave behind is rarely used by the people we have trained for a few days. No matter how “user-friendly” the model, most workshop participants go back to their regular jobs feeling that they don’t really understand the model well enough or have the proper modeling credentials to use or present its projections credibly in discussing policy options with their own government or donors.

    I believe that “squads of graduate students” testing the alternative HIV/AIDS projection models will not only enrich the discussion of the strengths and weakness of the existing models, but will also create a supply of competent modelers who can be employed full-time in the most affected developing countries – so that sophisticated computer models become an integrated part of local planning processes rather than just the flashy part of the presentation of an occasional itinerant expert.

Post a Comment

We value frank and constructive exchanges and encourage you to use your real name in your comments.

  • Global Health Policy is a group blog discussing the issues facing the donor community on everything from HIV/AIDS financing to pharmaceutical R&D to broader health systems concerns. Comments are strongly encouraged, and suggestions for new posts can be sent to us here.

    The Race Against Drug Resistance
    A short film tells the story of Khalifa, a nurse in Ghana who contracted typhoid. She takes one drug and then another—each more expensive than the last—but still she isn’t well. The film uses expert interviews and animation to explain why drug resistance threatens us all—and what we can do about it.

    Learn more about our Combating Drug Resistance initiative.

  • Translator

  • Monthly Archives

  • Categories

  • Most Recent Comments

  • Blogs & Other Useful Resources