Medicare claims as an epidemiological tool

A simple methodological question has always fascinated me:

How good are Medicare claims records as an epidemiological tool?

Medicare claims record the payment by Medicare for services, and have been used to answer a variety of research questions. For some purposes they are great, such as how much Medicare paid for care. For other purposes, such as tracking the incidence and prevalence of disease through the use of ICD-9 diagnosis codes, it is less clear and likely differs by disease. The motivation for using Medicare claims are that they provide a way to look at a large population of great policy interest–namely those persons covered by traditional fee for service Medicare.*

About 12 years ago during a program project site visit at Duke, I suggested that Medicare claims could be an efficient way to track changes in prevalence of Alzheimer’s Disease (AD). A Neurologist on the site visit committee said he bet that only 10% of true AD cases would be identified in claims. A challenge!

We put together a project team and provided an initial answer to the question (gated; looking for ungated, reference below)–79% of true AD cases were identified in Medicare claims if you used 5 years of claims and included Physician and outpatient claims and not only hospital files (87% of true cases were identified when using a broader group of ICD-9 codes, AD and related dementias).

This provided some evidence that claims were a great deal better than the Neurologist on the site visit had surmised, but questions remained. The source of the true dementia diagnosis was autopsy data collected as part of the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) study; participants had their Medicare claims history linked to their records. However, because CERAD enrolled patients through memory disorders clinics in major teaching hospitals, the worry remained that the cases were more severe than average, and that their participation in the CERAD study could have affected the way their Medicare claims were coded by community-based physicians (CERAD-specific interactions did not bill Medicare). Further, to truly assess the ability of claims to identify AD, you would need to not only know the sensitivity, but the specificity (false positive). More pieces of the puzzle were needed to give a full answer, which I will get to in a follow up post.

Donald H. Taylor, Jr., Gerda G. Fillenbaum, Michael E. Ezell. The accuracy of medicare claims data in identifying Alzheimer’s disease. Journal of Clinical Epidemiology 2002;55:929-37.

*A limitation of Medicare claims is that they do not provide information for persons choosing Medicare Advantage plans.


About Don Taylor
Professor of Public Policy at Duke University (with appointments in Business, Nursing, Community and Family Medicine, and the Duke Clinical Research Institute). I am one of the founding faculty of the Margolis Center for Health Policy, and currently serve as Chair of Duke's University Priorities Committee (UPC). My research focuses on improving care for persons who are dying, and I am co-PI of a CMMI award in Community Based Palliative Care. I teach both undergrads and grad students at Duke. On twitter @donaldhtaylorjr

4 Responses to Medicare claims as an epidemiological tool

  1. Brad F. says:

    Thanks for highlighting this difficult subject. I would add, that gain seekers in reimbursement will overcode (regional phenomena)–well documented in the big journals. This further mudies the waters at a population level.

    Also not discussed, is as we migrate to bundled care with less emphasis on FFS payment, the coding system that drives reimbursements will effect our current administrative infrastructure. If providers are no longer paid based on disease intensity or individual procedures, documentation priorities diminish. How we measure morbidty and cost account will need a new chasis.

    In terms of administrative data, one of the more important studies published this year (and folks have not figured it out yet):

    Automated identification of postoperative complications within an electronic medical record using natural language processing. Murff HJ, FitzHenry F, Matheny ME, et al. JAMA. 2011;306:848-855.

    Essentially, rather than ICD code scrubbing, its using Google-like logic on EMRs with search terms. This is the future.


  2. Don Taylor says:

    @Brad F
    all great points. Will blend in the paper in later post. Need some sort of practical guidance that claims are great for this, good for that, ok for some, not so good here and terrible in some cases.

  3. Weiwen Ng says:

    I had actually been looking for a study that assessed the sensitivity and specificity of Medicare claims in identifying Alzheimer’s disease and/or other dementias, so I’m glad you mentioned this one. I think Medicare claims are a pretty good epidemiological tool, for many but not all diseases. I just came across an article evaluating the Chronic Condition Warehouse algorithm to identify about 15 chronic diseases with claims data: Yelena Gorina and Ellen Kramarow, “Identifying Chronic Conditions in Medicare Claims Data: Evaluating the Chronic Condition Data Warehouse Algorithm,” Health Services Research, 46(5), 2011.

  4. Don Taylor says:

    This one looks at sensitivity and specificity in a more representative sample of people, will write about it later Taylor, D. H., et al. “The Accuracy of Medicare Claims as an Epidemiological Tool: The Case of Dementia Revisited.” Journal of Alzheimers Disease 17.4 (2009): 807-15. Print.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: