There is growing emphasis on defining the quality and safety of health care through reporting of process and outcome measures. The Agency for Healthcare Research and Quality’s (AHRQ) Patient Safety Indicators (PSIs) were designed to screen for potentially preventable complications of inpatient care using hospital discharge data (AHRQ 2003). Although they were originally intended for case finding and quality improvement (QI), they are increasingly used for comparing hospital performance. The Centers for Medicare and Medicaid Services (CMS) recently added several of the PSIs to its Hospital Compare website for the Hospital Inpatient Quality Reporting (IQR) Program (CMS 2013a). Beginning in 2015, the National Quality Forum (NQF)-endorsed PSI composite measure (PSI-90) will also be used in the Hospital Value-based Purchasing (HVBP) Program to reduce payment for Medicare hospitalizations based on their achievement and improvement performance scores relative to benchmark quality standards (CMS 2013b). Because of these high-impacts, high-stakes programs, it is critical that the PSIs accurately reflect hospital quality. Concerns over the use of the PSIs for comparative reporting and pay-for-performance are prevalent in the scientific literature and popular press. For example, previous studies examining the criterion validity of the PSIs using explicit chart review found that flagged PSI events do not always represent true safety events (Utter et al. 2009, 2010; White et al. 2009; Sadeghi et al. 2010; Zrelak et al. 2011; Rosen et al. 2012). The positive predictive validity (PPV), the proportion of flagged cases confirmed by chart review (the “gold standard”) to have a PSI, ranged from 28 percent for Postoperative Hip Fracture to 91 percent for Accidental Puncture or Laceration. Most of the false positives were attributed to well-known coding limitations inherent in administrative data and lack of present-on-admission (POA) codes (Iezzoni 1990, 1997; Utter et al. 2009, 2010; White et al. 2009; Zrelak et al. 2011; Rosen et al. 2012). In addition, criticisms have been raised about the specific methods used in calculating the PSI composite, as well as the moderate reliability of individual PSI component measures (i.e., the ability to distinguish high- from low-quality hospitals), potentially threatening the reliability and validity of the composite measure (Schone, Hubbard, and Jones 2011). Although use of a composite measure may improve reliability, it is not known whether the PSI composite does, in fact, improve reliability with respect to hospital profiling (AHRQ 2008; Schone, Hubbard, and Jones 2011). Thus, hospital rankings based on flagged PSI composite rates could be affected by limitations inherent in the use of an administrative data-based measure. An unintended consequence of the HVBP Program would be to penalize all hospitals with high rates—those whose rates reflect poor performance, as well as those whose rates might reflect better coding practices and documentation of events (Haut et al. 2007). It might also cause hospitals to misdirect QI activities and focus on the wrong areas. Given the lack of knowledge about the ability of the PSI composite measure to accurately detect differences between high- and low-quality hospitals, and the potentially serious implications of this, we conducted a study in the Veterans Health Administration (VA) to assess whether use of PSI-flagged events obtained from the AHRQ PSI composite measure versus estimates of true safety event rates derived from two modified composite measures based on previous chart abstraction findings led to changes in hospital ranks, hospital categorization, and hospital payments under a pay-for-performance program. Because the VA has fewer financial incentives for the upcoding of diagnoses and lacks penalties or rewards tied to PSI events compared to the private sector (Rosen et al. 2012), it represents an ideal setting in which to study the impact of using routinely collected hospital discharge data on hospital performance. However, due to increasing financial imperatives to reduce the excess costs associated with patient safety events, as well as an already constrained global budget, it is likely that the VA will adopt similar payment policies to control costs in the near future. Thus, evaluating any changes that occur from using modified composites versus the original AHRQ PSI composite measure on hospital profiling in the VA will hopefully yield important insights into areas that are modifiable, as well as opportunities for improving the measure.