BACKGROUND: Pay‐for‐Performance (P4P) is a payment model that rewards health care providers for meeting pre‐defined targets for quality indicators or efficacy parameters to increase the quality or efficacy of care. OBJECTIVES: Our objective was to assess the impact of P4P for in‐hospital delivered health care on the quality of care, resource use and equity. Our objective was not only to answer the question whether P4P works in general (simple perspective) but to provide a comprehensive and detailed overview of P4P with a focus on analyzing the intervention components, the context factors and their interrelation (more complex perspective). SEARCH METHODS: We searched CENTRAL, MEDLINE, Embase, three other databases and two trial registers on 27 June 2018. In addition, we searched conference proceedings, gray literature and web pages of relevant health care institutions, contacted experts in the field, conducted cited reference searches and performed cross‐checks of included references and systematic reviews on the same topic. SELECTION CRITERIA: We included randomized trials, cluster randomized trials, non‐randomized clustered trials, controlled before‐after studies, interrupted time series and repeated measures studies that analyzed hospitals, hospital units or groups of hospitals and that compared any kind of P4P to a basic payment scheme (e.g. capitation) without P4P. Studies had to analyze at least one of the following outcomes to be eligible: patient outcomes; quality of care; utilization, coverage or access; resource use, costs and cost shifting; healthcare provider outcomes; equity; adverse effects or harms. DATA COLLECTION AND ANALYSIS: Two review authors independently screened all citations for inclusion, extracted study data and assessed risk of bias for each included study. Study characteristics were extracted by one reviewer and verified by a second. We did not perform meta‐analysis because the included studies were too heterogenous regarding hospital characteristics, the design of the P4P programs and study design. Instead we present a structured narrative synthesis considering the complexity as well as the context/setting of the intervention. We assessed the certainty of evidence using the GRADE approach and present the results narratively in 'Summary of findings' tables. MAIN RESULTS: We included 27 studies (20 CBA, 7 ITS) on six different P4P programs. Studies analyzed between 10 and 4267 centers. All P4P programs targeted acute or emergency physical conditions and compared a capitation‐based payment scheme without P4P to the same capitation‐based payment scheme combined with a P4P add‐on. Two P4P program used rewards or penalties; one used first rewards and than penalties; two used penalties only and one used rewards only. Four P4P programs were established and evaluated in the USA, one in England and one in France. Most studies showed no difference or a very small effect in favor of the P4P program. The impact of each P4P program was as follows. Premier Hospital Quality Incentive Demonstration Program: It is uncertain whether this program, which used rewards for some hospitals and penalties for others, has an impact on mortality, adverse clinical events, quality of care, equity or resource use as the certainty of the evidence was very low. Value‐Based Purchasing Program: It is uncertain whether this program, which used rewards for some hospitals and penalties for others, has an impact on mortality, adverse clinical events or quality of care as the certainty of the evidence was very low. Equity and resource use outcomes were not reported in the studies, which evaluated this program. Non‐payment for Hospital‐Acquired Conditions Program: It is uncertain whether this penalty‐based program has an impact on adverse clinical events as the certainty of the evidence was very low. Mortality, quality of care, equity and resource use outcomes were not reported in the studies, which evaluated this program. Hospital Readmissions Reduction Program: None of the studies that examined this penalty‐based program reported mortality, adverse clinical events, quality of care (process quality score), equity or resource use outcomes. Advancing Quality Program: It is uncertain whether this reward‐/penalty‐based program has an impact on mortality as the certainty of the evidence was very low. Adverse clinical events, quality of care, equity and resource use outcomes were not reported in any study. Financial Incentive to Quality Improvement Program: It is uncertain whether this reward‐based program has an impact on quality of care, as the certainty of the evidence was very low. Mortality, adverse clinical events, equity and resource use outcomes were not reported in any study. Subgroup analysis (analysis of modifying design and context factors) Analysis of P4P design factors provides some hints that non‐payments compared to additional payments and payments for quality attainment (e.g. falling below specified mortality threshold) compared to quality improvement (e.g. reduction of mortality by specified percent points within one year) may have a stronger impact on performance. AUTHORS' CONCLUSIONS: It is uncertain whether P4P, compared to capitation‐based payments without P4P for hospitals, has an impact on patient outcomes, quality of care, equity or resource use as the certainty of the evidence was very low (or we found no studies on the outcome) for all P4P programs. The effects on patient outcomes of P4P in hospitals were at most small, regardless of design factors and context/setting. It seems that with additional payments only small short‐term but non‐sustainable effects can be achieved. Non‐payments seem to be slightly more effective than bonuses and payments for quality attainment seem to be slightly more effective than payments for quality improvement.