External Validation Of The Surgical Outcome Risk Tool (sort) In 3305 Abdominal Surgery Patients In The Independent Sector In The United Kingdom

Background Assessing risk of post-surgical mortality is a key component of pre-surgical planning. The Surgical Outcome Risk Tool (SORT) uses pre-operative variables to predict 30-day mortality. The aim of this study was to externally validate SORT in patients undergoing major abdominal surgery. Methods Data were collected from patients treated in ve Independent hospitals in the United Kingdom. Individualised SORT scores were calculated and area under the receiver operating characteristic (AUROC) and precision-recall curves (PRC) plus 95% condence intervals (CI) were drawn to test the ability of SORT to identify in-hospital death. Outcomes of patients with a SORT predicted risk of mortality of ≥ 5% (high risk) were compared to those with a predicted risk of <5% (standard risk). Results The study population comprised 3305 patients, mean age 51 years, 2783 (84.2%) underwent elective surgery most frequently involving the colon (24.6%), or liver, pancreas or gallbladder (18.2%). Overall 1551 (46.9%) patients were admitted to ICU and 29 (0.88%) died. The AUROC of SORT for discriminating patients at risk of death was 0.899 (95% CI 0.849 to 0.949) and the PRC 0.247. In total 72 (2.18%) patients were stratied as high risk. There were more unplanned ICU admissions and deaths in this group compared to the standard risk group (25.0% and 3.3%, versus 3.1% and 0.5%, respectively). Conclusion We externally validated SORT in a large population of abdominal surgery patients. SORT performed well in patients with lower risk proles, but some patients that were predicted to be standard risk did experience adverse outcomes.


Introduction
In 2010 the National Con dential Enquiry into Patient Outcome and Death (NCEPOD) conducted a national review of care provided to high risk surgical patients. A key nding was the need for a UK-wide system that could reliably identify patients at high risk of mortality and morbidity. 1 In 2018 the Royal College of Surgeons of England (RCS) formalised this, recommending that all adult patients admitted under the care of a general surgeon should have their risk of morbidity and mortality assessed and recorded. The RCS also recommended that high-risk surgical patients, de ned as those with a predicted mortality of ≥ 5%, should receive timely surgery in the presence of a consultant surgeon, and should immediately be admitted to critical care post-operatively. 2 Prediction tools have been developed to quantify risk of death or morbidity, but have either not been designed to generate individualised risk pro les or require variables that are only available intra-operatively, limiting their use in the pre-operative setting.
Recently the Surgical Outcome Risk Tool (SORT) was developed with the aim of predicting 30-day mortality following surgery 3 but as yet has not been fully externally validated. It comprises procedure code, operation severity, American Society of Anaesthesiologists' physical status classi cation (ASA), clinical urgency, surgical site (thoracic, gastrointestinal or vascular surgery), cancer (active malignancy within the last 5 years) and age, all of which are available pre-operatively.
The aim of this study was to externally validate SORT in a large population of general surgical patients admitted to and treated in ve independent hospitals in the United Kingdom (UK).

Methods
This study was conducted across ve independent hospitals, operated by HCA Healthcare UK in London.
All participating hospitals had a 24/7 level 3 intensive care unit (ICU), and on site access to interventional radiology and emergency theatres.

Patient Population
We studied all insured adult patients who underwent elective and emergency major abdominal surgery in a HCA facility between 1st January 2013 and 30th September 2018. Major abdominal surgery was de ned using the Clinical Coding and Schedule Development Group (CCSD) schedule of procedures, comprising 125 individual procedures within the following groups of codes: stomach, duodenum, small intestine, large intestine, rectum, repair of a major vessel, oesophagus, other abdominal organs and peritoneum (Appendix 1). If a patient had multiple procedures performed synchronously (with separate procedure codes) the most complex procedure code was used to calculate SORT. Procedure codes that were associated with a discrete hospital admission, were considered as separate cases.
Patients who were transferred to other hospitals (National Health Service, NHS, or other independent hospitals beyond HCA) were excluded from the analyses as it was not possible to collect data on their clinical outcomes after transfer. It was also not possible to determine why these patients were transferred to the NHS. Routine administrative data that are collected prospectively on patient demographics, surgical procedure, ASA and patient outcomes were used for this study. These data are collected automatically or by clinical or administrative staff and are entered directly into hospitals' electronic health records. Post-operative ICU admission de ned as level 2 or 3 care and was classi ed as planned or unplanned. These data were entered into the electronic health record by clinical staff at the point of admission to ICU. However for some patients it was not clear from the data whether the reason for ICU admission was due to clinical need or lack of ward capacity. These cases were handled as missing data in the analysis. Post-surgery ICU admission was limited to ICU admissions that occurred within seven days of surgery. In cases of multiple ITU admissions during the same hospital episode of care, only the rst admission after surgery was considered.
The study proposal was reviewed by the hospitals' Research Review Committee who deemed that ethical approval was not required as no new data were collected, and the study involved no patient intervention.
The study was performed and reported in accordance with the TRIPOD statement. 4 Applying SORT SORT was calculated for each patient. SORT classi es the 'procedure urgency' variable using the NCEPOD classi cation of interventions; immediate (within minutes of decision to operate), urgent (within hours), expedited (within days) or elective (routine admission). 5 However, the hospitals' electronic database de ned this variable as only 'elective' or 'unplanned'. Due to the nature of surgical cases in the Independent sector, true 'immediate' cases would be extremely rare. It was not possible differentiate 'urgent' from 'expedited' so these variables were grouped together.

Study outcomes
The primary outcome of interest was all-cause in-hospital mortality. We used in-hospital death as opposed to 30-day mortality as it was not possible to collect outcome data after hospital discharge. The RCS de nes 'high-risk' patients as those with a risk of death of ≥ 5%. 2 The applicability of SORT generated mortality predictions was tested by using each patients predicted risk to stratify need for ICU admission. SORT generated predicted probabilities were used to classify patients as high or standard risk; the high risk group was de ned as patients with a SORT generated risk of 30 day mortality of ≥ 5% and the standard risk group de ned as those with a SORT generated risk of 30 day mortality of < 5%.

Statistical analysis
Continuous data are reported as mean and standard deviation (SD) or median and interquartile range (IQR). An area under the receiver operator characteristic curve (AUROC) with 95% con dence intervals (CI) was drawn to assess the ability of SORT to predict in-hospital mortality. As the dataset was imbalanced in terms of a small number of in-hospital deaths, a precision-recall curve (PRC) was also drawn and the area under the PRC (AUPRC) calculated. A PRC reduces the impact of a large population of 'true negative' cases in a dataset with few events of interest. 6 The literature on calculating CI for a AUPRC are controversial 7 therefore 95% CI are not reported for this metric.

Results
In total 3357 patients were identi ed. After excluding patients who were transferred to the NHS (n = 43, 1.3%) or to other independent hospitals (n = 9, 0.3%) the study population included 3305 patients. The mean age of patients was 51 years, the most frequent ASA grading was two (47.8%) and the majority of cases were elective (84.2%). The most common sites of surgery were the colon (812/3305, 24.6%), liver, pancreas and gallbladder (600/3305, 18.2%) and the rectum (376/3305, 11.4%, Table 1). In total there were 29 in-hospital deaths (0.88%). In comparison to patients who survived to discharge, patients who died were older, more likely to have cancer and other medical co-morbidities, had higher ASA scores and were more likely to be unplanned admissions to hospital. The clinical performance of SORT The observed and predicted mortality rates are shown in Table 2. For quantiles 1 to 4 the mean predicted mortality was < 0.2%, and there were no observed deaths in these groups. In quantiles 5 to 9, the mean predicted probability of death ranged from 0.21-4.19% (Fig. 1). Overall SORT under-predicted the number of deaths. Across the entire cohort of patients 29 patients died. SORT predicted 25 of these. On an individual case level, the SORT predicted risk of in-hospital mortality ranged from 0.13-43.81%. The AUROC c-statistic for SORT was 0.899 (95% CI 0.849 to 0.949, Fig. 2) suggesting good discriminative ability. However, the area under the precision-recall curve for SORT was 0.247 suggesting that the large proportion of true negatives may have arti cially improved the ROC curve.
The use of SORT to identify high risk patients Overall 72/3305 (2.2%) patients had an individual predicted risk of post-operative in-hospital mortality of ≥ 5% and were therefore classi ed as high risk ( Table 3). The remaining patients had SORT predicted mortalities of < 5% and were classi ed as standard risk. Patients in the high risk group were older, with higher Charlson Co-morbidity Indices and were more likely to have had emergency surgery in comparison to the standard risk group.

Discussion
In this large external validation study examining the performance of SORT in patients undergoing abdominal surgery, we found that SORT accurately predicted risk of post-operative death. It performed particularly well in low risk patients, but under-predicted the risk of death in patients who were strati ed as the highest risk. When SORT was used to identify patients at risk of adverse outcome, only 2.2% of the study population were identi ed as being high risk. In this high risk group 25% patients had unplanned ICU admissions. These may have been avoidable if SORT was used to risk assess patients preoperatively.
SORT was originally developed in 11,219 non-cardiac surgical patients 3  The predictive ability of risk strati cation tools is frequently assessed using AUROCs and the c-statistic. However in populations where the outcome of interest is infrequent, such as the low mortality rate seen in the present study, AUROCs may over-estimate the performance of the model. This is due to impact of a large proportion of patients without the event (true negatives) in the calculation of speci city. In imbalanced populations the more appropriate analysis may be the PRC, where true negatives do not feature in the calculation of precision (positive predicted value) or recall (sensitivity). 6 The present study is the rst to assess the performance of SORT using PRC as well as a ROC curve, nding that the performance of SORT was signi cantly poorer. This was notable in patients with the highest risk pro les, where SORT under-quanti ed their risk. In lower risk patients SORT performed well though. Arguably risk prediction tools in these patients are more useful than in patients with higher risk pro les, as the latter as will have risk factors for poor outcome, such as advanced age, complex co-morbidity or emergency surgery which are readily identi ed by clinicians.
Several other tools have been designed to predict post-operative morbidity and mortality, such as ASA 11 and the Portsmouth Physiological and Operative Severity Score for Mortality and Morbidity (P-POSSUM). 12 In an external validation study of 5569 patients, SORT was superior to ASA at predicting mortality, although both performed well (AUROCs of 0.91 and 0.87, respectively). 3 ASA is a population based tool de ning physical status not operative risk, and although widely used, misclassi cations are common particularly amongst patients with multiple co-morbidities. 13 The performance of SORT is yet to be compared to that of P-POSSUM. A limitation of P-POSSUM is that it requires laboratory data, a chest radiograph and electrocardiogram, making it more di cult to calculate than SORT.
Once risk prediction is established as being accurate, the next question is regarding the discrete level of risk that quali es a patient as 'high risk'. The RCS recommend using a predicted risk of death of ≥ 5% to identify high risk patients. 2 This represents a departure from previous guidance that categorised patients as high risk if they had a predicted risk of death of ≥ 10%. 14 The present study is the rst to assess ICU utilisation following the new recommendation of a threshold of 5%, and the rst to use SORT to stratify patients. We demonstrate that lowering the threshold to 5% does not generate large volumes of new postoperative ICU admissions; only 2.2% of the study population met the criteria for direct ICU admission, and most of these had already been recognised as requiring post-operative ICU care. This group of additional ICU admissions represents only 0.45% of the study population. Of note, 25% of the high-risk group had unplanned ICU admissions. These patients represent a sub-group of high risk patients that could have been identi ed pre-operatively by SORT and electively admitted to ICU. However, there were also patients in the high risk group who were managed without ICU admission, and conversely patients in the standard risk group that had unplanned ICU admissions or died in hospital. In the standard risk group there were 16 deaths, suggesting that using a predicted mortality of 5% may yet be too high to safely identify all patients at risk of death.
Historically post-operative ICU admission has been thought to be of bene t as it permits rapid recognition and treatment of life-threatening post-operative complications. A study of 572,598 general surgical procedures found that a patient who receives post-operative ward-based care but then requires unplanned ICU admission has twice the risk of 30 day mortality. 15 In elective surgery a recent study of 44,814 patients found no association between direct admission to ICU following surgery and in-hospital mortality however. 16 These ndings may be explained by advances in surgical and anaesthetic techniques that have reduced the physiological disturbance caused by surgery and therefore reduced the impact of ICU-based care. In the present study half of the patients in the standard risk group were admitted to ICU post-operatively. Given the acuity of the surgical procedures this is not an unexpected nding, but in the future a proportion of these patients may be eligible to receive critical care interventions, such as telemetry or vasopressors, outside of the traditional ICU.
Within the standard risk group 3.3% of patients had an unplanned ICU admission. These patients would not have been identi ed if risk strati cation was restricted to SORT and the 5% mortality threshold. It is therefore important to highlight that risk tools serve to aid, as opposed to replace clinical judgement. None of the previously described scores have been directly compared to clinical opinion, but when assessing pre-operative risk, guidelines recommend that risk tools are used in conjunction with surgical judgement. 2 In keeping with this, the American College of Surgeons National Surgical Quality Improvement Programme risk tool has an in-built option to allow surgeons to modify risk calculations if they deem necessary. 18 Mortality is not the only outcome of importance to clinicians and patients. Prediction of complications and morbidity that allows accurate discussion of risk during surgical consent and pre-operative optimisation would also be of value and is a key area for further research. The creators of SORT have developed the SORT morbidity model, which they have validated in a mixed population of 527 elective surgery patients. 19 It is yet to be further externally validated.
This study uses data collected from patients treated in ve independent hospitals in the UK, a sector of healthcare that is traditionally thought to deliver simple treatments to stable patients. When comparing the demographics of this population to that of a contemporaneous NHS population of 16,788 surgical patients 3 there are important similarities. High ASA classi cations were common (ASA 3 and 4 were found in 19.9% and 2.7%, respectively in the NHS study, 3 and 14.4% and 1.5%, respectively, in the present study) and the majority of patients were undergoing major or complex-major operations (32.7% and 34.2% patients, respectively in the NHS study 3 and 41.6% and 49.8%, respectively, in the present study). The mortality rate was also similar (1.8% in the NHS study and 0.88% in the present study) and comparable to reported rates of 1.4 to 1.9% in other large NHS-based population studies of surgical patients. 20 21 There are some important limitations to the present study. SORT was initially developed to predict 30-day mortality, but the present study was limited to in-hospital death as we were unable to collect data on patient outcomes after discharge. We also unable to capture the outcomes of patents who were transferred to the NHS or other healthcare providers. However, these cases represented only 1.6% of the study population. In some cases we were unable to determine the rationale for post-operative ICU admission so these cases were excluded from this sub-analysis. In the remaining cases we assumed that ICU admissions categorised as unplanned were categorised using clinical need. However a proportion of these may represent elective admissions where the operating surgeon has failed to book a bed, and were not truly unplanned admissions. It was not possible to sub-classify procedure urgency beyond elective or unplanned, so we were unable to identify which patients were truly 'expedited' or 'emergency' procedures. This may mean that true 'emergency procedures' are under-represented in the study population, leading to under-estimation of ICU capacity needed to implement the 5% risk threshold. It may also mean the performance of SORT described in the present study is not as good as could be if all variations of procedure urgency were included.
In summary this large study externally validates SORT in a population of patients undergoing major abdominal surgery. SORT performed particularly well in patients with low risk pro les, but under-predicted the number of deaths in patients with the highest risk. When SORT was used to identify patients with a predicted post-surgery mortality of ≥ 5% and therefore requiring direct ICU admission some patients who were strati ed as standard risk ultimately required unplanned ICU admission. However, SORT did identify high risk patients who had unplanned ICU admissions, demonstrating the value of using SORT in conjunction with clinical judgement.

Declarations
Ethics approval and consent to participate The study proposal was reviewed by the hospitals' Research Review Committee who deemed that ethical approval was not required as no new data were collected, and the study involved no patient intervention Consent for publication Not applicable Availability of data and materials The datasets generated and/or analysed during the current study are not publicly available due as they contain commercially sensitive information but are available from the corresponding author on reasonable request.