UMICRS poster+mre

Improving Trauma Triage Models for Motor Vehicle CrashesYaoyaun Vincent Tana, Michael Elliotta,b, Carol Flannaganc

aUniversity of Michigan Department of Biostatistics, bInstitute for Social Research cUniversity of Michigan Transportation Research Institute

IntroductionDelta-v, a measure of the near-instantaneous change in vehi-cle velocity after the impact of a crash, is a strong predictor ofsevere injury. However, most prediction models of severe in-jury use delta-v estimated during after crash investigations. Be-cause a realistic and comprehensive real-time prediction modelwould help Emergency Medical Services (EMS) allocate re-sources more efficiently and reduce morbidity and mortality incrashes, we develop a real-time prediction model using the ve-hicle’s acceleration profile during a crash, recorded by the vehi-cle’s Event Data Recorder (EDR). We use functional data anal-ysis (FDA) to estimate the mean trend of the acceleration andthen built our prediction model around summary measures ofthe estimated mean trend (its absolute integral and absolute in-tegral of its slope) as well as its residual variance. We appliedour method to the acceleration profiles recorded in 2002-2012EDR reports from the National Transportation Safety Admin-istration (NHTSA) website. We obtained our outcomes fromthe National Highway and National Automotive Sampling Sys-tem (NASS) Crashworthiness Data System (CDS) datasets ofthe same years. Our results can be seen as an important step to-wards the development of a comprehensive near real-time pre-diction model for severe injury in a motor vehicle crash.

Dataset and variablesWe have EDR data from 3,460 vehicles that were involved infrontal impacts (direction of impact 0 to 40 and 320 to 350)from 2002-2012. We analyzed data from 249 usable crashes, ofwhich 27 had a severe injury outcome.

Outcome•VAIS: Maximum AIS 3+ in vehicle (Yes/No)

Deceleration (crash pulse) Data•Time (1 ms)

•Gs (9.8 m/sec2)

Baseline Data•Driver seat beat use (Yes/No/Not reported)

• Front-seat passenger belt use (Yes/No/No front-seat passen-ger/Not reported)

•Curb weight (kgs)

•Body type (Car/Pickups & Vans/SUV)

•Multiple crash indication (Yes/No)

• Principal direction of force (o)

• Sampling weight

MethodWe used a two-stage approach with summary measures from a1st stage FDA model as inputs for the 2nd stage model (Jiang et.al., 2014). FDA requires converting observed values yi1, yi2, . . . , yimi

to a function yi(tij). For each crash, we considered four meth-ods of estimating yi(tij).

3 millisecond (3mil) Method

yi(tij) =mi∑l=1

bilφl(tij) (1)

where φl(tij) ≡ φl,d(tij) are basis splines matrices of degree d,and φl,d(tij) was obtained by the recursion relation:

φl,d(tij) =tij − κilκi,l+d − κil

φl,d−1(tij) +κi,l+1+d − tijκi,l+1+d − κi,l+1

φl+1,d−1(tij).

(2)κij were the internal knots set at 3 millisecond intervals.

Combinatorial (Combi) Method We considered all possiblecombinations of choosing 5 internal knots out of the knots at3 millisecond intervals. The optimum placement of knots wasdetermined by the placement that gave the smallest

∑mij=1[yij −

yi(tij)]]2.

Penalized natural cubic splines (PNCS) Method

yi(tij) =mi∑l=1

[yil − y(til)]2 + λi

∫T

[L(til)]2dti (3)

where L(t) = w0x + w1y′(til) + . . . + wd−1y

d−1(til) + yd(til)and d is the degree of the polynomial. We estimated λi usinggeneralized cross-validation. (Ramsay et.al., 1997).

Mixed Model (MM) Method (Wang, 1998)

yi(tij) = β0i + β1itij +mi−1∑p=1

Zijpbp + εij εijiid∼ N(0, σ2

i ).

(4)where ti ∈ [0, 1], (b1, b2, . . . , bn−1)

T ∼ N(0, τ 2i I), ZiZT

i = Ωi,and Ωi is an mi×mi matrix with the rows and columns definedas Ωk,l =

∫ 10 (tk − µ)+(tl − µ)+dt = 1

2[min(tk, tl)]2max(tk, tl)−

16[min(tk, tl)]

3. The tuning parameter λi could be estimated byλi = σ2i

miτ 2i. tij in equation (4) was a transformation from the

original time tij given by tij =tij−min

j(tij)

maxj

(tij)−minj

(tij)so that ti1 = 0 <

ti2 < . . . < timi= 1.

We obtained Zi by estimating the Cholesky decomposition ofΩi using Smith’s (1995) method. We used standard linear mixedmodel programs to estimate yi(tij) as

yi(tij) = β0i + β1itij +mi−1∑p=1

Zijpbp. (5)

Summary measure For each crash pulse estimated under thefour FDA methods, we computed four summary measures:

1. G =∫ti|yi(tij)|dti - Absolute area under the deceleration pro-

file. This could be seen as an estimation of total delta-v.

2. g =∫ti|y′i(tij)|dti - Absolute integral of the slope of yi(tij) for

the duration ti. This measure gave a sense of the amount offluctuation in yi(tij).

3. σ2 - The residual variance of yi(tij).

4. tt025 - Time the crash pulse took to return to within±0.25Gs.

Predicting severe injury risk. We merged the chosen sum-mary measures together with VAIS and baseline data: driverseat belt use, front-seat passenger belt use, curb weight, bodytype, and multiple crash indicator. We applied the weightedlogistic regression and ran an all-subset analysis on the ninecovariates (summary measures in chosen form). We obtainedthree models: 1. Model with the highest ROC (Model A),

2. Model with the highest ROC excluding any summary mea-sure (Model B), and 3. Centered summary measures in modelA and their squared terms with baseline model A covariates(Quadratic Model A). We compared the ROC between these 3models by running 1,000 bootstrap samples. To complementthe weighted ROC results, we plotted the weighted precision-recall (PR) curve (Davis and Goadrich, 2006) and constructeda false discovery rate (FDR) table. Finally, we conducted aleave-one-out cross-validation (CV) to investigate how well ourmodels would perform given new data.

ResultsWe selected the summary measures computed under the MMmethod and log-transformation of the summary measures be-cause its CV results were better compared to the model with thehighest ROC. Our all-subset analysis produced a Model A con-sisting all covariates except log(g). The coefficients of log(G),log(tt025), belted driver, and multiple crash were significant.Model B consisted of all baseline covariates. The coefficientsof belted driver and multiple crashes were significant. BecauseROC of quadratic model A was smaller compared to model A,we shall not focus on quadratic model A.

The weighted ROC of Model B was to the right of Model Afor false positive rates between 0 and 0.8. The loess PR curveof model B was on the left of model A for all recall values(Figure 1). FDR table (Table 1) reflected results similar to thePR curve (Figure 1). The leave-one-out cross-validation resultswere generally similar to non cross-validation results but withthe weighted ROCs shifted right, weighted PR curves shiftedleft, capture rates decreased, and FDR increased for all threeprediction models (Results not shown here).

Figure 1: Weighted Reciever Operating Curves (ROC) and Precision-Recall(PR) Curves for Models A and B.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

False Positive Rate

True

Pos

itive

Rat

e

Receiver Operating Curves for Models A and B

Model AModel B

0.0 0.2 0.4 0.6 0.8 1.0

0.1

0.2

0.3

0.4

0.5

Recall

Pre

cisi

on

Model AModel B

Precision Recall Curves for Models A and B

Table 1: Fraction of AIS 3+ Injury Crashes Captured and Associated Frac-tion of Crash that are not AIS 3+ [False Discovery Rate (FDR)] at variousthresholds of injury risk cutpoints.

Model A Model B

Cutpoints (%) Captured (%) FDR (%) Captured (%) FDR (%)

1 93.60 94.95 87.90 97.22

2 81.04 88.50 54.41 95.63

3 77.45 85.54 44.88 94.46

4 76.07 81.09 40.47 90.72

5 71.13 82.10 24.92 87.28

6 71.13 82.09 17.17 86.32

7 70.32 81.30 17.17 82.85

8 70.32 79.44 17.17 80.71

9 46.36 78.87 17.17 80.71

10 46.36 78.81 17.17 80.71

15 44.17 68.74 17.17 68.46

20 40.74 63.27 17.17 47.76

25 40.74 60.36 17.17 47.76

30 40.16 37.73 2.45 83.15

DiscussionWe successfully developed a new severe injury risk predictionmodel able to estimate the probability of a severe injury in amotor vehicle crash near real-time. Model A performed fairly

Table 2: Coefficient Estimates for model A and B with weighted ROC.

Model A Model BParameter/Statistic Estimate (Conf. Int) Estimate (Conf. Int)log(G) 3.34 (1.43, 5.25)∗∗

log(σ2) 0.19 (-0.41, 0.80)log(tt025) -6.07 (−10.05,−2.07)∗∗

Driver belt useBelted -2.95 (−4.55,−1.36)∗∗∗ -3.23 (−4.79,−1.68)∗∗∗

Not reported 2.77 (-0.58, 6.13) 0.25 (-1.78, 2.29)Not belted

Front-seat passenger belt use†

Belted 1.79 (-1.99, 5.58) 1.01 (-1.66, 3.69)No front passenger 2.52 (-0.23, 5.28) 1.58 (-0.34, 3.50)Not belted

Curb weight 0.0003 (-0.001, 0.002) -0.0007 (-0.003, 0.001)Body type

Car 0.85 (-1.43, 3.14) 0.96 (-1.15, 3.07)Pickup or Van -0.58 (-2.88, 1.72) 0.31 (-1.49, 2.12)SUV

Multiple crashes?Yes 5.20 (1.09, 9.31)∗ 2.04 (0.11, 3.96)∗

No

ROC 0.93 0.78ROC A - ROC B 0.151 (0.040, 0.227)† Crashes not reporting front-seat passenger belt status were the same with driver belt status.

* 0.01 ≤ p < 0.05; ** 0.001 ≤ p < 0.01; *** p < 0.001.

well in predicting the severe injury crashes from our data (ROC 0.93). Weused a novel variable – the crash pulse – as the main variable in our regres-sion model. We were not aware of any such applications of FDA to crashpulses to predict severe injury risk in trauma literature. The summary mea-sures we defined were new in trauma research and helped us avoid problemswe faced when defining yi(tij) for each crash with a different set of linearcombination of functions.

LimitationsA majority of our frontal crashes (92.4%) were unusable because vehicleacceleration was not reported. A comparison (results not shown here) be-tween the demographics (driver’s age, sex, belt use, and intrusion, front-seatpassenger belt use, maximum vehicle injury severity, and sampling weight)of eligible and ineligible crashes showed no significant difference exceptfor driver intrusion. This implied that excluded crashes had greater rates ofdriver intrusion, suggesting that our analytic dataset may under-report themost severe crashes.

Future direction•Develop a separate prediction model that uses information from the ad-

justed velocity change instead of acceleration.

•Develop a method to combine the crash pulse and adjusted velocity changemodel.

•Add the lateral, vertical, and rollover of crash pulse and adjusted velocitycomponents into the model.

•Develop a joint model to compute the FDA estimates and logistic regres-sion model in a single step. This will make estimated coefficients moreefficient i.e. smaller estimated variance.

References• Davis, J., Goadrich, M. (2006). The Relationship Between Precision-Recall and ROC Curves. Proceed-

ings of the 23rd International Conference on Machine Learning, Pittsburgh, PA.

• Jiang, B., Wang, N., Sammel M.D., Elliott M.R. (2014). Modeling short- and long-term variability ofvariation of follicle stimulating hormone as predictors of severe hot flashes in Penn Ovarian Aging Study.Submitted for publication.

• Ramsay, J.O., Heckman, N., and Silverman, B.W. (1997). Spline smoothing with model-based penalties.Behav. Res. Meth. Ins. C., 29(1):99-106.

• Smith, S. P. (1995). Differentiation of the Cholesky Algorithm. J. Comput. Graph. Stat., 4:134-147.

• Wang, Y. (1998). Mixed-Effects Smoothing Spline ANOVA. J. R. Stat. Soc. Ser. B Stat. Methodol.,60:159-174.

AcknowledgmentsWe would like to acknowledge the help of Dr. Patrick Carter and Dr. Jonathan Rupp in providing an under-

standing of the background and goals of the analysis. This work was supported jointly by Dr. Michael Elliott

and an MCubed project awarded to Drs. Patrick Carter, Jonathan Rupp and Carol Flannagan.

Documents

UMICRS poster+mre