Notice: Undefined index: linkPowrot in C:\wwwroot\wwwroot\publikacje\publikacje.php on line 1275
Publikacje
Pomoc (F2)
[58150] Artykuł:

Selected aspects of prior and likelihood information for a Bayesianclassifier in a road safety analysis

(Wybrane aspekty informacji apriorycznej i danych wiarygodności dla klasyfikatora bayesowskiego w analizie bezpieczeńsatwa ruchu drogowego)
Czasopismo: Accident analysis and prevention   Zeszyt: 101, Strony: 97-106
ISSN:  0001-4575
Opublikowano: Kwiecień 2017
Liczba arkuszy wydawniczych:  0.50
 
  Autorzy / Redaktorzy / Twórcy
Imię i nazwisko Wydział Katedra Do oświadczenia
nr 3
Grupa
przynależności
Dyscyplina
naukowa
Procent
udziału
Liczba
punktów
do oceny pracownika
Liczba
punktów wg
kryteriów ewaluacji
Marzena Nowakowska orcid logo WZiMKKatedra Informatyki i Matematyki Stosowanej**Takzaliczony do "N"Nauki o zarządzaniu i jakości10040.00.00  

Grupa MNiSW:  Publikacja w czasopismach wymienionych w wykazie ministra MNiSzW (część A)
Punkty MNiSW: 40



Słowa kluczowe:

bayesowska regresja logistyczna  informatywne rozkłady aprioryczne  wiedza o wiarygodności  próba zbalansowana  ocena modelu  ciężkość wypadku drogowego 


Keywords:

Bayesian logistic regression  Informative prior  Likelihood knowledge  Balanced sample  Model assessment  Crash severity 



Streszczenie:

W pracy podjęto dyskusję estymacji logistycznego modelu regresji Bayesa klasyfikującego ciężkość wypadku drogowego. Poddano analizie aprioryczne rozkłady informatywne w przypadku, gdy nie jest dostępna opinia eksperta: zaproponowane przez Yu and Abdel-Aty (Yu R., Abdel-Aty M., 2013. Investigating different approaches to develop informative priors in hierarchical Bayesian safety performance functions. Accident Analysis and Prevention 56, 51-58) (otrzymane z metody momentów, estymacji za pomocą metody największej wiarygodności oraz dwustopniowej bayesowskiej metody aktualizacji) oraz otrzymane za pomocą opracowanej przez autorkę metody Boot prior. Dodatkowo, zaprezentowane zostały dwa możliwe podejścia uaktualniania wiedzy apriorycznej – w postaci zbalansowanego i niezbalansowanego zbioru treningowego potrzebnego do estymacji modelu. Otrzymane modele logistyczne Bayesa są oceniane na podstawie kryterium DIC, przedziałów największej gęstości prawdopodobieństwa a’posteriori parametrów modeli oraz współczynników zmienności tych parametrów. Weryfikacja dokładności modeli została wykonana w oparciu o wskaźniki: czułości, swoistości oraz średniej harmonicznej czułości i swoistości – wszystkie policzone w oparciu o niezbalansowany zbiór treningowy. Modele otrzymane na podstawie zbalansowanego zbioru treningowego charakteryzowały się lepszą jakością klasyfikacji niż modele otrzymane na podstawie zbioru zbalansowanego. Modele, w których wykorzystano rozkłady aprioryczne parametrów pochodzące z dwustopniowej bayesowskiej metody aktualizacji i z metody Boot prior, oba estymowane w oparciu o zbalansowany zbiór treningowy, okazały się być lepsze niż modele, w których zastosowane rozkłady aprioryczne: nieinformatywne, oraz informatywne pochodzące z metody momentów i największej wiarygodności. W każdym przypadku parametry powinny być interpretowane ostrożnie, ponieważ można otrzymać różne modele wynikowe dla różnych rozkładów apriorycznych.




Abstract:

The development of the Bayesian logistic regression model classifying the road accident severity is dis-cussed. The already exploited informative priors (method of moments, maximum likelihood estimation,and two-stage Bayesian updating), along with the original idea of a Boot prior proposal, are investigatedwhen no expert opinion has been available. In addition, two possible approaches to updating the priors,in the form of unbalanced and balanced training data sets, are presented. The obtained logistic Bayesianmodels are assessed on the basis of a deviance information criterion (DIC), highest probability density(HPD) intervals, and coefficients of variation estimated for the model parameters. The verification of themodel accuracy has been based on sensitivity, specificity and the harmonic mean of sensitivity and speci-ficity, all calculated from a test data set. The models obtained from the balanced training data set have abetter classification quality than the ones obtained from the unbalanced training data set. The two-stageBayesian updating prior model and the Boot prior model, both identified with the use of the balancedtraining data set, outperform the non-informative, method of moments, and maximum likelihood esti-mation prior models. It is important to note that one should be careful when interpreting the parameterssince different priors can lead to different models.



B   I   B   L   I   O   G   R   A   F   I   A
Aguero-Valverde, J., 2013. Full Bayes Poisson gamma, Poisson lognormal, and zero inflated random effects models: comparing the precision of crash frequency estimates. Accid. Anal. Prev. 50, 289–297.
Ahmed, M.M., Abdel-Aty, M., Lee, J., Yu, R., 2014. Real-time assessment of fog-related crashes using airport weather data: a feasibility analysis. Accid. Anal. Prev. 72, 309–317.
Berg, A., Meyer, R., Yu, J., 2004. Deviance information criterion for comparing stochastic volatility models, American Statistical Association. J. Bus. Econ. Stat. 22, 1.
Box, G.E.P., Tiao, G.C., 1992. Bayesian Inference in Statistical Analysis. Wiley Classic Library, John Wiley & Sons, Inc.
Breiman, L., 1996. Bagging predictors. Mach. Learn. 24, 123–140.
Chung, Y.-S., 2013. Factor complexity of crash occurrence: an empirical demonstration using boosted regression tree. Accid. Anal. Prev. 61, 107–118.
Congdon, P., 2006. Bayesian Statistical Modelling. In: Wiley Series in Probability and Statistics, Second Edition. John Wiley & Sons. Ltd.
Das, A., Abdel-Aty, M., Pande, A., Santos, J.B., 2009. Severity analysis of crashes on multilane arterials using conditional inference forests. In: Proceedings of Transportation Research Board 89th Annual Meeting, Washington DC, USA.
De O˜na, J., Mujalli, R.O., Calvo, F.J., 2011. Analysis of traffic accident injury severity on Spanish rural highways using Bayesian networks. Accid. Anal. Prev. 43, 402–411.
De O˜na, J., López, G., Mujalli, R., Calvo, F.J., 2013. Analysis of traffic accidents on rural highways using latent class clustering and Bayesian networks. Accid. Anal. Prev. 51, 1–10.
Efron, B., Tibshirani, R.J., 1993. An Introduction to the Bootstrap. Chapman and Hall, New York.
El-Basyouny, K., Sayed, T., 2009. Urban arterial accident prediction models with spatial effects. In: Proceedings of Transportation Research Board 89th Annual Meeting, Washington DC, USA.
El-Basyouny, K., Sayed, T., 2012. Measuring safety treatment effects using full Bayes non-linear safety performance intervention functions. Accid. Anal. Prev. 45, 152–163.
El-Basyouny, K., Barua, S., Islam, M.T., 2014. Investigation of time and weather effects on crash types using full Bayesian multivariate Poisson lognormal models. Accid. Anal. Prev. 73, 91–99.
Elvik, R., 2008. A comparative analysis of techniques for identifying hazardous road locations. In: Proceedings of Transportation Research Board 88th Annual Meeting, Washington DC, USA.
European Road Safety Observatory, 2015. Annual Accident Report 2015. http://ec. europa.eu/transport/road safety/pdf/statistics/dacota/asr2015.pdf (2015/12/15).
Häggström, O., 2003. Finite Markov Chains and Algorithmic Applications. Cambridge University Press http://uosis.mif.vu.lt/∼stepanauskas/MK/
Haggstrom%20O.%20Finite%20Markov%20chains%20and%20algorithmic %20applications%20(CUP,%202002)(125s).pdf (2015/11/05).
Han, S., Yuan, B., Liu, W., 2009. Rare class mining: progress and prospect. In: Proceedings of the 2009 Chinese Conference on Pattern Recognition, Nanjing, China, http://boyuan.global-optimization.com/Mypaper/CCPR2009-
0282.pdf(2015/12/15).
Hand, D., Manilla, H., Smyth, S., 2001. Principles of Data Mining. Prentice Hall India Pvt. Ltd.
Haque, M.M., Chin, H.C., Huang, H., 2010. Applying Bayesian hierarchical models to examine motorcycle crashes at signalized intersections. Accid. Anal. Prev. 42, 203–212.
Heydari, S., Miranda-Moreno, L.F., Lord, D., Fu, L., 2014. Bayesian methodology to estimate and update safety performance functions under limited data conditions: a sensitivity analysis. Accid. Anal. Prev. 64, 41–51.
Huang, H., Abdel-Aty, M., 2010. Multilevel data and Bayesian analysis in traffic safety. Accid. Anal. Prev. 42, 1556–1565.
Huang, H., Chin, C.H., Haque, M.M., 2008. Severity of driver injury and vehicle damage in traffic crashes at intersections: a Bayesian hierarchical analysis. Accid. Anal. Prev. 40, 45–54.
Hughes, B.P., Newstead, S., Anund, A., Shu, C.C., Falkmer, T., 2015. A review of models relevant to road safety. Accid. Anal. Prev. 74, 250–270.
Islam, M.T., El-Basyouny, K., 2015. Full Bayesian evaluation of the safety effects of reducing the posted speed limit in urban residential area. Accid. Anal. Prev. 80, 18–25.
Jung, S., Qin, X., Noyce, D.A., 2010. Rainfall effect on single-vehicle crash severities using polychotomous response models. Accid. Anal. Prev. 42, 213–224.
Khattak, A.J., Kantor, P., Council, F.M., 1998. Role of adverse weather in key crash types on limited-access roadways. Implications for advanced weather systems. Trans. Res. Record J. Transp. Res. Board 1621, 10–19 (Washington DC).
Kubat M., Matwin S., 1997. Addressing the curse of imbalanced training sets: One-sided selection. Proc. ICML’97, Nashville, Tennessee, 179–186.
Kwon, O.H., Rhee, W., Yoon, Y., 2015. Application of classification algorithms for analysis of road safety risk factor dependencies. Accid. Anal. Prev. 75, 1–15.
Larose, D.T., 2006. Data Mining Methods and Models. John Wiley & Sons, Inc.
Lee, C., Park, Y.-J.P., Abdel-Aty, M., 2009. Effects of lane-change and car-following-related traffic flow parameters on crash occurrence by lane. In: Proceedings of Transportation Research Board 89th Annual Meeting, Washington DC, USA.
Lord, D., Miranda-Moreno, L.F., 2008. Effects of low sample mean values and small sample size on the estimation of the fixed dispersion parameter of Poisson-gamma models for modeling motor vehicle crashes: a Bayesian perspective. Saf. Sci. 45, 751–770.
Ma, J., Kockelman, K., 2006. Bayesian multivariate Poisson regression for models of injury count, by severity. Transp. Res Record J. Transp. Res. Board 1950, 24–34 (Washington DC).
Mitra, S., Washington, S., 2007. On the nature of over-dispersion in motor vehicle crash prediction models. Accid. Anal. Prev. 39, 459–468.
Mujalli, R.O., López, G., Garach, L., 2016. Bayes classifiers for imbalanced traffic accidents datasets. Accid. Anal. Prev. 84, 37–51.
Nowakowska, M., 2010. Logistic models in the crash severity classification on thebasis of chosen road characteristics. Transp. Res Record J. Transp. Res. Board2148, 16–26 (Washington DC).
Olszewski, P., Szagała, P., Wola´nski, M., Zieli´nska, A., 2015. Pedestrian fatality riskin accidents at unsignalized zebra crosswalks in Poland. Accid. Anal. Prev. 84,83–91.
Pei, X., Wong, S.C., Sze, N.N., 2011. A joint probability approach to crash predictionmodels. Accid. Anal. Prev. 43, 1160–1166.
Pei, X., Wong, S.C., Sze, N.N., 2012. The roles of exposure and speed in road safetyanalysis. Accid. Anal. Prev. 48, 464–471.
Polish Police, 2006. The regulation No 635 by the Main Commanding Officer of the Polish Police Headquarters from the 30-th of June 2006 regarding the methodsand forms of processing road crash statistics by the Police. Warszawa (inPolish).
Rifaat, S.M., Tay, R., 2009. Effects of street patterns on injury risks in two-vehiclecrashes. In: Proceedings of Transportation Research Board 89th AnnualMeeting, Washington DC, USA.SAS, 2009.
SAS/STAT®9.2 User’s Guide (Introduction to Bayesian AnalysisProcedures). Second Edition, SAS Institute Inc Cary, NC, USA.
Savolainen, P.T., Mannering, F.L., Lord, D., Quddus, M.A., 2011.
The statisticalanalysis of highway crash-injury severities: a review and assessment ofmethodological alternatives. Accid. Anal. Prev. 43, 1666–1686.
Schneider, W.H., Zimmerman, K., Vavilikolanu, S., 2009. A Bayesian analysis of theeffect of horizontal curvature on truck crashes using training and validationdata sets. In: Proceedings of Transportation Research Board 89th AnnualMeeting, Washington DC, USA.
Strauss, J., Miranda-Moreno, L.F., Morency, P., 2013. Cyclist activity and injury riskanalysis at signalized intersections: a Bayesian modelling approach. Accid.Anal. Prev. 59, 9–17.
Tay, R., 2016. Comparison of the binary logistic and skewed logistic (Scobit) modelsof injury severity in motor vehicle collisions. Accid. Anal. Prev. 88, 52–55.
Vaez, M., Laflamme, L., 2005. Impaired driving and motor vehicle crashes amongSwedish youth: an investigation into drivers’ sociodemographiccharacteristics. Accid. Anal. Prev. 37, 605–611.
Yu, R., Abdel-Aty, M., 2013. Investigating different approaches to developinformative priors in hierarchical Bayesian safety performance functions.Accid. Anal. Prev. 56, 51–58.
Yu, R., Abdel-Aty, M., 2014. Using hierarchical Bayesian binary probit models toanalyze crash injury severity on high speed facilities with real-time trafficdata. Accid. Anal. Prev. 62, 161–167.