KTH Matematik  


Matematisk Statistik

Tid: 16 juni, 2017, kl 13.15-13.45.

Seminarierummet 3721, Institutionen för matematik, KTH, Lindstedtsvägen 25, plan 7. Karta!

Föredragshållare: Daniel Berlin

Titel: Multi-class supervised classification techniques for high-dimensional data: applications to vehicle maintenance at Scania (Master's thesis)

Abstract In vehicle reparations, many times locating the cause of error could turn out more time consuming than the reparation itself. Hence a systematic way to accurately predict a fault causing part would constitute a valuable tool especially for errors difficult to diagnose. This thesis explores the predictive ability of Diagnostic Trouble Codes (DTCs), produced by the electronic system on Scania vehicles, as indicators for fault causing parts. The statistical analysis is based on about 18800 observations of vehicles where both DTCs and replaced parts could be identified during the period march 2016 - march 2017. Two different approaches of forming classes is evaluated. Many classes had only few observations and, to give the classifiers a fair chance, it is decided to omit observations of classes based on their frequency in data. After processing, the resulting data could comprise 1547 observations on 4168 features, demonstrating very high dimensionality and making it impossible to apply standard methods of large-sample statistical inference. Two procedures of supervised statistical learning, that are able to cope with high dimensionality and multiple classes, Support Vector Machines and Neural Networks are exploited and evaluated. The analysis showed that on data with 1547 observations of 4168 features (unique DTCs) and 7 classes SVM yielded an average prediction accuracy of 79.4% compared to 75.4% using NN. The conclusion of the analysis is that DTCs holds potential to be used as indicators for fault causing pars in a predictive model, but in order to increase prediction accuracy learning data needs improvements. Scope for future research to improve and expand the model, along with practical suggestions for exploiting supervised classifiers at Scania is provided.

The full report (pdf)

Till seminarielistan
To the list of seminars

Sidansvarig: Filip Lindskog
Uppdaterad: 25/02-2009