Methods of selecting explanatory variables in classification trees

Hana Řezanková, Faculty of Informatics and Statistics, University of Economics in Prague, Czech Republic

Pages: 40 – 53

Abstract

The paper focuses on different evaluations of relationships between categorical variables and their application to the explanatory variables selection in classification trees. On the one hand approaches available in commercial software systems (chi-square tests and comparison of variability explained in different groups of objects using the Gini measure and the entropy) are discussed and on the other hand, further development possibilities of development are outlined. The well-known possibilities are illustrated on the data analysis in the IBM SPSS Decision Trees system. Recently research focuses on the evaluation of directional association of the target variable on the nominal variable and the implementation in classification trees. Such approaches have been realized in the packages in the R environment.

Issue for download
PDF (1.4 MB, 428 downloads)