Heterogeneous distance functions for prototype rules: influence of parameters on probability estimation.

Authors

  • Marcin Blachnik

Abstract

An interesting and little explored way to understand data is based on prototype rules (P-rules). The goal of this approach is to find optimal similarity (or distance) functions and position of prototypes to which unknown vectors are compared. In real applications similarity functions frequently involve different types of attributes, such as continuous, discrete, binary or nominal. Heterogeneous distance functions that may handle such diverse information are usually based on probability distance measure, such as the Value Difference Metrics (VDM). For continuous attributes calculation of probabilities requires estimations of probability density functions. This process requires careful selection of several parameters that may have important impact on the overall classification of accuracy. In this paper, various heterogeneous distance function based on VDM measure are presented, among them some new heterogeneous distance functions based on different types of probability estimation. Results of many numerical experiments with such distance functions are presented on artificial and real datasets, and quite simple P-rules for several heterogeneous databases extracted.

Downloads

Download data is not yet available.

Downloads

Published

15.12.2006

How to Cite

Blachnik, M. (2006). Heterogeneous distance functions for prototype rules: influence of parameters on probability estimation. Studia Informatica. System and Information Technology, 7(1-2), 19-30. https://czasopisma.uws.edu.pl/studiainformatica/article/view/2847