Document worth reading: “How Complex is your classification problem? A survey on measuring classification complexity”

Extracting traits from the teaching datasets of classification points has confirmed environment friendly in varied meta-analyses. Among them, measures of classification complexity can estimate the difficulty in separating the data elements into their anticipated classes. Descriptors of the spatial distribution of the data and estimates of the shape and dimension of the selection boundary are among the many many existent measures for this characterization. This data can help the formulation of current data-driven pre-processing and pattern recognition methods, which can in flip be centered on troublesome traits of the problems. This paper surveys and analyzes measures which is likely to be extracted from the teaching datasets in order to characterize the complexity of the respective classification points. Their use in present literature is moreover reviewed and talked about, allowing to prospect alternate options for future work throughout the house. Finally, descriptions are given on an R bundle named Extended Complexity Library (ECoL) that implements a set of complexity measures and is made publicly on the market. How Complex is your classification draw back? A survey on measuring classification complexity