In Chapter 3, we discussed the design and operation of CLIFF with the help of a simple example. We also showed its time complexity as linear - $O(n)$ in Section \ref{section:time}. In this chapter, we look at how CLIFF performs against other PLS with three(3) evaluation methods. CLIFF is evaluated by first comparing its performance with three(3) PLS spanning the decades from 1968 (Hart's CNN \cite{Hart68a}) to MCS in 1994 \cite{Dasarathy94} and finally 2010 PSC \cite{lot2010}. We then move on to examine the noise tolerance of each PLS studied here by introducing artificial noise to the training sets. Finally we take a look at what we call the $brittleness$ measure. Brittleness, discussed in Chapter 5, is defined as \emph{a tiny change in the input data can lead to a major change in the output}. As the reader will see, we view instance selection as a viable method to decrease $brittleness$ and the following sections will show that CLIFF does a better job of reducing the impact of $brittleness$ than any other PLS.

%\section{CLIFF Assessment on Standard Data Sets}
%\label{section:assess}
\section{Data Sets}
\fig{info} lists the seven(7) data sets used to assess CLIFF. The number of instances and attributes per instance are shown for each data set, along with the number of distinct classes of instances. All of these data sets were acquired from the UCI repository \cite{Frank+Asuncion:2010}. Except for the Iris data-set, all the attribute values of the data-sets are discrete and so do not require any pre-processing. However, since the attribute values for Iris are numeric, we discretize them using an equal frequency binning algorithm so that ranges of values are ranked rather than each individual value. In the experiments to follow, the number of bins is set to $10$.

%These data sets represent a variety of data types and characteristics. For example, three of the data sets (Sonar, Soybean and Splice) have large dimensions of 60, 35 and 60 respectively, while the others have dimensions in the range of four(4) to eighteen(18). Also the number of instances range from 148 (Lymph) to 3190 (Splice), while the number of classes range from two(2) to fifteen(15).  

%Assessing CLIFF on such a diverse set of data will give a good indication of the level of generalization CLIFF is capable off. However in the interest of speed, 2 pre-processing tools are applied to the high dimensional data sets in \fig{info}. The following section details the algorithms used.

\begin{figure}
\begin{center}
\begin{tabular}{ | l | l | l | l | l |}
\hline
Data Set & Code & Instances & Attributes & Class \\ \hline
Breast Cancer &bc & 286 & 9 & 2 \\ \hline
Dermatology &dm & 366 & 34 & 6 \\ \hline
Heart (Cleveland) &hc & 303 & 13 & 5 \\ \hline
Heart (Hungarian) &hh & 297 & 13 & 2 \\ \hline
Iris & ir& 150 & 4 & 3 \\ \hline
Liver (Bupa) &lv & 345 & 6 & 2 \\ \hline
Mamography& mm & 150 & 4 & 3 \\ \hline
\end{tabular}
\end{center}
\caption{Data Set Characteristics}
\label{fig:info}
\end{figure}     


%In this chapter, we evaluate CLIFF as a prototype learner on standard data sets in cross validation experiments. In the following sections we present results which show the probability of detection (pd) and probability of false alarm (pf) before and after the use of CLIFF. 

\section{Experimental Method}
\label{section:brit}

%The data set used in this work is donated by \cite{Karslake09}. It contains 37 samples each with five(5) replicates (37 x 5 = 185 instances). Each instance has 1151 infrared measurements ranging from 1800-650cm-1. (Further details of this algorithm can be found elsewhere \cite{Karslake09}). For our experiments we took the original data set and created four (4) data sets each with a different number of clusters (3, 5, 10 and 20) or groups. These clusters were created using the K-means algorithm (\fig{kmeans}).

We evaluate CLIFF as a prototype learning scheme on standard data sets in cross validation experiments. Its performance compared with CNN, MCS and PSC is measured using probability of detection (pd) and probability of false alarm (pf) completed as follows \cite{burak}: By allowing A, B, C and D to represent true negatives, false negatives, false positives and true positives respectfully, it then follows that \emph{pd} also known as recall, is the result of true positives divided by the sum of false negative and true positives \emph{D / (B + D)}. While pf is the result of: \emph{C / (A + C)}. The $pd$ and $pf$ values range from 0 to 1. When there are no false alarms $pf$ = 0 and at 100\% detection, $pd$ = 1.

The results were visualized using \emph{quartile charts} as in \cite{burak}. To generate these charts the performance measures for the $pd$s and $pf$s are sorted to get the median, lower and upper quartile of numbers. For our quartile charts, the upper and lower quartiles are marked with black lines; the median is marked with a black dot; and the vertical bars are added to mark the 50\% percentile value. \fig{qc} shows an example where the upper and lower quartiles are 39\% and 59\% respectively, while the median is 49\%. 

\begin{figure}[ht!]
  \begin{center}
  \scalebox{0.7}{
    \begin{tabular}{l}
      \resizebox{100mm}{!}{\includegraphics{quartc}}      
    \end{tabular}}
    \caption{}
    \label{fig:qc}
  \end{center}
\end{figure}


%need to include examples and cite paragraph

Finally, the Mann-Whitney U test was used to test for statistical difference of the different PLS. These results are shown as rank values starting at one(1). The lower the rank value, the better the performance of the prototype learner. Please note that prototype learners with the same rank value, are not statistically different.

%The brittleness level measure is conducted as follows: First we calculate Euclidean distances between the validation or testing set which has already been validated and the training set. For each instance in the validation set the distance from its nearest like neighbor (NLN) and its nearest unlike neighbor (NUN) is found. Using these NLN and NUN distances from the entire validation set a Mann-Whitney U test was used to test for statistical difference between the NLN and NUN distances. 

The following sections describes the experiments and discusses the results.

\section{Experiment 1: Is CLIFF viable as a Prototype Learning Scheme for NNC?}

The goal here is to see if the performance of CLIFF is comparable or better than the plain k nearest neighbor (KNN) algorithm, and the CNN, MCS and PSC prototype learners. In this experiment we compare the performance of predicting the target class using the entire training set to using only the prototypes generated by the prototype learners including CLIFF. To accomplish this, our experiment design follows the pseudo code given in \fig{knnexp1} for the standard data sets. For each data set, tests were built from 20\% of the data, selected at random. The models/prototypes were then learned from the remaining 80\% of the data.

This procedure was repeated 5 times, randomizing the order of data in each data-set each time. In the end CLIFF is tested and trained 25 times for each data set. 

\begin{figure}[h!]
\small
\begin{center}
\begin{tabular}{ p{7cm} }
\hline
\begin{verbatim}
DATA = [bc dm hc hh ir lv mm]
LEARNER = [KNN]
PLS = [KNN CLIFF CNN MCS PSC]
STAT_TEST = [Mann Whitney]

FOR EACH PLS
  REPEAT 5 TIMES
    FOR EACH data IN DATA
     TRAIN = random 80% of data
     TEST = data - TRAIN
		
     \\Construct model from TRAIN data
     r_TRAIN = Reduce TRAIN with PLS
     MODEL = Train LEARNER with r_TRAIN
     
     \\Evaluate model on test data
     [pd, pf] = MODEL on TEST
   END
  END
END  	
\end{verbatim}
 \\ \hline
    \end{tabular}
\end{center}
\caption{Pseudo code for Experiment 1}\label{fig:knnexp1}
\end{figure}

\subsection{Results from Experiment 1}
The results for this experiment and Experiment 2 (discussed in the following section) are shown in \fig{results100} to \fig{results106}. For each data-set, each figure shows the results for a $clean$ data-set (without noise) and a $noisy$ data-set (with noise). For both the clean and noisy data-sets, the [pd, pf] quartile charts, the percentage of the data-set used for training after using the PLS and the rank values showing the significant difference between the PLS and KNN are presented. 

Let us focus on the $clean$ results for each data-set from \fig{info}. First, we compare the CLIFF results with those of the baseline KNN. As shown, despite using only 9\% to 15\% of the training set, the pds and pfs results of CLIFF compares favorably with those of KNN showing similar or better rank values in most cases. For example, the Mammography(mm) data set (\fig{results106}) ranks as number one(1) for both the pd and pf results while KNN ranks at number three(3). In \fig{results105}, the Liver(lv) data set exhibits statistically similar results for pd while for pf CLIFF has a much better statistical performance than KNN. \fig{results101} to \fig{results104} also show encouraging results for CLIFF as compared with KNN, however \fig{results100}, the Breast Cancer(bc) data set shows an exception, with the pd and pf statistical results for KNN better than CLIFF.

Next, let us consider the results of the PLS. As compared with the other PLS, CLIFF does as well as or better than the others in all but one(1) case. However, although for the $bc$ data set, CNN present statistically better pd and pf results, it does so with 62\% of the training data for median pd and pf values of 55\% and 39\% respectively, while CLIFF only needed 11\% of the training data for median pd and pf values of 67\% and 20\% respectively.   

So generally, CLIFF has markedly lower pfs than the other PLS and KNN, this is especially true for the $hh$ and $ir$ data sets. Also the low pfs does not come at the cost of lower pds, infact, CLIFF's median pd results are competitive (sometimes even the best) of all the other PLS. 

\section{Experiment #2: How well does CLIFF handle the presence of noise?}

The goal here is to see if the CLIFF works well in the presence of noise. This is important because according to \cite{wilson00}:
\begin{quote}
In the presence of class noise, ... there are two main problems that can occur. The first is that very few instances will be removed from the training set because many instances are needed to maintain the noisy (and thus overly complex) decision boundaries. The second problem is that generalization accuracy can suffer, especially if noisy instances are retained while good instances are removed.
\end{quote}

With that said, in this experiment we repeat Experiment #1, only this time noise is introduced to training data by randomly changing the target class of 10\% of the instances in the training data to any other target class value. The the results here will indicate how well the 1NN classifier along with the different PLS are able to predict the correct target class even if some of its training data is faulty \cite{wilson00}. 

\subsection{Results from Experiment 2}
The $noisy$ tables in \fig{results100} to \fig{results106} displays the results of Experiment 2. The first thing to recognize is that compared to the $clean$ tables, there is a general degradation of the pd and pf results for each data set. However in some cases such as the pd results for $bc$ the pds can degrade by as little as 1\%. The second thing to recognize is that while the $sizes$ of the training data basically remains the same for CLIFF (differences of no more that 2\%) for all data sets, the other PLS display a general increase in the training set sizes except for PSC whose training size decrease from 42\% to 40\%. 



%\fig{result1} shows the 25\%, 50\% and 100\% percentile values of the $pd$, $pf$ and position values in each data set when r=1 (upper table) and r=2 (lower table. Next to these is the brittleness signal where $high$ signals an unacceptable level of brittleness and $low$ signals an acceptable level of brittleness. The results show that the brittleness level for each data set is $low$. The $pd$ and $pf$ results are promising showing that 50\% of the pd values are at or above 95\% for the data set with 3 clusters and at 100\% for the other data sets. While 50\% of the pf values are at 3\% for 3 clusters and 0\% for the others. These results show that our model is highly discriminating and can be used successfully in the evaluation of trace evidence.


%\end{center}
%\caption{Results for Experiment 1 for the 4 data sets distinguished by the number of clusters. Here for the upper and lower tables f=4 is used while r=1 is used for the upper table and r=2 for the lower table.}\label{fig:result1}
%\end{figure}

\section{Experiment 3: Can CLIFF reduce brittleness?}

In this work, $brittleness$ refers to the following:

\begin{quote}
\emph{Brittleness} is a measure of whether a solution (predicted target class) comes from a region of similar solutions or from a region of dissimilar solutions. Or, looking at this another way, how far would a test instance have to move before a different target class is predicted.
\end{quote}


\begin{figure}[ht!]
  \begin{center}
  \scalebox{1}{
    \begin{tabular}{l}
      \resizebox{100mm}{!}{\includegraphics{ir2fss}} \\
      \resizebox{100mm}{!}{\includegraphics{ir2fssaf}}      
    \end{tabular}}
    \caption{}
    \label{fig:charts2}
  \end{center}
\end{figure} 

Take for example the \emph{Before CLIFF} chart in \fig{charts2}, the classes $versicolor$ and $virginica$ obviously show severe overlap. Also, the versicolor test instance represented by the purple square does not have to move very far before it can change its predicted target class to $virginica$. Looking now at the \emph{After CLIFF} chart in \fig{charts2}, after applying CLIFF, a subset of instances are selected as prototypes. This increases the distance from the $versicolor$ test instance to a prototype with the $virginica$ class thereby reducing $brittleness$.

With this measure in mind, the goal here is to see how each prototype learning scheme studied here reduces the brittleness of the KNN model where k=1. Brittleness will be measured by distance each test instance moves before changing its original predicted target class. Intuitively, the further away the test instance has to move the less $brittle$ the model. The experiment design for $brittleness$ can be done in conjunction with Experiment 1 by collecting the distances of each test instance with a predicted target class to the nearest training instance with a different target class. This is done for each prototype learning scheme and 1NN. The distances generated are joined, sorted and labelled according to their position in the list. for example, let us say that A and B are PLS with the distance values of [2, 2, 2, 3, 4, 3, 77] and [6, 7, 3, 9, 1, 1, 1, 100] respectively. After being joined, sorted and labelled the result is as follows:

\begin{verbatim}
PLS      B  B  B  A  A  A  A  A  B   A   B   B   B   A    B
Sort     1  1  1  2  2  2  3  3  3   4   6   7   9  77  100
Position 1  2  3  4  5  6  7  8  9  10  11  12  13  14   15
Label    1  1  1  5  5  5  8  8  8  10  11  12  13  14   15
\end{verbatim}

As shown, labels are assigned according to the position of a value in the list, however, if values are the same, the mean of their position values are used as a label.

\subsection{Results from Experiment 3}
\fig{charts1} and \fig{instances} present the results for this experiment. The pattern is very clear: CLIFF does a much better job of reducing $brittleness$ in all cases than any of the other PLS. Each chart in \fig{charts1} represents results for the different data sets used in this work. They all show that the CLIFF test instances have to move further away (most of the time) before there is a change in their target classes. These results are confirmed by a Mann Whitney statistical test (\fig{instances}), which show that the CLIFF results are statistically different and better than the other PLS. 

\section{Summary}
Collectively, the results of the above experiments indicate that CLIFF may be effective in the field of forensic interpretation where a very low false alarm rate $pf$ is desired. Here, CLIFF can be used to help lower the $pf$s of a forensic interpretation model. The results of Experiment 1 indicate this possibility in the median $pf$ results where CLIFF's values ranges from 0 to 47 and are lower or the same as the baseline (KNN) $pf$ results.


\begin{figure}[ht!]
%  \begin{center}
  \scalebox{0.87}{
    \begin{tabular}{l}
      \resizebox{90mm}{!}{\includegraphics{bc1}}
      \resizebox{90mm}{!}{\includegraphics{dm1}} \\
      \resizebox{90mm}{!}{\includegraphics{hc1}} 
      \resizebox{90mm}{!}{\includegraphics{hh1}} \\
      \resizebox{90mm}{!}{\includegraphics{ir1}} 
      \resizebox{90mm}{!}{\includegraphics{lv1}} \\
      \resizebox{90mm}{!}{\includegraphics{mm1}} \\
    \end{tabular}}
    \caption{Position of distance values for PLS}
    \label{fig:charts1}
 % \end{center}
\end{figure}


\begin{figure}
\begin{center}
\begin{tabular}{l@{~}|c@{~}| c@{~}|}
\cline{1-3}
Clusters &  PLS & Significance \\\hline
\multirow{5}{*}{Breast Cancer (bc)} &  cliff & 1  \\
 &  mcs & 2 \\
 &  psc & 2 \\
 &  cnn & 3 \\
 & knn & 3 \\
 \hline
\multirow{5}{*}{Dermatology (dm)} & cliff & 1  \\
 &  mcs & 2 \\
 &  psc & 2 \\
  & cnn & 3 \\
 & knn & 3 \\
  \hline
\multirow{5}{*}{Heart Cleveland (hc)} &   cliff & 1  \\
 &  mcs & 2 \\
 &  psc & 2 \\
 &  cnn & 2 \\
  & knn& 2 \\
 \hline
\multirow{5}{*}{Heart Hungarian (hh)} &   cliff & 1  \\
 &  mcs & 2 \\
 &  psc & 2 \\
 &  cnn & 2 \\
 & knn & 2 \\
  \hline 
\multirow{5}{*}{Iris (ir)} &  cliff & 1  \\
 &  mcs & 2 \\
 &  psc & 3 \\
 & cnn & 3 \\
  & knn & 4 \\
  \hline 
 \multirow{5}{*}{Liver Bupa (lv)} &  cliff & 1  \\
 &  mcs & 2 \\
 &  psc & 2 \\
 &  cnn & 3 \\
 & knn & 3 \\
 \hline 
\multirow{5}{*}{Mammography (mm)} &  cliff & 1  \\
 &  mcs & 2 \\
 &  psc & 2 \\
 &  cnn & 3 \\
 & knn & 3 \\
  \hline    
\end{tabular}
\end{center}
\caption{Summary of Mann Whitney U-test results for Experiment 3 (95\% confidence): In the Significance column, indicates that CLIFF is better than other PLS with the greatest brittleness reduction reported.}\label{fig:instances}
\end{figure}




%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[ht]
\begin{center}
\small
\scalebox{1}{

\begin{tabular}{ l }
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Clean Breast Cancer Results} \\ \hline
bc & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&knn      & 1& 100& 38 & 58 & 79 & \boxplot{0.0}{37.5}{58.3}{78.9}{21.1}  \\
	&cnn+knn  & 1&  62& 40 & 55 & 76 & \boxplot{0.0}{40.0}{55.0}{76.3}{23.7}  \\
	&cliff+knn& 2&  11& 33 & 67 & 89 & \boxplot{0.0}{33.3}{66.7}{88.6}{11.4}  \\ 
	&psc+knn  & 2&  15& 40 & 50 & 64 & \boxplot{0.0}{40.0}{50.0}{64.3}{35.7}  \\ 
  &mcs+knn  & 3&  22& 41 & 50 & 60 & \boxplot{0.0}{41.2}{50.0}{60.0}{40.0}  \\
  \hline
pf
	&knn      & 1& 100&  21 & 38 & 60 & \boxplot{0.0}{21.1}{38.2}{60.0}{40.0}  \\
	&cnn+knn  & 1&  62&  22 & 39 & 60 & \boxplot{0.0}{22.0}{38.9}{60.0}{40.0}  \\
	&cliff+knn& 2&  11&  9 & 20 & 64 & \boxplot{0.0}{9.1}{19.6}{64.3}{35.7}  \\
	&psc+knn  & 2&  15& 36 & 50 & 61 & \boxplot{0.0}{35.7}{50.0}{61.4}{38.6}  \\
	&mcs+knn  & 3&  22& 39 & 49 & 59 & \boxplot{0.0}{38.9}{48.6}{58.8}{41.2}  \\
	\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\\    \\    \\
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Noisy Breast Cancer Results} \\ \hline
bc & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&knn      & 1& 100& 41 & 57 & 67 & \boxplot{0.0}{41.2}{57.1}{66.7}{33.3}  \\
	&cliff+knn& 1&  11&  39 & 57 & 82 & \boxplot{0.0}{38.9}{57.1}{81.6}{18.4}  \\ 
	&mcs+knn  & 2&  25& 43 & 57 & 63 & \boxplot{0.0}{42.9}{56.5}{62.8}{37.2}  \\
	&cnn+knn  & 2&  71& 44 & 54 & 65 & \boxplot{0.0}{43.8}{54.1}{65.1}{34.9}  \\
	&psc+knn  & 3&  19& 37 & 49 & 64 & \boxplot{0.0}{36.6}{48.8}{64.3}{35.7}  \\
	\hline
pf
	&cliff+knn& 1&  11&  17 & 35 & 58 & \boxplot{0.0}{17.1}{35.3}{57.9}{42.1}  \\
	&knn      & 1& 100&  32 & 40 & 53 & \boxplot{0.0}{32.4}{40.0}{52.9}{47.1}  \\
	&cnn+knn  & 2&  71&  33 & 42 & 54 & \boxplot{0.0}{33.3}{41.7}{53.8}{46.2}  \\
	&psc+knn  & 3&  19&  33 & 50 & 62 & \boxplot{0.0}{33.3}{50.0}{61.9}{38.1}  \\
	&mcs+knn  & 2&  25&  37 & 44 & 54 & \boxplot{0.0}{37.2}{43.9}{54.1}{45.9}  \\
	\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\end{tabular}}
\caption{Clean and noisy results for breast cancer.}
\label{fig:results100}
\end{center}
\end{figure}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[ht]
\begin{center}
\small
\scalebox{1}{
\begin{tabular}{ l }
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Clean Dermatology Results} \\ \hline
dm & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&knn      & 1& 100& 89 & 100 & 100 & \boxplot{0.0}{88.9}{100.0}{100.0}{0.0}  \\
	&cliff+knn& 2&  13&  80 & 93 & 100 & \boxplot{0.0}{80.0}{93.3}{100.0}{0.0}  \\ 
	&cnn+knn  & 3&  27& 77 & 88 & 100 & \boxplot{0.0}{76.9}{87.5}{100.0}{0.0}  \\
	&psc+knn  & 4&  10&  69 & 86 & 96 & \boxplot{0.0}{68.8}{85.7}{95.7}{4.3}  \\
	&mcs+knn  & 4&  11& 73 & 85 & 91 & \boxplot{0.0}{73.3}{84.6}{90.9}{9.1}  \\
  \hline
pf
	&knn      & 1& 100& 0 & 0 & 2 & \boxplot{0.0}{0.0}{0.0}{1.6}{98.4}  \\
	&cliff+knn& 1&  13& 0 & 0 & 3 & \boxplot{0.0}{0.0}{0.0}{2.9}{97.1}  \\
	&cnn+knn  & 1&  27& 0 & 0 & 3 & \boxplot{0.0}{0.0}{0.0}{3.2}{96.8}  \\
	&psc+knn  & 1&  10& 0 & 0 & 5 & \boxplot{0.0}{0.0}{0.0}{4.8}{95.2}  \\
	&mcs+knn  & 1&  11& 0 & 0 & 5 & \boxplot{0.0}{0.0}{0.0}{4.8}{95.2}  \\	
\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\\    \\    \\
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Noisy Dermatology Results} \\ \hline
dm & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&cliff+knn& 1&   13& 69 & 91 & 100 & \boxplot{0.0}{68.8}{90.9}{100.0}{0.0}  \\ 	
	&cnn+knn  & 2&  94& 67 & 80 & 90 & \boxplot{0.0}{66.7}{80.0}{90.0}{10.0}  \\
	&knn      & 2& 100& 64 & 78 & 88 & \boxplot{0.0}{63.6}{77.8}{87.5}{12.5}  \\
	&mcs+knn  & 3&  27& 46 & 60 & 73 & \boxplot{0.0}{46.2}{60.0}{72.7}{27.3}  \\
	&psc+knn  & 4&  22& 27 & 50 & 73 & \boxplot{0.0}{27.3}{50.0}{72.7}{27.3}  \\	
	\hline
pf
	&cliff+knn& 1&  13&  0 & 1 & 6 & \boxplot{0.0}{0.0}{1.4}{5.6}{94.4}  \\
	&cnn+knn  & 2&  94& 2 & 3 & 6 & \boxplot{0.0}{1.6}{3.3}{6.2}{93.8}  \\
	&knn      & 2& 100&  2 & 4 & 8 & \boxplot{0.0}{1.7}{3.7}{7.7}{92.3}  \\
	&mcs+knn  & 3&  27& 4 & 7 & 12 & \boxplot{0.0}{3.9}{7.0}{11.8}{88.2}  \\
	&psc+knn  & 4&  22& 5 & 9 & 16 & \boxplot{0.0}{4.5}{9.0}{16.1}{83.9}  \\
\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\end{tabular}}
\caption{Clean and noisy results for dermatology}
\label{fig:results101}
\end{center}
\end{figure}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{figure}[ht]
\begin{center}
\small
\scalebox{1}{
\begin{tabular}{ l }
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Clean Heart (Cleveland) Results} \\ \hline
hc & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&psc+knn  & 1&  42&  9 & 26 & 42 & \boxplot{0.0}{9.1}{25.7}{41.9}{58.1}  \\
	&mcs+knn  & 1&  36&  9 & 25 & 42 & \boxplot{0.0}{9.1}{25.0}{41.7}{58.3}  \\
  &cnn+knn  & 1&  67&  0 & 20 & 50 & \boxplot{0.0}{0.0}{20.0}{50.0}{50.0}  \\ 
	&cliff+knn& 1&  11&  0 & 20 & 42 & \boxplot{0.0}{0.0}{20.0}{41.7}{58.3}  \\
	&knn      & 1& 100&  0 & 20 & 40 & \boxplot{0.0}{0.0}{20.0}{40.0}{60.0}  \\
	\hline
pf
	&cliff+knn& 1&  11& 3 & 9 & 22 & \boxplot{0.0}{3.4}{8.9}{21.7}{78.3}  \\
	&mcs+knn  & 2&  36& 5 & 10 & 25 & \boxplot{0.0}{5.4}{10.0}{25.0}{75.0}  \\
	&cnn+knn  & 2&  67& 6 & 11 & 23 & \boxplot{0.0}{5.8}{10.7}{23.3}{76.7}  \\
	&knn      & 2& 100& 6 & 11 & 23 & \boxplot{0.0}{5.7}{11.3}{23.4}{76.6}  \\
	&psc+knn  & 3&  42& 7 & 14 & 23 & \boxplot{0.0}{7.4}{13.5}{23.1}{76.9}  \\	
\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\\    \\    \\
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Noisy Heart (Cleveland) Results} \\ \hline
hc & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&cliff+knn& 1&  12&  0 & 17 & 47 & \boxplot{0.0}{0.0}{16.7}{46.7}{53.3}  \\
	&psc+knn  & 2&  40& 10 & 21 & 33 & \boxplot{0.0}{10.0}{20.5}{33.3}{66.7}  \\
	&cnn+knn  & 2&  86&  0 & 17 & 38 & \boxplot{0.0}{0.0}{16.7}{37.5}{62.5}  \\
	&mcs+knn  & 2&  48&  0 & 17 & 33 & \boxplot{0.0}{0.0}{16.7}{33.3}{66.7}  \\
  &knn      & 3& 100&  0 & 18 & 33 & \boxplot{0.0}{0.0}{18.2}{33.3}{66.7}  \\ 
  \hline
pf
	&cliff+knn& 1&  12&  3 & 10 & 22 & \boxplot{0.0}{3.4}{10.3}{22.0}{78.0}  \\
	&cnn+knn  & 2&  86&  8 & 13 & 23 & \boxplot{0.0}{7.5}{13.2}{22.7}{77.3}  \\
	&knn      & 2& 100&  8 & 16 & 23 & \boxplot{0.0}{7.7}{15.7}{23.4}{76.6}  \\
	&mcs+knn  & 3&  48&  9 & 14 & 26 & \boxplot{0.0}{8.5}{13.8}{25.5}{74.5}  \\
	&psc+knn  & 4&  40&  10 & 18 & 25 & \boxplot{0.0}{9.8}{18.2}{25.0}{75.0}  \\
\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\end{tabular}}
\caption{Clean and noisy results for heart (Cleveland).}
\label{fig:results102}
\end{center}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[ht]
\begin{center}
\small
\scalebox{1}{
\begin{tabular}{ l }
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Clean Heart (Hungarian) Results} \\ \hline
hh & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&cliff+knn& 1&   9& 68 & 82 & 90 & \boxplot{0.0}{68.4}{82.1}{89.7}{10.3}  \\
	&knn      & 1& 100& 65 & 75 & 83 & \boxplot{0.0}{65.0}{75.0}{83.3}{16.7}  \\
	&cnn+knn  & 1&  65& 57 & 74 & 85 & \boxplot{0.0}{57.1}{74.4}{85.4}{14.6}  \\
	&psc+knn  & 2&  14& 50 & 63 & 75 & \boxplot{0.0}{50.0}{62.5}{75.0}{25.0}  \\ 
	&mcs+knn  & 2&  19& 53 & 62 & 71 & \boxplot{0.0}{52.6}{61.5}{70.6}{29.4}  \\
  \hline
pf
	&cliff+knn& 1&   9&  10 & 19 & 31 & \boxplot{0.0}{10.3}{19.0}{31.2}{68.8}  \\
	&knn      & 1& 100& 16 & 24 & 33 & \boxplot{0.0}{16.0}{24.0}{33.3}{66.7}  \\
	&cnn+knn  & 1&  65&  13 & 25 & 37 & \boxplot{0.0}{12.5}{25.0}{36.8}{63.2}  \\
	&mcs+knn  & 2&  19& 28 & 38 & 46 & \boxplot{0.0}{28.0}{37.5}{45.8}{54.2}  \\
	&psc+knn  & 2&  14& 28 & 38 & 48 & \boxplot{0.0}{28.0}{38.1}{47.8}{52.2}  \\
\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\\    \\    \\
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Noisy Heart (Hungarian) Results} \\ \hline
hh & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&cliff+knn& 1&   7& 68 & 79 & 89 & \boxplot{0.0}{68.4}{78.9}{89.2}{10.8}  \\
	&cnn+knn  & 2&  76& 60 & 65 & 72 & \boxplot{0.0}{60.0}{65.0}{71.8}{28.2}  \\
	&knn      & 2& 100& 58 & 64 & 69 & \boxplot{0.0}{57.9}{64.1}{69.0}{31.0}  \\
	&mcs+knn  & 2&  25& 51 & 59 & 68 & \boxplot{0.0}{51.3}{59.1}{68.4}{31.6}  \\
  &psc+knn  & 3&  19& 38 & 53 & 68 & \boxplot{0.0}{37.5}{52.6}{68.2}{31.8}  \\ \hline
pf
	&cliff+knn& 1&   7&  11 & 21 & 32 & \boxplot{0.0}{10.5}{20.5}{31.6}{68.4}  \\
	&knn      & 2& 100&  29 & 35 & 41 & \boxplot{0.0}{29.4}{35.3}{41.0}{59.0}  \\
	&cnn+knn  & 2&  76&  28 & 36 & 41 & \boxplot{0.0}{28.2}{35.5}{40.7}{59.3}  \\
	&mcs+knn  & 2&  25&  28 & 37 & 47 & \boxplot{0.0}{28.0}{37.1}{47.1}{52.9}  \\
	&psc+knn  & 3&  19&  28 & 44 & 61 & \boxplot{0.0}{28.2}{43.5}{60.5}{39.5}  \\
\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\end{tabular}}
\caption{Clean and noisy results for heart (Hungarian)}
\label{fig:results103}
\end{center}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[ht]
\begin{center}
\small
\scalebox{1}{
\begin{tabular}{ l }
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Clean Iris Results} \\ \hline
ir & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&knn   & 1& 100& 92& 100& 100&  \boxplot{0.0}{91.7}{100.0}{100.0}{0.0}  \\
	&psc+knn  & 1&   9& 86& 100& 100&  \boxplot{0.0}{85.7}{100.0}{100.0}{0.0}  \\
	&cliff+knn& 1&  15& 80& 100& 100&  \boxplot{0.0}{80.0}{100.0}{100.0}{0.0}  \\
	&mcs+knn  & 1&   4& 80& 100& 100&  \boxplot{0.0}{80.0}{100.0}{100.0}{0.0}  \\
  &cnn+knn & 1&  14& 88&  93& 100&  \boxplot{0.0}{87.5}{92.9}{100.0}{0.0}  \\ \hline
pf
	&knn   & 1& 100& 0& 0& 4&  \boxplot{0.0}{0.0}{0.0}{4.3}{95.7} \\
	&psc+knn  & 1&   9& 0& 0& 6&  \boxplot{0.0}{0.0}{0.0}{5.6}{94.4}  \\
	&cnn+knn & 1&  14& 0& 0& 6&  \boxplot{0.0}{0.0}{0.0}{5.6}{94.4} \\
	&cliff+knn& 1&  15& 0& 0& 7&  \boxplot{0.0}{0.0}{0.0}{7.1}{92.9} \\
	&mcs+knn  & 1&   4& 0& 0& 9&  \boxplot{0.0}{0.0}{0.0}{9.1}{90.9} \\
	\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\\    \\    \\
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Noisy Iris Results} \\ \hline
ir & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&cliff+knn& 1&  13& 56&  78& 100&  \boxplot{0.0}{55.6}{77.8}{100.0}{0.0}  \\
	&knn   & 2& 100& 69&  83&  92&  \boxplot{0.0}{69.2}{83.3}{91.7}{8.3}  \\
	&cnn+knn & 3&  33& 38&  67&  80&  \boxplot{0.0}{37.5}{66.7}{80.0}{20.0}  \\
	&mcs+knn  & 3&  10& 42&  63&  78&  \boxplot{0.0}{41.7}{62.5}{77.8}{22.2}  \\
  &psc+knn  & 4&  18& 30&  50&  67&  \boxplot{0.0}{30.0}{50.0}{66.7}{33.3}  \\ \hline
pf
	&cliff+knn& 1&  13&  0&  5& 19&  \boxplot{0.0}{0.0}{5.0}{19.0}{81.0} \\
	&knn   & 2& 100&  0&  9& 14&  \boxplot{0.0}{0.0}{9.1}{14.3}{85.7} \\
	&mcs+knn  & 3&  10&  6& 16& 25&  \boxplot{0.0}{5.6}{15.8}{25.0}{75.0} \\
	&cnn+knn & 3&  33&  9& 17& 29&  \boxplot{0.0}{8.7}{16.7}{29.4}{70.6} \\
	&psc+knn  & 3&  18& 10& 25& 38&  \boxplot{0.0}{10.0}{25.0}{38.1}{61.9}  \\
\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\end{tabular}}
\caption{Clean and noisy results for iris.}
\label{fig:results104}
\end{center}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[ht]
\begin{center}
\small
\scalebox{1}{
\begin{tabular}{ l }
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Clean Liver (Bupa) Results} \\ \hline
lv & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&knn   & 1& 100& 50 & 56 & 64 & \boxplot{0.0}{50.0}{56.4}{64.0}{36.0}  \\
	&cliff+knn& 1&  9& 29 & 56 & 80 & \boxplot{0.0}{28.9}{56.0}{80.0}{20.0}  \\ 
	&mcs+knn  & 1&  25& 50 & 55 & 60 & \boxplot{0.0}{50.0}{54.8}{59.5}{40.5}  \\
  &psc+knn  & 1&  13& 45 & 55 & 60 & \boxplot{0.0}{44.7}{54.5}{60.0}{40.0}  \\
  &cnn+knn & 1&  59& 46 & 53 & 58 & \boxplot{0.0}{45.8}{52.5}{58.3}{41.7}  \\
  \hline
pf
	&cliff+knn& 1&  9& 19 & 36 & 68 & \boxplot{0.0}{18.8}{36.0}{68.4}{31.6}  \\
	&cnn+knn & 2&  59& 40 & 46 & 53 & \boxplot{0.0}{40.0}{46.2}{52.6}{47.4}  \\
	&knn   & 2& 100& 36 & 44 & 50 & \boxplot{0.0}{35.9}{44.0}{50.0}{50.0}  \\
	&mcs+knn  & 3&  25& 37 & 44 & 48 & \boxplot{0.0}{36.7}{43.9}{48.3}{51.7}  \\
	&psc+knn  & 3&  13& 36 & 45 & 55 & \boxplot{0.0}{35.5}{44.8}{55.0}{45.0}  \\
	\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\\    \\    \\
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Noisy Liver (Bupa) Results} \\ \hline
lv & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&mcs+knn  & 1&  26& 44 & 55 & 61 & \boxplot{0.0}{44.2}{54.8}{61.3}{38.7}  \\
	&knn      & 1& 100& 44 & 54 & 60 & \boxplot{0.0}{44.0}{53.8}{60.0}{40.0}  \\
	&cnn+knn  & 1&  57& 47 & 54 & 59 & \boxplot{0.0}{46.5}{53.8}{59.4}{40.6}  \\
	&psc+knn  & 1&  15& 42 & 53 & 58 & \boxplot{0.0}{42.3}{52.5}{58.1}{41.9}  \\ 
  &cliff+knn& 1&   9& 30 & 48 & 74 & \boxplot{0.0}{30.0}{48.3}{74.3}{25.7}  \\
  \hline
pf
	&cliff+knn& 1& 	  9&  21 & 42 & 68 & \boxplot{0.0}{21.4}{41.7}{67.6}{32.4}  \\
	&knn   		& 1&	100&  38 & 44 & 54 & \boxplot{0.0}{37.5}{44.4}{53.5}{46.5}  \\
	&mcs+knn  & 1&	 26&  39 & 45 & 54 & \boxplot{0.0}{38.5}{45.0}{53.5}{46.5}  \\
	&psc+knn  & 1&   15&  41 & 47 & 57 & \boxplot{0.0}{40.6}{47.4}{56.8}{43.2}  \\
	&cnn+knn 	& 1& 	 57&  41 & 48 & 54 & \boxplot{0.0}{40.6}{48.3}{54.1}{45.9}  \\
	\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\end{tabular}}
\caption{Clean and noisy results for liver (Bupa).}
\label{fig:results105}
\end{center}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{figure}[ht]
\begin{center}
\small
\scalebox{1}{
\begin{tabular}{ l }
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Clean Mammography Results} \\ \hline
mm & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&cliff+knn& 1&   8& 49 & 62 & 77 & \boxplot{0.0}{48.6}{62.2}{77.3}{22.7}  \\
	&cnn+knn  & 2&  57& 47 & 54 & 61 & \boxplot{0.0}{46.5}{53.7}{60.5}{39.5}  \\
	&knn      & 3& 100& 45 & 53 & 57 & \boxplot{0.0}{45.0}{52.9}{57.1}{42.9}  \\
	&mcs+knn  & 3&  17& 48 & 52 & 57 & \boxplot{0.0}{47.6}{51.8}{56.8}{43.2}  \\
  &psc+knn  & 4&  10& 44 & 50 & 56 & \boxplot{0.0}{44.2}{50.0}{55.8}{44.2}  \\ 
  \hline
pf
	&cliff+knn& 1&   8& 21 & 36 & 45 & \boxplot{0.0}{21.3}{36.0}{45.2}{54.8}  \\
	&cnn+knn  & 2&  57& 39 & 46 & 52 & \boxplot{0.0}{38.6}{46.3}{52.3}{47.7}  \\
	&knn      & 3& 100& 41 & 46 & 51 & \boxplot{0.0}{41.3}{45.9}{51.2}{48.8}  \\
	&mcs+knn  & 3&  17& 41 & 48 & 52 & \boxplot{0.0}{40.8}{47.6}{52.2}{47.8}  \\
	&psc+knn  & 4&  10& 43 & 47 & 55 & \boxplot{0.0}{42.5}{47.4}{54.5}{45.5}  \\
\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\\    \\    \\
\begin{tabular}{l@{~}| l@{~}| c@{~}| r@{~}|r@{~}r@{~}@{~}r@{~}|c}
\multicolumn{8}{ c }{Noisy Mammography Results} \\ \hline
mm & PLS & rank & size\% & 25\%& 50\% & 75\%&Q1 median Q3\\\hline 
pd
	&cliff+knn& 1&   9& 51 & 63 & 76 & \boxplot{0.0}{51.3}{63.4}{76.1}{23.9}  \\
	&mcs+knn  & 2&  19& 44 & 51 & 57 & \boxplot{0.0}{44.0}{51.1}{57.4}{42.6}  \\ 
	&psc+knn  & 2&  11& 43 & 50 & 60 & \boxplot{0.0}{42.9}{50.0}{60.0}{40.0}  \\
	&cnn+knn  & 3&  61& 35 & 48 & 57 & \boxplot{0.0}{34.8}{47.8}{56.6}{43.4}  \\
	&knn      & 4& 100& 34 & 46 & 53 & \boxplot{0.0}{34.1}{45.5}{52.8}{47.2}  \\
	\hline
pf
	&cliff+knn& 1&   9&  22 & 37 & 52 & \boxplot{0.0}{22.0}{36.6}{52.2}{47.8}  \\
	&mcs+knn  & 2&  19&  39 & 48 & 54 & \boxplot{0.0}{38.5}{47.7}{53.5}{46.5}  \\
	&psc+knn  & 2&  11&  41 & 50 & 57 & \boxplot{0.0}{40.5}{50.0}{56.8}{43.2}  \\
	&cnn+knn  & 3&  61&  42 & 51 & 65 & \boxplot{0.0}{41.7}{51.2}{65.0}{35.0}  \\
	&knn      & 4& 100&  44 & 54 & 63 & \boxplot{0.0}{43.9}{53.5}{63.4}{36.6}  \\
	\hline
\multicolumn{7}{c}{~}&~~~~~0~~~~~~~~50~~~~100
\end{tabular}
\end{tabular}}
\caption{Clean and noisy results for mammography}
\label{fig:results106}
\end{center}
\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

