
\begin{figure*}[!t]
\begin{center}
{\small
\begin{tabular}{p{0.95\linewidth}}\hline
Instances-based learners draw conclusions from instances {\em near} 
the test instance.
Mendes et al.~\cite{Mendes2003} discuss various {\em near}-ness measures.
\bi
\item[$M_1$]: A simple Euclidean measure;
\item[$M_2$]: A ``maximum distance'' measure that that focuses on the single feature that maximizes
inter-project distance.
\item[$M_3$]: More elaborate kernel estimation methods.
\ei
Once the nearest neighbors are found, they must be used to generate an effort
estimate via...
\bi
\item[$R_1$]: Reporting the  median effort value of the analogies;
\item[$R_2$]: Reporting the mean dependent value;
\item[$R_3$]: Reporting a weighted mean where the nearer analogies are weighted higher than those further away~\cite{Mendes2003};
\ei
Prior to running  an instance-based learning, it is sometimes recommended
to handle anomalous rows by:
\bi
\item[$N_1$]: Doing nothing at all;
\item[$N_2$]: Using outlier removal~\cite{keung2008b};
\item[$N_3$]: {\em Prototype generation}; i.e. 
replace the data set with a smaller set of most representative examples~\cite{chang74}.
\ei
When computing distances between pairs,  some feature weighting scheme is often applied:
\bi
\item[$W_1$]: All features have uniform weights;
\item[$W_2.. W_9$]: Some pre-processing scores the relative value of the features
using various methods~\cite{keung2008b,Li2009,hall03}.
The pre-processors may require  {\em discretization}.
\ei
Discretization breaks up  continuous ranges
at  points $b_1,b_2,...$, each containing  counts of  $c_1,c_2,...$ of numbers~\cite{gama06}.
Discretization methods include:
\bi
\item[$D_1$]: Equal-frequency, where $c_i=c_j$;
\item[$D_2$]: Equal-width, where $b_{i+1} - b_i$ is a constant;
\item[$D_3$]: Entropy~\cite{FayIra93Multi};
\item[$D_4$]: PKID~\cite{YanWeb02Comparative};
\item[$D_5$]: Do nothing at all.
\ei
Finally, there is the issue of how many $k$ neighbors should be used:
\bi
\item[$K_1$]: $k=1$ is used by Lipowezky et al.~\cite{Lipowezky1998} and
Walkerden \&
Jeffery~\cite{Walkerden1999};
\item[$K_2$]: $k=2$ is used by Kirsopp \& Shepperd~\cite{Kirsopp2002}
\item[$K_3$]: $k=1,2,3$ is used by Mendes el al.~\cite{Mendes2003}
\item[$K_4$]: Li et al. use $k=5$~\cite{Li2009};
\item[$K_5$]: Baker  tuned $k$ to a particular training set using an experimental method~\cite{baker07}.
\ei
\\\hline
\end{tabular}}
\end{center}
\caption{Each combination of the above
$N{\times}W{\times}D{\times}M{\times}R{\times}K$ techniques is one {\em algorithm} for instance-based effort estimation.
This figure shows
$3\times3\times3\times9\times5\times5>6,000$ algorithms for effort estimation.
Some of these ways can be ruled out, straight away.
For example, at $k=1$, then all the adaptation mechanisms
return the same result. Also, not all the feature weighting
techniques require discretization, decreasing the space of
options by a factor of five. However, even after discarding
some combinations, there are still hundreds to thousands of algorithms
to explore.  }\label{fig:cbr}
\end{figure*}
