Numerous methods exist to make estimations on effort.  While this paper can not comprehensively assess all of them, it assesses a subsection of algorithms using a combination of algorithms (COMBA) approach.  In this way, different data preprocessors are combined with different learner algorithms, so that a larger number of algorithms are generated with the addition of a new preprocessor or learner.  The combination done involves sending the data to the preprocessor, and then sending that output to the learner to obtain an estimate.
\\
Multiple methods are used within the literature to assess the accuracy of estimation across a dataset.  As this paper seeks to evaluate these different measures, multiple measures of error will be used and compared.  In addition, error measures that are a synthesis of multiple error measures will be assessed to see if a trend can be uncovered. 
\\ 
These different error measures will be compared using paired Mann-Whitney Wilcoxian statistical tests to determine which algorithms perform better and have a significantly different distribution.  Algorithms can win, tie (if they have similar distributions), or lose with each possible comparison.  Algorithms will be ranked on win and loss measures from the comparison.
\\
The ranking will also be done across multiple datasets, to provide a broader reference as well as potentially gaining information about the datasets themselves.  Which algorithms perform well on a given dataset could be indicative of that datasets terrain.
\\
This section will discuss the preprocessors, learners, and error measures used as well as the data sets which they were used upon.

\subsection{Preprocessors}
Before being passed to the learners, the data was run through a preprocessor.  Some preprocessors change the values of the data, and others change the shape of the dataset by converting it in to a representative model.  The specific preprocessors used will be discussed below.
\subsubsection{None} 
The data is passed to the learner without any preprocessing performed.\\
Abbreviation in results: none
\subsubsection{Logarithmic}
The features in the data are replaced with the natural log of their value.  This reduces the distance between features, effecting many of the learners the data is sent to.\\
Abbreviation in results: log
\subsubsection{Equal Frequency Discretization}
The numeric data is discretized in to a number of bins, with each bin having an equal number of items, or frequency.  The data is put in to bins based on numeric value.  For example, a 2-bin frequency discretization on the data\\
$\{1, 4, 2, 8, 3, 9\}$\\
Would produce a bin $\{1, 2, 3\}$ and a bin $\{4, 8, 9\}$.  Equal Frequency Discretization is performed using 3-bins and 5-bins in this experiment\\  
Abbreviation in results: freq3bin, freq5bin
\subsubsection{Equal Width Discretization}
The numeric data is discretized in to a number of bins, with each width having an equal width of starting and ending values contained.  The width of each bin is computed as:\\
$\frac{Max Value - Min Value}{Number of Bins}$\\
To provide an example,\\
$\{1, 2, 3, 4, 8, 9\}$\\
passed through a 2-bin Equal Width Discretization would produce a bin containing $\{1, 2, 3, 4\}$ and a bin containing $\{8, 9\}$.  Equal Width Discretization is done in this experiment with 3 and 5 bins.\\
Abbreviation in results: width3bin, width5bin
\subsubsection{Normalization}
Each numeric entry in the data is replaced with a normalized value, computed as:\\
$\frac{Value - Min Value}{Max Value - Min Value}$\\
Abbreviation in results: norm
\subsubsection{Stepwise Regression}
A stepwise regression is performed on the data.  This removes values which do not fall with in a certain similarity tolerance in order to remove noise in the data.\\
Abbreviation in results: SWReg
\subsubsection{Principal Component Analysis}
Principal Component Analysis reduces the data to a set of features which are not correlated with one another.\\
Abbreviation in results: PCA
\subsubsection{Sequential Filter Sampler}
Sequential Filter Sampling filters the data in to a set of relevant instances, by applying a filter to the data, testing the changes, and applying a different filter to sample a subset of the overall data.  This sample is then passed on in place of the dataset.\\
Abbreviation in results: SFS

\subsection{Learners}
After the data has been preprocessed, it is passed to a learner which makes an estimate on the effort required.
\subsubsection{Stepwise Regression} 
Stepwise Regression is used as a learner as well as a data preprocessor.  After making a model, the given project is placed in that model to predict for its effort value.\\
Abbreviation in results: SWReg
\subsubsection{Simple Linear Regression}
Simple Linear Regression applies n-dimensional linear regression on the data, attempting to determine a correlation of attributes that generate a given effort value.  The instance to be tested is then placed along the regression, to estimate for its effort value.\\
Abbreviation in results: SLReg
\subsubsection{Partial Least Squares Regression}
Partial Least Squares Regression project the value being predicted for on to the known values to create a new hyperplane.  The project being tested for is then placed on this hyperplane, and the predicted effort value observed.\\
Abbreviation in results: PlSR
\subsubsection{Principal Component Regression}
Instead of using all features besides effort to make predictions, Principal Component Regression reduces the space to a set of features with high variance.  Regression is then performed on this space, and the unknown project is placed on the new space to predict for its effort value.\\
Abbreviation in results: PCR
\subsubsection{Single Nearest Neighbor}
Single Nearest Neighbor finds the project in the dataset that has the closest euclidean distance to the unknown project, and uses that projects effort value for the estimate.\\
Abbreviation in results: 1NN
\subsubsection{Analogy Based Estimation}
Analogy Based Estimation finds historical instances similar to the unknown instance.  In the COMBA system used, Analogy Based Estimation finds the five nearest neighbors of the unknown project and uses their median as the estimated effort value.\\
Abbreviation in results: ABE0

\subsection{Error Measures}
For each dataset, leave one out analysis is performed.  In this way, each instance in the dataset is removed to be tested under all preprocessor and learner combinations available, and the estimates stored.  Once all estimates have been found, collective error measures are gathered for each combination of preprocessor and learner on each dataset.  The error measures used are detailed below.
\subsubsection{Mean Absolute Residual (MAR)}
The absolute residual error of an estimate is computed as:\\
$|actual - predicted|$ \\
After the absolute residuals have been calculated across a dataset, their mean is taken and reported for the error value.
\subsubsection{Mean Magnitude of Relative Error (MMRE)}
The magnitude of relative error is calculated across a dataset, computed as:\\
$\frac{|actual - predicted|}{actual}$\\
After the magnitude of relative error for each instance in the dataset is computed, their mean is reported.
\subsubsection{Mean Magnitude of Error Relative to the Estimate (MMER)}
The magnitude of relative error is computed, but in contrast to MRE it is computed relative to the estimate, as follows:\\
$\frac{|actual - predicted|}{predicted}$\\
After the MER is computed for each instance in the dataset, their mean is computed and reported.
\subsubsection{Median Magnitude of Relative Error (MDMRE)}
As MMRE, but the median of the MRE values across the dataset is computed and reported.
\subsubsection{Pred25}
The number of instances in a dataset whose predicted value had an MRE score of less than 25\% are divided by the number of total instances, and reported as the Pred25 score.
\subsubsection{Mean Balanced Relative Error (MBRE)}
Balanced relative error is computed as :\\
$\frac{|actual - predicted|}{dividing}$\\
Where the dividing term is the smaller of the actual or predicted term.  Once the BRE scores have been computed for each instance in the dataset, their mean is computed and reported.
\subsubsection{Mean Inverted Balanced Relative Error (MIBRE)}
Inverted balanced relative error is computed as :\\
$\frac{|actual - predicted|}{dividing}$\\
Where the dividing term is the larger of the actual or predicted term.  Once the IBRE scores have been computed for each instance in the dataset, their mean is computed and reported.

\subsection{Data Sets}
The datasets used in this experiment were obtained from the PROMISE data repository, which provides freely available software engineering data from real world projects.  The COMBA software platform used could not handle discrete elements in the data, so discrete elements were removed from the data before being sent to the data preprocessor.  The information about each dataset, after removing discrete elements, is provided.
\begin{center}
\begin{tabular} {| l || c | r |}
\hline
Dataset & Features & Instances \\ \hline
Cocomo81o & 17 & 21 \\ \hline
Cocomo81s & 17 & 11 \\ \hline
Finnish & 8 & 38 \\ \hline
Miyazaki94 & 8 & 48 \\ \hline
DesharnaisL2 & 11 & 25 \\ \hline
DesharnaisL3 & 11 & 10 \\ \hline
Nasa93\_center\_1 & 17 & 12 \\ \hline
Albrecht & 8 & 24 \\ \hline
Telecom1 & 3 & 18 \\ 
\hline
\end{tabular}
\end{center}