\input{ekrem/combination-algorithms}

%This section reviewed the effort estimation literature with regards to 
%the major estimation techniques 
%used by empirical research studies on cost estimation within the last 15 years.

\subsubsection{Algorithmic Methods}
There are different algorithmic effort predictors introduced over the past 15 years. 
For instance, in the class of instance-based algorithms, \fig{cbr} shows
that there are thousands of options just in that one sub-field.

As to non-instance methods, there are many proposed in the literature
including various kinds of regression (simple, partial least square,
stepwise, regression trees), and neural networks just to name a
few. Refer to \tion{learners} for further information on these non-instance methods.

The combination of that instance \& non-instance-based methods can create even more algorithms. For example,
once an instance-based method finds its nearest neighbors, those
neighboring items can be used for adaptation to the problem under investigation using regression or
neural nets~\cite{Li2009}.

\subsubsection {Non-Algorithmic Methods}
An alternative popular approach to algorithmic approaches (e.g. the instance-based methods of \fig{cbr})
is to utilize the best knowledge of an experienced expert. 
Expert based estimation \cite{Jor2004e} is a human intensive approach that is most commonly adopted in practice. 
Estimates are usually produced by domain experts based on their very own personal experience. It is flexible and intuitive in a sense that it can be applied in a variety of circumstances where other estimating techniques do not work  (for example when there is a lack of historical data). 
%Furthermore in many cases requirements are simply unavailable at the bidding stage of a project where a rough estimate is required in a very short period of time.

Jorgensen \cite{Jor2005b} provides guidelines for producing realistic software development effort estimates derived from industrial experience and empirical studies. One important finding concluded was that the {\em combined estimation} method in expert based estimation offers the most robust and accurate combination method, as combining estimates captures a broader range of information that is relevant to the target problem, for example combining estimates of analogy based with expert based method. Data and knowledge relevance to the project's context and characteristics are more likely to influence the prediction accuracy.

%In contrast, software estimation by analogy \cite{shepperd96} is a more formal and systematic approach to expert based estimation using direct comparison with one or more past projects. The distinction between expert based and analogy based in current software engineering research is that the former is a human-intensive approach and can based on variety of different methods such as rules of thumbs, personal recollection of past experiences etc. and the later is a data-intensive approach based on one or more specified potential analogous projects, and can be automated and repeated. The general principle of data-intensive analogy is to reuse software development experience in the form of past project cases stored in a project repository or a database, which includes what are considered to be the important project features of those projects from the point of view of their possible effect on development effort . An estimate of the effort to complete a new software project is made by analogy with one or more previously completed projects, based on the {\em K-Nearest Neighbour Algorithm} \cite{shepperd96,shepperd97}.

Although widely used in industry, there are still many ad-hoc methods for
expert based estimation. Shepperd et al. \cite{shepperd96} do not
consider expert based estimation an empirical method because the
means of deriving an estimate are not explicit and therefore not
repeatable, nor easily transferable to other staff. In addition,
knowledge relevancy is also a problem, as an expert may not be able
to justify estimates for a new application domain. Hence, the rest of this paper does not consider non-algorithmic methods.


%\subsection {Data quality and effort estimation}
%Software effort estimation researchers have focused on the development of advanced algorithms and optimizing models in order to achieve better prediction accuracy measured by different evaluation criteria, so that the result is comparable to many existing prediction accuracy results produced. In many cases, the prediction performance improvements are less than significant in a statistical sense. Keung \cite{keung08b} argues that estimates are probabilistic in nature, they represents the most likely values based on historical data, an error-free prediction in software cost estimation is both empirically and theoretically impossible. Using different prediction systems may result in similar outcomes only with a small variation in the prediction accuracy, this is because different quality and characteristics of the dataset determine the {\em Theoretical Maximum Prediction Accuracy} (TMPA)\cite{keung08b} of any prediction system being used. Some prediction system may be more suitable than the others on a dataset, this is because the  algorithm within a prediction system is more suitable for the dataset characteristic. In Keung \cite{keung08b}'s experiment, to optimize the prediction accuracy, one approach is to dynamically select a method that would produce a favourable result given the actual effort is known for comparison, he used a different number of k-nearest neighbours for each data point estimate, resulting in an improved overall performance accuracy using the entire dataset. It shows that variance in the dataset drastically changed the prediction accuracy, rather than using different prediction models.  
%
%The dataset quality and their characteristics are usually overlooked in the development of a better algorithm for software effort estimation. Studies shown the use of a data preprocessor in the estimation experiment generally show a significant improvement in the estimation accuracy. Dataset homogenization will also result in improved performance in estimation. \cite{mendes_04}\cite{kitchenham_07}
%
%Without looking into the eminent issue of data quality and their characteristics, the research into the development of a better effort predictor had reached its destination. 
%If the above statement sustains, then the selection of a single useful evaluation criteria and the prediction algorithm are less important than the dataset quality itself. Research effort should be more focusing on the evaluation of dataset quality and its characteristics, including variance in the dataset. 
%

%\subsection{Conclusion Instability}
%
%To derive  stable conclusions about which estimator is ``best'',
%there have been attempts in trying to compare model prediction
%performance of different approaches. For example,
%Shepperd and Kododa \cite{shepperd01b} compared regression, rule
%induction, nearest neighbor and neural nets, in an attempt to
%explore the relationship between accuracy, choice of prediction
%system, and different dataset characteristics by using a simulation
%study based on artificial datasets. They also reported a number of
%conflicting results exist in the literature as to which method
%provides better prediction accuracy, and offered possible explanations
%including the use of an evaluation criteria such as MMRE or the
%underlying characteristics of the dataset being used can
%have a strong influence upon the relative effectiveness of different
%prediction models. Their work as a {\em simulation study}
%that took a single dataset, then generated very large artificial datasets
%using the distributions seen on that data. They concluded that:
%\bi
%\item
%{\em None}
%of these existing estimators  were consistently ``best'';
%\item
%The accuracy of an estimate depends on the dataset characteristic
%and a suitable prediction model for the dataset. 
%\ei
%They conclude that it is
%generally {\em infeasible} to determine which prediction technique
%is "best".
% 
%Recent results suggest that it is appropriate to revisit the conclusion instability hypothesis.
%Menzies et al.~\cite{menzies11} applied 158 estimators to various subsets of two
%COCOMO datasets. In a result consistent  with Shepperd and Kododa, 
%they found the precise ranking of the 158 estimators 
%changed according to the random number seeds used to generate train/test sets;
%the performance evaluation criteria used; and which subset of the data was used.
%However, they also found that four methods consistently out-performed the other 154
%across all datasets, across 5 different random number seeds, and across
%three different evaluation criteria. 
%
%Also,  there are now many more public domain datasets, readily available for stability studies. 
%\fig{datasets} lists 20 datasets which have become available in the last
%year at the PROMISE repository of reusable SE data\footnote{\url{http://promisedata.org/data}}.
%Given the availability of this data, it is no longer necessary to work on
%simulated data (as done by Shepperd and Kadoda~\cite{shepperd01b}) or to study merely two datasets (as done by Menzies et al.~\cite{menzies11}).
%The rest of this paper explores conclusion stability over 20 datasets given in \fig{datasets}.

%The literature placed strong emphasis on {\em No General Conclusion}
%as the "accepted wisdom" in the field of software effort estimation.
%Given the instability and conflicting results of many experiments,
%and the covered algorithms and simulated datasets, we doubt that
%some of the results were not general enough to produce general
%stable conclusion. In this study, we use a large number of real
%project datasets against a large number of different algorithms to
%revisit this challenging issue.



