\section{Conclusion}

Although our techniques could not predict first or second year retention with  significantly higher accuracies than the baseline, these techniques obtained probability of detection approximately 15\% higher for the class value of $Y$ and 20\% higher for the class value of $N$ than the baseline percentages for third-year retention, based on the first-year beginning of the term data. In the studied literature, we have not found any studies with such a significant improvement over the baseline for the third-year retention. In addition, if policies are designed to improve third-year retention rate (using this predictive model), not only will they improve first and second year retention rates, but also the six-year graduation rates.

For the studied institution, family background and family's social-economic status are critical for  student's third-year persistence. Using feature subset selection methods, we found that the attributes from the ``financial aid'' hypothesis were selected the most as predictors of retention, and although the attributes from the ``performance'' hypothesis were selected, their predictability, in isolation, was lesser than the attributes from the ``financial aid'' hypothesis. None of the attributes from the ``faculty tenure and experience'' were selected by the feature subset selectors. 

These results could very well be true only for the studied institution; however, if the approach detailed in this study is followed, other institutions can find top performing classifier and important attributes. We recommend: ~(a) data discretization; ~(b) feature subset selection with cross-validation and evaluation the performance over various learners; ~(c) treatment learners, such as TAR3 to find succinct strategic actions in complex data. We welcome the opportunity to study data from other institutions and willing to share the experiment platform used in this study.
