\section{Results}
Figure \ref{fig:rankingsr} lists the rankings of  all attribute ranges which, in isolation, predict for third year retention at a probability
higher than the ZeroR limit (55\%), and are supported by good number of records. The top six attributes affecting third-year retention were from the financial aid hypothesis: student's wages, parent's adjusted gross income, student's adjusted gross income, mother's income, father's income, and high school percentile. Of those students who reported their wages, students who made between 7,850 and 9,958 had a 79\% retention. Similar rules were found for parent's income and adjusted gross income. It means that the students with stronger financial support usually stay in college than the students with weaker financial support.

After these top six attributes, high school percentile of 81 or greater was an important attribute with 69\% of students returning after three years. Some other ``performance'' attributes were ACT scores and ranks. This supports the argument that scores do have some predictability of student retention.

TAR3 results, given in Figure \ref{fig:rx}, produced simple theories (treatments) that combined ranges of various attributes that maximized the student retention. For example, the student retention was very high for students with the AGI in the range from \$7,000 to \$724,724 and father's wages were in the range from \$56,289 to \$999,999. One more interesting theory that predicted high retention  was where father's education level was 3 (college) and student's rank amongst the freshmen cohort was between  66.3 and 98.4.

Treatments that predicted student drop-out were based on the total number of classes student was enrolled, English 10000, an introductory college writing and supplemental instruction class, and on-campus living. Students who took less than five class, enrolled in the English 10000 class, and did not live on-campus were at high risk of drop-out. Chart on the bottom of Figure  \ref{fig:rx} shows the retention percentage of each treatment. For example, students enrolled in English 10000 had a 40\% retention in their third year.

Key findings were:
\begin{itemize}
    \item Student's and parent's income capacity and levels affected student retention. Third-year retention was higher for the students with high income than the students with low income. According to treatment 1,  approximately 82\% of  students who had at least \$7,000 AGI and their fathers' income was at least \$56,289 returned after three years. Similarly,  according to treatment 5,  approximately 79\% of  students who made at least \$5,383 and their parents' AGI was at least \$87,744 returned after three years.
	\item Students with better high school performance amongst their peers had higher chances of retention. According to treatment 2,  approximately 81\% of  students who had at least \$7,000 AGI and had high school percentile of 72 and better returned after three years. Approximately 79\% students who had at least 3.34 HS GPA and whose parents had an AGI of at least \$84,744 stayed after three years, given in treatment 4. 
	\item ACT scores, rank of these scores amongst peers, and COMPASS scores affected student retention. Students with higher scores and rank had higher chances of retention. According to treatment 3,  approximately 80\% of  students who had at least \$7,000 AGI and had ACT math score of 21 or better returned after three years. Similarly, 77\% of students who had at least 23 in  ACT composite (or SAT equivalent) and had an income of at least \$5,383 and less than \$561,500 returned after three years, given in treatment 6.
	\item Parent's education level had a positive effect on student retention.  Students whose parents did not attend college had a lower retention compared to students whose parents did attend college. As given in treatments 7 and 10, a student was highly likely (77\%) to return after three years:~(7) if the mother of that student attended college, the student had a ACT composite score of 22 or better, the parents' AGI was at least \$84,744; ~(10) if the father of that student attended college and the student's percentile rank amongst other freshmen in the cohort was at least 66.3.
	\item Enrolling in fewer classes (less than five), enrolling in English 10000 (an introductory college writing class), and  living off-campus had a negative effect on student retention, as given in treatments 11, 12, and 13. It is important to note that enrolling in that English course itself is not a predictor of non-retention, but the sample of the students that attended this class were at high-risk of dropping out. Given funding for further investigation, we would focus more data collection on this high-risk group.
\end{itemize}
%
%Using data mining techniques, we were unable to significantly improve the classification rates for first-year and second-year retention prediction over the baseline, but we achieved approximately 20\% higher probability of detection for third-year retention over the baseline. As we can predict third-year retention probability with high accuracy, based only on the first-year, beginning of term data, this result is significant in student persistence research.    
%
%\begin{figure}
%\begin{center}
%\begin{tabular}{rlr}
%             & $Y$=students enrolled & \\
%$X$= classes & in $X$ classes        & percent\\\hline
%1	&119&	0.45\%\\
%2	&184&	0.70\%\\
%3&	174&	0.66\%\\
%4	&481&	1.83\%\\
%5&	10237	&39.05\%\\
%6&	12861&	49.06\%\\
%7&	1613&	6.15\%\\
%8&	484&	1.85\%\\
%9&	60&	0.23\%\\
%10&	3&	0.01\%\\
%\end{tabular}
%\end{center}
%\caption{Profile of number of classes taken by each student.}\label{ref:profile}
%\end{figure}
%
%\begin{figure}
%\begin{center}
%\includegraphics[width=3in]{gploty.pdf}
%\end{center}
%\caption{Effects of the 13 treatments of Figure~\label{fig:rx}.
%Treatment 0 is the baseline rate for third year students;
%i.e. 55\% retention. Treatments 1 to 10 try to increase the
%baseline retention rate. Treatments 11,12,13 try to decrease the baseline
%retention rates.
%}\label{fig:rxx}
%\end{figure}
%
\begin{figure}
\small
\begin{center}


\begin{tabular}{r@{~}|p{5in}}
\# & Treatment \\\hline
1  & 7000 $\le$ FinAidSTUDENT\_AG $<$ 724,724   and  56,289 $\le$ FinAidFATHER\_WAG $<$ 999,999 \\
2  & 7,000 $\le$ FinAidSTUDENT\_AG $<$ 724,724  and HS\_PERCENT $\ge$ 72   \\
3  & 7,000 $\le$ FinAidSTUDENT\_AG $<$ 724,724  and   21 $\le$ ACT1\_MATH $<$ 36 \\
4  & 84,744 $\le$ FinAidPARENT\_AGI $<$ 999,999 and       HS\_GPA $\ge$ 3.34 \\
5  & 84,744 $\le$ FinAidPARENT\_AGI $<$ 999,999 and 5383 $\le$ FinAidSTUDENT\_WA $<$ 561,500 \\
6  & 23 $\le$ MaxACT $<$ 35                    and 5383 $\le$ FinAidSTUDENT\_WA $<$ 561,500 \\
7  & 22 $\le$ ACT1\_COMP $<$ 35 and 84,744 $\le$ FinAidPARENT\_AGI $<$ 999,999 and\newline FinAidMOTHER\_ED=3\\
8  & 5383 $\le$ FinAidSTUDENT\_WA $<$  561,500 and  21 $\le$ ACT1\_MATH $<$ 36\\
9  & HS\_GPA $\ge$ 3.34 and 32,570 $\le$ FinAidMOTHER\_WAG $<$ 533,395 \\
10 & FinAidFATHER\_ED=3 and 66.3 $\le$ PercentileRankHSGPA $<$ 98.4\\\hline
11 & 1 $\le$ TotalClass$\le$ 5\\
12 & ENG10=Y \\
13 &  LIVE.ON.CAMP=N 
\end{tabular}

\includegraphics[width=3in]{gploty.pdf}
\end{center}
\caption{
Treatments 1 to 10 are the top ten treatments found by this
analysis that increases the third year retention rates.
Treatments 11,12,13 are the worst three  treatments found by this
analysis that {\em most decrease} the third year retention rates.
The effects of each treatment, is shown on the bottom plot.
}\label{fig:rx}
\end{figure}

%\begin{center}
%\begin{table}
%\begin{tabular}{llr}
%Name & Description & Importance \\ \hline
%FinAidSTUDENT\_TA & Student's Tax Form Type &     1.0000 \\
%
%FinAidPARENT\_HOU & Parent's Household Size &     0.9677 \\
%
%FinAidMOTHER\_ED & Mother's Education Level &    0.9542 \\
%
%FinAidSTUDENT\_MA &  Student's Marital Status &   0.9486 \\
%
%FinAidFATHER\_ED &  Father's Education Level &    0.9430 \\
%
%FinAidSTUDENT\_HO & Student's  Household Size &   0.9426 \\
%
%FinAidDEPENDENCY & Student's Dependency Status  & 0.9402 \\
%
%FirstGenInd & First Generation Student &    0.9402 \\
%
%FinAidPARENT\_TAX &   Parent's Tax Form Type &  0.9331 \\
%
%FinAidSTUDENT\_AG &   Student's Adjusted Gross Income &   0.8685 \\
%
%FinAidSTUDENT\_WA &   Student's Wage &   0.7956 \\
%
%    HS\_GPA &  High School GPA &   0.7526 \\
%
%FinAidPARENT\_MAR &    Parent's Marital Status &  0.7466 \\
%
%PercentileRankHSGPA &    Percentile Of Hs Gpa Among Freshmen Cohort &  0.6701 \\
%
%FinAidPARENT\_AGI &  Parent's Adjusted Gross Income &   0.6665 \\
%
%FinAidFATHER\_WAG &  Father's Income &    0.6546 \\
%
%FinAidMOTHER\_WAG &   Mother's Income &   0.6183 \\
%
%HS\_PERCENT &  High School Percentile &   0.5092 \\
%
%    MaxACT &   Max Of ACT Score And ACT Equivalent &   0.4709 \\
%
%PercentileRankMaxACT & Percentile Of Max ACT Among  Freshmen Cohort &     0.4510 \\
%
%CUR\_ERLHRS &   Total  Enrolled Hours &  0.4263 \\
%
% ACT1\_COMP &    ACT Comprehensive Score (new) &  0.4112 \\
%
% ACT1\_MATH &    ACT Math Score (new) &  0.4040 \\
%
% ACT1\_ENGL &    ACT English Score (new) &  0.3928 \\
%
%       AGE &   Age of Student at Matriculation &  0.3016 \\
%
%     ENG10 &  Enrolled in English Courses &    0.2900 \\
%
%LIVEONCAMP &   On-Campus Indicator &  0.2773 \\
%
% ADMIT\_MAJ &  Admit Major &   0.2586 \\
%
%COMP\_WRITE &   Compass Writing Score &  0.2558 \\
%
%TotalClasses & Total Number of Enrolled Classes  &  0.2498 
%
%\end{tabular}  
%\caption{Top 30 Attributes}
%\label{tabTop30Attrs}
%\end{table}
%\end{center}

%After selecting the best combination of FSS (oneR) and classifier  (Bayes Network) based on Mann-Whitney test rankings, we found that  attributes given in Table~\ref{tabTop30Attrs} are critical to third-year persistence. Out of these 30 attributes, top ten attributes described student's family background and family's economic condition, and the most selected attribute was the student's tax form type, which came from the FAFSA submission and had these values: 
%\begin{enumerate}
%    \item IRS 1040
%\item  IRS 1040A, 1040EZ
%\item A foreign tax return
%\item A tax return with Puerto Rico, another U.S. territory or a Freely Associated State
%\end{enumerate}
%
%A person is eligible to file 1040A or 1040EZ if he or she makes less than \$100,000, does not itemizes deductions, does not claim dependents, etc. As shown in Figure~\ref{figRET3HSGPAS_TA}, there is a positive correlation between tax form type 2 and third-year retention for lower high school GPA ranges with the exception of the range: 2.645 to 2.905. Third-year retention percentages are significantly higher for the students who (or their parents) have filed a foreign tax return (type 3) or a U.S. territory tax return (type 4) than those who have filed U.S tax return (type 1 or 2).
%
%
%Second attribute in the list was the parent's household size, which had a positive correlation with third-year retention percentage as shown in Figure~\ref{figParentsHHSizeRET3} along with the distribution of the parent's household size. The sample size was low for student's with large number of people in the household, therefore,  retention percentages in such cases is meaningless. 
%
%As previous research has concluded that parent's education level plays an important role in student's dropout decision \citep{Spa70,Tin75,Bea79}, Figure~\ref{figParentsEdLevelRET3} shows that chances of student's persistence are higher if the parent's education level is higher. If the parents did attend college and beyond, father's education level has greater impact than mother's education level on student's persistence. 
%
%
%As shown in the Table~\ref{tabTop30Attrs}, student's marital status does play a role in persistence, especially if the student is separated (denoted by S in the table). Out of 24 students, who indicated in FAFSA as separated,  only four students persisted till the third year. Students income (FinAidSTUDENT\_WA) also affect their persistence; students with wages in the range of \$7850.5-\$9958 had the highest percentages of return (close to 80\%).

\subsection{Strategic Actions}
This study provides insights in student retention domain using beginning of term data. These insights can be used to design effective policies and strategic actions, such as:
\begin{itemize}
	\item Most of the attributes were related to socio-economic levels and capacities of students and their parents; however, this cannot be controlled while admitting students, but better support programs and calculated financial-aid packaging for students with lower economic capacities can be created. 
	\item First-year students should be encouraged to live on-campus by providing some incentives, as on-campus students have higher chances of retention.
	\item Special guidance and supplemental instruction in writing and reading should be provided to first-generation students. Parents of first-year generation students have considerably low-incomes than the parents of non-first-generation students, and according to the results of this study,  income of parents is a critical factor in student retention even if the students had similar academic performance. 
	\item Students are placed in the supplemental instruction classes, such as English 10000, based on their COMPASS and ACT scores. As these students' scores indicated lack of academic preparedness in some areas, academic advisers correctly place students in such classes; however, if the students fail or perform poorly in such classes, it leaves a lasting impression and sets the students to for future drop-out, even after three years. Therefore, it is paramount that advisers not only place students in supplemental instruction classes, but also ensure the success of students in these classes and improve the skills that students lack. Out of all classes considered in this study, English seemed to have the greatest impact. Intuitive as it may be, to succeed in college, students need good writing and reading skills.
\end{itemize}
