CS572: project2b Welcome to the bleeding edge of search-based software engineering. Right now, WVU is being funded to prepare a report for NASA on the future of software engineering and what software process options will best reduce effort, development time, bugs and threats to future software projects. That project is using some prototype "C" code developed by WVU super star masters student Ous El-Rawas. That prototype code will soon need replacing with a better system written with more design clarity. I don't know how good are your LISP coding skills but there is a real chance that what you do in this assignment will become an oracle that shows the way to a saner SE enterprise. That's the good news. The bad news is that bleeding edge can be a confusing place to live. The following project description is far more verbose that it should be but, as Blaise Pascal said (in French) "I made this letter very long, because I did not have time to make it shorter.". So, read on and excuse my verbose-ity-ousity-ness. At first pass, there is a lot of work here. This is why... a) ... there a whole project devoted to this work b) ... doing this work gets you out of the exam But before you get anxious, none of the following is "clever" work. Its all just "effort-in effort-out" stuff. So get to it, work at a considered pace, get lots of sleep. ------------------------------------------------------------------ DATES Due midnight Monday March 3 (week 8) Late assignments accepted (with deductions for late work) till March 6 ------------------------------------------------------------------ YOUR TASK In Project3, you'll be applying AI search engines to SE models. But before you can search a space you have to define that space. So, in this project you will get running four prediction models for software's - development time (the COCOMO months predictor), - effort (the COCOMO staff months predictor), - quality (the COQUALMO #defects/KLOC predictor), - threats (the MADACHY THREAT predictor for "dumb ass management" decisions) ------------------------------------------------------------------ SPECIFIC TASKS This project divides into the following tasks 0) lots of reading. 1) get my code 2) store my code in an on-line version control system 3) get my code running 4) debug my code (the effort and months prediction models) 5) extend my code (with the defects and threats models) 6) do a walk through of the code with me ------------------------------------------------------------------ WHAT NOT TO DO (YET): You'll have enough to do this project an extra task will be deferred till next project 7) Implement a mutator for the threats model. (Mutators are already in place for the other four models) ------------------------------------------------------------------ WHAT TO HAND IN To prove that all that is working, you will submit(*) - a list of seven debugging reports (described in task #4, task #5 below) - a list of bugs / enhancements with my code (I'm expecting a long list) - a set of revised code files - graphs showing the outputs from the different models (from debugging task #7) (*) Note that you won't "submit" you code. Rather, your code will be stored in a subversion repository located at the url I send out. So you won't "submit" the system. Instead, you will just commit your code regularly and on the submission date I'll just graph a copy Please note: - Your code will be in some directory "nova". - Your list of (unfixed) bugs will be stored nova/doc/proj2/bugs.txt - Your list of (fixed) bugs will be stored nova/doc/proj2/fixes.txt - Your list of enhancements will be stored nova/doc/proj2/fixes.txt - Your plots will be stored in the nova/doc/proj2/plots directory - Your debugging reports will be stored in nova/doc/proj2/debug1.txt, debug2.txt, etc... ------------------------------------------------------------------ HOW TO DO TASK 0) : LOTS OF READING a) pages 1 to 8 of http://now.unbox.org/trunk/doc/06/xomo2/xomo101.pdf b) the paper "business case for software engineering" c) the tutorial notes on this code in http://menzies.us/csx72/nova.php?xomo d) an old implementation of this stuff that you might find useful nova/etc/xomo.awk - the effort model is on lines 322 366 - the months model is missing - the defects model is on lines 691 to 744 - the risk model is on lines 750 to 791 e) the current lisp implementation that you will have to extend nova/lisp/xomo - the effort model is xomo/defects-model.lisp ; current empty - the effort model is xomo/effort-model.lisp - the effort model is xomo/months-model.lisp - the effort model is xomo/threats-model.lisp ; current empty f) a sample NOVA application that does the dumbest search possible nova/apps/search1.lisp (just thrashes around and if you find something better, jump to it) - (search1) ; high level driver. does some inits, calls (run1) - (run1 &optional (n 100) (r 1000)) ; - run1 is called "n=100" times. - each time it calls "step1" - (step1) ; - for r=1000 repeats.. - (setup) ; resets the cache, - (run); is called that asks multiple models for a score - (score) ; combines the score into one figure - see the call "euclidean" that only combines effort and months ------------------------------------------------------------------ HOW TO DO TASK 1) : GET MY CODE In a clean directory... svn export http://unbox.org/wisp/tags/nova/0.12/ nova HOW TO DO TASK 2) : STORE MY CODE IN AN ON-LINE VERSION CONTROL SYSTEM Using the url and login name i gave you.. svn import nova http://unbox.org/08aiXX -m "initial import" For notes on subversion, see the end of this document ------------------------------------------------------------------ HOW TO DO TASK 3) : GET MY CODE RUNNING cd nova/lisp emacs ; start slime > (load "apps/search1.lisp") > (search1) ; hopefully you'll see a lot of output. watch the "effort" and "tdev" ; figures- if you are lucky they should be decreasing ------------------------------------------------------------------ HOW TO DO TASK 4) : DEBUG MY CODE (THE EFFORT AND MONTHS PREDICTION MODELS) Before debugging the BIG model, you must micro debug the small model. nova/lisp/xomo/model-defaults.lisp defines what we know about the slopes associated with each variable :range is some project-specific parameter; e.g. acap=1 means your analyst capabilities are very low. :effort is the effect on a value from range on the effort :rin is a linear effect on defect INTRODUCTION at requirements time :din is a linear effect on defect INTRODUCTION at design time :cin is a linear effect on defect INTRODUCTION at coding time :rsf is an exponential effect on defect INTRODUCTION at requirements time :dsf is an exponential effect on defect INTRODUCTION at design time :csf is an exponential effect on defect INTRODUCTION at coding time :rsf is an exponential effect on defect introduction at requirements time :rout is the effect on defect REMOVAL at requirements time :dout is the effect on defect REMOVAL at design time :cout is the effect on defect REMOVAL at coding time Debugging test 1: - the slopes for rin,din,csin,rsf,dsf,rout,dout,cout, etc come from the maximum slopes in tables 9.13.14.15 in xomo101.pdf. Those slopes are summarized in lines 75 to 101. Are they correct? Debugging test 2: - the specific slopes for the cocomo parameters are set in model-defaults.lisp. Have the right ranges and slopes been assigned to the right variables? Debugging test 3: - if the slopes are right and if they have been set to the right values, then it should be possible to recreate tables similar 9,13,14,15 (they won't be exactly the same due to all the random variations, but you should get close). To test that, you'll need the following code snippet: (init-db) (! 'stor) ; see what we have guessed about "stor" => #(S(EM :RANGE 6 :EFFORT 0.2344466 :RUN 0.013 8 :DIN 0.0998 :CIN 0.05178000)) (em2effort 'stor) ; what is the impact of stor=6 on effort => 1.703 Now, is 1.703 anything like stor=xh in fig 9 of xomo101.pdf? Go check. Explain any differences. Debugging test 4: - 20 times, pick any random part of my code and write a tiny "eg" that exercises that bit. Does the output make sense? Debugging test 5: - For the effort and months model, you can compare outputs from your system to http://sunset.usc.edu/research/COCOMOII/expert_cocomo/expert_cocomo2000.html. The source code for that system is at /nova/etc/cocomo.c. Try and explain any differences between your code and expert cocomo ------------------------------------------------------------------ HOW TO DO TASK 5) : EXTEND MY CODE (WITH THE DEFECTS AND THREATS MODELS) The files lisp/xomo/defects-model.lisp and threats-model.lisp are incomplete. Fix that. The awk code in http://menzies.us/csx72/src/nova/etc/xomo contains working versions of the missing models. See: - the defects model is on lines 691 to 744 - the risk model is on lines 750 to 791 To test the defect model, well, pray a little. To test the threats model, see below. Debugging test 6: - To test the risk model, run http://sunset.usc.edu/research/COCOMOII/expert_cocomo/expert_cocomo2000.html. The source code for that system is at /nova/etc/cocomo.c. Try and explain any differences between your code and expert cocomo. Once you have all four models going (effort, months, threats, risk), you need to combine these four scores into one value. This one value will be used by your search engine in Project3. We call this combination the "energy" function. The energy of a project can be thought of as being the normalized euclidean distance of the project from the point of minimal energy (i.e. the origin) in nth-dimensional space. In this space, the axes are the normalized and weighted project stats. These stats are the Effort (COCOMO II), Defects (COQUALMO), Threat and Months. Each of these stats are normalized and then weighted with there corresponding weight before being included in the energy calculation. The default value of these weights is 1.0 for all except for the Defects weight. The Defects weight is made of two parts: one is constant and defaults to 1.0, and the other is dependent on the reliability level of the software project. This is further explained below. Effort weight = alpha Defect weight = beta + RD Threat weight = gamma Months weight = delta Default of alpha, beta, gamma and delta is 1.0. RD = relydefect^(rely-3) relydefect: set by default to 1.8 (can be set by user), and is the factor relating the set reliability of a project (rely) to the defects. rely: the required reliability level of a software project. This varies from 1 (very low) to 5 (very high) in the COCOMO II model. For default relydefect = 1.8: if rely is very low (1) => RD = 1.8^-2 ~ 0.31 if rely is low (2)=> RD = 1.8^-1 ~ 0.56 if rely is nominal (3)=> RD = 1.8^0 = 1 if rely is high (4) => RD = 1.8^1 = 1.8 if rely is very high (5)=> RD = 1.8^2 = 3.24 Unnormalized Project Energy = Sum of the squares of the normalized and weighted project stats = sqrt( (Effort*alpha)^2 + (Defect*(beta+RD))^2 + (Threat*gamma)^2 + (Months*delta)^2 ) Final Project Energy = Unnormalized Project Energy / sqrt ( sum of weights^2) = Unnormalized Project Energy / sqrt ( alpha^2 + (beta+RD)^2 + gamma^2 + delta^2 ) Debugging test 7: - Note that all your models should generate random about according to what guesses are made about ranges,rin,din,csin,rsf,dsf,rout,dout,cout, etc . - To test that, generate ten graphs showing 1000 outputs from 1000 randomly selected inputs Ee: energy vs effort Ed: energy vs defects Et: energy vs threats Em: energy vs months em: effort vs months ed: effort vs defects et: effort vs threats md: months vs defects mt: months vs threats td: threats vs defects To generate the plots, generate a two column ascii file (tab separated) called, e.g. em.dat show() { sort $1.dat > $1.sorted cat <<-EOF>em.plt set terminal dumb set xlabel "$2" set ylabel "$3" set title "$2 vs $3" set no key plot "$1.sorted" EOF gnuplot $1.plt > $1.plot } show Ee energy effort show Ed energy defects show Et energy threats show Em energy months show em effort months show ed effort defects show et effort threats show md months detects show mt months threats show td threats detects ------------------------------------------------------------------ HOW TO DO TASK 6) DO A WALK THROUGH OF THE CODE WITH ME Strange to say, a "walk through: is usually conducted by people sitting down. Odd, heh? The walk through will have the same rules as proj1. Round robin on the project members, me pointing to random parts of the code, asking questions. ------------------------------------------------------------------ TUTORIAL ON SUBVERSION Imagine never losing code ever again, being able to back up to old versions, being able to share code with others. Welcome to the wonderful world of version control. For the subversion tool: * you should start the day by updating your local work from the repository cd nova svn update * you should end the day by committing your changes cd nova svn commit -m "my new changes" * Use the "svn status" command to find what files are new in your repository cd nova svn status * Use the "svn add" command to add local files to you local version of the repository cd nova svn add x CAUTION When working with a directory under subversion control, there is a magic hidden .svn directory that you can corrupt if you aren't careful. And by "careful" I mean moving things around without telling subversion about it all. So don't move copy or delete files using any (a) graphical browser for your directories or (b) standard unix commands. Instead, prefix those operations with "svn". e.g. svn cp old new svn mv here toThere svn rm this SEE A FILE'S HISTORY If you want to see whose edited a certain file, or a directory tree, the log command is what you need. svn log svn log svn log