Numerous studies have used historical datasets to build
and validate models for estimating software development
effort. Very few used a chronological split (where
projects' end dates are used so that training sets only
contain projects that were completed before the start date
of each project in the validation set), and only one
compared chronological split to random split. Therefore
the aim of this study is to investigate further and compare
the use of chronological and random splitting. We do so
in the context of comparing cross-company and singlecompany
models for effort estimation. We used 450
single-company projects and 741 cross-company projects
from the ISBSG Release 10 repository, and estimates
were obtained using manual stepwise regression. We
found that with these data the use of chronological
splitting, and different splitting dates, did not affect
prediction accuracy. We were not able to obtain a
converging set of findings when comparing cross- to
single-company predictions given that different accuracy
measures presented contradictory results. |
Cite as: Lokan, C. and Mendes, E. (2009). Using Chronological Splitting to Compare Cross- and Single-company Effort Models: Further Investigation. In Proc. Thirty-Second Australasian Computer Science Conference (ACSC 2009), Wellington, New Zealand. CRPIT, 91. Mans, B., Ed. ACS. 35-42. |
(from crpit.com)
(local if available)
|