Author:
Description:
The use of genomic technology has the potential to provide invaluable insight into the mechanisms of several important traits. Unfortunately this information comes at a cost, in terms of the high-dimensions and sometimes poor quality of the data. One potential application of genomics is the diagnosis of diseases, such as Alzheimer’s disease, with ambiguous and confounding clinical markers. Of course to predict disease statuses, an algorithm must first be trained using a data set in which disease statuses are known without error. In the case of incipient Alzheimer’s disease this is rarely the case. To this end a misclassification algorithm was applied to a data set containing healthy individuals and incipient Alzheimer’s patients to examine the effects of potential misclassification on diagnostic accuracy. Results obtained without invoking the misclassification algorithm showed limited predictive power of the model. When the misclassification algorithm was invoked significant increase in the model’s predictive ability were obtained. These results demonstrate the utility of the misclassification algorithm in data sets containing potential misdiagnosis. In addition to potential misdiagnosis, the high-dimensions of genomic data sets can also pose substantial issues for statistical analysis. Due to the large number of features in many genomic datasets, explicit modeling of gene interactions is often infeasible. To eliminate the need for simplifying assumptions a machine learning algorithm, referred to as the ant colony algorithm (ACA), was adapted for analysis of high-dimension genomic data. In a study examining the selection of predictive gene expression patterns, the performance of the ACA was compared to several standard methodologies. When applied to high-dimensional data sets, the ACA was able to identify small subsets of highly predictive genes, yielding superior prediction accuracy when compared to several standard feature selection methods. In an application involving single nucleotide polymorphism marker ...
Publisher:
uga
Year of Publication:
2007-12
Document Type:
Dissertation ; [Doctoral and postdoctoral thesis]
Language:
eng
Subjects:
ant colony optimization ; genomics ; latent variable model ; logistic regression ; misclassification algorithm
DDC:
006 Special computer methods (computed)
Rights:
public
Relations:
robbins_kelly_200712_phd
;
http://purl.galileo.usg.edu/uga_etd/robbins_kelly_200712_phd
;
http://hdl.handle.net/10724/24479
robbins_kelly_200712_phd
;
http://purl.galileo.usg.edu/uga_etd/robbins_kelly_200712_phd
;
http://hdl.handle.net/10724/24479
Content Provider:
University of Georgia: Athenaeum@UGA
- URL: http://athenaeum.libs.uga.edu/
- Continent: North America
- Country: us
- Latitude / Longitude: 33.951032 / -83.375083 (Google Maps | OpenStreetMap)
- Number of documents: 20,268
- Open Access: 1 (1%)
- Type: Academic publications
- System: DSpace XOAI
- Content provider indexed in BASE since:
- BASE URL: https://www.base-search.net/Search/Results?q=coll:ftunivgeorgia
My Lists:
My Tags:
Notes: