A short-cut function for doing cross-validation with the classifier is also provided.
The software is most suitable for analyzing the data with very high dimension, for example the diagnosis of cancer based on the gene expression data.
If you found that this software is useful in your work, please do not hesitate to cite the software or the papers below.
Download the add-on R package, say mypkg, and type the following command in Unix console to install it to /my/own/R-packages/:
$ R CMD INSTALL mypkg -l /my/own/R-packages/
Type the following command in R console to install it to /my/own/R-packages/ directly from CRAN:
> install.packages("mypkg", lib="/my/own/R-packages/")
> library("mypkg", lib.loc="/my/own/R-packages/")
The original real-valued colon data of R format: colon.rda. The binary colon data of R format: colon.bin.rda. There are 62 patients (40 vs 22) and 2000 genes. They can be loaded into R workspace by using "load" function:
> load("colon.bin.rda")
Test how well the above method with leave-one-out crossvalidation:
>cv.bayes(colon.bin,T,62,4,0.4,30,0.8,5,30,T,40)
Results:
The result of above R command is shown by cv-colon-result. The error rate of above analysis is
0.0967742, i.e. 6 out 62 cases were misclassified. This is the lowest error
rate for Colon data to my knowledge. We selected only 4 features out of 2000
for each iteration in cross-validation. Our method is also very fast, taking
totally 103 secs for 62 folds crossvalidation, which includes also the time for
feature selection. One more thing, our method is also pretty simple
conceptually.