Mercurial > repos > testtool > accuracy
changeset 4:4494c973f643 draft default tip
Deleted selected files
author | testtool |
---|---|
date | Fri, 13 Oct 2017 10:15:08 -0400 |
parents | a5a5716e0317 |
children | |
files | accuracy.R accuracy.xml |
diffstat | 2 files changed, 0 insertions(+), 104 deletions(-) [+] |
line wrap: on
line diff
--- a/accuracy.R Fri Oct 13 10:14:29 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,48 +0,0 @@ -require(caret, quietly = TRUE) - -args <- commandArgs(trailingOnly = TRUE) - -input = args[1] -p = args[2] -output1 = args[3] -output2 = args[4] - -dataset <- read.csv(input, header=TRUE) - -validation_index <- createDataPartition(dataset$Species, p=p, list=FALSE) - -validation <- dataset[-validation_index,] - -validdataset <- dataset[validation_index,] - -percentage <- prop.table(table(validdataset$Species)) * 100 -cbind(freq=table(validdataset$Species), percentage=percentage) - -output_summary <- summary(validdataset) -write.csv(output_summary,output1) - -control <- trainControl(method="cv", number=10) -metric <- "Accuracy" - -# a) linear algorithms -set.seed(7) -fit.lda <- train(Species~., data=validdataset, method="lda", metric=metric, trControl=control) -# b) nonlinear algorithms -# CART -set.seed(7) -fit.cart <- train(Species~., data=validdataset, method="rpart", metric=metric, trControl=control) -# kNN -set.seed(7) -fit.knn <- train(Species~., data=validdataset, method="knn", metric=metric, trControl=control) -# c) advanced algorithms -# SVM -set.seed(7) -fit.svm <- train(Species~., data=validdataset, method="svmRadial", metric=metric, trControl=control) -# Random Forest -set.seed(7) -fit.rf <- train(Species~., data=validdataset, method="rf", metric=metric, trControl=control) - -results <- resamples(list(lda=fit.lda, cart=fit.cart, knn=fit.knn, svm=fit.svm, rf=fit.rf)) -output_results <- summary(results) - -write.csv(as.matrix(output_results),output2)
--- a/accuracy.xml Fri Oct 13 10:14:29 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,56 +0,0 @@ -<tool id="accuracy" name="accuracy" version="1.0.0"> - <description>model creation and accuracy estimation</description> - <requirements> - <requirement type="package" version="6.0_76">r-caret</requirement> - </requirements> - <command detect_errors="aggressive"> - Rscript '$__tool_directory__/accuracy.R' '$input' '$p' '$output1' '$output2' - </command> -<inputs> - <param format="csv" type="data" name="input" value="" label="Input dataset" help=" - e.g. iris species table -Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species -5.1,3.5,1.4,0.2,Iris-setosa -4.9,3,1.4,0.2,Iris-setosa -4.7,3.2,1.3,0.2,Iris-setosa -4.6,3.1,1.5,0.2,Iris-setosa''"/> - <param name="p" type="integer" value="0.80" label="Select % of data to training and testing the models"/> - </inputs> - <outputs> - <data format="csv" name="output1" label="dataset_summary.csv" /> - <data format="csv" name="output2" label="accuracy_summary.csv" /> - </outputs> - <tests> - <test> - <param name="test"> - <element name="test-data"> - <collection type="data"> - <element format="csv" name="input" label="test-data/input.csv"/> - </collection> - </element> - </param> - <output format="csv" name="fit" label="test-data/dataset_summary.csv"/> - <output format="csv" name="fit" label="test-data/accuracy_summary.csv"/> - </test> - </tests> - <help> -Tool allow us to build 5 different models to predict e.g. species from flower measurements. -In the end we can select the best model for further analysis. - -Let’s evaluate 5 different algorithms: - -**Linear Discriminant Analysis (LDA)** -**Classification and Regression Trees (CART).** -**k-Nearest Neighbors (kNN).** -**Support Vector Machines (SVM) with a linear kernel.** -**Random Forest (RF)** - -This is a good mixture of simple linear (LDA), nonlinear (CART, kNN) and complex nonlinear methods (SVM, RF). -We reset the random number seed before reach run to ensure that the evaluation of each algorithm is performed -using exactly the same data splits. It ensures the results are directly comparable. - -</help> -<citations> - <citation>https://CRAN.R-project.org/package=caret</citation> -</citations> -</tool>