Software Technology Lab, Queen's University, Kingston, Canada
The replication package: Defect Prediction Using RCLUV.7z contains the bug datasets needed to replicate our study. It also has the RCLUV extractor for Java. The datasets were built from two existing benchmark bug datasets: the Eclipse bug dataset and the GitHub bug database.
The thesis is available for download from qthesis
In our report we published the performance of the models developed using the WEKA machine learning tool. We used WEKA version 3.8. The *.arff data files provided in Eclipse\Eclipse_filename_bug_RCLUV\ and GitHub\GitHub_filename_bug_RCLUV\ can be used to train machine learning models in WEKA with the minimum effort.
Below we present the hyperparameters that were used to develop the models.
Table.1 - Hyperparameters of the models used to plot the learning curves for the Eclipse bug dataset
Table.2 - Hyperparameters of the models used to measure the training time for the merged Eclipse bug dataset
Table.3 - Hyperparameters with cost matrix used to develop models for the Eclipse bug dataset
Table.4 - Hyperparameters used to develop models for the GitHub bug database
The RCLUV extractor is written in TXL. You need to have it installed on your machine.
The RCLUVExtractor directory contains a script AnalyzeRCLUV.sh. It takes a java directory name as its argument, and finds all the *.java files from the given directory. After extracting the RCLUVs of the source files it generates a *.RCLUVAnalysis.csv in the parent directory of the give project directory. The extractor writes any errors or warnings triggered during the feature extraction process to the result file. The AnalyzeRCLUV.sh script should be invoked using the following command.
./AnalyzeRCLUV.sh Java_directory
Below we use the above command on the source files of examples/java directory provided with the RCLUVExtractor.
rahman@MSI-GP62-6QF:~/Downloads/RCLUVExtractor$ ./AnalyzeRCLUV.sh examples/java
Compiling
.....................................................................................Done.
It generates a java.RcluvAnalysis.csv file inside the directory examples/
The titles of the feature columns aree needed to be generated separately using the following command.
txl -q -s 1000 dummy.java rcluv-java.txl -d SPREADSHEET -d TITLES 2> titles
We run the above command replacing the dummy.java placeholder with the path of a java file as shown below.
rahman@MSI-GP62-6QF:~/Downloads/RCLUVExtractor$ txl -q -s 1000 examples/java/Policy.java rcluv-java.txl -d SPREADSHEET -d TITLES 2> titles.txt
It generates a titles.txt file inside the RCLUVExtractor directory.
#Get in Touch: E-mail: toashiqur@gmail.com LinkedIn: www.linkedin.com/in/toashiqur