View on GitHub

Parameter-Tuning-Learners

Assignments, Projects and other course related material.

Project

Parameter Tuning for Comparison of Learners in Ordinal Data Classification

Original Reference Paper

DRONE: Predicting Priority of Reported Bugs by Multi-factor Analysis
http://ieeexplore.ieee.org/document/6676891/

Dataset

Eclipse Bugzilla Bug Reports between 2001-10-10 to 2007-12-14
~ 103k Bug Reports.

Train - Test Split: 80:20

raw and processed datasets present in Data directory

Feature Extraction

Temporal Features, Author Features and Product Features generated as per description in Paper.
Each set of features available individually here

Code to generate all these features present in (features)[./features] folder.
Nomenclature of features same as paper.
Temporal Features - (TMP1 to TMP9)
Author Features - (AUT 1 to AUT3)
Product Features - (PRO1 to PRO11) & Component Features - (PRO12 to PRO22)

Text Features

Features from text field Summary were generated based on paper. Using Tokenizer and countvectorizer. Code for the same is available here

Original Algorithm

Implementation of DRONE is available here

Presentation

Steps to Reproduce

We provide the raw dataset as well as the processed dataset here.
Disclaimer: Datasets have been sourced from Eclipse Bugzilla. It is available under Terms and conditions specified on the Eclipse Bugzilla Webpage. Refer this for any queries related to license for use or distribution of the dataset.

In case you would like to process the features again, the Code with instructions for the same is present here.

Once the these features are generated, we need to generate count vector for summary column. For this we have DataPreprocessing.py which can be imported and used to preprocess training and testing Data (using fit() and transform() methods).

Finally, Code for running Differential Evolution is present here.

Implementation of the original DRONE algorithm is also available here.

Report

Detailed report with results can be found here