Codes and data for the paper Detecting Duplicate Bug Reports with Convolutional Neural Network in APSEC 2018
- Python3.6
- Anaconda(numpy, pandas, sklearn)
- PyTorch 0.4.0
- torchtext
- gensim
- cuda 8.0
Before run codes, set parameters(paths) in each .py file in codes/
- Train model:
python main.py - Evaluate existed model:
python main.py -snapshot *.pt(*.pt is the existed model)
- Generate DBR-CNN result: get into
codes/[data_set]/[data_set] means the specific dataset you are using, like (spark, hadoop, hdfs, mapreduce) python cb.py
In main.py:
- set
use_global_w2v = False - set
wordvec_saveto specific .save file
In main.py:
change variables straightly in parser
Pretrained word vectors could be download from https://pan.baidu.com/s/18R_lZhlOdp-kgDlbrBq7iA
and unzip it in codes/