# FixMiner ![Build Status](https://travis-ci.com/SerVal-DTF/fixminer_source.svg?branch=master)[![Coverage Status](https://coveralls.io/repos/github/SerVal-DTF/fixminer_source/badge.svg?branch=master)](https://coveralls.io/github/SerVal-DTF/fixminer_source?branch=master)![Java CI with Maven](https://github.com/SerVal-DTF/fixminer_source/workflows/Java%20CI%20with%20Maven/badge.svg) # Code of FixMiner Reference: [FixMiner: Mining Relevant Fix Patterns for Automated Program Repair](http://arxiv.org/pdf/1810.01791) (Empirical Software Engineering, [doi:10.1007/s10664-019-09780-z](https://doi.org/10.1007/s10664-019-09780-z)) # FixMiner * [I. Introduction of FixMiner](#user-content-i-introduction) * [II. Environment setup](#user-content-ii-environment) * [III. Replication Data](#user-content-iii-data) * [IV. Step-by-Step execution](#user-content-iv-how-to-run) ## I. Introduction Fixminer is a systematic and automated approach to mine relevant and actionable fix patterns for automated program repair. ![The workflow of this technique.\label{workflow}](worflow.png) ## II. Environment setup * OS: macOS Mojave (10.14.3) * JDK8: (**important!**) * Download and configure Anaconda * Create an python environment using the [environment file](environment.yml) ```powershell conda env create -f environment.yml ``` * After creating the environment, activate it. It is containing necessary dependencies for redis, and python. ```powershell source activate fixminerEnv ``` * Update the config.yml file with the corresponding paths in your computer. An example config.yml file could be found under ```powershell fixminer_source/src/main/resources/config.yml ``` ## IV. Step-by-Step execution #### Before running * Update [config file](src/main/resources/config.yml) with corresponding user paths. * Active the conda environment from shell ```powershell source activate fixminerEnv ``` In order to launch FixMiner, execute [fixminer.sh](python/fixminer.sh) bash fixminer.sh [JOB] [CONFIG_FILE] e.g. bash fixminer.sh dataset4c /Users/projects/release/fixminer_source/src/main/resources/config.yml #### Job Types *FixMiner* needs to specify a job to run. 1. __dataset4j__ / __dataset4c__: Create a java/c mining dataset from the projects listed in [subjects.csv](python/data/subjects.csv) or [datasets.csv](python/data/datasets.csv) for c 2. __richEditScript__: Calls the jar file produced as the results as maven package to compute Rich edit scripts. This step can be invoke natively from java or using the [Launcher](src/main/java/edu/lu/uni/serval/richedit/Launcher.java) with appropriate arguments. ```powershell java -jar FixPatternMiner-1.0.0-jar-with-dependencies.jar /Users/projects/release/fixminer_source/src/main/resources/config.yml RICHEDITSCRIPT ``` 3. __shapeSI__: Search index creation for shapes. The output of this step is written to __pairs__ folder which will be generated under __datapath__ in [config file](src/main/resources/config.yml) 4. __compare__ : Calls the jar file produced as the results as maven package to compare the trees. This step can be invoke natively from java or using the [Launcher](src/main/java/edu/lu/uni/serval/richedit/Launcher.java) with appropriate arguments. ```powershell java -jar FixPatternMiner-1.0.0-jar-with-dependencies.jar /Users/projects/release/fixminer_source/src/main/resources/config.yml COMPARE ``` 5. __cluster__ : Forms clusters of identical trees. The output of this step is written to __shapes__ folder which will be generated under __datapath__ in [config file](src/main/resources/config.yml) ## III. Replication Data Replication Data: [singleBR.pickle](python/data/singleBR.pickle) This pickle contains the list bug reports (i.e. bid) with the their corresponding fixes (i.e. commit) for each project in the dataset (i.e. project). [bugReports.7z.00X](python/data/bugReports.7z.001) This is the dump of the bug reports archive extracted from each commit. These bug reports are not necessarily considered as BUG,CLOSED; this archive is the contins initial bug reports before identifying the fixes. [gumInput.7z.001](python/data/gumInput.7z.001) This archive contains all the patches in our dataset, formatted in a way that can be processed by GumTree (i.e DiffEntries, prevFiles, revFiles) [ALLbugReportsComplete.pickle](python/data/ALLbugReportsComplete.pickle) The pickle object that represents the bug reports under the following columns 'bugReport', 'summary', 'description', 'created', 'updated', 'resolved', 'reporterDN', 'reporterEmail','hasAttachment', 'attachmentTime', 'hasPR', 'commentsCount' #### Data Viewer The data provided with replication package is listed in directory [python/data](python/data) The data is stored in different formats. (e.g. pickle, db, csv, etc..) The see content of the .pickle file the following script could be used. ```python import pickle as p import gzip def load_zipped_pickle(filename): with gzip.open(filename, 'rb') as f: loaded_object = p.load(f) return loaded_object ``` Usage ```python result = load_zipped_pickle('code/LANGbugReportsComplete.pickle') # Result is pandas object which can be exported to several formats # Details on how to export is listed in offical library documentation # https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html ```