Fixminer is a systematic and automated approach to mine relevant and actionable fix patterns for automated program repair. $The workflow of this technique.\label{workflow}$

II. Environment setup

OS: macOS Mojave (10.14.3)
JDK8: Oracle jdk1.8 (important!)
Download and configure Anaconda
Create an python environment using the environment file
```
conda env create -f environment.yml
```
After creating the environment, activate it. It is containing necessary dependencies for redis, and python.
```
source activate redisEnv
```

III. Replication Data

Replication Data:

singleBR.pickle

This pickle contains the list bug reports (i.e. bid) with the their corresponding fixes (i.e. commit) for each project in the dataset (i.e. project).

bugReports.7z.00X

This is the dump of the bug reports archive extracted from each commit. These bug reports are not necessarily considered as BUG,CLOSED; this archive is the contins initial bug reports before identifying the fixes.

gumInput.7z.001

This archive contains all the patches in our dataset, formatted in a way that can be processed by GumTree (i.e DiffEntries, prevFiles, revFiles)

ALLbugReportsComplete.pickle

The pickle object that represents the bug reports under the following columns 'bugReport', 'summary', 'description', 'created', 'updated', 'resolved', 'reporterDN', 'reporterEmail','hasAttachment', 'attachmentTime', 'hasPR', 'commentsCount'

Data Viewer

The data provided with replication package is listed in directory python/data The data is stored in different formats. (e.g. pickle, db, csv, etc..)

The see content of the .pickle file the following script could be used.

 import pickle as p
 import gzip
 def load_zipped_pickle(filename):
    with gzip.open(filename, 'rb') as f:
        loaded_object = p.load(f)
        return loaded_object

Usage

result = load_zipped_pickle('code/LANGbugReportsComplete.pickle')
# Result is pandas object which can be exported to several formats
# Details on how to export is listed in offical library documentation
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html

IV. Step-by-Step execution

Before running

Update config file with corresponding user paths.
Active the conda environment from shell
```
source activate redisEnv
```

In order to launch FixMiner, execute fixminer.sh

bash fixminer.sh [PATH_TO_PYTHON_FOLDER] [OPTIONS]
 e.g. bash fixminer.sh Users/fixminer-core/python/ stats

Running Options

FixMiner needs to specify an option to run.

1. 'dataset': Create a mining dataset from the project speficied in  [subjects.csv](python/data/subjects.csv)
Eventually dataset option is the execution of the following steps, which are merged under 'dataset' option 
for this demo. Eventually single options can be activated by commenting out the corresponding option in [main.py](python/main.py)

    `clone` : Clone target project repository.

    `collect` : Collect all commit from repository.

    `fix` : Collect commits linked to a bug report.

    `bugPoints` : Identify the snapshot of the repository before the bug fixing commit introducted.

    `brDownload` : Download bug reports recovered from commit log

    `brParser` : Parse bug reports to select the bug report where type labelled as BUG and status as RESOLVED or CLOSED
    
2. 'richEditScript': Rich edit script computation step.    

3. 'shapeSI': Search index creation for shapes. The output of this step is written to [pairs](python/data/pairs)

4. 'compareShapes' : ShapeTree comparison

5. 'cluster': Forms clusters of identical shapetree. The output of this step is written to [shapes](python/data/shapes)

6. 'actionSI': Search index creation for actions. The output of this step is written to [pairs](python/data/pairsAction)

7. 'compareActions' : ActionTree comparison

8. 'clusterActions': Forms clusters of identical ActionTree. The output of this step is written to [shapes](python/data/actions)

9. 'tokenSI': Search index creation for shapes. The output of this step is written to [pairs](python/data/pairsToken)

10. 'compareTokens' : TokenTree comparison

11. 'clusterTokens': Forms clusters of identical TokenTree. The output of this step is written to [shapes](python/data/tokens)

12. 'stats' : Calculate some statistics about patterns under python/data/statsactions.csv,statsshapes.csv,statstokens.csv, and export FixPatterns of APR integration [fixpatterns](actionPattern2verify.csv)

Languages

C 69.3%

Java 26.9%

Python 2.9%

GAP 0.5%

CSS 0.1%

Other 0.2%