changes for jar

2020-04-07 18:06:53 +02:00
parent c5463f91f8
commit 40a8b0c290
57 changed files with 1180 additions and 301 deletions
@@ -24,7 +24,7 @@ Fixminer is a systematic and automated approach to mine relevant and actionable
 ## II. Environment setup

 * OS: macOS Mojave (10.14.3)
-* JDK8: Oracle jdk1.8 (**important!**)
+* JDK8: (**important!**)
 * Download and configure Anaconda
 * Create an python environment using the [environment file](environment.yml)
  ```powershell
@@ -32,7 +32,12 @@ Fixminer is a systematic and automated approach to mine relevant and actionable
  ```
 * After creating the environment, activate it. It is containing necessary dependencies for redis, and python.
  ```powershell
-  source activate redisEnv
+  source activate fixminerEnv
+  ```
+
+* Update the config.yml file with the corresponding paths in your computer. An example config.yml file could be found under
+  ```powershell
+  fixminer_source/src/main/resources/config.yml
  ```
 <!---
 [fixminer.sh](python/fixminer.sh)
@@ -45,95 +50,52 @@ In order to launch FixMiner, execute [fixminer.sh](python/fixminer.sh)

    bash fixminer.sh /Users/..../enhancedASTDiff/python/ stats
 --->
-    
-## III. Replication Data
-Replication Data:
-    
-   [singleBR.pickle](python/data/singleBR.pickle)
-    
-    This pickle contains the list bug reports (i.e. bid) with the their corresponding fixes (i.e. commit) for each project in the dataset (i.e. project). 
-    
-   [bugReports.7z.00X](python/data/bugReports.7z.001)
-   
-    This is the dump of the bug reports archive extracted from each commit. These bug reports are not necessarily considered as BUG,CLOSED; this archive is the contins initial bug reports before identifying the fixes. 
-    
-   [gumInput.7z.001](python/data/gumInput.7z.001)
-   
-    This archive contains all the patches in our dataset, formatted in a way that can be processed by GumTree (i.e DiffEntries, prevFiles, revFiles)
-    
-   [ALLbugReportsComplete.pickle](python/data/ALLbugReportsComplete.pickle)
-   
-    The pickle object that represents the bug reports under the following columns 'bugReport', 'summary', 'description', 'created', 'updated', 'resolved', 'reporterDN', 'reporterEmail','hasAttachment', 'attachmentTime', 'hasPR', 'commentsCount'
-
-#### Data Viewer
-
-The data provided with replication package is listed in directory [python/data](python/data)
-The data is stored in different formats. (e.g. pickle, db, csv, etc..)
-
-The see content of the .pickle file the following script could be used.
-
-  ```python
-   import pickle as p
-   import gzip
-   def load_zipped_pickle(filename):
-      with gzip.open(filename, 'rb') as f:
-          loaded_object = p.load(f)
-          return loaded_object
-  ```
-Usage
-
-  ```python
-  result = load_zipped_pickle('code/LANGbugReportsComplete.pickle')
-  # Result is pandas object which can be exported to several formats
-  # Details on how to export is listed in offical library documentation
-  # https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html
-
-  ```
-

 ## IV. Step-by-Step execution

 #### Before running

-* Update [config file](python/config.yml) with corresponding user paths.
+* Update [config file](src/main/resources/config.yml) with corresponding user paths.

 * Active the conda environment from shell
  ```powershell
-  source activate redisEnv
+  source activate fixminerEnv
  ```

 In order to launch FixMiner, execute [fixminer.sh](python/fixminer.sh)

-    bash fixminer.sh [PATH_TO_PYTHON_FOLDER] [OPTIONS]
-     e.g. bash fixminer.sh Users/fixminer-core/python/ stats
-    
-#### Running Options 
+    bash fixminer.sh [JOB] [CONFIG_FILE]
+     e.g. bash fixminer.sh  dataset4c /Users/projects/release/fixminer_source/src/main/resources/config.yml

-*FixMiner* needs to specify an option to run.
+    
+#### Job Types  

-    1. 'dataset': Create a mining dataset from the project speficied in  [subjects.csv](python/data/subjects.csv)
-    Eventually dataset option is the execution of the following steps, which are merged under 'dataset' option 
-    for this demo. Eventually single options can be activated by commenting out the corresponding option in [main.py](python/main.py)
+*FixMiner* needs to specify a job to run.

-        `clone` : Clone target project repository.
-    
-        `collect` : Collect all commit from repository.
-    
-        `fix` : Collect commits linked to a bug report.
-    
-        `bugPoints` : Identify the snapshot of the repository before the bug fixing commit introducted.
-    
-        `brDownload` : Download bug reports recovered from commit log
-    
-        `brParser` : Parse bug reports to select the bug report where type labelled as BUG and status as RESOLVED or CLOSED
+   1. __dataset4j__ / __dataset4c__: Create a java/c mining dataset from the projects listed in [subjects.csv](python/data/subjects.csv) or [datasets.csv](python/data/datasets.csv) for c
+   
        
-    2. 'richEditScript': Rich edit script computation step.    
+   2. __richEditScript__: Calls the jar file produced as the results as maven package to compute Rich edit scripts.
+   This step can be invoke natively from java or using the [Launcher](src/main/java/edu/lu/uni/serval/richedit/Launcher.java) with appropriate arguments.
+   
+ ```powershell
+ java -jar FixPatternMiner-1.0.0-jar-with-dependencies.jar  /Users/projects/release/fixminer_source/src/main/resources/config.yml RICHEDITSCRIPT
+
+ ```   
+
+     
    
-    3. 'shapeSI': Search index creation for shapes. The output of this step is written to [pairs](python/data/pairs)
+   3. __shapeSI__: Search index creation for shapes. The output of this step is written to __pairs__ which will be generated under __datapath__ in [config file](src/main/resources/config.yml)
    
-    4. 'compareShapes' : ShapeTree comparison
+   4. __compareShapes__ : Calls the jar file produced as the results as maven package to compare the trees.
+                             This step can be invoke natively from java or using the [Launcher](src/main/java/edu/lu/uni/serval/richedit/Launcher.java) with appropriate arguments.
+                             
+                           ```powershell
+                           java -jar FixPatternMiner-1.0.0-jar-with-dependencies.jar  /Users/projects/release/fixminer_source/src/main/resources/config.yml COMPARETREES
+                          
+                           ```   
    
-    5. 'cluster': Forms clusters of identical shapetree. The output of this step is written to [shapes](python/data/shapes)
+   5. 'cluster': Forms clusters of identical shapetree. The output of this step is written to [shapes](python/data/shapes)
    
    6. 'actionSI': Search index creation for actions. The output of this step is written to [pairs](python/data/pairsAction)
    
@@ -233,6 +195,51 @@ It is necessary to run the FixMiner, following the order.
 --> 
    

+    
+## III. Replication Data
+Replication Data:
+    
+   [singleBR.pickle](python/data/singleBR.pickle)
+    
+    This pickle contains the list bug reports (i.e. bid) with the their corresponding fixes (i.e. commit) for each project in the dataset (i.e. project). 
+    
+   [bugReports.7z.00X](python/data/bugReports.7z.001)
+   
+    This is the dump of the bug reports archive extracted from each commit. These bug reports are not necessarily considered as BUG,CLOSED; this archive is the contins initial bug reports before identifying the fixes. 
+    
+   [gumInput.7z.001](python/data/gumInput.7z.001)
+   
+    This archive contains all the patches in our dataset, formatted in a way that can be processed by GumTree (i.e DiffEntries, prevFiles, revFiles)
+    
+   [ALLbugReportsComplete.pickle](python/data/ALLbugReportsComplete.pickle)
+   
+    The pickle object that represents the bug reports under the following columns 'bugReport', 'summary', 'description', 'created', 'updated', 'resolved', 'reporterDN', 'reporterEmail','hasAttachment', 'attachmentTime', 'hasPR', 'commentsCount'
+
+#### Data Viewer
+
+The data provided with replication package is listed in directory [python/data](python/data)
+The data is stored in different formats. (e.g. pickle, db, csv, etc..)
+
+The see content of the .pickle file the following script could be used.
+
+  ```python
+   import pickle as p
+   import gzip
+   def load_zipped_pickle(filename):
+      with gzip.open(filename, 'rb') as f:
+          loaded_object = p.load(f)
+          return loaded_object
+  ```
+Usage
+
+  ```python
+  result = load_zipped_pickle('code/LANGbugReportsComplete.pickle')
+  # Result is pandas object which can be exported to several formats
+  # Details on how to export is listed in offical library documentation
+  # https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html
+
+  ```
+