Skip to content

Commit 3ca71aa

Browse files
committed
Script Readme
1 parent 97ce924 commit 3ca71aa

File tree

1 file changed

+8
-9
lines changed

1 file changed

+8
-9
lines changed

README.md

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ This repository is composed of the following elements:
1818
* `EMCL.py`: script that reproduces the experiments of our paper submitted to ECML PKDD.
1919
* `PANG.py`: script that implements the Pang method.
2020
* `ProcessingPattern.py`: script that computes the number of occurences and the set of induced patterns.
21-
* `Pattern.sh`: **TODO (identifies the patterns with SPMF and counts them with `ProcessingPattern.py` ?).**
21+
* `Pattern.sh`: script that computes the patterns of a dataset.
2222
* `CORKcpp.zip`: archive containing the CORK source code (used in `EMCL.py`) cf. Section [Installation](#installation).
2323
* `data`: folder containing the input data. Each subfolder corresponds to a distinct dataset, cf. Section [Datasets](#datasets).
2424
* `results`: files produced by the processing.
@@ -41,12 +41,11 @@ Second, one of the dependencies, SPMF, is not a Python package, but rather a Jav
4141

4242
Note that SPMF is available both as a JAR and as source code archive. However, the former does not contain all the features required by Pang, so one should use only the latter.
4343

44-
**TODO In order to run the script that reproduces our ECML PKDD experiments, you also need to install CORK.**
44+
In order to run the script that reproduces our ECML PKDD experiments, you also need to install CORK. This is done by unzipping the archive `CORKcpp.zip` in the `src` folder.
4545

4646
## Data
4747
Third, you need to set up the data to which you want to apply Pang. This can be the dataset from our paper, in which you will need to unzip several archives, or your own data, in which case they need to be respect the appropriate format. In both cases, see cf. Section [Use](#use).
4848

49-
5049
# Use
5150
We provide two scripts to use Pang:
5251

@@ -100,9 +99,7 @@ For information, the files produced by our scripts to list the identified patter
10099

101100
4. `x A B C A,B,C` : graphs containing the pattern
102101

103-
The format of the file containing the graph labels is as follows:
104-
105-
**TODO**
102+
The format of the file containing the graph labels is as follows: each line contains an unique integer, corresponding to the label of the graph in the same line in the graph file.
106103

107104
### Processing
108105

@@ -111,9 +108,11 @@ Once the data are ready, you need to run a script to identify the patterns, and
111108
1. Open the `Python` console.
112109
2. Run the script `Patterns.sh` in order to create the files `XXX_patterns.txt`.
113110
3. Run `ProcessingPattern.py`with the option `-d XXX` in order to create the files `XXX_mono.txt` and `XXX_iso.txt`.
114-
4. Run `PANG.py` with the option `-d XXX` in order to run Pang on the data `XXX`.
111+
4. Run `PANG.py`. 2 parameters are required:
112+
* `-d XXX` : the name of the dataset
113+
* `-k k` : the number of patterns to consider. It can be a single value, or a list of values separated by commas.
115114

116-
For each value of the parameter `k` **TODO c'est quoi ce k ?**, Pang will create a file `KResults.txt` containing the results of the classification and a file `KPatterns.txt` containing the patterns.
115+
For each value of the parameter `k`, Pang will create a file `KResults.txt` containing the results of the classification and a file `KPatterns.txt` containing the patterns.
117116

118117

119118
# Dependencies
@@ -136,7 +135,7 @@ For the ECML PKDD assessment, we use the following algorithms for the sake of co
136135
* The `WL` and `WLOA` algorithms are included in the `Grakel` library, documentation available [here](https://ysig.github.io/GraKeL/0.1a8/benchmarks.html)
137136
* `Graph2Vec` is included in the `karateclub` library, documentation available [here](https://karateclub.readthedocs.io/en/latest/)
138137
* `DGCNN` is included in the `stellargraph` library, documentation available [here](https://stellargraph.readthedocs.io/en/stable/).
139-
* We use the implementation of `CORK` from Marisa Thoma. This implementation is available in the `CORKcpp.zip` archive.
138+
* We use the implementation of `CORK` from Marisa Thoma. This implementation is available in the `CORKcpp.zip` archive, from [here](http://www.dbs.ifi.lmu.de/~thoma/pub/sam2010/sam2010.zip)
140139

141140

142141
# References

0 commit comments

Comments
 (0)