Skip to content

Commit 4f1549f

Browse files
committed
2 parents 83b79d0 + 93fa264 commit 4f1549f

File tree

2 files changed

+32
-19
lines changed

2 files changed

+32
-19
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
ldbc_socialnet_bm
22
=================
33

4-
Social Network Benchmark: DBGEN dataset generator and QGEN workload generator
4+
Social Network Benchmark: [DBGEN](https://github.com/ldbc/ldbc_socialnet_bm/tree/master/ldbc_socialnet_dbgen) dataset generator and QGEN workload generator

ldbc_socialnet_dbgen/README.md

Lines changed: 31 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,47 @@
11
# Introduction
22

3+
The LDBC Social Network Dataset Generator (SNDG) is the responsible of providing the data sets used by all the LDBC benchmarks. This dataset generator is designed to produce directed labeled graphs that mimic the characteristics of those graphs of real data. A detailed description of the generator can be found in the following pages:
4+
5+
* In **[Data Schema](https://github.com/ldbc/ldbc_socialnet_bm/wiki/Data-Schema)**, a description of the schema of the data produced by the generator.
6+
* In **[Data Generation Process](https://github.com/ldbc/ldbc_socialnet_bm/wiki/Data-Generation)**, information about the generation process of the data.
7+
* In **[Data Output](https://github.com/ldbc/ldbc_socialnet_bm/wiki/Data-Output)**, a description of the contents and the format of the files produced by the generator.
8+
9+
310
ldbc_socialnet_dbgen is part of the LDBC project (http://www.ldbc.eu/).
411
ldbc_socialnet_dbgen is GPLv3 licensed, to see detailed information about this license read the LICENSE.txt.
512

6-
This software was build using Apache hadoop version 1.0.3 and we not guarantee compatibility with newer releases.
7-
You can download hadoop 1.0.3 from http://archive.apache.org/dist/hadoop/core/hadoop-1.0.3/
813

14+
## Requirements
15+
16+
This software is build using Apache hadoop version 1.0.3 and we not guarantee compatibility with newer releases.
17+
You can download hadoop 1.0.3 from [here](http://archive.apache.org/dist/hadoop/core/hadoop-1.0.3/). To Configure your hadoop machine or cluster, please visit [here](http://hadoop.apache.org/docs/stable/index.html).
918

10-
## Compilation
1119

12-
The compilation uses Apache Maven to automatically detect and download the necessary dependencies. See: maven.apache.org.
20+
## Compilation
1321

14-
Make sure you are in your ldbc_socialnet_bm/ldbc_socialnet_dbgen/ project folder.
15-
To generate the jar containing all the dependencies the following maven instruction is used:
22+
The compilation uses [Apache Maven](http://maven.apache.org) to automatically detect and download the necessary dependencies. Make sure you are in your ldbc_socialnet_bm/ldbc_socialnet_dbgen/ project folder.
23+
To generate the jar containing all the dependencies, type
1624

25+
```
1726
mvn assembly:assembly
27+
```
1828

19-
This can lead to the generation of two jars in the target folder the default one called ldbc_socialnet_dbgen-<Version-Number>.jar or the one containing all the dependencies inside the jar called ldbc_socialnet_dbgen.jar.
29+
This can lead to the generation of two jars in the target folder: the default one called ldbc_socialnet_dbgen-\<Version-Number\>.jar or the one containing all the dependencies inside the jar called ldbc_socialnet_dbgen.jar.
2030

2131

2232
## Configuration
2333

24-
* Configure your hadoop machine or cluster. For more information on how to do it, please refer its official page http://hadoop.apache.org/docs/stable/index.html
34+
The SNDG is configured by means of the ldbc\_socialnet\_bm/ldbc\_socialnet\_dbgen/_params.init_ file. Set the parameters properly to meet your needs. This file has the following format.
2535

26-
* Configure the params.ini to your needs. This file contains:
27-
- numtotalUser: The number of users the social network will have. It shoud be bigger than 1000.
28-
- startYear: The first year.
29-
- numYears: The period of years.
30-
- serializerType: The serializer type has to be one of this three values: ttl (Turtle format), n3 (N3 format), csv (coma separated value).
31-
- rdfOutputFileName: The base name for the files generated in rdf format (Turtle and N3)
36+
```
37+
numtotalUser: #The number of users the social network will have. It shoud be bigger than 1000.
38+
startYear: #The first year.
39+
numYears: #The period of years.
40+
serializerType: #The serializer type has to be one of this three values: ttl (Turtle format), n3 (N3 format), csv (coma separated value).
41+
rdfOutputFileName: #The base name for the files generated in rdf format (Turtle and N3)
42+
```
3243

33-
This configuration will generate for the startYear-01-01 to the (startYear+numYears)-01-01 period activity in the simulated social network for the amount of users configurated.
44+
This configuration will generate a database for the startYear-01-01 to the (startYear+numYears)-01-01 period activity in the simulated social network for the amount of users configurated.
3445

3546

3647
## Execution
@@ -41,11 +52,13 @@ Terminology:
4152
* $HADOOP_HOME is used to refer to the hadoop-1.0.3 folder in your system.
4253
* $LDBC_SOCIALNET_DBGEN_HOME is used to refer to the ldbc_socialnet_dbgen folder in your system.
4354

44-
The execution instruction is:
55+
To execute the generator, please type:
4556

57+
```
4658
$HADOOP_HOME/bin/hadoop jar $LDBC_SOCIALNET_DBGEN_HOME/ldbc_socialnet_dbgen.jar hadoop_input_folder hadoop_output_folder Num_machines_ldbc_will_use $LDBC_SOCIALNET_DBGEN_HOME/ Final_output_folder
59+
```
4760

48-
You can refer to the run.sh script to see a clearer example of how to run it.
61+
In ldbc\_socialnet\_bm/ldbc\_socialnet\_dbgen/run.sh you can find a full example of how to compile and execute the SNDG.
4962

5063
## Output
51-
The generator will create CSV files [with the following format](https://github.com/ldbc/ldbc_socialnet_bm/wiki/Generated-CSV-Files)
64+
The generator can create data in three formats: CSV, TTL and N3. For more information please check the [wiki](https://github.com/ldbc/ldbc_socialnet_bm/wiki/Data-Output)

0 commit comments

Comments
 (0)