Skip to content

Commit 5da7822

Browse files
committed
Merge branch 'batching'
2 parents 02e05ce + 2248579 commit 5da7822

File tree

7 files changed

+16
-13
lines changed

7 files changed

+16
-13
lines changed

.travis.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ before_install:
1313
- docker build . --tag ldbc/datagen
1414
install: true
1515
script:
16-
- mvn test
1716
- cp params-csv-basic.ini params.ini
1817
- docker run --rm --mount type=bind,source="$(pwd)/",target="/opt/ldbc_snb_datagen/out" --mount type=bind,source="$(pwd)/params.ini",target="/opt/ldbc_snb_datagen/params.ini" ldbc/datagen
19-
- bash check-md5sums-csv-basic.sh
18+
- md5sum social_network/*.csv | sort
19+
- "[[ `md5sum social_network/*.csv | sort | md5sum` == 'ee9e6dd99bf7c3459f4c6156d355f5ea -' ]]"
2020
- mkdir out
2121
- cp -r substitution_parameters out/
2222
notifications:

README.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,11 @@ To grab Hadoop, extract it, and set the environment values to sensible defaults,
4242

4343
```bash
4444
cp params-csv-basic.ini params.ini
45-
wget http://archive.apache.org/dist/hadoop/core/hadoop-2.9.2/hadoop-2.9.2.tar.gz
46-
tar xf hadoop-2.9.2.tar.gz
45+
wget http://archive.apache.org/dist/hadoop/core/hadoop-3.2.1/hadoop-3.2.1.tar.gz
46+
tar xf hadoop-3.2.1.tar.gz
4747
export HADOOP_CLIENT_OPTS="-Xmx2G"
48-
# set this to the Hadoop 2.9.2 directory
49-
export HADOOP_HOME=`pwd`/hadoop-2.9.2
48+
# set this to the Hadoop 3.2.1 directory
49+
export HADOOP_HOME=`pwd`/hadoop-3.2.1
5050
# set this to the repository's directory
5151
export LDBC_SNB_DATAGEN_HOME=`pwd`
5252
./run.sh
@@ -66,10 +66,13 @@ docker build . --tag ldbc/datagen
6666

6767
Set the `params.ini` in the repository as for the pseudo-distributed case. The file will be mounted in the container by the `--mount type=bind,source="$(pwd)/params.ini,target="/opt/ldbc_snb_datagen/params.ini"` option. If required, the source path can be set to a different path.
6868

69-
The container outputs its results in the `/opt/ldbc_snb_datagen/out/` directory which contains two sub-directories, `social_network/` and `subsitution_parameters`. In order to save the results of the generation, a directory must be mounted in the container from the host. The driver requires the results be in the datagen repository directory. To generate the data, run the following command which includes changing the owner (`chown`) of the Docker-mounted volumes:
69+
The container outputs its results in the `/opt/ldbc_snb_datagen/out/` directory which contains two sub-directories, `social_network/` and `substitution_parameters`. In order to save the results of the generation, a directory must be mounted in the container from the host. The driver requires the results be in the datagen repository directory. To generate the data, run the following command which includes changing the owner (`chown`) of the Docker-mounted volumes.
70+
71+
:warning: This removes the previously generated `social_network` directory:
7072

7173
```bash
72-
docker run --rm --mount type=bind,source="$(pwd)/",target="/opt/ldbc_snb_datagen/out" --mount type=bind,source="$(pwd)/params.ini",target="/opt/ldbc_snb_datagen/params.ini" ldbc/datagen && \
74+
rm -rf social_network/ substitution_parameters && \
75+
docker run --rm --mount type=bind,source="$(pwd)/",target="/opt/ldbc_snb_datagen/out" --mount type=bind,source="$(pwd)/params.ini",target="/opt/ldbc_snb_datagen/params.ini" ldbc/datagen; \
7376
sudo chown -R $USER:$USER social_network/ substitution_parameters/
7477
```
7578

base-docker-image/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ FROM openjdk:8-jdk-stretch
44
WORKDIR /opt
55
RUN apt-get update
66
RUN apt-get install -y bash curl maven python
7-
RUN curl -L 'http://archive.apache.org/dist/hadoop/core/hadoop-2.9.2/hadoop-2.9.2.tar.gz' | tar -xz
7+
RUN curl -L 'http://archive.apache.org/dist/hadoop/core/hadoop-3.2.1/hadoop-3.2.1.tar.gz' | tar -xz
88
RUN curl -L 'https://julialang-s3.julialang.org/bin/linux/x64/1.2/julia-1.2.0-linux-x86_64.tar.gz' | tar -xz
99

1010
# Copy the project

base-docker-image/pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
<dependency>
4141
<groupId>org.apache.hadoop</groupId>
4242
<artifactId>hadoop-client</artifactId>
43-
<version>2.9.2</version>
43+
<version>3.2.1</version>
4444
</dependency>
4545
<dependency>
4646
<groupId>ca.umontreal.iro</groupId>

docker_run.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ if [ ! -f /opt/ldbc_snb_datagen/params.ini ]; then
88
fi
99

1010
# Running the generator
11-
/opt/hadoop-2.9.2/bin/hadoop jar /opt/ldbc_snb_datagen/target/ldbc_snb_datagen-0.4.0-SNAPSHOT-jar-with-dependencies.jar /opt/ldbc_snb_datagen/params.ini
11+
/opt/hadoop-3.2.1/bin/hadoop jar /opt/ldbc_snb_datagen/target/ldbc_snb_datagen-0.4.0-SNAPSHOT-jar-with-dependencies.jar /opt/ldbc_snb_datagen/params.ini
1212

1313
# Cleanup
1414
rm -f m*personFactors*

pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
<dependency>
4141
<groupId>org.apache.hadoop</groupId>
4242
<artifactId>hadoop-client</artifactId>
43-
<version>2.9.2</version>
43+
<version>3.2.1</version>
4444
</dependency>
4545
<dependency>
4646
<groupId>ca.umontreal.iro</groupId>

run.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ if [ ! -f params.ini ]; then
55
exit 1
66
fi
77

8-
DEFAULT_HADOOP_HOME=/home/user/hadoop-2.9.2 #change to your hadoop folder
8+
DEFAULT_HADOOP_HOME=/home/user/hadoop-3.2.1 #change to your hadoop folder
99
DEFAULT_LDBC_SNB_DATAGEN_HOME=`pwd` #change to your ldbc_snb_datagen folder
1010

1111
# allow overriding configuration from outside via environment variables

0 commit comments

Comments
 (0)