Skip to content

Commit 03ca936

Browse files
committed
Remove ENV options, add ini requirement
1 parent 327a8ed commit 03ca936

File tree

3 files changed

+13
-25
lines changed

3 files changed

+13
-25
lines changed

Dockerfile

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,9 @@ RUN curl -L 'http://archive.apache.org/dist/hadoop/core/hadoop-2.6.0/hadoop-2.6.
88
# Copy the project
99
COPY . /opt/ldbc_snb_datagen
1010
WORKDIR /opt/ldbc_snb_datagen
11+
# Remove sample parameters
12+
RUN rm params*.ini
13+
# Build jar bundle
1114
RUN mvn -DskipTests clean assembly:assembly
1215

13-
ENV HADOOP_CLIENT_OPTS '-Xmx8G'
14-
ENV DATAGEN_SCALE_FACTOR 'snb.interactive.1'
15-
ENV DATAGEN_PERSON_SERIALIZER 'ldbc.snb.datagen.serializer.snb.interactive.CSVPersonSerializer'
16-
ENV DATAGEN_INVARIANT_SERIALIZER 'ldbc.snb.datagen.serializer.snb.interactive.CSVInvariantSerializer'
17-
ENV DATAGEN_PERSON_ACTIVITY_SERIALIZER 'ldbc.snb.datagen.serializer.snb.interactive.CSVPersonActivitySerializer'
18-
1916
CMD /opt/ldbc_snb_datagen/docker_run.sh

README.md

Lines changed: 4 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -43,25 +43,16 @@ docker build . --tag ldbc/datagen
4343

4444
### Running
4545

46-
The project will output its results in the `/opt/ldbc_snb_datagen/social_network/` directory. In order to save the results of the generation, a directory must be mounted in the container from the host:
46+
In order to run the container, a `params.ini` file is required. For reference, please see the params*.ini files in the repository. The file will be mounted in the container by the `--mount type=bind,source="$(pwd)/params.ini,target="/opt/ldbc_snb_datagen/params.ini"` option. If required, the source path can be set to a different path.
47+
48+
The container will output it's results in the `/opt/ldbc_snb_datagen/social_network/` directory. In order to save the results of the generation, a directory must be mounted in the container from the host:
4749

4850
```
4951
mkdir datagen_output
5052
51-
docker run --rm --mount type=bind,source="$(pwd)/datagen_output/",target="/opt/ldbc_snb_datagen/social_network/" ldbc/datagen
53+
docker run --rm --mount type=bind,source="$(pwd)/datagen_output/",target="/opt/ldbc_snb_datagen/social_network/" --mount type=bind,source="$(pwd)/params.ini",target="/opt/ldbc_snb_datagen/params.ini" ldbc/datagen
5254
```
5355

54-
### Options
55-
56-
The container image can be customized with environment variables passed through the `docker run` command. The following options are present:
57-
* `HADOOP_CLIENT_OPTS`: A standard Hadoop environment variable controlling the Hadoop client parameters. The default is `-Xmx8G`
58-
* `DATAGEN_SCALE_FACTOR`: The scale factor of the generated dataset. The default is `snb.interactive.1`
59-
* `DATAGEN_PERSON_SERIALIZER`: The serializer used for Person objects. The default is `ldbc.snb.datagen.serializer.snb.interactive.CSVPersonSerializer`
60-
* `DATAGEN_INVARIANT_SERIALIZER` The serializer used for Invariant objects. The default is `ldbc.snb.datagen.serializer.snb.interactive.CSVInvariantSerializer`
61-
* `DATAGEN_PERSON_ACTIVITY_SERIALIZER` The serializer used for Invariant objects. The default is `ldbc.snb.datagen.serializer.snb.interactive.CSVPersonActivitySerializer`
62-
63-
<!-- **Datasets** -->
64-
6556
<!-- Publicly available datasets can be found at the LDBC-SNB Amazon Bucket. These datasets are the official SNB datasets and were generated using version 0.2.6. They are available in the three official supported serializers: CSV, CSVMergeForeign and TTL. The bucket is configured in "Requester Pays" mode, thus in order to access them you need a properly set up AWS client.
6657
* http://ldbc-snb.s3.amazonaws.com/ -->
6758

docker_run.sh

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
#o!/bin/bash
22

3-
# Parameter serialization
4-
PARAMS_FILE=params.ini
5-
echo "ldbc.snb.datagen.generator.scaleFactor:${DATAGEN_SCALE_FACTOR}" >> ${PARAMS_FILE}
6-
echo "ldbc.snb.datagen.serializer.personSerializer:${DATAGEN_PERSON_SERIALIZER}" >> ${PARAMS_FILE}
7-
echo "ldbc.snb.datagen.serializer.invariantSerializer:${DATAGEN_INVARIANT_SERIALIZER}" >> ${PARAMS_FILE}
8-
echo "ldbc.snb.datagen.serializer.personActivitySerializer:${DATAGEN_PERSON_ACTIVITY_SERIALIZER}" >> ${PARAMS_FILE}
3+
set -e
4+
5+
if [ ! -f /opt/ldbc_snb_datagen/params.ini ]; then
6+
echo "The params.ini file is not present"
7+
exit 1
8+
fi
99

1010
# Running the generator
1111
/opt/hadoop-2.6.0/bin/hadoop jar /opt/ldbc_snb_datagen/target/ldbc_snb_datagen-0.2.7-jar-with-dependencies.jar /opt/ldbc_snb_datagen/params.ini

0 commit comments

Comments
 (0)