-
Notifications
You must be signed in to change notification settings - Fork 8
Setting up Giraph
Follow the instructions on our docker image here: https://registry.hub.docker.com/u/uwsampa/giraph-docker/
What's below here is old and broken.
These instructions will help build a recent release of Giraph. This builds jar files with include dependencies, which are not included in the release tarball from the Giraph website. You should be able to use a release tarball too, but using the jars with dependencies was easier for our simple experiments.
You must have the following installed to build and use Giraph:
- Java 1.6 or later
- Maven 3 or later
- A supported version of Hadoop, as described here. This tutorial assumes Hadoop 2.6.0, with Yarn.
git clone https://git-wip-us.apache.org/repos/asf/giraph.gitgit co -t origin/release-1.1First, get rid of this symbol as described here: [[http://mail-archives.apache.org/mod_mbox/giraph-user/201501.mbox/%3C54B17196.4040107@hiro-tan.org%3E]]
Then, change into the giraph directory and run this command:
mvn -Phadoop_yarn -Dhadoop.version=2.6.0 -DskipTests packageGiraph doesn't seem to play well with Hadoop 2.*, jar dependencies, and HDFS with permissions. I couldn't find a way to use the --yarnjars arguments to get the jars in the right place at this point. Not sure how to proceed to get Giraph running on our cluster.
These instructions will help you set up a version of Giraph that works with recent versions of Hadoop. At the time this was written, the most recent Giraph release (1.0.0) doesn't support the most recent release of Hadoop (2.4.1), so we use the Giraph development trunk.
You must have the following installed to build and use Giraph:
- Java 1.6 or later
- Maven 3 or later
- A supported version of Hadoop, as described here. This tutorial assumes Hadoop 2.4.1, with Yarn.
git clone https://git-wip-us.apache.org/repos/asf/giraph.gitChange into the giraph directory and run this command:
mvn -Phadoop_yarn -Dhadoop.version=2.4.1 package -DskipTestszookeeper
Run a demo.
hadoop jar /shared/hadoop/giraph/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-2.6.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip tiny_graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op giraph-output-$(date +%s) -yj giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.6.0-jar-with-dependencies.jar,giraph-1.1.0-SNAPSHOT-for-hadoop-2.6.0-jar-with-dependencies.jar -w 1
$HADOOP_HOME/bin/hdfs dfs -put /scratch/nelson/giraph-1.0.0-x/tiny-graph.txt input
$HADOOP_HOME/bin/hadoop jar
$HADOOP_HOME/bin/hadoop jar /scratch/nelson/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.4.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/nelson/input/tiny-graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/nelson/output -yj giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.4.1-jar-with-dependencies.jar,giraph-1.1.0-SNAPSHOT-for-hadoop-2.4.1-jar-with-dependencies.jar -w 1
$GIRAPH_PREFIX/bin/giraph ./giraph-examples-1.1.0-SNAPSHOT.jar org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/root/input/tiny-graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/root/output -yj giraph-core-1.1.0-SNAPSHOT.jar,giraph-examples-1.1.0-SNAPSHOT.jar -w 1
runs: