forked from amplab/shark
-
Notifications
You must be signed in to change notification settings - Fork 0
Running Shark Locally
HarveyFeng edited this page May 18, 2012
·
6 revisions
This guide describes how to get Spark running locally.
Shark requires Hive 0.7.0 and Spark (0.4-SNAPSHOT).
Get the patched Hive from AMPLab github account:
$ export HIVE_DEV_HOME=/path/to/hive
$ git clone git://github.com/amplab/hive.git -b shark-0.7.0 $HIVE_HOME
$ cd $HIVE_DEV_HOME
$ ant package
Get Spark from Github, compile, and publish to local ivy:
$ git clone git://github.com/mesos/spark.git spark
$ cd spark
$ sbt/sbt publish-local
Get Shark from Github:
$ git clone git://github.com/amplab/shark.git shark
$ cd shark
Before building Shark, first modify the config file:
$ conf/shark-env.sh
Compile Shark (make sure $HIVE_HOME is set as $HIVE_DEV_HOME/build/dist in the config file or as an environmental variable):
$ sbt/sbt products
There are several executables in /bin:
-
shark: Runs Shark CLI. -
shark-withinfo: Runs Shark with INFO level logs printed to the console. -
shark-withdebug: Runs Shark with DEBUG level logs printed to the console. -
shark-shell: Runs Shark scala console. This provides an experimental feature to convert Hive QL queries intoTableRDD. -
clear-buffer-cache.py: Automatically clears OS buffer caches on Mesos EC2 clusters. This is handy for performance studies.