-
Notifications
You must be signed in to change notification settings - Fork 15
Running InfoSphere Streams benchmark
Before you begin, create the dataset required by the InfoSphere Streams benchmark: [Create dataset for InfoSphere Streams benchmark](Create dataset for InfoSphere Streams benchmark)
The StreamsEmailBenchmark project contains the InfoSphere streams application for processing the emails.
- Before you can compile and run the application, you need to copy two directories from StreamsPrepareDataset project to the root of StreamsEmailBenchmark:
- avroDecode
- avroEncode
- Copy your serialized/compressed dataset (obtained using StreamsPrepareDataset) to
StreamsEmailBenchmark/data
Note: Naming convention should be filename0.av to filename<parallelism>.av
Samantha: Where is this naming convenstion mean? do I have to name the file this way?
To build the application, go to the root directory of StreamsEmailBenchmark, and type make all PARALLELISM=<parallelism> at the command line.
To run the application:
- Make sure a streams instance is created and started
- To submit the job to the streams instance:
streamtool submitjob output/Main/Distributed/Main.adl filename=<input_file_name> windowTime=<flush_interval_for_metrics> printWindowMetrics=<yes_or_no>
Samantha: This step does not work: streamtool submitjob -i storm@chanskw output/Main/Distributed/Main.adl -P filename=25percentEnronDataSet -P windowTime=5 -P printWindowMetrics=yes
CDISC5093E The output/Main/Distributed/Main.adl compiled application file could not be accessed. Check that the permissions of the application file are set to allow access. The error is: No such file or directory
You sure make all builds the right thing?
Also, what are we supposed to put down for windowTime ? what's the unit for window time, seconds or ms?
- Metrics will be dumped to
stdout - CPU Time can be obtained by visually inspecting the SPL graph in Streams Studio