Skip to content

Guideline to Run Tests

Matt Sun edited this page Jul 6, 2016 · 10 revisions

This is a guideline that helps you to run tests in MLCP and Hadoop Connector. This article also includes the requirement to verify your contributions to submit a pull request.

Test Environment

MLCP tests require a running MarkLogic Server instance. By default, MLCP runs tests under these MarkLogic configurations:

  • MarkLogic Server with same version of MLCP runs on localhost
  • Documents database is available
  • Documents database uses "Modules" database as module database
  • Port 8000 is available
  • Username/password admin/admin will be used to run tests

You can customize some of these settings with Configurable Test Parameters.

MLCP tests will make following setups:

  • Enable triple index on Documents database
  • Enable collection lexicon on Documents database
  • Create range element indexes on Documents database
  • Create temporal axes on Documents database
  • Create temporal collections on Documents database
  • Create CopyDst database and CopyDstForest forest if not exist. If the user specified a custom CopyDst database name, MLCP will also check the existance of the database and create it if necessary.

If a custom test database has been specified, above setups on Documents database will be applied to the specified one. MLCP won't clean up these setups after tests are finished.

Hadoop Connector tests don't require a running MarkLogic Server instance.

How to Run

To run tests in both MLCP and Hadoop Connector, from marklogic-contentpump root directory, run command

$ mvn test

Alternatively, you can only run tests of MLCP or Hadoop Connector by running above command from the root directory of mlcp or mapreduce. Things to note:

  • Maven builds the product before "test" phase and building MLCP requires Hadoop Connector being successfully built first.
  • To run MLCP distributed test, please run MLCP build command first then above command because MLCP uses built binary jar to run distributed job.

MLCP Tests: Local vs Distributed

MLCP can run in local mode or distributed mode. Running distributed mode requires a Hadoop cluster. Please refer to this page for more information about MLCP running in different modes.

MLCP tests will run in the default mode, which is decided by your environment. Basically, if you have environment variable HADOOP_CONF_DIR set, MLCP (and tests) will run in distributed mode by default. Otherwise, MLCP (and tests) will run in local mode by default (Read more about MLCP running mode).

MLCP contains local tests and distributed tests. Distributed tests are designed to test the functionality of MLCP working with a Hadoop cluster. By default only local tests will be invoked. To also run distributed test, append to the running command "-Ddistributed=true". See Configurable Test Parameters section for the usage of additional parameters. Other than Hadoop environment, some other requirements for MLCP distributed tests:

  • HDFS is installed and running - Some tests use HDFS as the file system of export destination
  • MLCP is running on a node of Hadoop Cluster - Test resources are in MLCP directory on local file system which will be used by distributed tests (Read more about this).

Configurable Test Parameters

You can customize your test environment settings by specifying additional parameters to the test command. All below parameters are optional. Default values will be used if not specified.

Parameter Type Default Usage Example Applicable To Description
test string Not used -Dtest=TestImportDocs MLCP, Hadoop Connector Specify the test to run. Read more from Maven Surefile Plugin
testDb string Documents -DtestDb=Documents MLCP The database that tests will use
testPort string 8000 -DtestPort=8000 MLCP The port that tests will use to talk to MarkLogic Server
skipTests boolean false -DskipTests=false MLCP, Hadoop Connector Whether tests should be skipped
testCopyDst string CopyDst -DtestCopyDst=CopyDst MLCP The database that tests for COPY command will copy to
testOutputPath string /tmp/mlcpout -DtestOutputPath=/tmp/mlcpout MLCP The directory path to save tests output
distributed boolean false -Ddistributed=true MLCP Distributed tests will also be run

Verify Your Changes

If you made changes to MLCP, you should verify that your change doesn't introduce any issue by running and passing MLCP unit tests.

If you made changes to Hadoop Connector, you should verify that you change doesn't introduce any issue by running and passing BOTH Hadoop Connector tests and MLCP tests.

CURRENTLY, the bottom line for the project maintainers to take a pull request of your contributions to MLCP or Hadoop Connector is to pass all Hadoop Connector tests and MLCP LOCAL tests. Pull requrests with failing tests won't be accepted. The unit tests included in MLCP and Hadoop Connector are only a minimum set of all the tests we have for the products. They are only designed for sanity check.

More contribution policies are under discussion.

Clone this wiki locally