Documentation changes

Jeff Bornemann · Jeff Bornemann · commit f39741a6faa9 · 2017-03-01T13:56:51.000-06:00
diff --git a/docs/GeneralLayout.adoc b/docs/GeneralLayout.adoc
@@ -1,7 +1,8 @@
 == General Layout
 
-There are two primary components to Grabbit: a client and a server that run in the two CQ instances that you want to copy to and from (respectively).
+Any server with Grabbit installed acts as a Grabbit peer that can send, or receive content to another Grabbit peer.
+
+To pull content into a server, a new job needs to be created on the receiving server. To do this, using the RESTful API exposed by Grabbit, PUT /grabbit/job with configuration specifying server to pull from, paths to pull, etc. This is outlined in more detail at link:Running.adoc[Running Grabbit]
 
 A recommended systems layout style is to have all content from a production publisher copied down to a staging "data warehouse" (DW) server to which all lower environments (beta, continuous integration, developer workstations, etc.) will connect. This way minimal load is placed on Production, and additional DW machines can be added to scale out if needed, each of which can grab from the "main" DW.
-The client sends an HTTP(S) GET request with a content path and "last grab time" to the server and receives a protobuf stream of all the content below it that has changed. The client's BasicAuth credentials are used to create the JCR Session, so the client can never see content they don't have explicit access to. There are a number of ways to tune how the client works, including specifying multiple focused paths, parallel or serial execution, JCR Session batch size (the number of nodes to cache before flushing to disk), etc.
 
diff --git a/docs/Monitoring.adoc b/docs/Monitoring.adoc
@@ -25,7 +25,7 @@ A job status has the following format :
 ```
 
 Couple of points worth noting here:
-`"exitCode"` can have 4 states - `UNKNOWN`, `COMPLETED`, `FAILED`, or `VALIDATION_FAILED`. `UNKNOWN` means the job is still running. `COMPLETED` means that the job was completed successfully. `FAILED` means the job failed. `VALIDATION_FAILED` means the job was aborted due to client configuration; This could mean that although the configuration was valid, Grabbit refused to perform some work due to imminent introduction of unintended consequences.
+`"exitCode"` can have 4 states - `UNKNOWN`, `COMPLETED`, `FAILED`, or `VALIDATION_FAILED`. `UNKNOWN` means the job is still running. `COMPLETED` means that the job was completed successfully. `FAILED` means the job failed. `VALIDATION_FAILED` means the job was aborted due to configuration; This could mean that although the configuration was valid, Grabbit refused to sync a path - for e.g, a non-existing parent path. Grabbit will not implicitly write parent nodes.
 `"jcrNodesWritten"` : This indicates how many nodes are currently written (increments by 1000)
 `"timeTaken"` : This will indicate the total time taken to complete content grab for `currentPath`
 
@@ -36,9 +36,9 @@ __Sample of a real Grabbit Job status__
 
 image::../assets/jobStatus.png[Job Status]
 
-Two loggers are predefined for Grabbit. One for Grabbit Server and the other for Grabbit Client.
-They are link:grabbit/src/main/content/SLING-INF/content/apps/grabbit/config/org.apache.sling.commons.log.LogManager.factory.config-com.twcable.grabbit.server.batch.xml[batch-server.log] and link:grabbit/src/main/content/SLING-INF/content/apps/grabbit/config/org.apache.sling.commons.log.LogManager.factory.config-com.twcable.grabbit.client.batch.xml[batch-client.log] respectively.
-These log files are for anything logged in **com.twcable.grabbit.server.batch** and **com.twcable.grabbit.client.batch** packages.
+Two loggers are predefined for Grabbit. One detailing content receive operations, another for content push operations.
+They are link:../src/main/content/SLING-INF/content/apps/grabbit/config/org.apache.sling.commons.log.LogManager.factory.config-com.twcable.grabbit.send.xml[grabbit-send.log] and link:../src/main/content/SLING-INF/content/apps/grabbit/config/org.apache.sling.commons.log.LogManager.factory.config-com.twcable.grabbit.receive.xml[grabbit-receive.log] respectively.
+These log files are for anything logged in **com.twcable.grabbit.server** and **com.twcable.grabbit.client** packages.
 
-If you want to see what nodes are being written on the Grabbit Client, change the logging for `batch-client.log` above to `DEBUG` or `TRACE`.
+If you want to see what nodes are being written to a receiving server, change the logging for `grabbit-receive.log` above to `DEBUG` or `TRACE`.
 
diff --git a/docs/Running.adoc b/docs/Running.adoc
@@ -1,6 +1,6 @@
 == Running
 
-Make sure Grabbit Package is installed on both client and server. You can download the package from image:https://api.bintray.com/packages/twcable/aem/Grabbit/images/download.svg[title = "Download", link = "https://bintray.com/twcable/aem/Grabbit/_latestVersion"]
+Make sure Grabbit Package is installed on both the server you are sending content from, and sending content to. You can download the package from image:https://api.bintray.com/packages/twcable/aem/Grabbit/images/download.svg[title = "Download", link = "https://bintray.com/twcable/aem/Grabbit/_latestVersion"]
 
 Once that is done, you need just 2 files to sync content between the servers:
 
@@ -12,7 +12,7 @@ Once that is done, you need just 2 files to sync content between the servers:
 link:../grabbit.sh[This] shell script can be used to initiate new Grabbit jobs, or monitor existing jobs.
 
 - Run grabbit.sh
-- Enter connection details to your Grabbit "client" server (The server you wish to pull content into)
+- Enter connection details to your receiving server (The server you wish to pull content into)
 
 image::../assets/grabbitConnection.png[Grabbit Connection Example]
 
@@ -118,27 +118,26 @@ The corresponding `YAML` configuration for the JSON above will look something li
          - someContent/someOtherExcludeContent
        workflowConfigIds : *damWorkflows
 ```
-
 ===== Required fields
 
-* __serverHost__: The server that the client should get its content from.
-* __serverPort__: The port to connect to on the server that the client should use.
-* __serverUsername__: The username the client should use to authenticate against the server.
-* __serverPassword__: The password the client should use to authenticate against the server.
+* __serverHost__: The server host to receive content.
+* __serverPort__: Server port for host above.
+* __serverUsername__: Username for sending server authentication.
+* __serverPassword__: Password for sending server authentication.
 * __pathConfigurations__: The list of paths and their options to pull from the server.
 ** __path__: The path to recursively grab content from.
 
 ===== Optional fields
 
-* __serverScheme__: string. The protocol the client should use when connecting to the server. Supported options are `http` and `https`. Defaults to `http`.
-* __deltaContent__: boolean, ```true``` syncs only 'delta' or changed content. Changed content is determined by comparing one of a number of date properties including jcr:lastModified, cq:lastModified, or jcr:created Date with the last successful Grabbit sync date. Nodes without any of previously mentioned date properties will always be synced even with deltaContent on, and if a node's data is changed without updating a date property (ie, from CRX/DE), the change will not be detected.  Most common throughput bottlenecks are usually handled by delta sync for cases such as large DAM trees; but if your case warrants a more fine tuned use of delta sync, you may consider adding mix:lastModified to nodes not usually considered for exclusion, such as extremely large unstructured trees. The deltaContent flag __only__ applies to changes made on the server - changes to the client environment will not be detected (and won't be overwritten if changes were made on the client's path but not on the server).
+* __serverScheme__: string. The protocol to use when securing a connection to the sending server. Supported options are `http` and `https`. Defaults to `http`.
+* __deltaContent__: boolean, ```true``` syncs only 'delta' or changed content. Changed content is determined by comparing one of a number of date properties including jcr:lastModified, cq:lastModified, or jcr:created Date with the last successful Grabbit sync date. Nodes without any of previously mentioned date properties will always be synced even with deltaContent on, and if a node's data is changed without updating a date property (ie, from CRX/DE), the change will not be detected.  Most common throughput bottlenecks are usually handled by delta sync for cases such as large DAM trees; but if your case warrants a more fine tuned use of delta sync, you may consider adding mix:lastModified to nodes not usually considered for exclusion, such as extremely large unstructured trees. The deltaContent flag __only__ applies to changes made on the server - changes to the receiving environment will not be detected (and won't be overwritten if changes were made on the receiving path but not on the sending path).
 * __batchSize__: integer. Used to specify the number of nodes in one batch, Defaults to 100.
-* __deleteBeforeWrite__: boolean. Before the client retrieves content, should content under each path be cleared? When used in combination with excludePaths, nodes indicated by excludePaths will not be deleted
+* __deleteBeforeWrite__: boolean. Before the receiving server retrieves content, should content under each path be cleared? When used in combination with excludePaths, nodes indicated by excludePaths will not be deleted
 
 Under path configurations
 
 ** __excludePaths__: This allows excluding specific subpaths from what will be retrieved from the parent path. See more detail below.
-** __workflowConfigIds__: Before the client retrieves content for the path from the server, it will make sure that the specified workflows are disabled. They will be re-enabled when all content specifying that workflow has finished copying. (Grabbit handles the situation of multiple paths specifying "overlapping" workflows.) This is particularly useful for areas like the DAM where a number of relatively expensive workflows will just "redo" what is already being copied.
+** __workflowConfigIds__: Before the receiving server retrieves content for the path from the server, it will make sure that the specified workflows are disabled. They will be re-enabled when all content specifying that workflow has finished copying. (Grabbit handles the situation of multiple paths specifying "overlapping" workflows.) This is particularly useful for areas like the DAM where a number of relatively expensive workflows will just "redo" what is already being copied.
 ** __deleteBeforeWrite__: Individual path overwrite for global deleteBeforeWrite setting.
 ** __deltaContent__: boolean. Individual path overwrite for the global deltaContent setting. Functionality is the same, but on a path-by-path basis, instead of applying to all path configurations. No matter what the global setting is, specifying this field will overwrite it. If not specified, the path will sync according to the global setting.
 ** __batchSize__: integer. Individual path override the global batchSize configuration. Functionality is the same, but on path-by-path basis. No matter what the global setting is, specifying this field will overwrite it. If not specified, the path will sync according to the global setting.
diff --git a/src/main/content/SLING-INF/content/apps/grabbit/config/org.apache.sling.commons.log.LogManager.factory.config-com.twcable.grabbit.receive.xml b/src/main/content/SLING-INF/content/apps/grabbit/config/org.apache.sling.commons.log.LogManager.factory.config-com.twcable.grabbit.receive.xml
@@ -2,7 +2,7 @@
     <primaryNodeType>sling:OsgiConfig</primaryNodeType>
     <property>
         <name>org.apache.sling.commons.log.file</name>
-        <value>logs/batch-client.log</value>
+        <value>logs/grabbit-receive.log</value>
         <type>String</type>
     </property>
 
diff --git a/src/main/content/SLING-INF/content/apps/grabbit/config/org.apache.sling.commons.log.LogManager.factory.config-com.twcable.grabbit.send.xml b/src/main/content/SLING-INF/content/apps/grabbit/config/org.apache.sling.commons.log.LogManager.factory.config-com.twcable.grabbit.send.xml
@@ -2,7 +2,7 @@
     <primaryNodeType>sling:OsgiConfig</primaryNodeType>
     <property>
         <name>org.apache.sling.commons.log.file</name>
-        <value>logs/batch-server.log</value>
+        <value>logs/grabbit-send.log</value>
         <type>String</type>
     </property>