Skip to content

Commit 6b6d2ff

Browse files
committed
fixing load binaries example
1 parent 146e98a commit 6b6d2ff

File tree

11 files changed

+301
-182
lines changed

11 files changed

+301
-182
lines changed

examples/load-binaries/README.md

Lines changed: 19 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
This example shows how to load binary documents with the Hub Framework.
33

44
# TLDR; How do I run it?
5-
1. Download the latest quick-start jar from the [releases page](https://github.com/marklogic-community/marklogic-data-hub/releases) into this folder.
5+
1. Download the [latest quick-start war](https://github.com/marklogic-community/marklogic-data-hub/releases/download/v2.0.0-rc.2/quick-start-2.0.0-rc.2.war) into this folder.
66

7-
1. Run the quick-start jar `java -jar quick-start.jar`
7+
1. Run the quick-start war `java -jar quick-start-2.0.0-rc.2.war`
88

99
1. Open your web browser to [http://localhost:8080](http://localhost:8080).
1010

@@ -16,52 +16,31 @@ This example shows how to load binary documents with the Hub Framework.
1616

1717
1. Install the Hub into MarkLogic (if necessary)
1818

19-
1. Load the sample pdf.
20-
21-
1. Click on the Guides entity on the left.
22-
1. Click on the "LoadAsXml" input flow.
23-
1. Click on the "Run Flow" button.
24-
1. Browse to the input folder.
25-
1. Expand the "General Options" section.
26-
1. Add ,\\.pdf,'.xml' to the end of "Output URI Replace". It should look something like: /Users/yourname/data-hub/examples/load-binaries,'','\\.pdf','.xml'
27-
1. Change "Document Type" to "binary".
28-
1. Scroll down and press the "RUN IMPORT" button.
29-
30-
1. At this point you have loaded the sample data. You can browse the data via [QConsole](http://localhost:8000/qconsole) or by searching the REST endpoint on the Staging Http Server [http://localhost:8010/v1/search](http://localhost:8010/v1/search). *Your port may be different if you changed it during setup*
19+
## Loading the Sample PDF
20+
1. Click on the **Entities** Tab at the top.
21+
1. Click on the Guides entity on the left.
22+
1. Click on the "LoadAsXml" or "LoadAsJson" input flow.
23+
1. Browse to the input folder.
24+
1. Expand the "General Options" section.
25+
1. Add ,\\.pdf,'.xml' to the end of "Output URI Replace". It should look something like:
26+
***nix**
27+
`/Users/yourname/data-hub/examples/load-binaries/input,'',\\.pdf,'.xml'`
28+
**windows**
29+
`/c:/Users/yourname/data-hub/examples/load-binaries/input,'',\\.pdf,'.xml'`
30+
1. Change "Document Type" to "binary".
31+
1. Scroll down and press the "RUN IMPORT" button.
32+
33+
## Browsing the sample data
34+
1. At this point you have loaded the sample data. You can browse the data by clicking on the **Browse** tab at the top.
3135

3236
# Ingesting Binaries
3337

3438
When you ingest a binary via the Quick Start MLCP process you will want to do a few things.
3539

3640
1. Change the file extension to xml or json via "Output URI Replace".
3741

38-
This is because we will be manually storing the binary an using the ingest for creating XML or JSON metadata.
42+
This is because we will be manually storing the binary, but returning XML or JSON metadata for MLCP to store into Marklogic.
3943

4044
1. Change the document type to binary via the "Document Type" option. This is necessary because MLCP might think this document is an XML or JSON after we changed the file extension above.
4145

42-
1. Store the binary manually and return the metatada. In your content.xqy put this:
43-
44-
```xquery
45-
declare function plugin:create-content(
46-
$id as xs:string,
47-
$raw-content as node()?,
48-
$options as map:map) as node()?
49-
{
50-
(: name the binary uri with a pdf extension :)
51-
let $binary-uri := fn:replace($id, ".xml", ".pdf")
52-
53-
(: stash the binary uri in the options map for later:)
54-
let $_ := map:put($options, 'binary-uri', $binary-uri)
55-
56-
(: save the incoming binary as a pdf :)
57-
return
58-
xdmp:document-insert($binary-uri, $raw-content),
59-
60-
(: extract the contents of the pdf and return them
61-
: as the content for the envelope
62-
:)
63-
xdmp:document-filter($raw-content)
64-
};
65-
```
66-
6746
Now you have loaded the binary manually and returned xhtml content to store in the envelope.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#
2+
#Mon Sep 18 10:42:06 EDT 2017
3+
mainModule=main.sjs
4+
mainCodeFormat=sjs
5+
codeFormat=sjs
6+
dataFormat=json
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
/*
2+
* Create Content Plugin
3+
*
4+
* @param id - the identifier returned by the collector
5+
* @param rawContent - the raw content being loaded.
6+
* @param options - an object containing options. Options are sent from Java
7+
*
8+
* @return - your content
9+
*/
10+
function createContent(id, rawContent, options) {
11+
// name the binary uri with a pdf extension
12+
var binaryUri = fn.replace(id, ".xml", ".pdf");
13+
14+
// stash the binary uri in the options map for later
15+
options.binaryUri = binaryUri;
16+
17+
// save the incoming binary as a pdf
18+
xdmp.eval('declareUpdate(); xdmp.documentInsert(binaryUri, rawContent)', {
19+
binaryUri: binaryUri,
20+
rawContent: rawContent
21+
},{
22+
isolation: 'different-transaction',
23+
commit: 'auto'
24+
});
25+
26+
// extract the contents of the pdf and return them
27+
// as the content for the envelope
28+
return xdmp.documentFilter(rawContent);
29+
30+
}
31+
32+
module.exports = {
33+
createContent: createContent
34+
};
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
/*
2+
* Create Headers Plugin
3+
*
4+
* @param id - the identifier returned by the collector
5+
* @param content - the output of your content plugin
6+
* @param options - an object containing options. Options are sent from Java
7+
*
8+
* @return - an object of headers
9+
*/
10+
function createHeaders(id, content, options) {
11+
return {
12+
binaryUri: options.binaryUri
13+
};
14+
}
15+
16+
module.exports = {
17+
createHeaders: createHeaders
18+
};
19+
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
// dhf.xqy exposes helper functions to make your life easier
2+
// See documentation at:
3+
// https://github.com/marklogic/marklogic-data-hub/wiki/dhf-lib
4+
const dhf = require('/com.marklogic.hub/dhf.xqy');
5+
6+
const contentPlugin = require('./content.sjs');
7+
const headersPlugin = require('./headers.sjs');
8+
const triplesPlugin = require('./triples.sjs');
9+
10+
/*
11+
* Plugin Entry point
12+
*
13+
* @param id - the identifier returned by the collector
14+
* @param rawContent - the raw content being loaded
15+
* @param options - a map containing options. Options are sent from Java
16+
*
17+
*/
18+
function main(id, rawContent, options) {
19+
var contentContext = dhf.contentContext(rawContent);
20+
var content = dhf.run(contentContext, function() {
21+
return contentPlugin.createContent(id, rawContent, options);
22+
});
23+
24+
var headerContext = dhf.headersContext(content);
25+
var headers = dhf.run(headerContext, function() {
26+
return headersPlugin.createHeaders(id, content, options);
27+
});
28+
29+
var tripleContext = dhf.triplesContext(content, headers);
30+
var triples = dhf.run(tripleContext, function() {
31+
return triplesPlugin.createTriples(id, content, headers, options);
32+
});
33+
34+
return dhf.makeEnvelope(content, headers, triples, options.dataFormat);
35+
}
36+
37+
module.exports = {
38+
main: main
39+
};
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
/*
2+
* Create Triples Plugin
3+
*
4+
* @param id - the identifier returned by the collector
5+
* @param content - the output of your content plugin
6+
* @param headers - the output of your heaaders plugin
7+
* @param options - an object containing options. Options are sent from Java
8+
*
9+
* @return - an array of triples
10+
*/
11+
function createTriples(id, content, headers, options) {
12+
return [];
13+
}
14+
15+
module.exports = {
16+
createTriples: createTriples
17+
};
18+
Lines changed: 49 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,49 @@
1-
xquery version "1.0-ml";
2-
3-
module namespace plugin = "http://marklogic.com/data-hub/plugins";
4-
5-
declare namespace envelope = "http://marklogic.com/data-hub/envelope";
6-
7-
declare option xdmp:mapping "false";
8-
9-
(:~
10-
: Create Content Plugin
11-
:
12-
: @param $id - the identifier returned by the collector
13-
: @param $raw-content - the raw content being loaded.
14-
: @param $options - a map containing options. Options are sent from Java
15-
:
16-
: @return - your transformed content
17-
:)
18-
declare function plugin:create-content(
19-
$id as xs:string,
20-
$raw-content as node()?,
21-
$options as map:map) as node()?
22-
{
23-
(: name the binary uri with a pdf extension :)
24-
let $binary-uri := fn:replace($id, ".xml", ".pdf")
25-
26-
(: stash the binary uri in the options map for later:)
27-
let $_ := map:put($options, 'binary-uri', $binary-uri)
28-
29-
(: save the incoming binary as a pdf :)
30-
return
31-
xdmp:document-insert($binary-uri, $raw-content),
32-
33-
(: extract the contents of the pdf and return them
34-
: as the content for the envelope
35-
:)
36-
xdmp:document-filter($raw-content)
37-
};
1+
xquery version "1.0-ml";
2+
3+
module namespace plugin = "http://marklogic.com/data-hub/plugins";
4+
5+
declare option xdmp:mapping "false";
6+
7+
(:~
8+
: Create Content Plugin
9+
:
10+
: @param $id - the identifier returned by the collector
11+
: @param $raw-content - the raw content being loaded.
12+
: @param $options - a map containing options. Options are sent from Java
13+
:
14+
: @return - your transformed content
15+
:)
16+
declare function plugin:create-content(
17+
$id as xs:string,
18+
$raw-content as node()?,
19+
$options as map:map) as node()?
20+
{
21+
(: name the binary uri with a pdf extension :)
22+
let $binary-uri := fn:replace($id, ".xml", ".pdf")
23+
24+
(: stash the binary uri in the options map for later:)
25+
let $_ := map:put($options, 'binary-uri', $binary-uri)
26+
27+
(: save the incoming binary as a pdf :)
28+
let $_ :=
29+
xdmp:eval('
30+
declare variable $binary-uri external;
31+
declare variable $raw-content external;
32+
xdmp:document-insert($binary-uri, $raw-content)
33+
',
34+
map:new((
35+
map:entry("binary-uri", $binary-uri),
36+
map:entry("raw-content", $raw-content)
37+
)),
38+
map:new((
39+
map:entry("isolation", "different-transaction"),
40+
map:entry("commit", "auto")
41+
))
42+
)
43+
return
44+
(:
45+
: extract the contents of the pdf and return them
46+
: as the content for the envelope
47+
:)
48+
xdmp:document-filter($raw-content)
49+
};
Lines changed: 25 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,25 @@
1-
xquery version "1.0-ml";
2-
3-
module namespace plugin = "http://marklogic.com/data-hub/plugins";
4-
5-
declare namespace envelope = "http://marklogic.com/data-hub/envelope";
6-
7-
declare option xdmp:mapping "false";
8-
9-
(:~
10-
: Create Headers Plugin
11-
:
12-
: @param $id - the identifier returned by the collector
13-
: @param $content - the output of your content plugin
14-
: @param $options - a map containing options. Options are sent from Java
15-
:
16-
: @return - zero or more header nodes
17-
:)
18-
declare function plugin:create-headers(
19-
$id as xs:string,
20-
$content as node()?,
21-
$options as map:map) as node()*
22-
{
23-
(: put the binary uri in the header of the envelope :)
24-
(
25-
<binary-uri>{map:get($options, "binary-uri")}</binary-uri>
26-
)
27-
};
1+
xquery version "1.0-ml";
2+
3+
module namespace plugin = "http://marklogic.com/data-hub/plugins";
4+
5+
declare option xdmp:mapping "false";
6+
7+
(:~
8+
: Create Headers Plugin
9+
:
10+
: @param $id - the identifier returned by the collector
11+
: @param $content - the output of your content plugin
12+
: @param $options - a map containing options. Options are sent from Java
13+
:
14+
: @return - zero or more header nodes
15+
:)
16+
declare function plugin:create-headers(
17+
$id as xs:string,
18+
$content as node()?,
19+
$options as map:map) as node()*
20+
{
21+
(: put the binary uri in the header of the envelope :)
22+
(
23+
<binary-uri>{map:get($options, "binary-uri")}</binary-uri>
24+
)
25+
};

0 commit comments

Comments
 (0)