Skip to content

Commit 27d410e

Browse files
authored
Merge pull request #421 from eXist-db/feature/facets-multi-hierarchical
(feature) multi-value hierarchical facets documentation
2 parents 6dd94b3 + 85404a1 commit 27d410e

File tree

4 files changed

+50
-5
lines changed

4 files changed

+50
-5
lines changed
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<collection xmlns="http://exist-db.org/collection-config/1.0">
3+
<index xmlns:db="http://docbook.org/ns/docbook" xmlns:xs="http://www.w3.org/2001/XMLSchema">
4+
<lucene>
5+
<module uri="http://exist-db.org/lucene/test/" prefix="idx" at="module.xql"/>
6+
<text qname="db:article">
7+
<facet dimension="keyword" expression="db:info/db:keywordset/db:keyword"/>
8+
<facet dimension="date" expression="tokenize(db:info/db:pubdate, '-')" hierarchical="yes"/>
9+
<facet dimension="subject" expression="idx:subject-hierarchy(db:info/db:subjectset/db:subject/db:subjectterm)" hierarchical="yes"/>
10+
</text>
11+
</lucene>
12+
</index>
13+
</collection>
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
declare function idx:subject-hierarchy($key as xs:string*) {
2+
array:for-each (array {$key}, function($k) {
3+
doc('/db/subjects/subjects.xml')//subject[@name=$k]/ancestor-or-self::subject/@name
4+
})
5+
};
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
<subject>
2+
<subject name="science">
3+
<subject name="math"/>
4+
<subject name="physics"/>
5+
</subject>
6+
<subject name="humanities">
7+
<subject name="art"/>
8+
<subject name="sociology"/>
9+
<subject name="history"/>
10+
</subject>
11+
</subject>

src/main/xar-resources/data/lucene/lucene.xml

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
schematypens="http://purl.oclc.org/dsdl/schematron"?><article xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0">
44
<info>
55
<title>Full Text Index</title>
6-
<date>4Q19</date>
6+
<date>1Q20</date>
77
<keywordset>
88
<keyword>indexing</keyword>
99
</keywordset>
@@ -394,14 +394,30 @@
394394
expression rooted in the parent node being indexed. In the example the parent will be a <tag>db:article</tag> element, so the context item
395395
for the expression is set to this element.</para>
396396
<para>The expression is evaluated and for each result item, a facet value is added
397-
to the dimension using the string value of the item. If the expression returns
398-
the empty sequence for the current parent node, the corresponding facet will be
399-
empty as well.</para>
397+
to the dimension using the string value of the item. Therefore if the expression
398+
returns multiple items, a facet for that dimension will also hold multiple values.
399+
If the expression returns the empty sequence for the current parent node, the corresponding
400+
facet will be empty as well.</para>
400401
<para>A facet can also be defined to be hierarchical. A typical example would be a date, which consists of a year, month and day component. By
401402
indexing the single components as separate parts of a hierarchical facet, we enable the user to drill down by year first, then by month and
402403
finally by day. Let's assume each of our docbook articles has a <tag>pubdate</tag> containing a date in <code>xs:date</code>
403404
format:</para>
404405
<programlisting language="xml" xlink:href="listings/listing-52.txt"></programlisting>
406+
<para>Hierarchical facets may also hold multiple values, for example if we would like to associate
407+
our documents with a subject classification on various levels of granularity (say: <emphasis>science</emphasis> with
408+
<emphasis>math</emphasis> and <emphasis>physics</emphasis> as subcategories or <emphasis>humanities</emphasis> with
409+
<emphasis>art</emphasis>, <emphasis>sociology</emphasis> and <emphasis>history</emphasis>).
410+
This way we enable the user to drill down into broad <emphasis>humanities</emphasis>
411+
or <emphasis>science</emphasis> subject first and choose particular topics afterwards.
412+
If the result of the hierarchical facet <code>expression</code>
413+
evaluates to an array, each of array members will be treated as a hierarchical value for that facet.
414+
Such an array could look in XQuery similar to <code>[('science', 'math'), ('humanities', 'history')]</code> and be
415+
a result of evaluationg a function like <code>idx:subject-hierarchy</code> below stored in an imported module (see <link linkend="external-module">below</link>)
416+
</para>
417+
<programlisting language="xml" xlink:href="listings/listing-520.xml"/>
418+
<programlisting language="xquery" xlink:href="listings/listing-521.txt"/>
419+
<para>which assumes hierarchical subject structure stored in <emphasis>/db/subjects/subjects.xml</emphasis></para>
420+
<programlisting language="xml" xlink:href="listings/listing-522.xml"/>
405421
<para>Next, we may want to define fields for the authors and title of the article. In docbook, <tag>author</tag> can be a complex element,
406422
consisting e.g. of a <tag>personname</tag> with nested
407423
<tag>surname</tag> and <tag>firstname</tag>. For display to the user and sorting we want to pre-compute a normalized string out of those
@@ -428,7 +444,7 @@
428444
attribute <code>store="no"</code>. The field will still be indexed and
429445
available for queries though.</para>
430446

431-
<para><emphasis role="bold">Importing external modules</emphasis>: as can be seen in the field definition for "author" above, expressions can easily become quite verbose, so writing them into an attribute
447+
<para xml:id="external-module"><emphasis role="bold">Importing external modules</emphasis>: as can be seen in the field definition for "author" above, expressions can easily become quite verbose, so writing them into an attribute
432448
is not convenient. It is thus also possible to import one or more XQuery modules into the index configuration and use the functions declared
433449
in the module:</para>
434450
<programlisting language="xml" xlink:href="listings/listing-58.xml"></programlisting>

0 commit comments

Comments
 (0)