JenaTextSparql: multi-word search returns 0 results (space escaped as Lucene special character)

## Summary

Multi-word search queries return **0 results** when using `sparqlDialect "JenaText"`, even though each word individually matches correctly. This affects both the UI search and the REST API.

## Reproduction

Using stock Skosmos 3.1 with Fuseki 5.4.0 (StandardAnalyzer, default config).

**Test vocabulary** — a simple 9-concept SKOS vocabulary with multi-word labels:

```turtle
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix ex:   <http://example.org/test/> .

ex:scheme a skos:ConceptScheme ;
    skos:hasTopConcept ex:cat, ex:dog .

ex:cat a skos:Concept ;
    skos:inScheme ex:scheme ;
    skos:prefLabel "Cat"@en ;
    skos:altLabel "Domestic cat"@en .

ex:siamese a skos:Concept ;
    skos:inScheme ex:scheme ;
    skos:prefLabel "Siamese cat"@en ;
    skos:broader ex:cat .

ex:dog a skos:Concept ;
    skos:inScheme ex:scheme ;
    skos:prefLabel "Dog"@en .

ex:labrador a skos:Concept ;
    skos:inScheme ex:scheme ;
    skos:prefLabel "Labrador retriever"@en ;
    skos:broader ex:dog .
```

**Results via REST API:**

| Search query | Expected | Actual |
|---|---|---|
| `Siamese*` | 1 result (Siamese cat) | 1 result ✅ |
| `Siamese cat*` | 1 result (Siamese cat) | **0 results** ❌ |
| `Labrador retriever*` | 1 result | **0 results** ❌ |
| `Domestic cat*` | 1 result (altLabel match) | **0 results** ❌ |

```bash
# Works:
curl -s 'http://localhost:9090/rest/v1/test/search?query=Siamese*&lang=en'
# → {"results":[{"prefLabel":"Siamese cat",...}]}

# Broken:
curl -s 'http://localhost:9090/rest/v1/test/search?query=Siamese+cat*&lang=en'
# → {"results":[]}
```

## Root Cause

In `src/model/sparql/JenaTextSparql.php`, the `LUCENE_ESCAPE_CHARS` constant includes a **space character** at position 0:

```php
public const LUCENE_ESCAPE_CHARS = ' +-&|!(){}[]^"~?:\\/';
//                                  ^ space here
```

The `createTextQueryCondition()` method escapes every character in that list:

```php
$lucenemap[$char] = '\\' . $char;
$term = strtr($term, $lucenemap);
```

This transforms `"Siamese cat*"` into `"Siamese\ cat*"`, telling Lucene to treat the space as a literal character rather than a word separator.

With StandardAnalyzer (the default for Jena Text indexes), labels are tokenized into individual words. No indexed token ever contains a literal space, so the escaped query `"Siamese\ cat*"` never matches anything.

## Why the space was included

Space is listed in the [Lucene Classic Query Parser documentation](https://lucene.apache.org/core/9_12_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters) as a special character. However, escaping it is incorrect when using word-level analyzers like StandardAnalyzer. The space should act as a **term separator**, not be escaped into a literal character.

## Proposed Fix

1. Remove space from `LUCENE_ESCAPE_CHARS`
2. Split multi-word queries into individual required Lucene terms using the `+` (required) operator

```php
public const LUCENE_ESCAPE_CHARS = '+-&|!(){}[]^"~?:\\/';
//                                  (no leading space)
```

Transform multi-word queries by splitting on whitespace and prefixing each word with `+`:

```
"Siamese cat*"       → "+Siamese +cat*"
"Labrador retriever*" → "+Labrador +retriever*"
```

This ensures each word must match independently, which works correctly with StandardAnalyzer's word-level tokenization. The wildcard suffix is preserved on the last (or any) term.

## Environment

- Skosmos 3.1 (also present in `v2.18-maintenance`)
- Apache Jena Fuseki 5.4.0
- Jena Text with Lucene (StandardAnalyzer, default config)
- `sparqlDialect "JenaText"`, `searchByNotation true`

This bug has existed since at least 2016 (the space has been in `LUCENE_ESCAPE_CHARS` since the early versions of JenaTextSparql.php).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JenaTextSparql: multi-word search returns 0 results (space escaped as Lucene special character) #1930

Summary

Reproduction

Root Cause

Why the space was included

Proposed Fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search query	Expected	Actual
`Siamese*`	1 result (Siamese cat)	1 result ✅
`Siamese cat*`	1 result (Siamese cat)	0 results ❌
`Labrador retriever*`	1 result	0 results ❌
`Domestic cat*`	1 result (altLabel match)	0 results ❌

JenaTextSparql: multi-word search returns 0 results (space escaped as Lucene special character) #1930

Description

Summary

Reproduction

Root Cause

Why the space was included

Proposed Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions