You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ci/praktika/job.py
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -92,7 +92,7 @@ def parametrize(
92
92
==len(timeout)
93
93
==len(provides)
94
94
==len(requires)
95
-
), f"Parametrization lists must be of the same size [{len(parameter)}, {len(runs_on)}, {len(timeout)}, {len(provides)}, {len(requires)}]"
95
+
), f"Parametrization lists for job [{self.name}] must be of the same size [{len(parameter)}, {len(runs_on)}, {len(timeout)}, {len(provides)}, {len(requires)}]"
Copy file name to clipboardExpand all lines: ci/praktika/validator.py
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -56,7 +56,7 @@ def validate(cls):
56
56
forjobinworkflow.jobs:
57
57
cls.evaluate_check(
58
58
isinstance(job, Job.Config),
59
-
f"Invalid job type [{job}]",
59
+
f"Invalid job type [{job}]: type [{type(job)}]",
60
60
workflow.name,
61
61
)
62
62
@@ -158,7 +158,7 @@ def validate(cls):
158
158
artifact.is_s3_artifact()
159
159
), f"All artifacts must be of S3 type if enable_cache|enable_html=True, artifact [{artifact.name}], type [{artifact.type}], workflow [{workflow.name}]"
Like [`IPv4StringToNum`](#IPv4StringToNum) but takes a string form of IPv4 address and returns value of [IPv4](../data-types/ipv4.md) type.
251
+
Converts a string or a UInt32 form of IPv4 address to [IPv4](../data-types/ipv4.md) type.
252
+
Similar to [`IPv4StringToNum`](#IPv4StringToNum) and [IPv4NumToString](#IPv4NumToString) functions but it supports both string and unsigned integer data types as input arguments.
Same as `toIPv4`, but if the IPv4 address has an invalid format, it returns `0.0.0.0` (0 IPv4), or the provided IPv4 default.
@@ -412,7 +428,7 @@ Result:
412
428
## toIPv6 {#toipv6}
413
429
414
430
Converts a string or a UInt128 form of IPv6 address to [IPv6](../data-types/ipv6.md) type. For strings, if the IPv6 address has an invalid format, returns an empty value.
415
-
Similar to [IPv6StringToNum](#ipv6stringtonum)function, which converts IPv6 address to binary format.
431
+
Similar to [IPv6StringToNum](#ipv6stringtonum)and [IPv6NumToString](#ipv6numtostringx) functions, which convert IPv6 address to and from binary format (i.e. `FixedString(16)`).
416
432
417
433
If the input string contains a valid IPv4 address, then the IPv6 equivalent of the IPv4 address is returned.
418
434
@@ -425,7 +441,7 @@ toIPv6(UInt128)
425
441
426
442
**Argument**
427
443
428
-
-`string` or `UInt128`— IP address. [String](../data-types/string.md).
444
+
-`x`— IP address. [`String`](../data-types/string.md) or [`UInt128`](../data-types/int-uint.md).
import DeprecatedBadge from '@theme/badges/DeprecatedBadge';
10
+
9
11
# Functions for Splitting Strings
10
12
11
13
## splitByChar {#splitbychar}
@@ -347,9 +349,14 @@ Result:
347
349
348
350
## ngrams {#ngrams}
349
351
352
+
<DeprecatedBadge/>
353
+
354
+
350
355
Splits a UTF-8 string into n-grams of `ngramsize` symbols.
356
+
This function is deprecated. Prefer to use [tokens](#tokens) with the `ngram` tokenizer.
357
+
The function might be removed at some point in future.
351
358
352
-
**Syntax**
359
+
**Syntax**
353
360
354
361
```sql
355
362
ngrams(string, ngramsize)
@@ -380,18 +387,23 @@ Result:
380
387
381
388
## tokens {#tokens}
382
389
383
-
Splits a string into tokens using non-alphanumeric ASCII characters as separators.
390
+
Splits a string into tokens using the given tokenizer.
391
+
The default tokenizer uses non-alphanumeric ASCII characters as separators.
384
392
385
393
**Arguments**
386
394
387
-
-`input_string` — Any set of bytes represented as the [String](../data-types/string.md) data type object.
395
+
-`value` — The input string. [String](../data-types/string.md) or [FixedString](../data-types/fixedstring.md).
396
+
-`tokenizer` — The tokenizer to use. Valid arguments are `default`, `ngram`, and `noop`. Optional, if not set explicitly, defaults to `default`. [const String](../data-types/string.md)
397
+
-`ngrams` — Only relevant if argument `tokenizer` is `ngram`: An optional parameter which defines the length of the ngrams. If not set explicitly, defaults to `3`. [UInt8](../data-types/int-uint.md).
388
398
389
399
**Returned value**
390
400
391
401
- The resulting array of tokens from input string. [Array](../data-types/array.md).
392
402
393
403
**Example**
394
404
405
+
Using the default settings:
406
+
395
407
```sql
396
408
SELECT tokens('test1,;\\ test2,;\\ test3,;\\ test4') AS tokens;
0 commit comments