dsq provides an extensive collection of built-in functions through the dsq-functions crate, supporting jq-compatible operations plus DataFrame-specific functions. All functions work with strings, arrays, objects, DataFrames, and Series.
- Array Functions - Array manipulation and operations
- String Functions - Text processing and transformation
- Math Functions - Mathematical operations
- DataFrame Functions - Tabular data operations
- Statistical Functions - Statistical analysis and aggregation
- Date/Time Functions - Date and time operations
- URL Functions - URL parsing and manipulation
- Utility Functions - General-purpose utilities
length(value)- Returns the length of arrays, strings, objects, or DataFrame heightkeys(value)- Returns array indices, object keys, or DataFrame column nameshas(container, key)- Checks if an object contains a keyvalues(value)- Returns array elements or object valuestype(value)- Returns the type name of a valueempty(value)- Checks if a value is empty (null, empty array/string/object)error(message)- Throws an error with the given message
reverse(array)- Reverses the order of array elementssort(array)- Sorts array elementssort_by(array, keys)- Sorts array by key valuesunique(array)- Removes duplicate elementsflatten(array)- Flattens nested arraysadd(array)- Sums numeric array elementsmin(array),max(array)- Find minimum/maximum valuesfirst(array),last(array)- Get first/last elements
array_unshift(array, value)- Add element to startarray_shift(array)- Remove element from startarray_push(array, value)- Add element to endarray_pop(array)- Remove element from endrepeat(value, count)- Repeat a value n timeszip(arrays...)- Combine multiple arrays element-wisetranspose(array)- Transpose a 2D array
tostring(value)- Convert any value to stringtonumber(string)- Parse string as numbersplit(string, separator)- Split string by separatorjoin(array, separator)- Join array elements with separatorconcat(strings...)- Concatenate stringsreplace(string, old, new)- Replace substringscontains(string, substring)- Check if string contains substring
startswith(string, prefix)- Check if string starts with prefixendswith(string, suffix)- Check if string ends with suffixis_valid_utf8(string)- Check UTF-8 validity
lstrip(string),rstrip(string),trim(string)- Remove whitespacetolower(string),toupper(string)- Case conversionlowercase(string),uppercase(string)- Aliases for case conversiontitlecase(string)- Convert to title casesnake_case(string),camel_case(string)- Case style conversionpluralize(string),singular(string)- Plural/singular formsto_ascii(string)- Convert to ASCII (remove accents)to_valid_utf8(string)- Convert invalid UTF-8 to valid
dos2unix(string),unix2dos(string)- Convert line endingstabs_to_spaces(string),spaces_to_tabs(string)- Tab/space conversionhumanize(number)- Format numbers for human readability
base32_encode(string),base32_decode(string)- Base32 encoding/decodingbase58_encode(string),base58_decode(string)- Base58 encoding/decodingbase64_encode(string),base64_decode(string)- Base64 encoding/decoding
sha512(string)- SHA-512 hashsha256(string)- SHA-256 hashsha1(string)- SHA-1 hashmd5(string)- MD5 hash
tojson(value)- Convert to JSON stringfromjson(string)- Parse JSON string
abs(number)- Absolute valuesqrt(number)- Square rootpow(base, exponent)- Power functionexp(number)- Exponential functionlog10(number)- Base-10 logarithm
floor(number),ceil(number)- Floor and ceiling functionsround(number)- Round to nearest integerroundup(number),rounddown(number)- Round up/downmround(number, multiple)- Round to nearest multiple
sin(number),cos(number),tan(number)- Trigonometric functionsasin(number),acos(number),atan(number)- Inverse trigonometric functions
rand()- Random number between 0 and 1randarray(count)- Array of random numbersrandbetween(min, max)- Random integer in range
pi()- Pi constant (3.14159...)
columns(dataframe)- Get column namesshape(dataframe)- Get (rows, columns) dimensionsdtypes(dataframe)- Get column data types
cut(dataframe, columns)- Select specific columnshead(dataframe, n?)- Get first n rows (default 5)tail(dataframe, n?)- Get last n rows (default 5)sample(dataframe, n?)- Random sample of n rows
group_by(dataframe, column)- Group DataFrame by columnpivot(dataframe, index, columns, values)- Pivot tablemelt(dataframe, id_vars, value_vars?)- Unpivot DataFrame
sum(array|dataframe)- Sum of valuescount(array|dataframe)- Count of non-null valuesmean(array|dataframe),avg(array|dataframe)- Arithmetic meanmedian(array|dataframe)- Median valuemin(array|dataframe),max(array|dataframe)- Minimum/maximum values
quartile(array|dataframe, percentile)- Percentile (0.25, 0.5, 0.75)percentile(array|dataframe, p)- p-th percentilehistogram(array|dataframe, bins?)- Value distribution
std(array|dataframe),stdev_p(array|dataframe)- Population standard deviationstdev_s(array|dataframe)- Sample standard deviationvar(array|dataframe)- Variance
correl(array1, array2)- Correlation coefficient
avg_if(values, mask)- Average with conditioncount_if(values, mask)- Count with conditionavg_ifs(values, mask1, mask2, ...)- Average with multiple conditions
least_frequent(array|dataframe)- Most common valuemin_by(array, key),max_by(array, key)- Min/max by key function
year(date)- Extract yearmonth(date)- Extract monthday(date)- Extract dayhour(date)- Extract hourminute(date)- Extract minutesecond(date)- Extract second
mktime(year, month, day, hour?, min?, sec?)- Create timestamptoday()- Current datenow()- Current timestamp
systime()- System time in secondssystime_ns()- System time in nanosecondssystime_int()- System time as integer
strftime(timestamp, format)- Format timestampstrflocaltime(timestamp, format)- Format local timestampstrptime(string, format)- Parse timestamp string
localtime(timestamp)- Local timegmtime(timestamp)- GMT time
date_diff(date1, date2, unit?)- Date differencetruncate_date(date, unit)- Truncate date to unittruncate_time(time, unit)- Truncate time to unit
start_of_month(date),end_of_month(date)- Month boundariesstart_of_week(date),end_of_week(date)- Week boundaries
iif(condition, true_value, false_value)- Conditional expressioniferror(value, fallback)- Error handlingcoalesce(values...)- Return first non-null value
range(start, end, step?)- Generate number sequencegenerate_sequence(start, end, step?)- Generate sequencestime_series_range(start, end, interval)- Time series sequencesgenerate_uuidv4(),generate_uuidv7()- Generate UUIDs
select(value)- Filter truthy valuesmap(array, field|template)- Transform array elementsfilter(array, condition)- Filter array elementsunnest(array)- Unnest nested structuresgroup_concat(array, separator?)- Concatenate with grouping
del(object, key)- Remove object keytransform_keys(object, function)- Transform object keys
url_parse(url)- Parse URL into componentsurl_extract_domain(url)- Extract domain from URLurl_extract_path(url)- Extract path from URLurl_extract_query_string(url)- Extract query stringurl_extract_protocol(url)- Extract protocolurl_extract_port(url)- Extract port number
url_set_protocol(url, protocol)- Change URL protocolurl_set_path(url, path)- Change URL pathurl_set_domain(url, domain)- Change URL domainurl_set_domain_without_www(url)- Remove www from domainurl_set_query_string(url, key, value)- Set query parameterurl_set_port(url, port)- Change URL port
url_strip_fragment(url)- Remove URL fragmenturl_strip_query_string(url)- Remove query stringurl_strip_port(url)- Remove port (if default)url_strip_port_if_default(url)- Remove default portsurl_strip_protocol(url)- Remove protocol
transliterate(text, from_script, to_script)- Script transliteration (e.g., Cyrillic to Latin)
# Sort array by field
dsq 'sort_by(.age)' people.csv
# Get unique values
dsq '.[] | unique' data.json
# Flatten nested arrays
dsq 'flatten' nested.json# Split and join
dsq '.name | split(" ") | join("_")' data.csv
# Case conversion
dsq 'map({name: .name | titlecase})' names.json
# URL encoding
dsq '.url | base64_encode' urls.csv# Calculate mean
dsq 'map(.salary) | mean' employees.csv
# Get percentiles
dsq '.values | percentile(0.95)' metrics.json
# Correlation
dsq 'correl(.x, .y)' data.csv# Extract date components
dsq 'map({y: .date | year, m: .date | month})' events.csv
# Date formatting
dsq '.timestamp | strftime("%Y-%m-%d")' logs.json
# Date difference
dsq 'date_diff(.end_date, .start_date, "days")' periods.csv# Select columns
dsq 'cut(["name", "age"])' people.csv
# Group and aggregate
dsq 'group_by(.department)' employees.csv
# Pivot table
dsq 'pivot("date", "category", "amount")' sales.csvMany functions work across different data types:
# length works on strings, arrays, DataFrames
dsq '.name | length' # string length
dsq '.items | length' # array length
dsq '. | length' # DataFrame row count
# mean works on arrays and DataFrames
dsq '.scores | mean' # array mean
dsq '.salary | mean' # column meanFunctions include comprehensive error handling:
# Use iferror for graceful fallback
dsq 'iferror(.value | tonumber, 0)' data.csv
# Use coalesce for null handling
dsq 'coalesce(.value1, .value2, 0)' data.csv