|
|
cts:collection-match(
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the collection lexicon
that match the specified wildcard pattern.
This function requires the collection-lexicon database configuration
parameter to be enabled. If the uri-lexicon database-configuration
parameter is not enabled, an exception is thrown.
|
Parameters:
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- URIs should be returned in ascending order.
- "descending"
- URIs should be returned in descending order.
- "any"
- URIs from any fragment should be included.
- "document"
- URIs from document fragments should be included.
- "properties"
- URIs from properties fragments should be included.
- "locks"
- URIs from locks fragments should be included.
- "frequency-order"
- URIs should be returned ordered by frequency.
- "item-order"
- URIs should be returned ordered by item.
- "limit=N"
- Return no more than N URIs.
- "sample=N"
- Return only URIs from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only URIs from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include URIs from fragments selected by the cts:query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "sample=N" is not specfied in the options parameter,
then all included URIs may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then URIs from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:collection-match("collection*")
=> ("collection1", "collection2", ...)
|
|
|
|
cts:collections(
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the collection lexicon.
This function requires the collection-lexicon database configuration
parameter to be enabled. If the collection-lexicon database-configuration
parameter is not enabled, an exception is thrown.
|
Parameters:
$start
(optional):
A starting value. Return only this value and following values.
If the parameter is is not in the lexicon, then it returns the values
beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- URIs should be returned in ascending order.
- "descending"
- URIs should be returned in descending order.
- "any"
- URIs from any fragment should be included.
- "document"
- URIs from document fragments should be included.
- "properties"
- URIs from properties fragments should be included.
- "locks"
- URIs from locks fragments should be included.
- "frequency-order"
- URIs should be returned ordered by frequency.
- "item-order"
- URIs should be returned ordered by item.
- "limit=N"
- Return no more than N URIs.
- "sample=N"
- Return only URIs from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only URIs from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include URIs from fragments selected by the cts:query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:collections("aardvark")
=> ("aardvark", "aardvarks", ...)
|
|
|
|
cts:element-attribute-value-co-occurrences(
|
|
$element-name-1 as xs:QName,
|
|
$attribute-name-1 as xs:QName?,
|
|
$element-name-2 as xs:QName,
|
|
$attribute-name-2 as xs:QName?,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:co-occurrence)* |
|
 |
Summary:
Returns value co-occurrences from the specified element or element-attribute
value lexicon(s).
Value lexicons are implemented using range indexes;
consequently this function requires a range index for each element/attribute
pairs specified in the function.
If there is not a range index configured for each of the specified
element or element/attribute pairs, then an exception is thrown.
|
Parameters:
$element-name-1
:
An element QName.
|
$attribute-name-1
:
An attribute QName or empty sequence.
The empty sequence specifies an element lexicon.
|
$element-name-2
:
An element QName.
|
$attribute-name-2
:
An attribute QName or empty sequence.
The empty sequence specifies an element lexicon.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Co-occurrences should be returned in ascending order.
- "descending"
- Co-occurrences should be returned in descending order.
- "any"
- Co-occurrences from any fragment should be included.
- "document"
- Co-occurrences from document fragments should be included.
- "properties"
- Co-occurrences from properties fragments should be included.
- "locks"
- Co-occurrences from locks fragments should be included.
- "frequency-order"
- Co-occurrences should be returned ordered by frequency.
- "item-order"
- Co-occurrences should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included co-occurrences.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included co-occurrence.
This option is used with
cts:frequency.
- "type=type"
- For both lexicons, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-1=type"
- For the first lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-2=type"
- For the second lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- For both lexicons, use the collation specified by
URI.
- "collation-1=URI"
- For the first lexicon, use the collation specified by
URI.
- "collation-2=URI"
- For the second lexicon, use the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "ordered"
- Include co-occurrences only when the value from the first lexicon
appears before the value from the second lexicon.
Requires that word positions be enabled for both lexicons.
- "proximity=N"
- Include co-occurrences only when the values appear within
N words of each other.
Requires that word positions be enabled for both lexicons.
- "limit=N"
- Return no more than N co-occurrences.
- "sample=N"
- Return only co-occurrences from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only co-occurrences from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include co-occurrences in fragments selected by the
cts:query.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included co-occurrences may be returned.
If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then co-occurrences from all fragments selected by the
$query parameter are included.
If a $query parameter is not present, then
"truncate=N" has no effect.
|
|
|
|
cts:element-attribute-value-match(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified element-attribute value lexicon(s)
that match the specified pattern. Element-attribute value lexicons are
implemented using range indexes; consequently this function requires an
attribute range index for each of the element/attribute pairs specified
in the function. If there is not a range index configured for each of the
specified element/attribute pairs, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
String parameters may include wildcard characters.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "item-order"
- Values should be returned ordered by item.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "sample=N"
- Return only values occurring in the first N fragments
selected by the
cts:query; only values in fragments
satisfying the cts:query are returned, but any analytics
calculations (using cts:frequency, for example)
use all the lexicon values, not just the ones constrained by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
selected by the
cts:query; only values in fragments
satisfying the cts:query are returned, and only those
values are used in calculating any analytics (using
cts:frequency, for example).
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:element-attribute-value-match(xs:QName("animals"),
xs:QName("name"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-attribute-value-ranges(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
[$bounds as xs:anyAtomicType*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:range)* |
|
 |
Summary:
Returns value ranges from the specified element-attribute value lexicon(s).
Element-attribute value lexicons are implemented using indexes;
consequently this function requires an attribute range index
of for each of the element/attribute pairs specified in the function.
If there is not a range index configured for each of the specified
element/attribute pairs, then an exception is thrown.
The values are divided into buckets. The $bounds parameter specifies
the number of buckets and the size of each bucket.
All included values are bucketed, even those less than the lowest bound
or greater than the highest bound. An empty sequence for $bounds specifies
one bucket, a single value specifies two buckets, two values specify
three buckets, and so on.
If you have string values and you pass a $bounds parameter
as in the following call:
cts:element-value-ranges(xs:QName("myElement"), ("f", "m"))
The first bucket contains string values that are less than the
string f, the second bucket contains string values greater than
or equal to f but less than m, and the third bucket
contains string values that are greater than or equal to m.
For each non-empty bucket, a cts:range element is returned.
Each cts:range element has a cts:minimum child
and a cts:maximum child. If a bucket is bounded, its
cts:range element will also have a
cts:lower-bound child if it is bounded from below, and
a cts:upper-bound element if it is bounded from above.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$bounds
(optional):
A sequence of range bounds.
The types must match the lexicon type.
The values must be in strictly ascending order.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Ranges should be returned in ascending order.
- "descending"
- Ranges should be returned in descending order.
- "empties"
- Include fully-bounded ranges whose frequency is 0. These ranges
will have no minimum or maximum value. Only empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Ranges should be returned ordered by frequency.
- "item-order"
- Ranges should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N ranges.
- "sample=N"
- Return only ranges for buckets with at least one value from the
first N fragments selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then ranges with all included values may be returned. If a
$query parameter is not present, then "sample=N"
has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
(: Run the following to load data for this example.
Make sure you have an int element attribute
range index on my-node/@number. :)
for $x in (1 to 10)
return
xdmp:document-insert(fn:concat("/doc", fn:string($x), ".xml"),
<root><my-node number={$x}/></root>) ;
(: The following is based on the above setup :)
cts:element-attribute-value-ranges(xs:QName("my-node"),
xs:QName("number"), (5, 10, 15, 20), "empties")
=>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">1</cts:minimum>
<cts:maximum xsi:type="xs:int">4</cts:maximum>
<cts:upper-bound xsi:type="xs:int">5</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">5</cts:minimum>
<cts:maximum xsi:type="xs:int">9</cts:maximum>
<cts:lower-bound xsi:type="xs:int">5</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">10</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">10</cts:minimum>
<cts:maximum xsi:type="xs:int">10</cts:maximum>
<cts:lower-bound xsi:type="xs:int">10</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">15</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:lower-bound xsi:type="xs:int">15</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">20</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:lower-bound xsi:type="xs:int">20</cts:lower-bound>
</cts:range>
|
|
|
|
cts:element-attribute-values(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
[$start as xs:anyAtomicType?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified element-attribute value lexicon(s).
Element-attribute value lexicons are implemented using indexes;
consequently this function requires an attribute range index
of for each of the element/attribute pairs specified in the function.
If there is not a range index configured for each of the specified
element/attribute pairs, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$start
(optional):
A starting value. The parameter type must match the lexicon type.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "sample=N"
- Return only values from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:element-attribute-values(xs:QName("animal"),
xs:QName("name"),
"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-attribute-word-match(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element-attribute word lexicon(s) that
match a wildcard pattern. This function requires an element-attribute
word lexicon for each of the element/attribute pairs specified in the
function. If there is not an element-attribute word lexicon
configured for any of the specified element/attribute pairs, then
an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "sample=N"
- Return only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:element-word-match(xs:QName("animals"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-attribute-words(
|
|
$element-names as xs:QName*,
|
|
$attribute-names as xs:QName*,
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element-attribute word lexicon(s).
This function requires an element-attribute word lexicon for each of the
element/attribute pairs specified in the function. If there is not an
element/attribute word lexicon configured for any of the specified
element/attribute pairs, then an exception is thrown. The words are
returned in collation order.
|
Parameters:
$element-names
:
One or more element QNames.
|
$attribute-names
:
One or more attribute QNames.
|
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "sample=N"
- Return only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
When multiple element and/or attribute QNames are specified,
then all possible element/attribute QName combinations are used
to select the matching values.
|
Example:
cts:element-attribute-words(xs:QName("animal"),
xs:QName("name"),
"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-value-co-occurrences(
|
|
$element-name-1 as xs:QName,
|
|
$element-name-2 as xs:QName,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:co-occurrence)* |
|
 |
Summary:
Returns value co-occurrences (that is, pairs of values, both of which appear
in the same fragment) from the specified element value lexicon(s). The
values are returned as an XML element with two children, each child
containing one of the co-occurring values. You can use
cts:frequency on each item returned to find how many times
the pair occurs.
Value lexicons are implemented using range indexes; consequently
this function requires an element range index for each element specified
in the function, and the range index must have range value positions
set to true. If there is not a range index configured for each
of the specified elements, and if the range value positions is not
enabled for the any of the range indexes, an exception is thrown.
|
Parameters:
$element-name-1
:
An element QName.
|
$element-name-2
:
An element QName.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Co-occurrences should be returned in ascending order.
- "descending"
- Co-occurrences should be returned in descending order.
- "any"
- Co-occurrences from any fragment should be included.
- "document"
- Co-occurrences from document fragments should be included.
- "properties"
- Co-occurrences from properties fragments should be included.
- "locks"
- Co-occurrences from locks fragments should be included.
- "frequency-order"
- Co-occurrences should be returned ordered by frequency.
- "item-order"
- Co-occurrences should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included co-occurrences.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included co-occurrence.
This option is used with
cts:frequency.
- "type=type"
- For both lexicons, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-1=type"
- For the first lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "type-2=type"
- For the second lexicon, use the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- For both lexicons, use the collation specified by
URI.
- "collation-1=URI"
- For the first lexicon, use the collation specified by
URI.
- "collation-2=URI"
- For the second lexicon, use the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "ordered"
- Include co-occurrences only when the value from the first lexicon
appears before the value from the second lexicon.
Requires that word positions be enabled for both lexicons.
- "proximity=N"
- Include co-occurrences only when the values appear within
N words of each other.
Requires that word positions be enabled for both lexicons.
- "limit=N"
- Return no more than N co-occurrences.
- "sample=N"
- Return only co-occurrences from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only co-occurrences from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include co-occurrences in fragments selected by the
cts:query.
The co-occurrences do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included co-occurrences may be returned.
If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then co-occurrences from all fragments selected by the
$query parameter are included.
If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
(: this query has the database fragmented on SPEECH and
finds SPEAKERs that co-occur in a SPEECH :)
cts:element-value-co-occurrences(
xs:QName("SPEAKER"),xs:QName("SPEAKER"),
("frequency-order","ordered"),
cts:document-query("hamlet.xml"))[1 to 3]
=>
<cts:co-occurrence xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:string">MARCELLUS</cts:value>
<cts:value xsi:type="xs:string">BERNARDO</cts:value>
</cts:co-occurrence>
<cts:co-occurrence xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:string">ROSENCRANTZ</cts:value>
<cts:value xsi:type="xs:string">GUILDENSTERN</cts:value>
</cts:co-occurrence>
<cts:co-occurrence xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:value xsi:type="xs:string">HORATIO</cts:value>
<cts:value xsi:type="xs:string">MARCELLUS</cts:value>
</cts:co-occurrence>
|
|
|
|
cts:element-value-match(
|
|
$element-names as xs:QName*,
|
|
$pattern as xs:anyAtomicType,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified element value lexicon(s)
that match the specified wildcard pattern. Element value lexicons
are implemented using range indexes; consequently this function
requires an element range index for each element specified in the
function. If there is not a range index configured for each of the
specified elements, then an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$pattern
:
A pattern to match. The parameter type must match the lexicon type.
String parameters may include wildcard characters.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the range index with the collation specified by
URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "sample=N"
- Return only values occurring in the first N fragments
selected by the
cts:query; only values in fragments
satisfying the cts:query are returned, but any analytics
calculations (using cts:frequency, for example)
use all the lexicon values, not just the ones constrained by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
selected by the
cts:query; only values in fragments
satisfying the cts:query are returned, and only those
values are used in calculating any analytics (using
cts:frequency, for example).
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a range index with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:element-value-match(xs:QName("animal"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-value-ranges(
|
|
$element-names as xs:QName*,
|
|
[$bounds as xs:anyAtomicType*],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as element(cts:range)* |
|
 |
Summary:
Returns value ranges from the specified element value lexicon(s).
Value lexicons are implemented using range indexes; consequently this
function requires an element range index for each element specified
in the function. If there is not a range index configured for each
of the specified elements, an exception is thrown.
The values are divided into buckets. The $bounds parameter specifies
the number of buckets and the size of each bucket.
All included values are bucketed, even those less than the lowest bound
or greater than the highest bound. An empty sequence for $bounds specifies
one bucket, a single value specifies two buckets, two values specify
three buckets, and so on.
If you have string values and you pass a $bounds parameter
as in the following call:
cts:element-value-ranges(xs:QName("myElement"), ("f", "m"))
The first bucket contains string values that are less than the
string f, the second bucket contains string values greater than
or equal to f but less than m, and the third bucket
contains string values that are greater than or equal to m.
For each non-empty bucket, a cts:range element is returned.
Each cts:range element has a cts:minimum child
and a cts:maximum child. If a bucket is bounded, its
cts:range element will also have a
cts:lower-bound child if it is bounded from below, and
a cts:upper-bound element if it is bounded from above.
Empty buckets return nothing unless the "empties" option is specified.
|
Parameters:
$element-names
:
One or more element QNames.
|
$bounds
(optional):
A sequence of range bounds.
The types must match the lexicon type.
The values must be in strictly ascending order, otherwise an exception
is thrown.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Ranges should be returned in ascending order.
- "descending"
- Ranges should be returned in descending order.
- "empties"
- Include fully-bounded ranges whose frequency is 0. These ranges
will have no minimum or maximum value. Only empty ranges that have
both their upper and lower bounds specified in the $bounds
options are returned;
any empty ranges that are less than the first bound or greater than the
last bound are not returned. For example, if you specify 4 bounds
and there are no results for any of the bounds, 3 elements are
returned (not 5 elements).
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Ranges should be returned ordered by frequency.
- "item-order"
- Ranges should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N ranges.
- "sample=N"
- Return only ranges for buckets with at least one value from the
first N fragments selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then ranges with all included values may be returned. If a
$query parameter is not present, then "sample=N"
has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
(: Run the following to load data for this example.
Make sure you have an int element range index on
number. :)
for $x in (1 to 10)
return
xdmp:document-insert(fn:concat("/doc", fn:string($x), ".xml"),
<root><number>{$x}</number></root>) ;
(: The following is based on the above setup :)
cts:element-value-ranges(xs:QName("number"),
(5, 10, 15, 20), "empties")
=>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">1</cts:minimum>
<cts:maximum xsi:type="xs:int">4</cts:maximum>
<cts:upper-bound xsi:type="xs:int">5</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">5</cts:minimum>
<cts:maximum xsi:type="xs:int">9</cts:maximum>
<cts:lower-bound xsi:type="xs:int">5</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">10</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:int">10</cts:minimum>
<cts:maximum xsi:type="xs:int">10</cts:maximum>
<cts:lower-bound xsi:type="xs:int">10</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">15</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:lower-bound xsi:type="xs:int">15</cts:lower-bound>
<cts:upper-bound xsi:type="xs:int">20</cts:upper-bound>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:lower-bound xsi:type="xs:int">20</cts:lower-bound>
</cts:range>
|
Example:
(: this query has the database fragmented on SPEECH and
finds four ranges of SPEAKERs :)
cts:element-value-ranges(xs:QName("SPEAKER"),("F","N","S"));
=>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">All</cts:minimum>
<cts:maximum xsi:type="xs:string">Danes</cts:maximum>
<cts:upper-bound xsi:type="xs:string">F</cts:maximum>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">First Ambassador</cts:minimum>
<cts:maximum xsi:type="xs:string">Messenger</cts:maximum>
<cts:lower-bound xsi:type="xs:string">F</cts:maximum>
<cts:upper-bound xsi:type="xs:string">N</cts:maximum>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">OPHELIA</cts:minimum>
<cts:maximum xsi:type="xs:string">ROSENCRANTZ</cts:maximum>
<cts:lower-bound xsi:type="xs:string">N</cts:maximum>
<cts:upper-bound xsi:type="xs:string">S</cts:maximum>
</cts:range>
<cts:range xmlns:cts="http://marklogic.com/cts"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<cts:minimum xsi:type="xs:string">Second Clown</cts:minimum>
<cts:maximum xsi:type="xs:string">VOLTIMAND</cts:maximum>
<cts:lower-bound xsi:type="xs:string">S</cts:maximum>
</cts:range>
|
Example:
(: this is the same query has above, but it is getting the counts
of the number of SPEAKERs for each bucket :)
for $bucket in cts:element-value-ranges(xs:QName("SPEAKER"),("F","N","S"))
return cts:frequency($bucket);
=>
9602
11329
5167
4983
|
|
|
|
cts:element-values(
|
|
$element-names as xs:QName*,
|
|
[$start as xs:anyAtomicType?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:anyAtomicType* |
|
 |
Summary:
Returns values from the specified element value lexicon(s).
Value lexicons are implemented using range indexes; consequently this
function requires an element range index for each element specified
in the function. If there is not a range index configured for each
of the specified elements, an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$start
(optional):
A starting value. The parameter type must match the lexicon type.
If the parameter value is is not in the lexicon, then the values are
returned beginning with the next value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Values should be returned in ascending order.
- "descending"
- Values should be returned in descending order.
- "any"
- Values from any fragment should be included.
- "document"
- Values from document fragments should be included.
- "properties"
- Values from properties fragments should be included.
- "locks"
- Values from locks fragments should be included.
- "frequency-order"
- Values should be returned ordered by frequency.
- "item-order"
- Values should be returned ordered by item.
- "fragment-frequency"
- Frequency should be the number of fragments with
an included value.
This option is used with
cts:frequency.
- "item-frequency"
- Frequency should be the number of occurences of
an included value.
This option is used with
cts:frequency.
- "type=type"
- Use the lexicon with the type specified by type
(int, unsignedInt, long, unsignedLong, float, double, decimal,
dateTime, time, date, gYearMonth, gYear, gMonth, gDay,
yearMonthDuration, dayTimeDuration, string, or anyURI)
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "timezone=TZ"
- Return timezone sensitive values (dateTime, time, date,
gYearMonth, gYear, gMonth, and gDay) adjusted to the timezone
specified by TZ.
Example timezones: Z, -08:00, +01:00.
- "limit=N"
- Return no more than N values.
- "sample=N"
- Return only values occurring in the first N fragments
selected by the
cts:query; only values in fragments
satisfying the cts:query are returned, but any analytics
calculations (using cts:frequency, for example)
use all the lexicon values, not just the ones constrained by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only values from the first N fragments
selected by the
cts:query; only values in fragments
satisfying the cts:query are returned, and only those
values are used in calculating any analytics (using
cts:frequency, for example).
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include values in fragments selected by the cts:query.
The values do not need to match the query, but they must occur in
fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "fragment-frequency" or "item-frequency" may be specified
in the options parameter. If neither "fragment-frequency" nor
"item-frequency" is specified, then the default is "fragment-frequency".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included values may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then values from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:element-values(xs:QName("animal"),"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:element-word-match(
|
|
$element-names as xs:QName*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element word lexicon(s) that match
a wildcard pattern. This function requires an element word lexicon
configured for each of the specified elements in the function. If there
is not an element word lexicon configured for any of the specified
elements, an exception is thrown.
|
Parameters:
$element-names
:
One or more element QNames.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "sample=N"
- Return only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
Only words that can be matched with element-word-query are included.
That is, only words present in immediate text node children of the
specified element as well as any text node children of child elements
defined in the Admin Interface as element-word-query-throughs or
phrase-throughs.
|
Example:
cts:element-word-match(xs:QName("animal"),"aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:element-words(
|
|
$element-names as xs:QName*,
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified element word lexicon. This function
requires an element word lexicon for each of the element specified in the
function. If there is not an element word lexicon configured for any
of the specified elements, an exception is thrown. The words are
returned in collation order.
|
Parameters:
$element-names
:
One or more element QNames.
|
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "sample=N"
- Return only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
Only words that can be matched with element-word-query are included.
That is, only words present in immediate text node children of the
specified element as well as any text node children of child elements
defined in the Admin Interface as element-word-query-throughs or
phrase-throughs.
|
Example:
cts:element-words(xs:QName("animal"),"aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:field-word-match(
|
|
$field-names as xs:string*,
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified field word lexicon(s) that match
a wildcard pattern. This function requires an field word lexicon
configured for each of the specified fields in the function. If there
is not an field word lexicon configured for any of the specified
fields, an exception is thrown.
|
Parameters:
$field-names
:
One or more field names.
|
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "sample=N"
- Return only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
Only words that can be matched with field-word-query are included.
That is, only words present in immediate text node children of the
specified field as well as any text node children of child fields
defined in the Admin Interface as field-word-query-throughs or
phrase-throughs.
|
Example:
cts:field-word-match("animal","aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:field-words(
|
|
$field-names as xs:string*,
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the specified field word lexicon. This function
requires an field lexicon for each of the field specified in the
function. If there is not an field word lexicon configured for any
of the specified fields, an exception is thrown. The words are
returned in collation order.
|
Parameters:
$field-names
:
One or more field names.
|
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "sample=N"
- Return only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
Only words that can be matched with field-word-query are included.
That is, only words present in immediate text node children of the
specified field as well as any text node children of child fields
defined in the Admin Interface as field-word-query-throughs or
phrase-throughs.
|
Example:
cts:field-words("animal","aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|
|
cts:frequency(
|
|
$value as item()
|
| ) as xs:integer |
|
 |
Summary:
Returns an integer representing the number of times in which a particular
value occurs in a value lexicon lookup (for example,
cts:element-values). When using the
fragment-frequency lexicon option, cts:frequency
returns the number of fragments in which the lexicon value occurs. When
using the item-frequency lexicon option,
cts:frequency returns the total number of times
in which the lexicon value occurs in each item.
|
Parameters:
|
Usage Notes:
You must have a Range index configured to use the value lexicon APIs
(cts:element-values, cts:element-value-match,
cts:element-attribute-values, or
cts:element-attribute-value-match).
If the value specified is not from a value lexicon lookup,
cts:frequency returns a frequency of 0.
The frequency returned from cts:frequency is fragment-based
by default (using the default fragment-frequency option in the
lexicon API). If there are multiple occurences of the value in any given
fragment, the frequency is still one per fragment when using
fragment-frequency. Therefore, if the value
returned is 13, it means that the value occurs in 13 fragments.
If you want the total frequency instead of the fragment-based frequency
(that is, the total number of occurences of the value in the items specified
in the cts:query option of the lexicon API),
you must specify the item-frequency option to the lexicon
API value input to cts:frequency. For example, the second
example below specifies an item-frequency and a
cts:document-query in the lexicon
API, so the item frequency is how many times each speaker speaks in the
play (because the constraining query is a document query of hamlet.xml, which
contains the whole play).
|
Example:
<results>{
let $x := cts:element-values(xs:QName("SPEAKER"),"",(),
cts:document-query("/shakespeare/plays/hamlet.xml"))
for $speaker in $x
return
(
<result>
<SPEAKER>{$speaker}</SPEAKER>
<NUMBER-OF-SPEECHES>{cts:frequency($speaker)}</NUMBER-OF-SPEECHES>
</result>
)
}</results>
=> Returns the names of the speakers in Hamlet
with the number of times they speak. If the
play is fragmented at the SCENE level, then
it returns the number of scenes in which each
speaker speaks.
|
Example:
<results>{
let $x := cts:element-values(xs:QName("SPEAKER"),
"", "item-frequency",
cts:document-query("/shakespeare/plays/hamlet.xml"))
for $speaker in $x
return
(
<result>
<SPEAKER>{$speaker}</SPEAKER>
<NUMBER-OF-SPEECHES>
{cts:frequency($speaker)}
</NUMBER-OF-SPEECHES>
</result>
)
}</results>
=> Returns the names of the speakers in Hamlet
with the number of times they speak. Returns
the total times they speak, regardless
of fragmentation.
|
|
|
|
cts:uri-match(
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the URI lexicon
that match the specified wildcard pattern.
This function requires the uri-lexicon database configuration
parameter to be enabled. If the uri-lexicon database-configuration
parameter is not enabled, an exception is thrown.
|
Parameters:
$pattern
:
Wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- URIs should be returned in ascending order.
- "descending"
- URIs should be returned in descending order.
- "any"
- URIs from any fragment should be included.
- "document"
- URIs from document fragments should be included.
- "properties"
- URIs from properties fragments should be included.
- "locks"
- URIs from locks fragments should be included.
- "frequency-order"
- URIs should be returned ordered by frequency.
- "item-order"
- URIs should be returned ordered by item.
- "limit=N"
- Return no more than N URIs.
- "sample=N"
- Return only URIs from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only URIs from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include URIs from fragments selected by the cts:query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "sample=N" is not specfied in the options parameter,
then all included URIs may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then URIs from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:uri-match("http://foo.com/*.html")
=> ("http://foo.com/bar.html", "http://foo.com/baz/bork.html", ...)
|
|
|
|
cts:uris(
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns values from the URI lexicon.
This function requires the uri-lexicon database configuration
parameter to be enabled. If the uri-lexicon database-configuration
parameter is not enabled, an exception is thrown.
|
Parameters:
$start
(optional):
A starting value. Return only this value and following values. If
the empty string, return all values. If the parameter is is not in
the lexicon, then it returns the values beginning with the next
value.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- URIs should be returned in ascending order.
- "descending"
- URIs should be returned in descending order.
- "any"
- URIs from any fragment should be included.
- "document"
- URIs from document fragments should be included.
- "properties"
- URIs from properties fragments should be included.
- "locks"
- URIs from locks fragments should be included.
- "frequency-order"
- URIs should be returned ordered by frequency.
- "item-order"
- URIs should be returned ordered by item.
- "limit=N"
- Return no more than N URIs.
- "sample=N"
- Return only URIs from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only URIs from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include URIs from fragments selected by the cts:query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "frequency-order" or "item-order" may be specified
in the options parameter. If neither "frequency-order" nor "item-order"
is specified, then the default is "item-order".
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending" if "item-order" is
specified, and "descending" if "frequency-order" is specified.
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "sample=N" is not specfied in the options parameter,
then all included URIs may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then URIs from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:uris("http://foo.com/")
=> ("http://foo.com/", "http://foo.com/bar.html", ...)
|
|
|
|
cts:word-match(
|
|
$pattern as xs:string,
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the word lexicon that match the wildcard pattern.
This function requires the word lexicon to be enabled. If the word
lexicon is not enabled, an exception is thrown.
|
Parameters:
$pattern
:
A wildcard pattern to match.
|
$options
(optional):
Options. The default is ().
Options include:
- "case-sensitive"
- A case-sensitive match.
- "case-insensitive"
- A case-insensitive match.
- "diacritic-sensitive"
- A diacritic-sensitive match.
- "diacritic-insensitive"
- A diacritic-insensitive match.
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "sample=N"
- Return only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
If neither "case-sensitive" nor "case-insensitive"
is present, $pattern is used to determine case sensitivity.
If $pattern contains no uppercase, it specifies "case-insensitive".
If $pattern contains uppercase, it specifies "case-sensitive".
If neither "diacritic-sensitive" nor "diacritic-insensitive"
is present, $pattern is used to determine diacritic sensitivity.
If $pattern contains no diacritics, it specifies "diacritic-insensitive".
If $pattern contains diacritics, it specifies "diacritic-sensitive".
|
Example:
cts:word-match("aardvark*")
=> ("aardvark","aardvarks")
|
|
|
|
cts:words(
|
|
[$start as xs:string?],
|
|
[$options as xs:string*],
|
|
[$query as cts:query?],
|
|
[$quality-weight as xs:double?],
|
|
[$forest-ids as xs:unsignedLong*]
|
| ) as xs:string* |
|
 |
Summary:
Returns words from the word lexicon. This function requires the word
lexicon to be enabled. If the word lexicon is not enabled, an
exception is thrown. The words are returned in collation order.
|
Parameters:
$start
(optional):
A starting word. Returns only this word and any following words
from the lexicon. If the parameter is not in the lexicon, then it
returns the words beginning with the next word.
|
$options
(optional):
Options. The default is ().
Options include:
- "ascending"
- Words should be returned in ascending order.
- "descending"
- Words should be returned in descending order.
- "any"
- Words from any fragment should be included.
- "document"
- Words from document fragments should be included.
- "properties"
- Words from properties fragments should be included.
- "locks"
- Words from locks fragments should be included.
- "collation=URI"
- Use the lexicon with the collation specified by URI.
- "limit=N"
- Return no more than N words.
- "sample=N"
- Return only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "truncate=N"
- Include only words from the first N fragments
selected by the
cts:query.
Only applies when a $query parameter is specified.
- "score-logtfidf"
- Compute scores using the logtfidf method.
- "score-logtf"
- Compute scores using the logtf method.
- "score-simple"
- Compute scores using the simple method.
- "checked"
- Word positions should be checked when resolving the query.
- "unchecked"
- Word positions should not be checked when resolving the query.
- "concurrent"
- Perform the work concurrently in another thread. This is a hint
to the query optimizer to help parallelize the lexicon work, allowing
the calling query to continue performing other work while the lexicon
processing occurs. This is especially useful in cases where multiple
lexicon calls occur in the same query (for example, resolving many
facets in a single query).
|
$query
(optional):
Only include words in fragments selected by the cts:query.
The words do not need to match the query, but the words must occur
in fragments selected by the query.
The fragments are not filtered to ensure they match the query,
but instead selected in the same manner as
"unfiltered" cts:search operations. If a string
is entered, the string is treated as a cts:word-query of the
specified string.
|
$quality-weight
(optional):
A document quality weight to use when computing scores.
The default is 1.0.
|
$forest-ids
(optional):
A sequence of IDs of forests to which the search will be constrained.
An empty sequence means to search all forests in the database.
The default is ().
|
|
Usage Notes:
Only one of "ascending" or "descending" may be specified
in the options parameter. If neither "ascending" nor "descending"
is specified, then the default is "ascending".
Only one of "any", "document", "properties", or "locks"
may be specified in the options parameter.
If none of "any", "document", "properties", or "locks" are specified
and there is a $query parameter, then the default is "document".
If there is no $query parameter then the default is "any".
Only one of the "score-logtfidf", "score-logtf", or "score-simple"
options may be specified in the options parameter.
If none of "score-logtfidf", "score-logtf", or "score-simple",
are specified, then the default is "score-logtfidf".
Only one of the "checked" or "unchecked" options may be specified
in the options parameter.
If neither "checked" nor "unchecked" are specified,
then the default is "checked".
If "collation=URI" is not specified in the options parameter,
then the default collation is used. If a lexicon with that collation
does not exist, an error is thrown.
If "sample=N" is not specfied in the options parameter,
then all included words may be returned. If a $query parameter
is not present, then "sample=N" has no effect.
If "truncate=N" is not specfied in the options parameter,
then words from all fragments selected by the $query parameter
are included. If a $query parameter is not present, then
"truncate=N" has no effect.
|
Example:
cts:words("aardvark")
=> ("aardvark","aardvarks","aardwolf",...)
|
|
|