|
|
search:search(
|
|
$qtext as xs:string+,
|
|
[$options as element(search:options)?],
|
|
[$start as xs:unsignedLong?],
|
|
[$page-length as xs:unsignedLong?]
|
| ) as element(search:response) |
|
 |
Summary:
This function parses and invokes a query according to
specified options, returning up to $page-length result nodes
starting from $start.
|
Parameters:
$qtext
:
The query text to
parse. This may be a sequence,
to accommodate more complex search UI. Multiple query texts
are combined with an AND operator.
|
$options
(optional):
Options to define
the search grammar and control the search.
The following is a summary of the
XML structure of an options node:
-
<additional-query>
- An additional serialized
cts:query node,
which is combined using a
cts:and-query with the query resulting from
the specified $qtext. The query results
are constrained by the specified additional-query, but
any terms matching the addtional-query are not
highlighted in the snippet result output.
For example, the
following options node constrains the results to the
directory named /my/directory/:
<options xmlns="http://marklogic.com/appservices/search">
<additional-query>{cts:directory-query("/my/directory/")}
</additional-query>
</options>
If you have multiple additional-query options,
they are combined using a cts:and-query.
-
<constraint>
- The outer wrapper element for a constraint definition.
Constraints are used to define facets which can be
returned as part of the search results.
The default is no defined constraints.
Each constraint element must have
an @name attribute (required),
which is the unique name of this constraint.
The name can then be used in the search
grammar to specify the constraint.
The constraint element can have zero
or more of the following elements:
-
<value>
- Specifies element or attribute values on
which to constrain.
For example:
<constraint name="my-value">
<value>
<element ns="my-namespace" name="my-localname"/>
</value>
</constraint>
<constraint name="my-attribute-value">
<value>
<attribute ns="" name="my-attribute"/>
<element ns="my-namespace" name="my-localname"/>
</value>
</constraint>
-
<word>
- Specifies the element, attribute, or field on
which to constrain by word.
For example:
<constraint name="name">
<word>
<element ns="http://widgets-r-us.com" name="name"/>
</word>
</constraint>
<constraint name="description">
<word>
<field name="my-field"/>
</word>
</constraint>
-
<collection>
- Specifies the collection on
which to constrain.
For example:
<constraint name="my-collection">
<collection prefix="http://server.com/my-collection/"/>
</constraint>
-
<range>
- Specifies the element or attribute on
which to constrain by range. There must
be a range index of the specified
type (and collation for string range
indexes) defined for the specified
element or attribute. Each
range element with a
type attribute child, an
optional collation attribute
child (for string range indexes),
an element child, an optional
attribute child (for
attribute ranges), and may have one
or more computed-bucket
children and/or one or more
facet-option children (to
pass options to the underlying lexicon
apis).
For example:
<options xmlns="http://marklogic.com/appservices/search">
<constraint name="made">
<range type="xs:dateTime"><!-- requires a dateTime range index -->
<element ns="http://example.com" name="manufactured"/>
<attribute ns="" name="date"/>
<computed-bucket name="today" ge="P0D" lt="P1D"
anchor="now">Today</computed-bucket>
<computed-bucket name="30-days" ge="-P30D" lt="P1D"
anchor="now">Last 30 days</computed-bucket>
<computed-bucket name="60-days" ge="-P60D" lt="P1D"
anchor="now">Last 60 Days</computed-bucket>
<computed-bucket name="year" ge="-P1Y" lt="P1D"
anchor="now">Last Year</computed-bucket>
</range>
</constraint>
<constraint name="color">
<range type="xs:string">
<element ns="" name="bodycolor"/>
</range>
</constraint>
<constraint name="color">
<range type="xs:string" facet="true">
<element ns="" name="bodycolor"/>
<!-- the facet-option values are passed directly to the
underlying lexicon calls -->
<facet-option>frequency-order</facet-option>
<facet-option>descending</facet-option>
</range>
</constraint>
</options>
For range constraints with either bucket
or computed-bucket specifications, for maximum performance
and sortability, the buckets should be in a continuous order;
if the order is not
continuous (either ascending or descending), then the buckets are
returned in the order specified, regardless of any sorting
facet-option in the specification.
-
<custom>
- Specifies a custom constraint along with
the name of the function implementations
used to evaluate the custom constraint.
For example:
<constraint name="my-custom">
<custom facet="true">
<parse apply="my-parse-function"
ns="my-function-namespace" at="path-to-module.xqy"/>
<start-facet apply="my-start-function"
ns="my-function-namespace" at="path-to-module.xqy"/>
<finish-facet apply="my-finish-function"
ns="my-function-namespace" at="path-to-module.xqy"/>
</custom>
</constraint>
<!-- The start-facet and finish-facet elements can be omitted if
facet="false". When facet="true", the start-facet can be ommitted
if you do not run in concurrent mode ("concurrent" option on the
lexicon functions).
-->
-
<term-option>
-
Specifies the options passed into the search
(for example, case-insensitive). There can be zero or
more term-option elements. By default,
the search uses the same default options as the
underlying cts:query constructors, and the
defaults change based on your index configuration.
You can use term-option elements as a child
of either the term
element or as a child of the
constraint
element.
Legal term option values are:
case-sensitive
case-insensitive
diacritic-sensitive
diacritic-insensitive
punctuation-sensitive
punctuation-insensitive
whitespace-sensitive
whitespace-insensitive
stemmed
unstemmed
wildcarded
unwilcarded
exact
lang=iso639code
For example:
<term-option>diacritic-insensitive</term-option>
-
<debug>
- Activates debugging mode. Additional report elements will
be present in the output. Set to
true to
activate. The default is false.
-
<default-suggestion-source>
Defines the content to be used as the default source of
suggestions (see
search:suggest). The
source may be expressed as a reference to an existing
named constraint, or as a collection, value,
word or word-lexicon element. Note that the use
of word-lexicon (the database-wide
word lexicon) is not recommended as best practice;
collection and range lexicons will yield the best
performance.
Each default-suggestion-source element
can optionally have an @collation
attribute, which specifies the
collation of the value lexicon used during
query evaluation. If no collation is specified,
then the query uses default collation for the
context in which the query is evaluated.
The default-suggestion-source element
can have zero or more of the following
child elements:
-
<collection>
- Specifies using the collection lexicon
for suggestions. For example:
<default-suggestion-source>
<collection/>
</default-suggestion-source>
-
<range>
- Specifies the element or attribute lexicon
to use for suggestions. For example:
<default-suggestion-source>
<range type="xs:string">
<element ns="my-namespace" name="my-localname"/>
<attribute ns="" name="my-attribute"/>
</range>
</default-suggestion-source>
-
<word>
- Specifies using the word lexicon for
suggestions. This option might
not scale well for a large database.
For example:
<default-suggestion-source>
<word/>
</default-suggestion-source>
<default-suggestion-source>
<word>
<field name="my-field"/>
</word>
</default-suggestion-source>
-
<forest>
- A single forest ID to pass into
cts:search.
To specify multiple forests, use multiple
forest elements in the options node. The value must be an xs:unsignedLong
type.
-
<grammar>
- Wrapper element for grammar definition. The default
grammar defines "Google-style" parsing.
The grammar element has a
quotation element that specifies
the quotation character with which to surround
phrases. The text between the quotation
characters is treated as a phrase.
You cannot specify a search that includes
the quotation character; for example, to
specify a search that includes the
double quotation character (the default
quotation character), modify your grammar to
use a different quotation character.
The grammar element can have
0 or more joiner elements and
0 or more starter elements.
The grammar element
should have one or more of each of the following
elements. If the grammer element is present but
empty, then the grammar does nothing, and the
search is parsed according to the
term option.
-
<joiner>
- Specifies what text to use to combine
terms together, and what is the
underlying
cts:query
constructor to use to join the terms
together. You specify the function to
call for the joiner with the
apply attribute, along with
optional ns (for the module
namespace) and
at (for the module path)
attributes. Additionally, the
strength attribute determines
the order of precedence over other
joiner elements, the optional
options attribute specifies
a space-separated list of options that
are passed through to the underlying
cts:query constructor,
and the element
attribute specifies the
cts:query
element name (for example,
cts:and-query).
-
<starter>
- Specifies what text to use to delimit
and group terms.
You specify the function to
call for the starter with the
apply attribute, along with
optional ns (for the module
namespace) and
at (for the module path)
attributes. Additionally, the
strength attribute determines
the order of precedence over other
starter elements, the
optional
options attribute specifies
a space-separated list of options that
are passed through to the underlying
cts:query constructor,
the element
attribute specifies the
cts:query
element name (for example,
cts:and-query), and the
delimiter attribute
specifies the string to use as a
delimiter for the starter.
The following is an example of a grammar
element.
<grammar>
<!-- return all results on empty qstring -->
<starter strength="30" apply="grouping" delim=")">(</starter>
<starter strength="40" apply="prefix"
element="cts:not-query">-</starter>
<starter strength="60" apply="quotation" delim='"'>"</starter>
<joiner strength="10" apply="infix"
element="cts:or-query">OR</joiner>
<joiner strength="20" apply="infix"
element="cts:and-query">AND</joiner>
<joiner strength="50" apply="constraint">:</joiner>
</grammar>
-
<operator>
- A named wrapper for one or more
state
elements, each representing a unique run-time
configuration option. For example, if an operator
with the name "sort" is defined, query text
[sort:foo] will select the state child
with the name "foo" at query runtime, using the option
specified on that state element.
Options affecting query
parsing (such as constraint, grammar,
term, empty) may not be configured
via operators.
An operator element can have one or more
state elements. Each state
element can have one of the
following elements:
additional-query
debug
forest
page-length
quality-weight
searchable-expression
sort-order
transform-results
In the following example, a search for
special:hello
constrains the search by the "hello world" query, and a
search for special:forest constrains the
search to the forest names "my-forest".
<operator name="special">
<state name="hello">
<additional-query>{cts:word-query("hello world")}
</additional-query>
</state>
<state name="forest">
<forest>{xdmp:forest("my-forest")}</forest>
</state>
</operator>
-
<page-length>
- Specifies the number of results per page. The default value
is 10.
The value must be an xs:unsignedInt
type.
-
<quality-weight>
- Specifies a a weighting factor to use in the query.
The default value is 1.0.
The value must be an xs:double
type.
-
<return-constraints>
- Include original constraint definitions in the results. The
default is false.
The value must be an xs:boolean
type.
-
<return-facets>
- Include resolved facets in the results. The default
is
true.
The value must be an xs:boolean
type.
-
<return-metrics>
- Include statistics in the results. The default is
true.
The value must be an xs:boolean
type.
-
<return-qtext>
- Include the original query text in the results.
The default is
true.
The value must be an xs:boolean
type.
-
<return-query>
- Include the XML query representation in the results.
The default is
false.
The value must be an xs:boolean
type.
-
<return-results>
- Include search results in the output. (Use transform-results
to specify how each result should be formatted.) The
default is
true.
The value must be an xs:boolean
type.
-
<return-similar>
- Include with each search result a list of URLs of similar
documents in the database. The default is
false.
The value must be an xs:boolean
type.
-
<search-option>
- For advanced users, a single option to be passed in
with
cts:search calls (for example,
filtered, unfiltered,
score-logtfidf, and so on). To pass in
multiple options, specify multiple
search-options elements in the options
node. The default is no
additional options. For example:
<search-option>unfiltered</search-option>
<search-option>score-logtf</search-option>
-
<searchable-expression>
- An expression to be searched. Whatever expression is
specified is returned from the search. For example,
if you specify
//p, then p
elements that match the search criteris are returned.
The expression must be an inline fully
searchable XPath expression, and all necessary
namespaces must be declared using xmlns
attributes. For example:
<searchable-expression xmlns:ex="http:example.com"
xmlns:com="http://company.com">/ex:orders/com:company
</searchable-expression>
The default value is fn:collection(), which
searches all documents in the database.
-
<sort-order>
Set the default sort order. The first such element is
the primary sort order, the second secondary sort order,
and so on. The default is to sort by score, descending.
Note that the default scoring algorithm can be set
just like any other option with the option
named search-option. If you are sorting
by an element or an attribute, you must specify a
type attribute with a value corresponding
to the range index type of that element or
attribute (for example, xs:string,
xs:dateTime, and so on). If the
corresponding range index is of type
xs:string, then you can optionally
specify a collation attribute (otherwise
the collation of the query is used). To change the
sorting direction, specify an optional
direction attribute with a value of
decending (the default) or
ascending.
The sort-order element must have either a
single element child or a single
score child. If there is a
score child, it specifies to
sort by the score of the search result. If there is an
element child it can optionally have an
attribute sibling (to specify an attribute
of the preceding element). Both the
element and attribute
elements must have ns and name
attributes to specify the namespace and local-name of
the specified element and attribute. Additionally,
the sort-order element can have 0 or more
annotation elements (to add comments,
for example).
For example, the following specifies a primary
sort order using the element value for
my-element (which needs a string
range index with the specified collation), and
a secondary sort order of score ascending:
<sort-order type="xs:string"
collation="http://marklogic.com/collation/"
direction="ascending">
<element ns="my-namespace" name="my-element"/>
<annotation>some user comment can go here</annotation>
</sort-order>
<sort-order direction="ascending">
<score/>
</sort-order>
-
<suggestion-source>
Specifies a constraint source to override
a named constraint when using
search:suggest.
The suggestions are often used
for type-ahead suggestions in a search user interface.
If empty, no suggestions are
generated when that constraint is applied. Specifying an
alternate suggestion-source is useful in
cases where you have a named constraint to use for
searching and facets, but you might want to use a
slightly (or completely) different source for
type-ahead suggestions without needed to re-parse your
search terms.
Each suggestion source must have a name
attribute corresponding to a named constraint (one
suggestion-source per named constraint).
A suggestion-source can have one
of the following child elements:
collection,
range,
word, or
word-lexicon.
For example, the following overrides the
tag: prefix, using the range index
for the attribute shortname instead of
the one for name when using
search:suggest:
<constraint name="tag">
<range collation="http://marklogic.com/collation/"
type="xs:string" facet="true">
<element ns="my-namespace"
name="my-element"/>
<attribute ns="" name="name"/>
</range>
</constraint>
<suggestion-source name="tag">
<range collation="http://marklogic.com/collation/"
type="xs:string" facet="true">
<element ns="my-namespace"
name="my-element"/>
<attribute ns="" name="shortname"/>
</range>
</suggestion-source>
-
<term>
Specifies handling of empty searches and controls options
for how individual terms (that is, terms
not associated with a constraint) will
be represented when parseing the search.
To control how empty searches (that is, the empty
string passed into search:search) are
resolved, specify an empty child element
with an apply attribute. The value of the
apply attribute specifies the behavior
for empty searches: a value of all-results
specifies that empty searches return everything in the
database, a value of no-results (the
default) specifies that an empty search returns nothing.
Additionally, you create your own function to
handle empty searches. To specify your own
function, create a function that returns a
cts:query and specify the local-name of
your function in the apply attribute, the
namespace of the function library module in the
ns attribute, and the path to the
module in the at attribute.
Additionally, you can specify zero or more
term-option
elements to control the behavior of the search terms.
For example:
<term>
<empty apply="no-results" />
<term-option>diacritic-insensitive</term-option>
<term-option>unwildcarded</term-option>
</term>
-
<transform-results>
Specifies a function to use to process a search result for
the snippet output.
The default is that each result is formatted using the
built-in default snippeting function.
Specify the local-name of the function to pass in as the
value of the apply attribute, the
namespace as the value of the ns attribute,
and the path to the module as the value of the
at attribute. You can pass in parameters
to the function by specifying zero or more
param child elements (the parameters are
passed in in the order specified).
For example:
<transform-results apply="snippet" ns="my-namespace"
at="/my-library.xqy"/>
|
$start
(optional):
The
index of the first hit to return. If 0, treated as 1. If
greater than the number of results, no results will be
returned. The default is 1.
|
$page-length
(optional):
The maximum number of hits to return.
The default is 10. If the value is 0, no results are returned.
|
|
Usage Notes:
The output of search:search returns a
<response> element, which in turn
contains a total attribute. The value of the
total attribute is an estimate, based
on the index resolution of the query, and it is not
filtered for accuracy. The accuracy of the index resolution
depends on the index configuration of the database, on the
query, and on the data being searched.
|
Example:
xquery version "1.0-ml";
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
search:search("Vannevar Bush",
<options xmlns="http://marklogic.com/appservices/search">
<return-results>false</return-results>
<return-facets>true</return-facets>
</options>)
=>
<search:response total="1234" start="1" page-length="10" xmlns=""
xmlns:search="http://marklogic.com/appservices/search">
<search:facet name="date">
<search:facet-value value="today" count="1000">
Today</search:facet-value>
<search:facet-value value="yesterday" count="234">
Yesterday</search:facet-value>
<search:facet-value value="thismonth" count="1234">
This Month</search:facet-value>
<search:/facet>
...
</search:response>
|
|
|
|
search:suggest(
|
|
$qtext as xs:string+,
|
|
[$options as element(search:options)?],
|
|
[$limit as xs:unsignedInt?],
|
|
[$cursor-position as xs:unsignedInt?],
|
|
[$focus as xs:positiveInteger?]
|
| ) as xs:string* |
|
 |
Summary:
This function returns a sequence of suggested text
strings that match a wildcarded search for the
$qtext input, ready for use in a user
interface. Typically this is used for type-ahead
applications to provide the user
suggestions while entering terms in a search box.
|
Parameters:
$qtext
:
One or more strings
of query text. The first string in the list (or the
string corresponding to the position in the $focus
parameter value) is used to find matching suggestons
by performing a lexicon match query.
The other strings (if any) are parsed as a
cts:query, with the resulting queries
combined with a cts:and-query, and the
resulting cts:query is passed as a
constraining query to the lexicon match query, restricting
the suggestions to fragments that match the
cts:query. Typically, each item in the
sequence corresponds to a single text entry box in a
user interface.
|
$options
(optional):
Options to define the search
grammar and control the search. See description for
$options for
the function search:search. In particular,
the default-suggestion-source and
suggestion-source options are specific to
search:suggest.
|
$limit
(optional):
The maximum number of
suggestions to return. The default is 10.
|
$cursor-position
(optional):
The position of the cursor, from point of
origin, in the text box corresponding to the
$focus parameter. This is used to determine
on which part of the query text to perform a lexicon
match. The default is the string length of the
$focus string (all of the string).
|
$focus
(optional):
If there are multiple
$qtext strings, the index of the string
corresponding to the text box that has current
"focus" in the user interface (and therefore containing
a partial query text for completion). The
default is 1 (the first $qtext string.
|
|
Usage Notes:
On large databases, the performance of using a
word lexicon for suggestions will probably be slower than
using a value lexicon. This can be very application
specific, and in some cases the performance might be good,
but in general, value lexicons (range constraints) will
perform much better than word lexicons (word constraints)
with search:suggest. Therefore, Mark Logic
recommends using value lexicons for suggestions, not word
lexicons.
The performance of search:suggest is highly
data-dependent. The best performing suggestion sources
are value lexicons (range indexes) that use the
codepoint collation. Performance is also impacted based on
the number of matches, and it can help to design the
interaction between search:suggest and the UI
so that suggestions are given after a minimum of 3
characters are entered (that is, the lexicon match calls
will have at least 3 characters). Again, this is quite
data-dependent, so you should try it on a large data set
with your own data.
The output of search:suggest is a sequence of
query text strings, not a sequence of words. Each
query text string can include quoted text, such as
phrases. The output of search:suggest
is appropriate to pass into the first argument of
search:search, including any quoted phrases.
For example, if you have a suggestion that returns
multi-word phrases
(for example, from range element index values), then
the suggestion will quote the phrase.
|
Example:
xquery version "1.0-ml";
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
let $options :=
<search:options xmlns="http://marklogic.com/appservices/search">
<default-suggestion-source>
<range collation="http://marklogic.com/collation/"
type="xs:string" facet="true">
<element ns="http://marklogic.com/xdmp/apidoc"
name="function"/>
<attribute ns="" name="name"/>
</range>
</default-suggestion-source>
</search:options>
return
search:suggest("docu", $options)
=> a sequence of strings representing query text:
document-add-collections
document-add-permissions
document-add-properties
document-checkin
document-checkout
|
Example:
xquery version "1.0-ml";
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
let $options :=
<search:options xmlns="http://marklogic.com/appservices/search">
<default-suggestion-source>
<range collation="http://marklogic.com/collation/"
type="xs:string" facet="true">
<element ns="" name="hello"/>
</range>
</default-suggestion-source>
</search:options>
return
search:suggest("a", $options)
=> a sequence of strings representing query text:
"and that"
"and this"
|
Example:
xquery version "1.0-ml";
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
search:suggest(("ta","foo"),(),5)
=> a sequence of strings representing query text:
tab
table
tadpole
tag
|
Example:
xquery version "1.0-ml";
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
search:suggest(("table","foo"),(),(),5,2)
=> a sequence of strings representing query text:
food
fool
foolhardy
foolish
foolishness
|
Example:
xquery version "1.0-ml";
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
(:
given a document created with the following:
xdmp:document-insert("/test.xml",
<root>
<my:my-element xmlns:my="my-namespace" shortname="fool"/>
<my:my-element xmlns:my="my-namespace" shortname="food"/>
<my:my-element xmlns:my="my-namespace" shortname="foolhardy"/>
<my:my-element xmlns:my="my-namespace" shortname="foolish"/>
<my:my-element xmlns:my="my-namespace" shortname="foolishness"/>
<my:my-element xmlns:my="my-namespace" name="foody"/>
</root>)
:)
let $options :=
<options xmlns="http://marklogic.com/appservices/search">
<constraint name="tag">
<range collation="http://marklogic.com/collation/"
type="xs:string" facet="true">
<element ns="my-namespace"
name="my-element"/>
<attribute ns="" name="name"/>
</range>
</constraint>
<suggestion-source ref="tag">
<range collation="http://marklogic.com/collation/"
type="xs:string" facet="true">
<element ns="my-namespace"
name="my-element"/>
<attribute ns="" name="shortname"/>
</range>
</suggestion-source>
</options>
return
search:suggest("tag:foo", $options)
=>
suggestions to complete tag: from the range index on the
"shortname" attribute (notice "foody" is not in the answer):
tag:food
tag:fool
tag:foolhardy
tag:foolish
tag:foolishness
|
|
|