http://en.wikipedia.org/wiki/Full_text_search
“FTS” redirects here. For Fox-owned stations, see Fox Television Stations. In text retrieval, full text search (also called free search text [citation needed]) refers to a technique for searching a computer-stored document or database; in a full text search, the search engine examines all of the words in every stored document as it tries to match search words supplied by the user. Full-text searching techniques became common in online bibliographic databases in the 1970s. Most Web sites and application programs (such as word processing software) provide full text search capabilities. Some Web search engines, such as AltaVista employ full text search techniques, while others index only a portion of the Web pages examined by its indexing system.1)
For more details on searching, see Search Engines.
Select the default Index Set to be used when displaying a full text query results list.
This features is valuable when you know which Index Set your document is associated with.
For users who have only a single Index Set invoke this default so that all results lists will be displayed in the correct Index Set.
To allow a full text search of a documents content check the box next to the item “Include document content in search”
The search conditions used for full text searches are character strings composed of words, phrases, logical operators, and parentheses for grouping priority.
Aside from white space characters, which are considered to be delimiters in search strings, special characters include the double quote “” used for delimiting phrases, the asterisk * used for prefix, postfix, and substring matching, and parentheses . The reserved words include the logical operators OR, AND, NOT, NEAR. The priority of the logical operators from lowest to highest is OR, AND, NOT, NEAR.
You can use parentheses to change the priority grouping.
For example, the query ‘Smith and Invoice or Receipt’ finds documents that have both the words “Smith” and “Invoice” or the word “Receipt”. Changing the priority with parentheses to ‘Smith and (Invoice or Receipt)’ causes the query to find documents that have the word “Smith” and either of the words “Invoice” or “Receipt”.
The NEAR operator (proximity operator) is similar to the AND operator in that both of its query values must be found. In addition, it requires that its query values be within a certain physical distance of each other for it to return a true result. The default distance for the NEAR operator is 8 words. To use a proximity value other than the default, specify the distance as a parameter to the NEAR operator.
For example ' medical near doctor’ will evaluate to true for records where the word “medical” is within 15 words of “doctor“.
Search words that do not have logical operators specified are assumed to have an implied AND operator between them.
For example, the following two search conditions are equivalent: ‘mechanical engineer’ ‘mechanical AND engineer’ And the following two are equivalent: ‘computer programmer or software developer’ ‘(computer AND programmer) OR (software AND developer)’
Individual search words that are not enclosed in double quotes can be searched for as exact matches, prefix matches, postfix matches, or substring matches. This behavior is controlled through the use of the asterisk (*) character.
Simply specify the word in the search condition. The word will be matched if it is found exactly as given in the text.
For example, the search word “special” will match only “special” in the text (or upper case versions if it is not a case sensitive index).
Place an asterisk at the end of the word to match all words that begin with the given characters.
For example, the search word “special*” will match “special”, “specialty”, “specialization”, etc.
Place an asterisk at both the beginning and end of the word to match all words that contain the given characters.
For example, the search word “*lock*” will match the words “locker”, “antilock”, “blocking”, etc.
Exact matches and prefix matches provide the best performance. Both postfix and substring matches require that the entire Full Text Search (FTS) index be scanned in order to satisfy the search.
Double quotes are used as phrase delimiters. In addition, special characters can be enclosed inside double quotes to ensure that they are unchanged by the Search Engine.
For example, to search for a parenthesis, it is necessary to enclose it in double quotes in order to keep it from being treated as a precedence operator in the search condition itself. To search for a physical double quote, use two of them in a row.
Single quotes are not treated as a special case by the Search Engine.
Spaces are considered to be delimiters in search conditions. If you have a special situation in which spaces are not delimiters in the text and are part of search words, then it may be necessary to enclose the search words in double quotes in order to preserve the spaces.
Multi-word phrases in queries match identical phrases in the text being searched.
For example, the search condition ' “alpine skiing”’ will match only records that have the exact words “alpine” and “skiing” in that order in the text with no other non-noise words between them.
If you use the NEAR operator with sub-expressions, you may need to use the optional form of the operator with the parentheses after it to avoid ambiguity in the expression.
For example, the condition “a near (b and c)” is not valid because the left parenthesis is interpreted as the opening of the NEAR proximity parameter. The query would need to be written as “a near() (b and c)”. The empty parentheses after the NEAR operator removes the ambiguity; because they are empty, the default proximity value is used. You can also specify a proximity value: “a near(25) (b and c)”.
The highest level of the hierarchy that can be searched in DocuXplorer is the Cabinet. If you know which Drawer or Folder the document is located in you can further refine the search. Refining the search will result in fewer false document hits.
When creating a DocuXplorer Library use the Cabinet object to place “like documents” as index information is held at the Cabinet level. “Like documents” are all documents that you would need to include in a single search.
To full text search a PDF document it must have first gone through the Extract Text process
If the document is in a binary format such as a media file or graphic file, the file information is ignored by the indexing engine unless the document has been OCR’d in DocuXplorer.