|
A Proximity Search is a option that looks for documents where matching are found near each other, whether specified as some keywords or all. It is used as an advanced search option in addition to searching for keywords or String searching algorithm|proximity relations between keywords of the search query. In a search, the proximity could be defined as the number of words or number of characters between the keywords. Example: As an example, a proximity search could be used to hunt "red brick house" matching phrases such as "red house of brick" or "house made of red brick" without matching documents containing the scattered words "house...brick...red" spread across a page. Operators: Sometimes query operators like NEAR, NOT NEAR, FOLLOWED BY, NOT FOLLOWED BY, SENTENCE or FAR are used to indicate a proximity-search limit between specified keywords: "brick NEAR house" and such. Google allows ordered-proximity searching using one asterisk ( Importance: A proximity search is one method to lessen the search-engine crisis problem of too many pages matching the search-keywords, where the more relevant pages can be selected as having keywords in closer promixity, nearer to other matching keywords. A proximity search can also help reduce linkspam by not matching the many pages which contain dictionary lists or shotgun lists of thousands of words, when the matching words are not in phrases near to each other. Note that a proximity search can designate that only some specified keywords, not all keywords, must be found near to each other.
Proximity search of search engines Implicit/automatic versus explicit proximity search: As of February 2006, almost all major Internet search engines except Exalead only implement an implicit proximity search functionality. That is, they automatically rank those search results higher where the user keywords have a good "overall proximity score" in such results. If only two keywords are in the search query, this has no difference from an explicit proximity search which puts a NEAR operator between the two keywords. However, if three or more than three keywords are present, it is often important for the user to specify which subsets of these keywords expect a proximity in search results. This is useful if the user wants to do a prior art search (e.g. finding an existing approach to complete a specific task, finding a document that discloses a system that exhibits a procedural behavior collaboratively conducted by several components and links between these components). For example, in a search query in the form of: (keyword1 NEAR keyword2) (keyword1 NEAR keyword3), the query specifies that keyword1 and keyword2 must co-occur closely somewhere in a document, and so must keyword1 and keyword3. However, keyword2 and keyword3 need not occur closely anywhere in the document. Proximity search within the Yahoo! and Google search engines is possible using full-word wildcards: the wildcard is an asterisk " Google Asterisk: Using Google's asterisk-in-quotations approach to emulate a NEAR operator is a little cumbersome but does work (as of October 2006). For example, to specify a close (at most 2 words' distance) co-occurrence of "house" and "dog", the following search-expression could be specified: "house Note the operator "OR" must be in capital letters. One asterisk allows a proximity of at most two words' distance between two search-words. To span 7 intervening words, use 3 asterisks: "house To span up to 11 intervening words in a Google search, use 4 asterisks, etc. See also edit distance, semantic proximity | ||||||||
|
| |||||||||
![]() |
|
| |