This topic describes how document text and properties returned by filters are broken up into words and how common words are excluded.
A word-breaker DLL parses the text and textual properties returned by the filter DLL into words. The word-breaker DLL is language dependent. For a list of languages supported by Index Server, see the Index Server Web page.
Words that are not significant for searching are called noise words or stop words. Noise words are stored in %systemroot%\system32 directory in various noise word files (Noise.dat, by default). The noise word files are language dependent. The noise word file for a particular language is specified in the registry under the key:
HKEY_LOCAL_MACHINE\SYSTEM \SYSTEM \CurrentControlSet \Control \ContentIndex \Language \<language> \NoiseFile
For example, the noise word file for English_US is listed as the registry key:
HKEY_LOCAL_MACHINE\SYSTEM \SYSTEM \CurrentControlSet \Control \ContentIndex \Language \English_US \NoiseFile \noise.dat
The noise word files can be edited with a text editor to either add new words or remove words that are not considered noise at a particular installation. Note that querying for noise words will not yield any hits.
Caution Removing all noise words from the noise word files can significantly increase the size of indexes.
© 1997 by Microsoft Corporation. All rights reserved.
file: /Techref/language/asp/ix/ixfilwdn.htm, 3KB, , updated: 1997/9/29 02:23, local time: 2024/11/23 10:47,
18.223.209.129:LOG IN
|
©2024 These pages are served without commercial sponsorship. (No popup ads, etc...).Bandwidth abuse increases hosting cost forcing sponsorship or shutdown. This server aggressively defends against automated copying for any reason including offline viewing, duplication, etc... Please respect this requirement and DO NOT RIP THIS SITE. Questions? <A HREF="http://massmind.org/techref/language/asp/ix/ixfilwdn.htm"> Microsoft Index Server: Word-Breaker DLLs and Noise Words</A> |
Did you find what you needed? |
Welcome to massmind.org! |
Welcome to massmind.org! |
.