Search Server Express not indexing images (or other documents)

 
We've been hitting a wall when testing our up-coming FC.ImageSearch product: as always we use "clean newly installed systems" when it comes to testing a product. When we ran the tests on Search Server Express (SSX) we encountered the problem that our test images would simply not be crawled. We double and triple checked that we had done all the required configuration steps, but it just wouldn't index the images. What was going wrong?
 
Well, one of the important steps involved in configuring search is to add the File Types to be crawled. E.g. jpg files are NOT crawled by default, so this has to be configured.
 
File Types for Search
 
To cut many hours of testing short: it´s actually a bug in SharePoint. When adding the File Types in SSX the changes made are actually stored in the Registry under
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Applications\{GUID}\Gather\Portal_Content\Extensions\ExtensionList
 
BUT, SSX is practically a WSS installation, so the Registry entries really have to go here:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\{GUID}\Gather\Search\\Extensions\ExtensionList
 
File Types SSX in Registry
 
The moment I had added a "jpg" key to the right location in the Registry and crawled the content again, the images showed up in the Crawl Log and the search results.
 
While wasting many hours on this it still makes me feel good, as it just shows how much value we'll provide with our up-coming product, as it eliminates a very long list of such issues
 
Update (21-4-2010):
Well, as we continue our tests we realized that this still wasn't the whole story. If you're using SharePoint search you're probably aware that many changes require a Full Crawl (like when adding a new ManagedProperty) for these changes to become effective and IISRESET calls are bread and butter for SharePoint deployment anyway. However, the final word on getting the new file types to be recognized by the search engine involves yet another step: stopping an restarting the search service. In order, the steps are:
- add your file types
- IISRESET
- "net stop osearch" (on MOSS) and then "net start osearch"
- and then a Full Crawl of your content
 
 

Published: Mar-23-10 | 2 Comments | Link to this post