The SharePoint technical documentation leaves a lot to wish for in general, but Crawled Properties is probably one of the worst topics when it comes to documentation. On top of that, the functionality to implement is weak as well. Let me give you some examples:
When you google for "SharePoint CrawledProperty" you get a whopping 1500 search results (I'm not sarcastic, just comparing Google to Bing, read on!) and on Bing you find 122 results on the same search, with the majority being people asking for help, I'd say.
"Property Set ID - A GUID that identifies the property set for the crawled property. Doing a search for the GUID B725F130-47EF-101A-A5F1-02608C9EEBAC and filtering the results for the Property Name of 12 yields several links to related content. One such link, on MSDN
, provides a tremendous amount of information. This tells us that this property set is a System property and the propID of 12 is the file size"
While helpful to know, for me it simply shows how awkward the whole naming system is around Crawled Properties. But let's move on and get productive. We want to make things work here...
I'll use a real-world scenario to describe a typical problem around Crawled properties and how to solve it: assume you want to search on the "Date Picture Taken" column of the Picture library. After you put a fresh MOSS installation on a server, you can look up in the Search Administration of the Central Administration what Managed Property is available for that kind of search (only Managed Properties can be used in the search!). You'll find one entry:
A Managed Property (DatePictureTaken) already exists and has a mapped Crawled property called "ows_Image_x0020_CreateDate". While this looks good at first glance it will actually not help us at all. Why? Because the internal name for the "Date Picture Taken" column in the Picture library is "ImageCreateDate", so the CrawledProperty we'd need would be "ows_ImageCreateDate".
Just a quick explanation on the Crawled property naming convention: Crawled Properties are created automatically by the search crawler when it detects new columns that have a value. SharePoint columns always start with "ows_" (probably stands from "Office web server") concatenated with the internal name of the column.
So, let's have a look if there is another Crawled Property available with the right name. Under the Crawled Properties section we find this:
Hmm, there are even three such Crawled Properties: two with the same, wrong name (including the text/hex version of a blank in the name) and one with the right name... but it's got the wrong type: we need a DateTime field, not a Text field!
OK, time to upload some images that include metadata for the "Date Picture Taken" column. A (shameless ;-) plug for our FC.MetadataExtractor product here: it extracts that DatePictureTaken metadata (and more) by default, and puts it in the right column. After that, starting a Crawl will "nudge" the search crawler to recognize the Date Picture Taken column in the Picture library, because it now has data. And voila, we have the "Date Picture Taken" column linked to the right Crawled Property and in turn we can go ahead and link this Crawled Property to the existing "DatePictureTaken" Managed Property (done in the following screenshot with our FC.ImageSearch product, but you can also do that in the Search Administration section of the Central Administration).
But we're still not done: if we'd do a search now we'd get a really unfriendly error from the search service:
This is because we now have two Crawled Properties mapped to the Managed Property, but it can only take input from one at a time, so we need to change another setting for the Managed Property to "include values from a single crawled property...":
Yet another full crawl will help to push the Crawled Property data on to the newly linked Managed Property.
Now comes the fun! In the Search Configuration page of the FC.ImageSearch you can simply add the DatePictureTaken Managed Property that is now linked to the library's "Date Picture Taken" column and add it to the properties for the Advanced Search page:
Now, in the Advanced Search page you can select the "All Images" scope of the FC.ImageSearch solution and then choose from its Managed Properties the DatePictureTaken property:
...and that will get the successful search on our "Date Picture Taken" column:
Well, I still wonder why there are - out of the box - three ambiguous ImageCreateDate Crawled Properties. And what's more, deleting Crawled Properties is not possible in SharePoint, so you'll just have to ignore them.
But at least we got the search to work!