We have two simple scopes defined based on a field called “status” for the files in a document library. The field value is either “Active” or “Archived”.
User complains she couldn’t find a document in the “Active” documents scope. That document was indeed set “Active” in the status field. This document is a Microsoft Office 2010 version word document ending with “.docx”. I tried to do another search for a “Active” PDF document and couldn’t find it in the search result either. Same thing happens for “DOCM” and “JPG” files. “DOC” and “XLS” files work just fine.
Also if I search the document in site level, they do show up in the results. that means they are being indexed.
I looked at the scope definition and found there are 597 documents in the scope. However I can see there are 635 documents in the document library marked as “Active”. The difference is the total count of those non-Office 2003 documents.
I tried to removed the rule of “Managed property” from the scope and leave another rule (folder based rule), and the document shows up in the results.
Based on the above facts, I suspect there is something wrong with managed property. Did some googling using keywords “Managed Property PDF” and located this page:
“On further investigation, we found that the particular column with the issue was a “Date and time” type column. However, the managed property had been mapped to a crawled property with the correct name “Date of origination” but of type “Text”. This mostly worked, but not for pdf files. When we re-mapped the managed property to the crawled property with the name “ows_Date_x0020_of_x0020_origination” and type “Date and Time”, and did a fresh full crawl, it worked correctly.”
It was exactly what happened to me. I went into the managed property and tried the mapping again. Added “ows_Status” as the new mapping. Performed a full crawl. Problem went away.
Read More