Friday, October 30, 2009

Starting in V8, the PDF iFilter file is bundled with Adobe Reader. To get the iFilter, you need to install Adobe Reader (8.0 or higher) onto the WSS server that will be doing the indexing.

The procedure below wouldn’t work for 64 bit servers. You can find comments on this  in: WSS needs to know what extension to use, so you need to change a few registry entries.

In regedit:
  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\{ANYGUID}\Gather\Search\Extensions\ExtensionList

    Find the highest number in the list and add the next value with PDF for the value.
  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension

    This is the list of file extensions with a class ID for the iFilter used to index the extension. If .pdf is not listed, add it. It should have a multi-string value in which you need to add the CLSID for the iFilter added by Adobe Reader.  In version 9, this file is called: "AcroRDIF.dll."

    CLSID for 9 = {E8978DA6-047F-4E3D-9C78-CDBE46041603}.
  • Add its path to the environment variables of the server: Start Menu / right click My Computer /select Properties /go to the Advanced tab /click on the Environmental Variables button and scroll down the Path variable, select it and click on the Edit button and add the path ";C:\Program Files\Adobe\Reader 9.0\Reader" then click OK to apply and close.
  • Reboot the server. It may take some time until the PDF documents are indexed.

    You can also run the search service: stsadm -o spsearch... for WSS and SSP Full Crawl for MOSS.

Wednesday, October 28, 2009

PDF search in SharePoint

There are 2 subjects related to PDF documents.

  1. Out of the box, you won't see the PDF icon in document libraries as you see MS Office icons next to office documents.
  2. The search will look at the PDF document's properties but not in its content.

PDF Icon:

  • Download the small icon file from adobe.
  • Save the icon in the c:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\IMAGES folder. You can rename it to match the standard naming convention.
  • Open (in notepad) the DOCICON.XML file located in the c:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\XML folder.
  • In ByExtension section add the tag >Mapping Key="pdf" Value="icpdf.gif" OpenControl=""/>
    -Change the direction of the first > (just left to Mapping) to make the markup work right.
  • Save.
  • Reset IIS
  • You should see now all PDF documents with the correct icon.

I will post the instructions for PDF content search soon.

Saturday, October 17, 2009

How can we explain SharePoint?

I posted this article in another blog but thought it might be of interest to Excel experts as well. There is another post in this blog explaining the SharePoint-Excel interaction.

I always sympathize with SharePoint professionals when reading articles about their difficulties in answering the question: “What do you do for a living”? After thinking it over and over, I decided to rephrase the question. I asked myself, “What will eventually be the TLR (Three Letter Acronym) for SharePoint-like applications?

I have never been satisfied with my own explanations of SharePoint, even with my latest strategy to ask the person about his or her IT knowledge before I formulate my response. I see people nodding their heads, but in most cases I feel I didn’t convey the message well enough.

Thinking about a TLR, the first thing that comes to mind is the equivalent ERP (Enterprise Resource Planning). The name is intuitively perceived as a package of related applications that support end-to-end the enterprises’ operations. But it took 10 to 15 years until this TLR became the standard everyone uses and understands. Previously, there had been MRP, Logistics, Shipping, Financial, HR and other disparate applications.

So what do we package here? SharePoint supports several relatively independent processes. It combines what was before Portal, EDM (Enterprise Document Management), ECM (Enterprise Content Management), WCM (Web Content Management), team collaboration, activity tracking and even connectivity extensions to backend systems. We will probably see more Social Network support coming soon. Since companies usually need more than one of these applications, why not use the same tool and save on maintenance?

What can be a reasonable common denominator for all these processes? I think that the phrase “Knowledge Sharing” is pretty close to what they do. Prefix it with the E for Enterprise which is important, accurate - and also sells, and you get EKS – Enterprise Knowledge Sharing.

Spread the word…