ManifoldCF

Managed by | Updated .

Warning

FB has now stopped shipping Manifold as a packaged product (as of v15). This means that Manifold needs to be downloaded and installed separately.

How to use ManifoldCF connectors with Funnelback.

General

Logging

Manifold logging is output to:

$SEARCH_HOME/web/logs/manifoldcf*.log

The log verbosity can be increased by adding the following line:

$SEARCH_HOME/conf/mcf-logging.ini
log4j.logger.org.apache.http.wire=DEBUG

Enable PostgreSQL

By default the ManifoldCF installation uses the Derby database packaged with it.  This database is suitable for testing but not for production (specified in the ManifoldCF documentation)

Enable PostgreSQL by altering the database connection properties in the mcf-properties.xml file

$SEARCH_HOME/conf/mcf-properties.xml
<property name="org.apache.manifoldcf.databaseimplementationclass" value="org.apache.manifoldcf.core.database.DBInterfacePostgreSQL" />
<property name="org.apache.manifoldcf.database.name" value="manifoldcf" />
<property name="org.apache.manifoldcf.dbsuperusername" value="<postgres username>" />
<property name="org.apache.manifoldcf.dbsuperuserpassword" value="<postgres password>" />
<property name="org.apache.manifoldcf.database.maxhandles" value="100" />

Notes:

  • Postgres must be configured to allow the DB user access via username and password (rather than the linux identity).  This can be set in the postgres pg_hba.conf file
    The following line can be added:
    pg_hba.conf
    # TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
    local   all         <db username>                     password
    # the default will look something like
    #local   all         all                               ident
  • After updating the database connection settings, you must restart the funnelback-continuous service.  Check the /opt/funnelback/log/continuous-global.log for errors while it is starting up.  Any database connection issues will appear in this file.  

Sharepoint

Commonly used SID's

Manifold will drag out many SID's as part of the ACL for a particular page or document.  Most of these will map directly back to an Active Directory user, but some will be generic "Global Access" type SID's

http://support.microsoft.com/kb/243330

SID notes:

The main two that you might encounter in SharePoint are SID: S-1-5-11, Name: Authenticated Users and SID: S-1-1-0, Name: Everyone.

How to identify? You just have to use a list to identify the values you want. There are also header files or similar in Windows that define them, e.g. .NET has WellKnownSidType enumeration http://msdn.microsoft.com/en-us/library/system.security.principal.wellknownsidtype(v=vs.110).aspx.

Are they set in AD? Not exactly. The same values the are used in Active Directory, but they aren't set in AD; they are a standardised value as per MSDN, although the definitions have shifted slightly over time.

There are also values like SID: S-1-5-21-domain-513, Name: Domain Users, which the actual value depends on the domain, but always has a specific group ID. These groups should work normally (i.e. they are just a group that users are in), so shouldn't need special handling.

You might also see values like S-1-0-0, Name: Nobody, SID: S-1-5-7, Name: Anonymous and potentially others. These probably won't appear as claims values, but directly as SIDs (I would have the check this). There are also special names like "SHAREPOINT\System", which is a special system account  (I don't think it has a SID).

You can create an example scenario by creating an item (or list or site) and then in the security settings assign read permissions to a group such as "Authenticated Users". If you then check the permissions on that item that you are receiving via the connector, you can see what format the values  come across as.

Note that SIDs have particular structure, so you should be able to output a summary of regular vs special SIDs, e.g. different length.

Testing Soap calls to the MCPermissions.asmx service

To enable the Sharepoint crawl, the MCPermissions.asmx service must be installed within the Sharepoint installation you are trying to crawl.  The relevant WSP installer is packaged with Manifold located in:

 $SEARCH_HOME/tools/apache-manifoldcf/sharepoint-integration/sharepoint-VERSION/MetaCarta.Sharepoing.MCPermissionsService

After installation, you can view all the available methods by viewing http://<sharepoint site domain>/_vti_bin/MCPermissions.asmx

This API interface can easily be tested by using SoapUI:

Download: http://www.soapui.org/

Use the wsdl interface to test - http://<sharepoint site domain>/_vti_bin/MCPermissions.asmx?wsdl

Was this artcle helpful?

Tags
Type:
Features: