public class SolrConnector
extends org.apache.manifoldcf.agents.output.BaseOutputConnector
| Modifier and Type | Class and Description |
|---|---|
protected class |
SolrConnector.SpecPacker
This class handles Solr connector version string packing/unpacking/interpretation.
|
| Modifier and Type | Field and Description |
|---|---|
static java.lang.String |
_rcsid |
protected java.lang.String |
allowAttributeName
The allow attribute name
|
protected static boolean |
allowCompression
Allow compression? Currently static
|
protected java.lang.String |
collectionName
Collection name (non-empty only if SolrCloud)
|
protected java.lang.String |
contentAttributeName |
protected java.lang.String |
createdDateAttributeName |
protected java.lang.String |
denyAttributeName
The deny attribute name
|
protected boolean |
doCommits
Whether or not to commit
|
protected java.util.Set<java.lang.String> |
excludedMimeTypes
Excluded mime types
|
protected java.lang.String |
excludedMimeTypesString
Excluded mime types string
|
protected static long |
EXPIRATION_INTERVAL
Idle connection expiration interval
|
protected long |
expirationTime
Expiration
|
protected java.lang.String |
fileNameAttributeName |
protected java.lang.String |
idAttributeName |
protected java.util.Set<java.lang.String> |
includedMimeTypes
Included mime types
|
protected java.lang.String |
includedMimeTypesString
Included mime types string
|
protected java.lang.String |
indexedDateAttributeName |
static java.lang.String |
INGEST_ACTIVITY
Ingestion activity
|
protected java.lang.Long |
maxDocumentLength
The maximum document length
|
protected java.lang.String |
mimeTypeAttributeName |
protected java.lang.String |
modifiedDateAttributeName |
protected java.lang.String |
originalSizeAttributeName |
protected HttpPoster |
poster
Local connection
|
static java.lang.String |
REMOVE_ACTIVITY
Document removal activity
|
protected boolean |
useExtractUpdateHandler
Use extractiing update handler?
|
| Constructor and Description |
|---|
SolrConnector()
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
int |
addOrReplaceDocumentWithException(java.lang.String documentURI,
org.apache.manifoldcf.core.interfaces.VersionContext pipelineDescription,
org.apache.manifoldcf.agents.interfaces.RepositoryDocument document,
java.lang.String authorityNameString,
org.apache.manifoldcf.agents.interfaces.IOutputAddActivity activities)
Add (or replace) a document in the output data store using the connector.
|
java.lang.String |
check()
Test the connection.
|
boolean |
checkLengthIndexable(org.apache.manifoldcf.core.interfaces.VersionContext outputDescription,
long length,
org.apache.manifoldcf.agents.interfaces.IOutputCheckActivity activities)
Pre-determine whether a document's length is indexable by this connector.
|
boolean |
checkMimeTypeIndexable(org.apache.manifoldcf.core.interfaces.VersionContext outputDescription,
java.lang.String mimeType,
org.apache.manifoldcf.agents.interfaces.IOutputCheckActivity activities)
Detect if a mime type is indexable or not.
|
void |
connect(org.apache.manifoldcf.core.interfaces.ConfigParams configParameters)
Connect.
|
void |
disconnect()
Close the connection.
|
java.lang.String[] |
getActivitiesList()
Return the list of activities that this connector supports (i.e.
|
org.apache.manifoldcf.core.interfaces.VersionContext |
getPipelineDescription(org.apache.manifoldcf.core.interfaces.Specification spec)
Get an output version string, given an output specification.
|
protected void |
getSession()
Set up a session
|
boolean |
isConnected()
This method is called to assess whether to count this connector instance should
actually be counted as being connected.
|
void |
noteJobComplete(org.apache.manifoldcf.agents.interfaces.IOutputNotifyActivity activities)
Notify the connector of a completed job.
|
void |
outputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
java.util.Locale locale,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
java.lang.String tabName)
Output the configuration body section.
|
void |
outputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
java.util.Locale locale,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
java.util.List<java.lang.String> tabsArray)
Output the configuration header section.
|
protected static java.util.Set<java.lang.String> |
parseMimeTypes(java.lang.String mimeTypes)
Parse a mime type field into individual mime types in a hash
|
void |
poll()
This method is periodically called for all connectors that are connected but not
in active use.
|
java.lang.String |
processConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IPostParameters variableContext,
java.util.Locale locale,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
Process a configuration post.
|
void |
removeDocument(java.lang.String documentURI,
java.lang.String outputDescription,
org.apache.manifoldcf.agents.interfaces.IOutputRemoveActivity activities)
Remove a document using the connector.
|
void |
viewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
java.util.Locale locale,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
View configuration.
|
checkDateIndexable, checkDocumentIndexable, checkURLIndexable, getFormCheckJavascriptMethodName, getFormPresaveCheckJavascriptMethodName, noteAllRecordsRemoved, outputSpecificationBody, outputSpecificationHeader, processSpecificationPost, requestInfo, viewSpecificationclearThreadContext, deinstall, getConfiguration, install, outputConfigurationBody, outputConfigurationHeader, outputConfigurationHeader, pack, packFixedList, packList, packList, processConfigurationPost, setThreadContext, unpack, unpackFixedList, unpackList, viewConfigurationpublic static final java.lang.String _rcsid
public static final java.lang.String INGEST_ACTIVITY
public static final java.lang.String REMOVE_ACTIVITY
protected HttpPoster poster
protected long expirationTime
protected java.lang.String allowAttributeName
protected java.lang.String denyAttributeName
protected java.lang.Long maxDocumentLength
protected java.lang.String includedMimeTypesString
protected java.util.Set<java.lang.String> includedMimeTypes
protected java.lang.String excludedMimeTypesString
protected java.util.Set<java.lang.String> excludedMimeTypes
protected java.lang.String idAttributeName
protected java.lang.String originalSizeAttributeName
protected java.lang.String modifiedDateAttributeName
protected java.lang.String createdDateAttributeName
protected java.lang.String indexedDateAttributeName
protected java.lang.String fileNameAttributeName
protected java.lang.String mimeTypeAttributeName
protected java.lang.String contentAttributeName
protected boolean useExtractUpdateHandler
protected static final boolean allowCompression
protected boolean doCommits
protected java.lang.String collectionName
protected static final long EXPIRATION_INTERVAL
public java.lang.String[] getActivitiesList()
getActivitiesList in interface org.apache.manifoldcf.agents.interfaces.IOutputConnectorgetActivitiesList in class org.apache.manifoldcf.agents.output.BaseOutputConnectorpublic void connect(org.apache.manifoldcf.core.interfaces.ConfigParams configParameters)
connect in interface org.apache.manifoldcf.core.interfaces.IConnectorconnect in class org.apache.manifoldcf.core.connector.BaseConnectorconfigParameters - is the set of configuration parameters, which
in this case describe the target appliance, basic auth configuration, etc. (This formerly came
out of the ini file.)public void poll()
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
poll in interface org.apache.manifoldcf.core.interfaces.IConnectorpoll in class org.apache.manifoldcf.core.connector.BaseConnectororg.apache.manifoldcf.core.interfaces.ManifoldCFExceptionpublic boolean isConnected()
isConnected in interface org.apache.manifoldcf.core.interfaces.IConnectorisConnected in class org.apache.manifoldcf.core.connector.BaseConnectorpublic void disconnect()
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
disconnect in interface org.apache.manifoldcf.core.interfaces.IConnectordisconnect in class org.apache.manifoldcf.core.connector.BaseConnectororg.apache.manifoldcf.core.interfaces.ManifoldCFExceptionprotected void getSession()
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionprotected static java.util.Set<java.lang.String> parseMimeTypes(java.lang.String mimeTypes)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionpublic java.lang.String check()
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
check in interface org.apache.manifoldcf.core.interfaces.IConnectorcheck in class org.apache.manifoldcf.core.connector.BaseConnectororg.apache.manifoldcf.core.interfaces.ManifoldCFExceptionpublic org.apache.manifoldcf.core.interfaces.VersionContext getPipelineDescription(org.apache.manifoldcf.core.interfaces.Specification spec)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
getPipelineDescription in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnectorgetPipelineDescription in class org.apache.manifoldcf.agents.output.BaseOutputConnectorspec - is the current output specification for the job that is doing the crawling.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruptionpublic boolean checkMimeTypeIndexable(org.apache.manifoldcf.core.interfaces.VersionContext outputDescription,
java.lang.String mimeType,
org.apache.manifoldcf.agents.interfaces.IOutputCheckActivity activities)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
checkMimeTypeIndexable in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnectorcheckMimeTypeIndexable in class org.apache.manifoldcf.agents.output.BaseOutputConnectoroutputDescription - is the document's output version.mimeType - is the mime type of the document.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruptionpublic boolean checkLengthIndexable(org.apache.manifoldcf.core.interfaces.VersionContext outputDescription,
long length,
org.apache.manifoldcf.agents.interfaces.IOutputCheckActivity activities)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
checkLengthIndexable in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnectorcheckLengthIndexable in class org.apache.manifoldcf.agents.output.BaseOutputConnectoroutputDescription - is the document's output version.length - is the length of the document.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruptionpublic int addOrReplaceDocumentWithException(java.lang.String documentURI,
org.apache.manifoldcf.core.interfaces.VersionContext pipelineDescription,
org.apache.manifoldcf.agents.interfaces.RepositoryDocument document,
java.lang.String authorityNameString,
org.apache.manifoldcf.agents.interfaces.IOutputAddActivity activities)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption,
java.io.IOException
addOrReplaceDocumentWithException in interface org.apache.manifoldcf.agents.interfaces.IPipelineConnectoraddOrReplaceDocumentWithException in class org.apache.manifoldcf.agents.output.BaseOutputConnectordocumentURI - is the URI of the document. The URI is presumed to be the unique identifier which the output data store will use to process
and serve the document. This URI is constructed by the repository connector which fetches the document, and is thus universal across all output connectors.pipelineDescription - includes the description string that was constructed for this document by the getOutputDescription() method.document - is the document data to be processed (handed to the output data store).authorityNameString - is the name of the authority responsible for authorizing any access tokens passed in with the repository document. May be null.activities - is the handle to an object that the implementer of a pipeline connector may use to perform operations, such as logging processing activity,
or sending a modified document to the next stage in the pipeline.java.io.IOException - only if there's a stream error reading the document data.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruptionpublic void removeDocument(java.lang.String documentURI,
java.lang.String outputDescription,
org.apache.manifoldcf.agents.interfaces.IOutputRemoveActivity activities)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
removeDocument in interface org.apache.manifoldcf.agents.interfaces.IOutputConnectorremoveDocument in class org.apache.manifoldcf.agents.output.BaseOutputConnectordocumentURI - is the URI of the document. The URI is presumed to be the unique identifier which the output data store will use to process
and serve the document. This URI is constructed by the repository connector which fetches the document, and is thus universal across all output connectors.outputDescription - is the last description string that was constructed for this document by the getOutputDescription() method above.activities - is the handle to an object that the implementer of an output connector may use to perform operations, such as logging processing activity.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruptionpublic void noteJobComplete(org.apache.manifoldcf.agents.interfaces.IOutputNotifyActivity activities)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
org.apache.manifoldcf.agents.interfaces.ServiceInterruption
noteJobComplete in interface org.apache.manifoldcf.agents.interfaces.IOutputConnectornoteJobComplete in class org.apache.manifoldcf.agents.output.BaseOutputConnectoractivities - is the handle to an object that the implementer of an output connector may use to perform operations, such as logging processing activity.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionorg.apache.manifoldcf.agents.interfaces.ServiceInterruptionpublic void outputConfigurationHeader(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
java.util.Locale locale,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
java.util.List<java.lang.String> tabsArray)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
outputConfigurationHeader in interface org.apache.manifoldcf.core.interfaces.IConnectoroutputConfigurationHeader in class org.apache.manifoldcf.core.connector.BaseConnectorthreadContext - is the local thread context.out - is the output to which any HTML should be sent.parameters - are the configuration parameters, as they currently exist, for this connection being configured.tabsArray - is an array of tab names. Add to this array any tab names that are specific to the connector.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOExceptionpublic void outputConfigurationBody(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
java.util.Locale locale,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters,
java.lang.String tabName)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
outputConfigurationBody in interface org.apache.manifoldcf.core.interfaces.IConnectoroutputConfigurationBody in class org.apache.manifoldcf.core.connector.BaseConnectorthreadContext - is the local thread context.out - is the output to which any HTML should be sent.parameters - are the configuration parameters, as they currently exist, for this connection being configured.tabName - is the current tab name.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOExceptionpublic java.lang.String processConfigurationPost(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IPostParameters variableContext,
java.util.Locale locale,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException
processConfigurationPost in interface org.apache.manifoldcf.core.interfaces.IConnectorprocessConfigurationPost in class org.apache.manifoldcf.core.connector.BaseConnectorthreadContext - is the local thread context.variableContext - is the set of variables available from the post, including binary file post information.parameters - are the configuration parameters, as they currently exist, for this connection being configured.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionpublic void viewConfiguration(org.apache.manifoldcf.core.interfaces.IThreadContext threadContext,
org.apache.manifoldcf.core.interfaces.IHTTPOutput out,
java.util.Locale locale,
org.apache.manifoldcf.core.interfaces.ConfigParams parameters)
throws org.apache.manifoldcf.core.interfaces.ManifoldCFException,
java.io.IOException
viewConfiguration in interface org.apache.manifoldcf.core.interfaces.IConnectorviewConfiguration in class org.apache.manifoldcf.core.connector.BaseConnectorthreadContext - is the local thread context.out - is the output to which any HTML should be sent.parameters - are the configuration parameters, as they currently exist, for this connection being configured.org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionjava.io.IOException