camel-tika-kafka-connector sink configuration
Connector Description: Parse documents and extract metadata and text using Apache Tika.
When using camel-tika-kafka-connector as sink make sure to use the following Maven dependency to have support for the connector:
<dependency>
<groupId>org.apache.camel.kafkaconnector</groupId>
<artifactId>camel-tika-kafka-connector</artifactId>
<version>x.x.x</version>
<!-- use the same version as your Camel Kafka connector version -->
</dependency>
To use this Sink connector in Kafka connect you’ll need to set the following connector.class
connector.class=org.apache.camel.kafkaconnector.tika.CamelTikaSinkConnector
The camel-tika sink connector supports 8 options, which are listed below.
Name | Description | Default | Required | Priority |
---|---|---|---|---|
camel.sink.path.operation |
Operation type One of: [parse] [detect] |
null |
true |
HIGH |
camel.sink.endpoint.lazyStartProducer |
Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing. |
false |
false |
MEDIUM |
camel.sink.endpoint.tikaConfig |
Tika Config |
null |
false |
MEDIUM |
camel.sink.endpoint.tikaConfigUri |
Tika Config Url |
null |
false |
MEDIUM |
camel.sink.endpoint.tikaParseOutputEncoding |
Tika Parse Output Encoding |
null |
false |
MEDIUM |
camel.sink.endpoint.tikaParseOutputFormat |
Tika Output Format. Supported output formats. xml: Returns Parsed Content as XML. html: Returns Parsed Content as HTML. text: Returns Parsed Content as Text. textMain: Uses the boilerpipe library to automatically extract the main content from a web page. One of: [xml] [html] [text] [textMain] |
"xml" |
false |
MEDIUM |
camel.component.tika.lazyStartProducer |
Whether the producer should be started lazy (on the first message). By starting lazy you can use this to allow CamelContext and routes to startup in situations where a producer may otherwise fail during starting and cause the route to fail being started. By deferring this startup to be lazy then the startup failure can be handled during routing messages via Camel’s routing error handlers. Beware that when the first message is processed then creating and starting the producer may take a little time and prolong the total processing time of the processing. |
false |
false |
MEDIUM |
camel.component.tika.autowiredEnabled |
Whether autowiring is enabled. This is used for automatic autowiring options (the option must be marked as autowired) by looking up in the registry to find if there is a single instance of matching type, which then gets configured on the component. This can be used for automatic configuring JDBC data sources, JMS connection factories, AWS Clients, etc. |
true |
false |
MEDIUM |
The camel-tika sink connector has no converters out of the box.
The camel-tika sink connector has no transforms out of the box.
The camel-tika sink connector has no aggregation strategies out of the box.