Tokenize
Since Camel 2.0
The tokenizer language is a built-in language in camel-core
, which is
most often used with the Split EIP
to split a message using a token-based strategy.
The tokenizer language is intended to tokenize text documents using a specified delimiter pattern. It can also be used to tokenize XML documents with some limited capability. For a truly XML-aware tokenization, the use of the XML Tokenize language is recommended as it offers a faster, more efficient tokenization specifically for XML documents.
Tokenize Options
The Tokenize language supports 11 options, which are listed below.
Name | Default | Java Type | Description |
---|---|---|---|
token |
|
Required The (start) token to use as tokenizer, for example you can use the new line token. You can use simple language as the token to support dynamic tokens. |
|
endToken |
|
The end token to use as tokenizer if using start/end token pairs. You can use simple language as the token to support dynamic tokens. |
|
inheritNamespaceTagName |
|
To inherit namespaces from a root/parent tag name when using XML You can use simple language as the tag name to support dynamic names. |
|
headerName |
|
Name of header to tokenize instead of using the message body. |
|
regex |
|
If the token is a regular expression pattern. The default value is false. |
|
xml |
|
Whether the input is XML messages. This option must be set to true if working with XML payloads. |
|
includeTokens |
|
Whether to include the tokens in the parts when using pairs The default value is false. |
|
group |
|
To group N parts together, for example to split big files into chunks of 1000 lines. You can use simple language as the group to support dynamic group sizes. |
|
groupDelimiter |
|
Sets the delimiter to use when grouping. If this has not been set then token will be used as the delimiter. |
|
skipFirst |
|
To skip the very first element. |
|
trim |
|
|
Whether to trim the value to remove leading and trailing whitespaces and line breaks. |
Example
The following example shows how to take a request from the direct:a endpoint then split it into pieces using an Expression, then forward each piece to direct:b:
<route>
<from uri="direct:a"/>
<split>
<tokenize token="\n"/>
<to uri="direct:b"/>
</split>
</route>
And in Java DSL:
from("direct:a")
.split(body().tokenize("\n"))
.to("direct:b");
For more examples see Split EIP.