This is a Splunk Modular Input Add-On for indexing messages from a Kafka broker or cluster of brokers that are managed by Zookeeper.
Kafka version 0.8.1.1 is used for the consumer and the testing of this Modular Input.
As this is a Modular Input , you can then configure your Kafka inputs via Manager->Data Inputs->Kafka. The field entry should be straightforward and intuitive for anyone with basic experience with Kafka / Zookeeper.
Any log entries/errors will get written to $SPLUNK_HOME/var/log/splunk/splunkd.log
The default heap maximum is 64MB.
If you require a larger heap, then you can alter this in $SPLUNK_HOME/etc/apps/kafka_ta/bin/kafka.py on line 95
You can declare custom JVM System Properties when setting up new input stanzas.
Note : these JVM System Properties will apply to the entire JVM context and all stanzas you have setup
The way in which the Modular Input processes the received Kafka messages is enitrely pluggable with custom implementations should you wish.
To do this you code an implementation of the com.splunk.modinput.kafka.AbstractMessageHandler class and jar it up.
Ensure that the necessary jars are in the $SPLUNK_HOME/etc/apps/kafka_ta/bin/lib directory.
If you don't need a custom handler then the default handler com.splunk.modinput.kafka.DefaultMessageHandler will be used.
This handler simply trys to convert the received byte array into a textual string for indexing in Splunk.
Code examples are on GitHub : https://github.com/damiendallimore/SplunkModularInputsJavaFramework/tree/master/kafka/src/com/splunk/modinput/kafka
This project was initiated by Damien Dallimore , firstname.lastname@example.org
Added a new custom handler : com.splunk.modinput.kafka.CSVWithHeaderDecoderHandler
This allows you to roll out CSV files (with or without header) into KV or JSON before indexing.
Example config you could pass to the custom message handler when you declare it
Better JSON handling for HEC output (hat tip to Tivo)
Better logging around HEC success/failure
Can now add custom timestamp into HEC payload
New custom handler (JSONBodyWithTimeExtraction) for pulling out timestamp from JSON messages from Kafka and adding this into HEC payload
Added support to optional output to Splunk via a HEC (HTTP Event Collector) endpoint
Added support for raw connection string format so that multiple zookeeper hosts
can be provided in a comma delimited manner
Added chroot support for zookeeper connection strings
Enabled TLS1.2 support by default.
Made the core Modular Input Framework compatible with latest Splunk Java SDK
Please use a Java Runtime version 7+
If you need to use SSLv3 , you can turn this on in bin/kafka.py
SECURE_TRANSPORT = "tls"
#SECURE_TRANSPORT = "ssl"
You can now pass a charset name to the DefaultHandler
Initial beta release
Splunk's App Certification program uses a specific set of criteria to evaluate the level of quality, usability and security your app offers to its users. In addition, we evaluate the documentation and support you offer to your app's users.
As a Splunkbase app developer, you will have access to all Splunk development resources and receive a 50GB license to build an app that will help solve use cases for customers all over the world. Splunkbase has 1000+ apps and add-ons from Splunk, our partners and our community. Find an app or add-on for most any data source and user need, or simply create your own with help from our developer portal.