Enterprise Kafka and Spark : Kerberos based Integration
In previous posts we have seen how to integrate Kafka with Spark streaming. However in a typical enterprise environment the connection between Kafka and spark should be secure. If you are using cloudera or hortonworks distribution of Hadoop/Spark, then most likely it will be Kerberos enabled connection. In this post we will see what are the configurations we have to make to enable kerberized kafka spark integration. Before getting into details of integration, i wanted to make it clear that I assume you are using cloudera/hortonworks/MapR provided kafka and spark kerberos enablled installation. we will see that as a developer what additional things has to be performed by you. For this we have to first select security protocol provided by kafka. if you see kafka documentation, security.protocol property is used for this. we can select following values 1. PLAINTEXT 2. SASL_PLAINTEXT 3. SASL_SSL We will select SASL_PLAINTEXT for kerberos. This property has to be set ...