// Run spark-shell with spark-sql-kafka-0-10_2.11 module
spark.readStream
.format("kafka")
.option("subscribe", "topic")
.option("kafka.bootstrap.servers", "localhost:9092")
.load
.writeStream
.format("console")
.start
KafkaSourceProvider
KafkaSourceProvider
is a StreamSourceProvider for KafkaSource.
KafkaSourceProvider
is a DataSourceRegister, too.
The short name of the data source is kafka
.
KafkaSourceProvider
requires the following options (that you can set using option
method):
-
Exactly one option for
subscribe
,subscribepattern
orassign
-
kafka.bootstrap.servers
(corresponds tobootstrap.servers
)
Note
|
KafkaSourceProvider is part of spark-sql-kafka-0-10 Library Dependency.
|
spark-sql-kafka-0-10
Library Dependency
The new structured streaming API for Kafka is part of the spark-sql-kafka-0-10
artifact. Add the following dependency to sbt project to use the streaming integration:
libraryDependencies += "org.apache.spark" %% "spark-sql-kafka-0-10" % "2.0.1"
Tip
|
|
Note
|
Replace 2.0.1 or 2.1.0-SNAPSHOT with available version as found at The Central Repository’s search.
|