Read ES into Spark Dataframe


A sample query to read ES into Spark Dataframe

>>> df = spark.read \
...     .option("es.nodes", "10.10.10.10:9200") \
...     .option("es.net.ssl", "true") \
...     .option("es.net.ssl.cert.allow.self.signed", "false") \
...     .option("es.net.http.auth.user", username) \
...     .option("es.net.http.auth.pass", password) \
...     .option("es.net.ssl.truststore.location", "file:///home/hdfs/truststore.jks") \
...     .option("es.net.ssl.truststore.pass", "123456") \
...     .option("es.net.ssl.keystore.location", "file:///home/hdfs/truststore.jks") \
...     .option("es.net.ssl.keystore.pass", "123456") \
...     .option("es.net.ssl.protocol","TLS") \
...     .option("es.resource", "index-2016*") \
...     .option("es.query", query) \
...     .option('es.input.json', 'yes')\
...     .option("es.read.field.exclude","transaction,tags,all_status,location")\
...     .format("org.elasticsearch.spark.sql") \
...     .load()

Comments