Spark cluster from Open data studio Zeppelin


Open data studio Apache Zeppelin integrates Spark 3.x out of the box. Extra installation/initialization steps are not required.

https://user-images.githubusercontent.com/1540981/80290438-cf3bc180-86f9-11ea-8c1f-d2dedcd48a86.png

Launch and use spark interpreter. Spark cluster will be automatically created.

configure spark executors
%spark.conf
spark.executor.instances 3
run spark api
%spark
// 'sc' and 'spark' are automatically created
spark.read.json(...)

Check Apache Zeppelin for more details.