

In this post, I’ll show you step-by-step tutorial for running Apache Spark on AKS. Note that Kubernetes scheduler is currently experimental.)Įspecially in Microsoft Azure, you can easily run Spark on cloud-managed Kubernetes, Azure Kubernetes Service (AKS). One way to achieve this is by creating a headless service for your pod and then use -conf =YOUR_HEADLESS_SERVICE whenever you submit your application.Apache Spark officially includes Kubernetes support, and thereby you can run a Spark job on your own Kubernetes cluster. You can use your own image packed with Spark and your application but when deployed it must be reachable from the workers. spark/bin/spark-submit -class CLASS_TO_RUN -master spark://spark-master:7077 -deploy-mode client -conf =spark-client URL_TO_YOUR_APP Kubectl run spark-base -rm -it -labels="app=spark-client" -image bde2020/spark-base:3.2.0-hadoop3.2 - bash. spark/bin/spark-shell -master spark://spark-master:7077 -conf =spark-client It will also setup a headless service so spark clients can be reachable from the workers using hostname spark-client. The master is reachable in the same namespace at spark://spark-master:7077. This will setup a Spark standalone cluster with one master and a worker on every available node using the default namespace and resources. To deploy a simple Spark standalone cluster issue The BDE Spark images can also be used in a Kubernetes enviroment. Check the template's README for further documentation.


