pythonapache-sparkkubeflowkubeflow-pipelines

Python dependencies in kubeflow spark operator


I wanted to ask if there is a way to use python as a .wheel or .egg or just .py dependency in kubeflow spark operator.

The resulting file i have in mind would look something like this, the dependecy would be either under jars or files, i presume files would make more sense:

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: spark-pi-python
  namespace: default
spec:
  type: Python
  pythonVersion: "3"
  mode: cluster
  image: spark:3.5.3
  imagePullPolicy: IfNotPresent
  mainApplicationFile: local:///path/to/my/python/script.py
  deps:
    jars:
      - local:///path/to/python/functions.py
    files:
      - gs://path/to/python/functions.py
  sparkVersion: 3.5.3
  driver:
    cores: 1
    memory: 512m
    serviceAccount: spark-operator-spark
  executor:
    instances: 1
    cores: 1
    memory: 512m


Solution

  • It is possible to use python files as dependencies, see link. This has worked for me:

    apiVersion: sparkoperator.k8s.io/v1beta2
    kind: SparkApplication
    metadata:
      name: view-creator-test
      namespace: default
    spec:
      type: Python
      pythonVersion: "3"
      mode: cluster
      image: spark:3.5.3
      imagePullPolicy: IfNotPresent
      mainApplicationFile: local:///path/to/my/python/script.py
      arguments: []
      sparkVersion: 3.5.3
      deps:
        pyFiles:
          - local:///mnt/spark/dependency_1.py
          - local:///mnt/spark/dependency_2.py
      driver:
        labels:
          version: 3.5.3
        cores: 1
        memory: 512m
        volumeMounts:
          - name: view-creator-volume
            mountPath: /mnt/spark
      executor:
        labels:
          version: 3.5.3
        instances: 1
        cores: 1
        memory: 512m
        volumeMounts:
          - name: view-creator-volume
            mountPath: /mnt/spark
      volumes:
        - name: view-creator-volume
          persistentVolumeClaim:
            claimName: view-creator-pvc