How do I use custom Spark configuration files with my cluster?

Create custom versions of standard Spark configuration files such as spark-defaults.conf or spark-env.sh and put them together in a subdirectory, then create a configmap from those files:

ls spark_config_dir
log4j.properties  metrics.properties  spark-defaults.conf spark-env.sh
oc create configmap mysparkconfig --from-file=spark_config_dir

Those files will ultimately be written to the Spark configuration directory of cluster nodes, so their names must match valid Spark configuration file names.

To use the Spark configuration on the master and/or worker nodes of a cluster created with the oshinko CLI, specify the configmap using flags:

oshinko create mycluster --workerconfig=mysparkconfig --masterconfig=mysparkconfig

To group Spark configurations for the master and workers together with things like the number of workers to create, create a cluster configuration configmap that references the master and/or worker configurations and then use it to create a cluster:

oc create configmap clusterconfig --from-literal=sparkworkerconfig=mysparkconfig \
                                    --from-literal=sparkmasterconfig=mysparkconfig \
				    --from-literal=workercount=4
oshinko create mycluster --storedconfig=clusterconfig

Also note that the Advanced tab for cluster creation on the oshinko web UI allows the names of configmaps to be entered for worker, master, or cluster configuration.

To use a cluster configuration on a Spark cluster created from an S2I workflow, specify the configmap using the OSHINKO_NAMED_CONFIG parameter when creating the app:

oc new-app --template=oshinko-python-spark-build-dc -p OSHINKO_NAMED_CONFIG=clusterconfig ...

To use a Spark configuration on the driver pod created by an S2I workflow, specify a configmap using the OSHINKO_SPARK_DRIVER_CONFIG parameter when creating the app:

oc new-app --template=oshinko-python-spark-build-dc -p OSHINKO_SPARK_DRIVER_CONFIG=mysparkconfig ...