I am trying to test spark-submit standalone mode and running below example task
spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://MBP-49F32N-CSP.local:7077 \
--driver-memory 3g \
--executor-memory 3g \
--num-executors 2 \
--executor-cores 2 \
--conf spark.dynamicAllocation.enabled=false \
/opt/homebrew/Cellar/apache-spark/3.5.0/libexec/examples/jars/spark-examples_2.12-3.5.0.jar \
10
And i can see logs generated under
/apache-spark/3.5.0/libexec/work
I can see directories with application-id
app-20230929175322-0003 app-20231003110238-0000
And inside app-20231003110238-0000 there are sub-directories 0 1 2 3 4 5 which are named after executors. and inside each directory i can see stderr stdout
Is there any way to aggregate all executor logs under application-id(ex=app-20231003110238-0000) directory?
like when we run spark in yarn mode we see all logs under yarn logs -applicationId <application_ID>
You can have a shell script to aggregate all the stdout and stderr logs, similar to the below.