Spark Submit from SSH has different behavior

122 views Asked by At

I have a yarn single node cluster setup in an Ubuntu VM.

When I am doing a spark-submit from the VM everything is working fine but when I am launching the same command from another VM with SSH the job is not working because it use a bad IP address for resource Manager.

The command I use in the yarn VM :

/home/namenode/spark/bin/spark-submit --master yarn --class Main --deploy-mode cluster /home/namenode/jars/data-transformation-service_2.11-0.1.0-SNAPSHOT.jar

The result :

Connecting to ResourceManager at /192.168.1.110:8032

And then my job is finishing well.

The command I use from another VM with SSH :

ssh [email protected] '/home/namenode/spark/bin/spark-submit --master yarn --class Main --deploy-mode cluster /home/namenode/jars/data-transformation-service_2.11-0.1.0-SNAPSHOT.jar'

The result :

22/10/26 15:16:12 INFO DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
22/10/26 15:16:13 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

And it loops again and again...

Do you have any idea how to fix this ? Thank you.

1

There are 1 answers

0
Yohan On

I finally managed to solved my issue by using this yarn-site :

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
 <property>
       <name>yarn.nodemanager.aux-services</name>
       <value>mapreduce_shuffle</value>
 </property>
 <property>
       <name>yarn.nodemanager.env-whitelist</name>
       <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREP END_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
 </property>
 <property>
       <name>yarn.resourcemanager.address</name>
       <value>192.168.1.110:8032</value>
 </property>
 <property>
       <name>yarn.resourcemanager.hostname</name>
       <value>192.168.1.110</value>
    </property>
</configuration>