why pyspark task not stop?

15 views Asked by At

I use pyspark to process my data and write the result to kafka like this:

map_df.write.format("kafka").option("kafka.bootstrap.servers", "xxxx:xx").option("spark.kafka.producer.cache.timeout", "1m").option("topic", "my-stream").option("kafka.request.timeout.ms", 120000).option("kafka.batch.size", 100).save()

But I find the task connot stop until: *Application application_1697441116106_286188 failed 1 times (global limit =100; local limit is =1) due to ApplicationMaster for attempt appattempt_1697441116106_286188_000001 timed out. *

I find all the data has been send to kafka, the kafka-console-consumer can not consumer any data after the stage finished. But the state is still running until timeout.

So what happening? can anyone help me?

I also find a useful infomation in the log: encounter exception in TSDBSender:

enter image description here

I expect the task stop after all the data have been send to kafka.

0

There are 0 answers