How to use timeout function to return a partial dataframe as a response?

30 views Asked by At

I have a big problem!

I am making a request that works like this: I consult a database in BQ and get several IDs as results. but sometimes the code get stuck and can't access the server. Then, with the table ready, I take each ID an iterate in a request. But sometimes, the code gets stuck and I can't pass the request to the server. This is bad, because it can happen on ID number 700, for example, and I already have the information for the other 699 IDs. When it crashes I lose all information. I wouldn't want that to happen.

Reading some things here, I discovered a function, but I can't return the dataframe that has already been sent through the request as a response.

Here's the code:

1 - API script


class api_me_cargas(): 

def clean_CargasPendentes(self, url, token, data_frame): 
        df_ids = data_frame.copy()

        for i, r in df.iterrows():
             id = r['ID'] 
             df_request_example = request(url, token, id=id) 
             df_results = pd.concat([df_request_example, df_results])

        return self.df_results

2 - Timeout


@contextmanager
def time_limit(seconds):
    def signal_handler(signum, frame):
        raise TimeoutException("Timed out!")
    signal.signal(signal.SIGALRM, signal_handler)
    signal.alarm(seconds)
    try:
        yield
    finally:
        signal.alarm(0)        

def timeout_signal(func):
    def wrapper(*args, **kwargs):
        successful_responses = []
        try:
            with time_limit(5):
                while True:
                    try:
                        resp = func(*args, **kwargs)
                        successful_responses.append(resp)
                    except TimeoutException:
                        print("Timed out!")
                        return successful_responses
        except TimeoutException as e:
            print("Timed out!")
            return successful_responses

    return wrapper 

3 - Data Flow

 class run_get_cargas(): 

    def __init__(self):

      # Variables
      self.url = 'url_teste_teste' 
      self.token = 'xxxxxxx'

      self.df_api = pd.DataFrame()
      self.df_etl = pd.DataFrame()
      self.main_functions = api_me_cargas()

@timeout_signal
    def clear_protocolo(self):
       self.df_t = self.main_functions.clean_CargasPendentes(self.url, self.token, self.df_bq)

       return self.df_t


    def run_api(self):

       self.df_clean_CargasPendentes = self.clear_protocolo()

Resume: If any request ID "clean_CargasPendentes" within the "clear_protocolo" function exceeds the limit of 5 seconds, I would like to return the df that was built up to that moment respecting the request time limit of 5. For example. I passed 1000 requests, until 967 it was ok. but request 968 exceeded the limit, I would like you to return the dataframe with the 967 requests

My result is empty, it doesn't give me the df I need

0

There are 0 answers