I am using WSL Ubuntu, Wandb and Mlflow, trying to connect to Wandb but the connection failed after trying 90sec. Various measures has been taken, including updating the latest Wandb version, doing Wandb login again,setting longer duration just before running the script with "export WANDB_INIT_TIMEOUT=120" and "export WANDB_HTTP_TIMEOUT=60". Also in my main.py, in the wandb.init(), I have included the arugment, "entity=[user name as registered]". Please let me know how to solve this, thank you.
Below is what has transpired in my Command Line:
(nyc_airbnb_dev) root@LAPTOP-1LJGH0LE:/mnt/c/users/nathan/build-ml-pipeline-for-short-term-rental-prices# mlflow run . -P steps="download"
2024/03/25 12:43:13 INFO mlflow.utils.conda: Conda environment mlflow-498808f8c125fa02022e7f09f7b0b2b49dc0ce8a already exists.
2024/03/25 12:43:13 INFO mlflow.projects.utils: === Created directory /tmp/tmpju35rja7 for downloading remote URIs passed to arguments of type 'path' ===
2024/03/25 12:43:13 INFO mlflow.projects.backend.local: === Running command 'source /root/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-498808f8c125fa02022e7f09f7b0b2b49dc0ce8a 1>&2 && python main.py main.steps='download' $(echo '')' in run with ID 'c8dffebdcff94982adebb2b54847540e' ===
/mnt/c/users/nathan/build-ml-pipeline-for-short-term-rental-prices/main.py:24: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_name='config')
/mnt/c/users/nathan/build-ml-pipeline-for-short-term-rental-prices/main.py:24: UserWarning:
config_path is not specified in @hydra.main().
See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/changes_to_hydra_main_config_path for more information.
@hydra.main(config_name='config')
/root/miniconda3/envs/mlflow-498808f8c125fa02022e7f09f7b0b2b49dc0ce8a/lib/python3.10/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
2024/03/25 12:43:15 INFO mlflow.projects.utils: === Fetching project from https://github.com/udacity/build-ml-pipeline-for-short-term-rental-prices#components/get_data into /tmp/tmp6gzzwnix ===
2024/03/25 12:43:18 INFO mlflow.utils.conda: Conda environment mlflow-f86f6f20a885af5bb1519172419f48251d34f00d already exists.
2024/03/25 12:43:18 INFO mlflow.projects.utils: === Created directory /tmp/tmp2myz4s8b for downloading remote URIs passed to arguments of type 'path' ===
2024/03/25 12:43:18 INFO mlflow.projects.backend.local: === Running command 'source /root/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-f86f6f20a885af5bb1519172419f48251d34f00d 1>&2 && python run.py sample1.csv sample.csv raw_data 'Raw file as downloaded'' in run with ID '8a40040a5fc8409cb1e4880191a66a11' ===
wandb: Currently logged in as: kerhl8 (udacity_mlpipeline). Use wandb login --relogin to force relogin
wandb: Network error (SSLError), entering retry loop.
2024-03-25 12:44:51,174 Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(1, '[SSL: SSLV3_ALERT_BAD_RECORD_MAC] ssl/tls alert bad record mac (_ssl.c:2559)')': /api/4504800232407040/store/
2024-03-25 12:44:51,213 Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(1, '[SSL: SSLV3_ALERT_BAD_RECORD_MAC] ssl/tls alert bad record mac (_ssl.c:2559)')': /api/4504800232407040/store/
2024-03-25 12:44:51,252 Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(1, '[SSL: SSLV3_ALERT_BAD_RECORD_MAC] ssl/tls alert bad record mac (_ssl.c:2559)')': /api/4504800232407040/store/
Problem at: /root/miniconda3/envs/mlflow-f86f6f20a885af5bb1519172419f48251d34f00d/lib/python3.12/site-packages/wandb/sdk/wandb_init.py 848 getcaller
wandb: ERROR Run initialization has timed out after 90.0 sec.
wandb: ERROR Please refer to the documentation for additional information: https://docs.wandb.ai/guides/track/tracking-faq#initstarterror-error-communicating-with-wandb-process-
Traceback (most recent call last):
File "/tmp/tmp6gzzwnix/components/get_data/run.py", line 48, in
go(args)
File "/tmp/tmp6gzzwnix/components/get_data/run.py", line 19, in go
run = wandb.init(job_type="download_file")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/mlflow-f86f6f20a885af5bb1519172419f48251d34f00d/lib/python3.12/site-packages/wandb/sdk/wandb_init.py", line 1185, in init
raise e
File "/root/miniconda3/envs/mlflow-f86f6f20a885af5bb1519172419f48251d34f00d/lib/python3.12/site-packages/wandb/sdk/wandb_init.py", line 1166, in init
run = wi.init()
^^^^^^^^^
File "/root/miniconda3/envs/mlflow-f86f6f20a885af5bb1519172419f48251d34f00d/lib/python3.12/site-packages/wandb/sdk/wandb_init.py", line 781, in init
raise error
wandb.errors.CommError: Run initialization has timed out after 90.0 sec.
Please refer to the documentation for additional information: https://docs.wandb.ai/guides/track/tracking-faq#initstarterror-error-communicating-with-wandb-process-
Error executing job with overrides: ["main.steps='download'"]
Traceback (most recent call last):
File "/mnt/c/users/nathan/build-ml-pipeline-for-short-term-rental-prices/main.py", line 40, in go
_ = mlflow.run(
File "/root/miniconda3/envs/mlflow-498808f8c125fa02022e7f09f7b0b2b49dc0ce8a/lib/python3.10/site-packages/mlflow/projects/init.py", line 354, in run
_wait_for(submitted_run_obj)
File "/root/miniconda3/envs/mlflow-498808f8c125fa02022e7f09f7b0b2b49dc0ce8a/lib/python3.10/site-packages/mlflow/projects/init.py", line 371, in _wait_for
raise ExecutionException(f"Run (ID '{run_id}') failed")
mlflow.exceptions.ExecutionException: Run (ID '8a40040a5fc8409cb1e4880191a66a11') failed
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. 2024/03/25 12:44:53 ERROR mlflow.cli: === Run (ID 'c8dffebdcff94982adebb2b54847540e') failed ===
I expect the the artifact and its name "sample.csv" etc will be uploaded to Weights and Biases.