Google Cloud Batch + Workflow: how to connect to Google Cloud SQL?

115 views Asked by At

I have a Workflow implemented on GCP that triggers a containerized batch job. (Here is the tutorial for something similar: https://cloud.google.com/workflows/docs/tutorials/batch-and-workflows). The service account that runs this job has the "Cloud SQL Client" role.

Still, I keep on getting this error when trying to connect to the database (in python using psycopg2):

psycopg2.OperationalError: connection to server on socket "/cloudsql/PROJECT_NAME:us-central1:DB_NAME/.s.PGSQL.5432" failed: No such file or directory
    Is the server running locally and accepting connections on that socket?

Why?

2

There are 2 answers

0
Jonathan Hess On

Jonathan from the Cloud SQL Connector team here. There are two ways to go about this.

First option: Configure your application to connect to your Cloud SQL database instance using TCP and the Cloud SQL Connector for Python. Python connector example. This uses the pg8000 database driver. Unfortunately, the connector does not work with the psycopg2 library.

Second option: Add the Cloud SQL Auth Proxy container to your batch job.

This is a slightly modified job workflow definition from the Run a Batch Job guide:

Replace <INSTANCE_CONNECTION_NAME> with the connection string from your database, something like my-project:uscentral-1:instance

Replace <DB_PORT> with a TCP port appropriate for your database. The proxy container will start listening on this port when it starts. Your application should use it's database driver to connect to localhost:<DB_PORT>.

In this configuration, your application will still need to authenticate to the database using database username and password. If you want to use IAM Authentication, you will need to adapt this based on the Cloud SQL Proxy Guide.

main:
  params: [args]
  steps:
  # ... snip ...
  - createAndRunBatchJob:
      call: googleapis.batch.v1.projects.locations.jobs.create
      args:
        parent: ${"projects/" + projectId + "/locations/" + region}
        jobId: ${jobId}
        body:
          taskGroups:
            taskSpec:
              runnables:
              # The Cloud SQL Proxy container, opens a database port at
              # localhost:<DB_PORT> accessible to your application container
              - container:
                  imageUri: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.8.2
                  background: true
                  commands:
                  - "--structured-logs"
                  - "--port=<DB_PORT>"
                  - "<INSTANCE_CONNECTION_NAME>"

              # The application container, uses a standard database library to
              - container:
                  imageUri: ${imageUri}
                environment:
                  variables:
                    DB_HOST: "localhost"
                    DB_PORT: "<DB_PORT>"
            # Run 6 tasks on 2 VMs
            taskCount: 6
            parallelism: 2
          logsPolicy:
            destination: CLOUD_LOGGING
      result: createAndRunBatchJobResponse
# ...snip...```

Let me know if this works for you.
0
lukeforehand On

Jonathan,

This didn't work for me. Here is the yaml output from describing the job

  taskSpec:
    environment:
      variables:
        POSTGRES_HOST: localhost
    runnables:
    - background: true
      container:
        commands:
        - /cloud_sql_proxy
        - -instances=myproject:myregion:myinstance=tcp:5432
        - -enable_iam_login
        - -structured_logs
        imageUri: gcr.io/cloudsql-docker/gce-proxy:1.28.1
    - container:
        commands:
        - load_postgres
        imageUri: myimage

And the logs

proxy

message: "Listening on 127.0.0.1:5432 for myproject:myregion:myinstance"
Ready for new connections

client

connection to server at "localhost" (127.0.0.1), port 5432 failed

UPDATE:

The trick seemed to be to set

taskGroups:
  taskCountPerNode: 1
  taskSpec:
    runnables:
    - container:
        options: "--network=host"