Postgresql_query won't run with 'unable to connect to database'

I have this TASK (reloading a postgres database with SQL) that I am sending to a couple of postgres servers.

    - name: reload postgres
      community.postgresql.postgresql_query:
        login_unix_socket: /tmp
        db: postgres
        query: "SELECT pg_reload_conf()"
      become: true
      become_user: postgres

about 40% of them come back with "unable to connect to database: fe_sendauth: no password supplied error.

the postgres user get’s its password locally from a ~/.pgpass file and running that SQL locally works.

Any idea, what the rootcause here might be?

Just to confirm, ansible is logging in with a user (such as ansible) and then using become to operate as the postgres user?

I can’t jump right to a root cause, but some thoughts

  • Are the 40% of failing servers always the same ones?
  • Have you confirmed the existence/permissions of ~/.pgpass on a working/not working system?

yes, precisely

yes, always the same one.

I took one as example.

  1. ~/.pgpass works fine locally
  2. even if I set localhost connections to trust in pg_hba.conf on that exact server I get the same "unable to connect to database: fe_sendauth: no password supplied error from ansible

my current workaround is to got through shell and pg_ctl -D $PGDATA reload as the postgres user.

Two other things I’d check

  • run the ansible tasks with with -v (anywhere from one to 5 v’s)
  • check out system logs to see if there’s anything indicating what might be happening.
1 Like

Any chance these are servers running SELinux and you’re hitting a labeling issue on those hosts?

2 Likes

Is the socket directory (/tmp) correct?

Is the database running before you run this command? Try a simple “SELECT 1” and see if that produces the same error.

1 Like
  1. yes the socket /tmp is correct
  2. a SELECT 1 query fails on the same machines with the same error

.. SELinux … you’re hitting a labeling issue on those hosts?

I compared a failing and a working hosts. Context (as per ls -laZ) is identical.

This then is an authentication problem with your database.

Running queries as postgres user should always be allowed in pg_hba.conf.