Postgresql_query won't run with 'unable to connect to database'

I have this TASK (reloading a postgres database with SQL) that I am sending to a couple of postgres servers.

    - name: reload postgres
      community.postgresql.postgresql_query:
        login_unix_socket: /tmp
        db: postgres
        query: "SELECT pg_reload_conf()"
      become: true
      become_user: postgres

about 40% of them come back with "unable to connect to database: fe_sendauth: no password supplied error.

the postgres user get’s its password locally from a ~/.pgpass file and running that SQL locally works.

Any idea, what the rootcause here might be?

Just to confirm, ansible is logging in with a user (such as ansible) and then using become to operate as the postgres user?

I can’t jump right to a root cause, but some thoughts

  • Are the 40% of failing servers always the same ones?
  • Have you confirmed the existence/permissions of ~/.pgpass on a working/not working system?

yes, precisely

yes, always the same one.

I took one as example.

  1. ~/.pgpass works fine locally
  2. even if I set localhost connections to trust in pg_hba.conf on that exact server I get the same "unable to connect to database: fe_sendauth: no password supplied error from ansible

my current workaround is to got through shell and pg_ctl -D $PGDATA reload as the postgres user.

Two other things I’d check

  • run the ansible tasks with with -v (anywhere from one to 5 v’s)
  • check out system logs to see if there’s anything indicating what might be happening.
1 Like

Any chance these are servers running SELinux and you’re hitting a labeling issue on those hosts?

2 Likes

Is the socket directory (/tmp) correct?

Is the database running before you run this command? Try a simple “SELECT 1” and see if that produces the same error.

1 Like
  1. yes the socket /tmp is correct
  2. a SELECT 1 query fails on the same machines with the same error

.. SELinux … you’re hitting a labeling issue on those hosts?

I compared a failing and a working hosts. Context (as per ls -laZ) is identical.

This then is an authentication problem with your database.

Running queries as postgres user should always be allowed in pg_hba.conf.

sorry, it took a while until I got the chance to dig into this a little deeper.

postgres via ansible - ERROR

> ansible -i ~/repos/ansible-inventory-eakte/ 'dbserver' -l rd5912 -m postgresql_query -a "login_unix_socket='/tmp' query='SHOW port'" --become --become-user postgres

[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details
[WARNING]: Database name has not been passed, used default database to connect to.
rd5912 | FAILED! => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "msg": "unable to connect to database: fe_sendauth: no password supplied\n"
}

Other targets in the same supposedly state work. Also these tests of connectivity to postgres on the server(s) that show the postgres via ansible issue are successful.

postgres via ssh - WORKS*

> ssh rd5912 'sudo -u postgres /opt/db/postgres/postgresql/bin/psql -qAtc "SHOW port ;"'
could not change directory to "/home/username": Permission denied
5432

.pgpass - EXISTS

> ssh rd5912 'sudo -u postgres cat /home/postgres/.pgpass'
localhost:5432:*:postgres:********************

psycopg2 - EXISTS

> ssh rd5912 'dnf list python3-psycopg2'
...
...
...
Installed Packages
python3-psycopg2.x86_64      2.8.6-6.el9       @rhel-9-for-x86_64-appstream-rpms

I am out of ideas. What else can I check/fix locally in order to make this work?

nice guess, but I compared the folder with the binaries as well as the content of data_directory between a failing and a working host. The SELinux context is identical, so it’s apparently not that.

I did not exaclty realize what that actually meant.
with such a line in pg_hba.conf everything is good

local all all 127.0.0.1/32 trust