I installed AWX a couple of weeks back, and have been using it without issues since then. Yesterday I got an “Internal Server Error” from the GUI, and then I noticed that the postgres container had crashed, and is stuck in “Restarting” status.
The only output from “docker logs postgres” is a repeating:
initdb: directory “/var/lib/postgresql/data” exists but is not empty
If you want to create a new database system, either remove or empty
the directory “/var/lib/postgresql/data” or run initdb
with an argument other than “/var/lib/postgresql/data”.
initdb: directory “/var/lib/postgresql/data” exists but is not empty
If you want to create a new database system, either remove or empty
the directory “/var/lib/postgresql/data” or run initdb
with an argument other than “/var/lib/postgresql/data”.
The files belonging to this database system will be owned by user “postgres”.
This user must also own the server process.
The database cluster will be initialized with locale “en_US.utf8”.
The default database encoding has accordingly been set to “UTF8”.
The default text search configuration will be set to “english”.
Data page checksums are disabled.
In troubleshooting, I upgraded the awx from github and re-ran the installer, but the issue persists.
I also tried moving /tmp/pgdocker out of the way, and then rerunning the installer to see if it would properly generate an emtpy DB and run from that. That works, but this means I’m starting from scratch. And there’s nothing saying this won’t happen again, so I’d rather not start over before pin pointing the issue.
Any ideas on how I could troubleshoot this further?
We’ve had this reported a couple of times and I have yet to reproduce it locally (though I have no doubt of its severity). I need to figure out why the postgres container is insisting on re-running its init utility and then failing to start up when data already exists there.
FYI I had the same issue when I updated my environemt and I had to delete the /tmp/pgdocker folder to get it to work.
Since it was a lab environment I didn’t look into the issue further…
This is repeatedly happening to our dev instances. They will be working fine, and then all of sudden you will see “A server error has occurred” when trying to access the UI. Every time it has been that the postgres container is stuck in “restarting” status.
Any troubleshooting advice? I have been unsuccessful in recovering the container once it gets in this state. As mentioned above, deleting /tmp/pgdocker means AWX re-initializes and you start from scratch. We’ve been having to roll back to stable snapshots each time this happens.
Sorry, my last message probably wasn’t very clear.
I started with a clean Centos7 install and deployed AWX after the 438 issue was closed. Previously, this issue would surface within a week - the UI would show “A server error has occurred” and upon investigation the postgres container would be stuck restarting. This most recent install lasted over 2 weeks, but the same issue is back. Clearing the pg container, data, etc obviously clears all data and settings which is quite obnoxious.
You’ll continue to have the problem if you are using your existing pg container and data. I left the outline for a migration path in the PR linked for that issue closure.