configured stats_temp_directory does not get created after reboot

Bug #1749283 reported by Mario Splivalo
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
postgresql-common (Debian)
Fix Released
Unknown
postgresql-common (Ubuntu)
Won't Fix
Medium
Mario Splivalo
Xenial
Won't Fix
Medium
Mario Splivalo
Bionic
Won't Fix
Medium
Mario Splivalo
resource-agents (Ubuntu)
Fix Released
Medium
Mario Splivalo
Xenial
Invalid
Undecided
Unassigned
Bionic
Invalid
Undecided
Unassigned

Bug Description

Default postgres installation in Ubuntu (and Debian) configures stats_temp_directory inside /var/run/postgresql:

$ grep stats_temp /etc/postgresql/10/main/postgresql.conf
stats_temp_directory = '/var/run/postgresql/10-main.pg_stat_tmp'

However, this directory is not created after reboot.

In most cases this is not a problem as systemd starts postgres via pg_ctlcluster, a "multiversion/cluster aware pg_ctl wrapper", and pg_ctlcluster will create missing directories before starting postgres.

But in cases where systemd is not starting postgres this is a problem.
Specifically, when postgres is controlled by pacemaker (using postgres resource agent for pacemaker) it is started using pg_ctl wrapper. pg_ctl won't create missing directories and therefore postgres fails to start.

The simplest solution for this issue is to have systemd recreate missing directories via /usr/lib/tmpfiles.d/postgresql.conf file.

Currently only /var/run/postgresql and /var/log/postgresql are created using systemd-tmpfiles.

Tags: cpe-onsite

Related branches

Revision history for this message
Mario Splivalo (mariosplivalo) wrote :

This does not affect trusty as in trusty stats_temp_directory is not configured:

ubuntu@pg-tru:~$ grep stats_temp /etc/postgresql/9.3/main/postgresql.conf
#stats_temp_directory = 'pg_stat_tmp'

Changed in postgresql-common (Ubuntu Xenial):
assignee: nobody → Mario Splivalo (mariosplivalo)
Changed in postgresql-common (Ubuntu Artful):
assignee: nobody → Mario Splivalo (mariosplivalo)
Changed in postgresql-common (Ubuntu Bionic):
assignee: nobody → Mario Splivalo (mariosplivalo)
Revision history for this message
Mario Splivalo (mariosplivalo) wrote :

I have created a PPA where I built proposed fixes for this package:

https://launchpad.net/~mariosplivalo/+archive/ubuntu/lp1749283

I am also attaching a debdiffs here.

Revision history for this message
Mario Splivalo (mariosplivalo) wrote :
Revision history for this message
Mario Splivalo (mariosplivalo) wrote :
Changed in postgresql-common (Ubuntu Xenial):
status: New → In Progress
Changed in postgresql-common (Ubuntu Artful):
status: New → In Progress
Changed in postgresql-common (Ubuntu Bionic):
status: New → In Progress
tags: added: sts-sponsor
Changed in postgresql-common (Debian):
status: Unknown → New
Eric Desrochers (slashd)
tags: added: sts-sponsor-slashd
removed: sts-sponsor
Changed in postgresql-common (Ubuntu Xenial):
importance: Undecided → Medium
Eric Desrochers (slashd)
Changed in postgresql-common (Ubuntu Artful):
importance: Undecided → Medium
Changed in postgresql-common (Ubuntu Bionic):
importance: Undecided → Medium
Revision history for this message
Brian Murray (brian-murray) wrote :

The Stable Release Update process requires that the bug have a information regarding its impact, a test case and regression potential in the bug description. For this to be considered by the SRU team that information will need to be added.

https://wiki.ubuntu.com/StableReleaseUpdates#SRU_Bug_Template

Revision history for this message
Mario Splivalo (mariosplivalo) wrote :

It looks like this is getting fixed in the resource agents in upstream:

https://github.com/ClusterLabs/resource-agents/pull/1102

So we might wait a bit to see if upstream will accept this change. If so, it will be merged in Debian and I'll create a SRU against resource-agents package in ubuntu.

Eric Desrochers (slashd)
tags: removed: sts-sponsor-slashd
Eric Desrochers (slashd)
Changed in resource-agents (Ubuntu):
assignee: nobody → Eric Desrochers (slashd)
Changed in postgresql-common (Ubuntu):
status: In Progress → Won't Fix
Changed in postgresql-common (Ubuntu Xenial):
status: In Progress → Won't Fix
Changed in postgresql-common (Ubuntu Artful):
status: In Progress → Won't Fix
Changed in postgresql-common (Ubuntu Bionic):
status: In Progress → Won't Fix
Changed in resource-agents (Ubuntu):
importance: Undecided → Medium
status: New → In Progress
Eric Desrochers (slashd)
Changed in resource-agents (Ubuntu):
assignee: Eric Desrochers (slashd) → Mario Splivalo (mariosplivalo)
Revision history for this message
Simon Quigley (tsimonq2) wrote :

Unsubscribing sponsors for now; Brian is right.

Changed in postgresql-common (Debian):
status: New → Fix Released
tags: added: cpe-onsite
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Subscribed ~field-high because this prevents a postgres process from starting when RA start action is executed.

Workaround:

sudo bash -c 'echo "d /var/run/postgresql/10-main.pg_stat_tmp 2750 postgres postgres" > /etc/tmpfiles.d/10-main.pg_stat_tmp.conf'

sudo systemd-tmpfiles --create

Prerequisite for using pacemaker to manage postgres lifetime:

systemctl stop postgresql
systemctl disable postgresql.service
systemctl disable postgresql@10-main.service

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

@Mario / @Eric - just to check in whoms curt the ball is right now - Is Mario or someone else in your Org still/again working on this? If not who do we need to bother to re-assign properly?

Changed in resource-agents (Ubuntu Artful):
status: New → Invalid
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI - ressource agents took the fix [1].
As mentioned in comment #6 this is what you were waiting for - it might have been lost waiting on this thou - therefore take this opportunity to update responsibilities and please let Dmitrii know if/who is working on it then to resolve that.

[1]: https://github.com/ClusterLabs/resource-agents/pull/1102

Revision history for this message
Michael Banck (mbanck) wrote :

Well that fix got merged a year ago - resource-agents on bionic is at 4.1.0~rc1 - I guess due to some unfortunate timing WRT the bionic freeze, but upstream has 4.1.1 and doesn't it make sense to either pull in the above PR or the 4.1.1 changes (after review?).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Yeah Michael,
Making just the commit [1] an SRU where needed would have been my thought as well. But since it is assigned to Mario I wanted to know if he is still (or again) working on this to avoid doing the work twice - also they might as well have ended up with discussions somewhere completely else so getting feedback from him was the obvious next step.

@Dmitrii a few things to confirm, to get the ball rolling in case Mario won't respond.
- are you hitting that at all in regard to pgsql "through ressource agent" or are you deploying just postgresql alone (to decide which package(s) actually need fixes)?
- If we would provide a PPA do you have an environment to easily test/verify them?

[1]: https://github.com/ClusterLabs/resource-agents/pull/1102/commits/00d888c00651eb178dc3a97d50359571963b5d44

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote : Re: [Bug 1749283] Re: configured stats_temp_directory does not get created after reboot

This is easily reproducible when systemd is not used to manage postgres
lifetime. So, yes, quite easy to reproduce and verify if a fix is valid or
not.

On Mon., Mar. 25, 2019, 09:25 Christian Ehrhardt , <
<email address hidden>> wrote:

> Yeah Michael,
> Making just the commit [1] an SRU where needed would have been my thought
> as well. But since it is assigned to Mario I wanted to know if he is still
> (or again) working on this to avoid doing the work twice - also they might
> as well have ended up with discussions somewhere completely else so getting
> feedback from him was the obvious next step.
>
> @Dmitrii a few things to confirm, to get the ball rolling in case Mario
> won't respond.
> - are you hitting that at all in regard to pgsql "through ressource agent"
> or are you deploying just postgresql alone (to decide which package(s)
> actually need fixes)?
> - If we would provide a PPA do you have an environment to easily
> test/verify them?
>
> [1]: https://github.com/ClusterLabs/resource-
> agents/pull/1102/commits/00d888c00651eb178dc3a97d50359571963b5d44
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1749283
>
> Title:
> configured stats_temp_directory does not get created after reboot
>
> Status in postgresql-common package in Ubuntu:
> Won't Fix
> Status in resource-agents package in Ubuntu:
> In Progress
> Status in postgresql-common source package in Xenial:
> Won't Fix
> Status in resource-agents source package in Xenial:
> New
> Status in postgresql-common source package in Artful:
> Won't Fix
> Status in resource-agents source package in Artful:
> Invalid
> Status in postgresql-common source package in Bionic:
> Won't Fix
> Status in resource-agents source package in Bionic:
> New
> Status in postgresql-common package in Debian:
> Fix Released
>
> Bug description:
> Default postgres installation in Ubuntu (and Debian) configures
> stats_temp_directory inside /var/run/postgresql:
>
> $ grep stats_temp /etc/postgresql/10/main/postgresql.conf
> stats_temp_directory = '/var/run/postgresql/10-main.pg_stat_tmp'
>
> However, this directory is not created after reboot.
>
> In most cases this is not a problem as systemd starts postgres via
> pg_ctlcluster, a "multiversion/cluster aware pg_ctl wrapper", and
> pg_ctlcluster will create missing directories before starting
> postgres.
>
> But in cases where systemd is not starting postgres this is a problem.
> Specifically, when postgres is controlled by pacemaker (using postgres
> resource agent for pacemaker) it is started using pg_ctl wrapper. pg_ctl
> won't create missing directories and therefore postgres fails to start.
>
> The simplest solution for this issue is to have systemd recreate
> missing directories via /usr/lib/tmpfiles.d/postgresql.conf file.
>
> Currently only /var/run/postgresql and /var/log/postgresql are created
> using systemd-tmpfiles.
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/postgresql-common/+bug/1749283/+subscriptions
>

Changed in resource-agents (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

With the default install it is obviously systemd managed.
There the config defaults to:
 $ grep stats_temp /etc/postgresql/10/main/postgresql.conf
    stats_temp_directory = '/var/run/postgresql/10-main.pg_stat_tmp'
and later the same with version bumped
 $ grep stats_temp /etc/postgresql/11/main/postgresql.conf
    stats_temp_directory = '/var/run/postgresql/11-main.pg_stat_tmp'

Adn as identified before each cluster is started through pg_ctlcluster as part of a templated service at /lib/systemd/system/postgresql@.service.
See /usr/share/doc/postgresql-common/README.systemd for more.

The above was mostly for myself to context switch in what was said on this bug before.

@Dmitrii - with "systemd is not used to manage postgres lifetime" you meant the heartbeat setup discussed here right? Or are there other cases affected? Because then the fix in resource would obviously not help that much - yet OTOH it is meant to be used through systemd, not sure how much of a pgsql bug it would be then.

@Dmitrii - Since I don't know when/if Mario will respond I created you a PPA (Bionic+Xenial) to try. Feel free to test that and let me know if that would help your case.
=> https://launchpad.net/~paelzer/+archive/ubuntu/bug-1749283-resource-agent-to-start-pgsql

@Dmitrii - Finally if you would convert "easily reproducible when systemd is not used to manage postgres lifetime" into a small set of commands I'd appreciate that (maybe you have that already around?).

P.S. to be ahead I also have added MPs for the (rather trivial) change to be reviewable.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Note: I have seen your fixes to the Foundation Cloud for the same purpose, that was one more use case but that is fixed now as well right?

Changed in resource-agents (Ubuntu Xenial):
status: New → Confirmed
Changed in resource-agents (Ubuntu Bionic):
status: New → Confirmed
no longer affects: postgresql-common (Ubuntu Artful)
no longer affects: resource-agents (Ubuntu Artful)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The MPs are approved as well now, did your testing confirm that this would help your case?

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Hi Christian,

> did your testing confirm that this would help your case?

Sorry, have not had a chance to test this yet (delivering on-site).

> systemd is not used to manage postgres lifetime" you meant the heartbeat setup discussed here right?

Yes, that's right.

> I have seen your fixes to the Foundation Cloud for the same purpose, that was one more use case but that is fixed now as well right?

Yes, this is pretty much a workaround for FCE that was necessary to move forward. But it's always better to have it fixed up at the distro level.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

> > did your testing confirm that this would help your case?
>
> Sorry, have not had a chance to test this yet (delivering on-site).
>

No problem, I just had to probe on the status.
I hope your delivery goes well!

>
> > I have seen your fixes to the Foundation Cloud for the same purpose, that was one more use case but that is fixed now as well right?
>
> Yes, this is pretty much a workaround for FCE that was necessary to move
> forward. But it's always better to have it fixed up at the distro level.

So to summarize,
- you have a workaround in place (in FCE) that gets you going for now.
- Once you have time you will test the PPA (and provide test steps for
the SRU if possible)
- once confirmed we will convert it into an SRU and fix it at the distro level.

Considering the above I'd ask to reduce from filed-high to med/low
thou, as a workaround is in place and the next steps are clear to
resolve it in the long run.

P.S. If you can just come up with reduced test steps from what you
know from your heartbeat FCE setup I can even do the testing.
And if there is no chance that you can come up with test steps let us
know, we can then try to recreate on our own (it just seems much
easier to bas eon what you already have).

Revision history for this message
Joshua Powers (powersj) wrote :

@Dmitrii - I got Christian involved in this at your request, is it possible for you to confirm if the PPA contains a valid fix or get Christian a test case?

Revision history for this message
Mario Splivalo (mariosplivalo) wrote :

@Christian, I've uploaded test pacakages to this ppa: https://launchpad.net/~mariosplivalo/+archive/ubuntu/lp1762492

It looks like resource-agents was erroneously added to this bug - the end effect is the same, but this bug deals with postgres-common issue. As the proposed patch for this bug was rejected in upstream (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890427#10), and as this indeed is the resource-agents issue I had created a bug that deals esp with that: LP: #1762492.

I'll proceed with SRU there, if that's all right.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Following your request to ignore the resource-agent portion of this and just for postgresql-common I'm actually with Christoph on this.
In general we haven't had a delta on postgres packages for Debian maintenance on these packages is really good. If anything we have short one-off changes, as an example look at the current Delta in Eoan - it already is accepted in Debian and will be synced away soon. And we should keep it this way. Differences are ok for SRUs if ok under the SRU policy, but not going forward e.g. Eoan.

Also the statement in the Debian bug is slightly wrong "In Ubuntu, the attached patch was applied"; actually it was not yet applied anywhere - only discussed.
Given that, it isn't in Eoan yet and hence this needs that done before SRUs can be considered.

For Eoan IMHO you should clean up the patch to make it acceptable to Debian, have it applied there and then synced.

For example the SRUs can be version specific (as you know which version ships in which release). And therefore can be much simpler.

Revision history for this message
Mario Splivalo (mariosplivalo) wrote :

I agree - the postgresql-common should not be touched, nor there is a need to create an SRU for it.
However, resource-agents should be fixed, and only for Xenial and Bionic. But, I felt it was 'cleaner' to do it in a separate bug, LP: #1762492.

So, this bug should be marked "Won'f Fix" for postgresql-common (again, based on what Cristoph said in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890427#10).

For resource-agents part, I'm not sure if it should be completely removed from this bug, or just marked 'Invalid'. That part is actually duplicate of LP: #1762492. Unless you have objections I'll continue resource-agents part there.

P.S. You are correct, the postgresql-common patch was never applied here - I just created a debdiff, waiting for upstream patch to be applied, thinking that the proper way to solve this issue was fixing postsgresql-common.

So, to make it clear - postgresql-common is not to be fixed (there is nothing to be fixed), and this bug should be abandoned (as Debian did). resource-agents needs to be fixed, and that's being worked on in LP: #1762492. Is that good?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for the clarification Mario!
Now things fit together.
I'll mark the resource agent tasks invalid here per these comments to make sure no one is confused again.

TL;DR:
- not fixed in postgresql-common
- only fixed in resouce-agent for Xenial/Bionic via SRU in bug 1762492

Changed in resource-agents (Ubuntu Xenial):
status: Confirmed → Invalid
Changed in resource-agents (Ubuntu Bionic):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.