unable to reach redis service in non-ha deployments

Bug #1618510 reported by Pradeep Kilambi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Jiří Stránský

Bug Description

On the latest deployments, we see telemetry services fail to reach redis service. The service configuration in ceilometer/gnocchi/aodh is accurate. So the issue seems to be how redis itself is configured, perhaps haproxy config is messed up.

summary: - unable to reach redis service on latest deployment
+ unable to reach redis service
Revision history for this message
Pradeep Kilambi (pkilambi) wrote : Re: unable to reach redis service

This is what Alex(akrzos) was seeing in build 08-29.1

listen redis
  bind 172.16.0.12:6379 transparent
  balance first
  option tcp-check
  tcp-check send PING\r\n
  tcp-check expect string +PONG
  tcp-check send info\ replication\r\n
  tcp-check expect string role:master
  tcp-check send QUIT\r\n
  tcp-check expect string +OK
  server overcloud-controller-0 172.16.0.24:6379 check fall 5 inter 2000 rise 2
  server overcloud-controller-1 172.16.0.17:6379 check fall 5 inter 2000 rise 2
  server overcloud-controller-2 172.16.0.31:6379 check fall 5 inter 2000 rise 2

[root@overcloud-controller-0 ~]# grep "redis" /etc/gnocchi/gnocchi.conf
coordination_url = redis://:XXXX@172.16.0.12:6379/
[root@overcloud-controller-0 ~]# redis-cli -h 172.16.0.12 -p 6379
172.16.0.12:6379> ping
Error: Server closed the connection
172.16.0.12:6379> auth XXXX
Error: Server closed the connection
172.16.0.12:6379> exit
[root@overcloud-controller-0 ~]# redis-cli -h 172.16.0.17-p 6379
172.16.0.17:6379> auth XXXX
OK
172.16.0.17:6379> ping
PONG
172.16.0.17:6379>

[root@overcloud-controller-1 ~]# netstat -tulpn | grep 6379
tcp 0 0 172.16.0.17:6379 0.0.0.0:* LISTEN 19512/redis-server
tcp 0 0 172.16.0.12:6379 0.0.0.0:* LISTEN 15449/haproxy

Changed in tripleo:
status: New → Triaged
importance: Undecided → Critical
assignee: nobody → Pradeep Kilambi (pkilambi)
milestone: none → newton-rc1
Revision history for this message
Pradeep Kilambi (pkilambi) wrote :

This is resolved with Dan's patch to set the tripleo::haproxy::redis_password in haproxy templates

https://review.openstack.org/#/c/357344/

I'll move this to fix committed.

Changed in tripleo:
status: Triaged → Fix Committed
assignee: Pradeep Kilambi (pkilambi) → nobody
Revision history for this message
Pradeep Kilambi (pkilambi) wrote :

This still seems to be happening in nonha case. But in ha case all seems to be working. For some reason we cant bind to the vip in non ha case. For example,

http://logs.openstack.org/11/358511/3/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/80c935f/logs/overcloud-controller-0/var/log/gnocchi/metricd.txt.gz

Changed in tripleo:
status: Fix Committed → Confirmed
summary: - unable to reach redis service
+ unable to reach redis service in non-ha deployments
Revision history for this message
Jiří Stránský (jistr) wrote :

Listens seem correct:

tcp 0 0 192.0.2.13:6379 0.0.0.0:* LISTEN 32378/redis-server
tcp 0 0 192.0.2.15:6379 0.0.0.0:* LISTEN 23915/haproxy

But the machine doesn't think it owns the redis VIP. It seems we should be telling keepalived to claim the redis VIP but we're not doing so.

9: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UNKNOWN
    link/ether fa:16:3e:2a:81:4d brd ff:ff:ff:ff:ff:ff
    inet 192.0.2.13/24 brd 192.0.2.255 scope global dynamic br-ex
       valid_lft 83265sec preferred_lft 83265sec
    inet 192.0.2.6/32 scope global br-ex
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe2a:814d/64 scope link
       valid_lft forever preferred_lft forever

Changed in tripleo:
assignee: nobody → Jiří Stránský (jistr)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/364917

Changed in tripleo:
status: Confirmed → In Progress
Changed in tripleo:
assignee: Jiří Stránský (jistr) → Emilien Macchi (emilienm)
Changed in tripleo:
assignee: Emilien Macchi (emilienm) → Jiří Stránský (jistr)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/366128

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/366128
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=280a70bfafeca67cd124733e898fb8c6f05a039e
Submitter: Jenkins
Branch: master

commit 280a70bfafeca67cd124733e898fb8c6f05a039e
Author: Jiri Stransky <email address hidden>
Date: Tue Sep 6 15:19:13 2016 +0200

    Set Redis VIP on all nodes

    Move Redis VIP from controller-only to all nodes so that we don't assume
    where Redis is deployed.

    Change-Id: I55f8d48e3e077951fbcc88158dd6f21a2fe5f457
    Related-Bug: #1618510
    Partially-Implements: blueprint custom-roles

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/364917
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=9d07e18cad5f212c6aea30c5cbdfd53e0694d808
Submitter: Jenkins
Branch: master

commit 9d07e18cad5f212c6aea30c5cbdfd53e0694d808
Author: Jiri Stransky <email address hidden>
Date: Tue Sep 6 15:21:25 2016 +0200

    Use Redis VIP when deploying with keepalived

    Previously we weren't creating Redis VIP in keepalived, causing Redis to
    be unusable in non-HA deployments. This is now fixed.

    Depends-On: I0bb37f6fb3eed022288b2dcfc7a88e8ff88a7ace
    Change-Id: I0ecfda1e6ad5567f6f58d60bf418bc91761833ab
    Closes-Bug: #1618510

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.