puppet log contains errors, but nailgun doesn't report a failure of the task

Bug #1508807 reported by Sergey Vasilenko
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
In Progress
Medium
Fuel Toolbox

Bug Description

Our CI are green for tasks, where pupet log contains non-recoverable errors:
http://paste.openstack.org/show/477035/

Changed in fuel:
milestone: none → 8.0
Changed in fuel:
assignee: nobody → Fuel CI team (fuel-ci)
status: New → Confirmed
Changed in fuel:
status: Confirmed → Incomplete
Revision history for this message
Bartłomiej Piotrowski (bpiotrowski) wrote :

Sergey, can you provide some details, e.g. used ISO, link to the job, diagnostic snapshot?

Revision history for this message
Sergey Vasilenko (xenolog) wrote :
Changed in fuel:
status: Incomplete → New
Changed in fuel:
assignee: Fuel CI team (fuel-ci) → Sergey Vasilenko (xenolog)
Changed in fuel:
assignee: Sergey Vasilenko (xenolog) → Fuel QA Team (fuel-qa)
summary: - puppet log contains errors, but CI greeen
+ puppet log contains errors, but test passes
Revision history for this message
Bartłomiej Piotrowski (bpiotrowski) wrote :

Reassigning once more after talking to Alexandra Fedorova and Nastya Urlapova – the problem is somewhere in nailgun, as the task should be marked as failed.

summary: - puppet log contains errors, but test passes
+ puppet log contains errors, but nailgun doesn't report a failure of the
+ task
Changed in fuel:
assignee: Fuel QA Team (fuel-qa) → Fuel Python Team (fuel-python)
Changed in fuel:
status: New → Confirmed
Revision history for this message
Alexander Kislitsky (akislitsky) wrote :

The issue appeared on node-1 at:

./node-1.test.domain.local/var/log/puppet.log:2015-10-22 03:38:16 +0000 Puppet (err): Could not prefetch neutron_router_interface provider 'neutron': Can't retrieve subnet-show because Neutron or Keystone API is not available.

At the same time in astute task is recognized as successful:

2015-10-22T03:38:17 debug: [628] Node 1(primary-controller) status: succeed
2015-10-22T03:38:17 debug: [628] Puppet completed within 170 seconds
2015-10-22T03:38:17 debug: [628] {"nodes"=>[{"uid"=>"1", "status"=>"ready", "role"=>"primary-controller"}]}
2015-10-22T03:38:17 info: [628] Task '{"priority"=>5800, "type"=>"puppet", "uids"=>["1"], "parameters"=>{"puppet_modules"=>"/etc/puppet/modules", "puppet_manifest"=>"/etc/puppet/modules/osnailyfacter/modular/openstack-network/openstack-network-controller.pp", "timeout"=>3600, "cwd"=>"/"}}' on node uid=1 ended successfully

tags: added: area-python
Revision history for this message
Alexander Kislitsky (akislitsky) wrote :

Bug doesn't fail bvt. Importance reduced to 'High'

Changed in fuel:
importance: Critical → High
tags: added: swarm-blocker
Changed in fuel:
assignee: Fuel Python Team (fuel-python) → Sergey Slipushenko (sslypushenko)
Revision history for this message
Sergey Slipushenko (sslypushenko) wrote :

This bug was found in ISO #30 https://product-ci.infra.mirantis.net/job/8.0.all/30/
To reproduce this bug, revert commit afb031a6f53c58ca9c17f29ac6d902c3445c5afb in fuel-lib and build custom ISO
It isn't reproduced in latest ISOs out of the box because of neutron fix.

tags: removed: swarm-blocker
Changed in fuel:
status: Confirmed → In Progress
Changed in fuel:
assignee: Sergey Slipushenko (sslypushenko) → Fuel Python Team (fuel-python)
status: In Progress → Confirmed
Dmitry Pyzhov (dpyzhov)
tags: added: tricky
Dmitry Pyzhov (dpyzhov)
tags: added: team-bugfix
Revision history for this message
Ihor Kalnytskyi (ikalnytskyi) wrote :

What the status of this bug? Why it's still opened if it was fixed in library?

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

This bug is about astute not seeing failed puppet. Puppet doesn't fail right now but we should investigate the issue in order to avoid hidden issues in future. Setting Medium priority.

Changed in fuel:
importance: High → Medium
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 8.0 → 9.0
Dmitry Pyzhov (dpyzhov)
tags: added: module-astute
Revision history for this message
Bug Checker Bot (bug-checker) wrote : Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

version

expected result

steps to reproduce

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

It is a tricky issue that doesn't affect real deployments. Moving to 10.0.

Changed in fuel:
milestone: 9.0 → 10.0
Changed in fuel:
assignee: Fuel Python (Deprecated) (fuel-python) → Georgy Kibardin (gkibardin)
Revision history for this message
Georgy Kibardin (gkibardin) wrote :
Download full text (4.2 KiB)

It doesn't look like nailgun problem:

2015-10-22T04:14:37 info: [628] Run hook ---
priority: 1200
fail_on_error: true
type: shell
uids:
- '1'
parameters:
  retries: 3
  cmd: ruby /etc/puppet/modules/osnailyfacter/modular/astute/upload_cirros.rb
  timeout: 180
  interval: 20

2015-10-22T04:14:46 debug: [628] 280b6a02-9f04-4de8-9fec-94cc22ac7bd4: MC agent 'execute_shell_command', method 'execute', results:
{:sender=>"1",
 :statuscode=>0,
 :statusmsg=>"OK",
 :data=>
  {:stdout=>
    "/usr/bin/glance --os-image-api-version 1 image-create --name 'TestVM' --is-public 'true' --container-format='bare' --disk-format='qcow2' --min-ram='64' --file '/usr/share/cirros-testvm/cirros-x86_64-disk.img'\nImage 'TestVM' was uploaded from '/usr/share/cirros-testvm/cirros-x86_64-disk.img'\n+------------------+--------------------------------------+\n| Property | Value |\n+------------------+--------------------------------------+\n| checksum | ee1eca47dc88f4879d8a229cc70a07c6 |\n| container_format | bare |\n| created_at | 2015-10-22T04:14:43.000000 |\n| deleted | False |\n| deleted_at | None |\n| disk_format | qcow2 |\n| id | c7866d71-1903-46b0-81cf-29ab189cfb0c |\n| is_public | True |\n| min_disk | 0 |\n| min_ram | 64 |\n| name | TestVM |\n| owner | d0c77303dfbf4512986b2d9859716862 |\n| protected | False |\n| size | 13287936 |\n| status | active |\n| updated_at | 2015-10-22T04:14:46.000000 |\n| virtual_size | None |\n+------------------+--------------------------------------+\n",
   :stderr=>
    "/usr/lib/python2.7/dist-packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.\n InsecurePlatformWarning\n/usr/lib/python2.7/dist-packages/urllib3/connection.py:251: SecurityWarning: Certificate has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)\n SecurityWarning\n/usr/lib/python2.7/dist-packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.\n InsecurePlatformWarning\n/usr/lib/python2.7/dist-packages/urllib3/connection.py:251: Se...

Read more...

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Georgy Kibardin (gkibardin) wrote :

Here is puppet run results:

2015-10-22T03:38:17 debug: [628] 280b6a02-9f04-4de8-9fec-94cc22ac7bd4: MC agent 'puppetd', method 'last_run_summary', results:
{:sender=>"1",
 :statuscode=>0,
 :statusmsg=>"OK",
 :data=>
  {:time=>
    {"config_retrieval"=>2.858058108,
     "cs_resource"=>19.785889719,
     "exec"=>10.225019944,
     "file"=>0.025000217999999998,
     "file_line"=>0.000464048,
     "filebucket"=>9.5792e-05,
     "neutron_agent_ovs"=>0.018477572999999997,
     "neutron_api_config"=>0.009214799,
     "neutron_config"=>0.40304204599999993,
     "neutron_dhcp_agent_config"=>0.014584379,
     "neutron_l3_agent_config"=>0.020192068,
     "neutron_metadata_agent_config"=>0.012914508,
     "neutron_network"=>5.604460546,
     "neutron_plugin_ml2"=>0.015927388,
     "neutron_router"=>3.203366055,
     "neutron_router_interface"=>0.923435177,
     "neutron_subnet"=>5.493685004,
     "notify"=>0.006195566,
     "nova_admin_tenant_id_setter"=>3.780444248,
     "nova_config"=>0.081388309,
     "package"=>37.85673805,
     "schedule"=>0.00048237799999999994,
     "service"=>20.407656023,
     "sysctl"=>0.007621518,
     "sysctl_runtime"=>0.12488828700000001,
     "total"=>110.87924175099998,
     "last_run"=>1445485097},
   :resources=>
    {"failed"=>0,
     "changed"=>211,
     "total"=>247,
     "restarted"=>2,
     "out_of_sync"=>211,
     "failed_to_restart"=>0,
     "scheduled"=>0,
     "skipped"=>0},
   :changes=>{"total"=>211},
   :events=>{"failure"=>0, "success"=>211, "total"=>211},
   :version=>{"config"=>1445484928, "puppet"=>"3.4.3"},
   :status=>"stopped",
   :running=>0,
   :enabled=>1,
   :idling=>0,
   :stopped=>1,
   :lastrun=>1445485097,
   :runtime=>0,
   :output=>"Currently stopped; last completed run 0 seconds ago"}}

2015-10-22T03:38:17 debug: [628] Node 1(primary-controller) status: succeed
2015-10-22T03:38:17 debug: [628] Puppet completed within 170 seconds
2015-10-22T03:38:17 debug: [628] {"nodes"=>[{"uid"=>"1", "status"=>"ready", "role"=>"primary-controller"}]}
2015-10-22T03:38:17 info: [628] Task '{"priority"=>5800, "type"=>"puppet", "uids"=>["1"], "parameters"=>{"puppet_modules"=>"/etc/puppet/modules", "puppet_manifest"=>"/etc/puppet/modules/osnailyfacter/modular/openstack-network/openstack-network-controller.pp", "timeout"=>3600, "cwd"=>"/"}}' on node uid=1 ended successfully

I don't see any trace of the failure here either.

Revision history for this message
Georgy Kibardin (gkibardin) wrote :

It looks like puppet just reports error happening in provider_class.prefetch(resources) and does nothing about that.

See https://github.com/puppetlabs/puppet/blob/3.4.1/lib/puppet/transaction.rb#L272 for details

2015-10-22 03:37:34 +0000 Puppet (err): Could not prefetch neutron_subnet provider 'neutron': Can't retrieve subnet-show because Neutron or Keystone API is not available.
/etc/puppet/modules/neutron/lib/puppet/provider/neutron.rb:153:in `get_neutron_resource_attrs'
/etc/puppet/modules/neutron/lib/puppet/provider/neutron_subnet/neutron.rb:24:in `block in instances'
/etc/puppet/modules/neutron/lib/puppet/provider/neutron_subnet/neutron.rb:23:in `collect'
/etc/puppet/modules/neutron/lib/puppet/provider/neutron_subnet/neutron.rb:23:in `instances'
/etc/puppet/modules/neutron/lib/puppet/provider/neutron_subnet/neutron.rb:43:in `prefetch'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:277:in `prefetch'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:167:in `prefetch_if_necessary'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:67:in `block in evaluate'
/usr/lib/ruby/vendor_ruby/puppet/graph/relationship_graph.rb:116:in `call'
/usr/lib/ruby/vendor_ruby/puppet/graph/relationship_graph.rb:116:in `traverse'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:108:in `evaluate'
/usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:164:in `block in apply'
/usr/lib/ruby/vendor_ruby/puppet/util/log.rb:149:in `with_destination'
/usr/lib/ruby/vendor_ruby/puppet/transaction/report.rb:108:in `as_logging_destination'
/usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:163:in `apply'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:125:in `block in apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:161:in `block in benchmark'
/usr/lib/ruby/1.9.1/benchmark.rb:295:in `realtime'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:160:in `benchmark'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:124:in `apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:192:in `run'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:268:in `apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:218:in `main'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:146:in `run_command'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:364:in `block (2 levels) in run'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:470:in `plugin_hook'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:364:in `block in run'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:478:in `exit_on_fail'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:364:in `run'
/usr/lib/ruby/vendor_ruby/puppet/util/command_line.rb:137:in `run'
/usr/lib/ruby/vendor_ruby/puppet/util/command_line.rb:91:in `execute'
/usr/bin/puppet:4:in `<main>'

tags: added: area-library
removed: area-python module-astute
Changed in fuel:
assignee: Georgy Kibardin (gkibardin) → Fuel Toolbox (fuel-toolbox)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.