Failure to communicate with Glance server stores empty files in /var/lib/nova/instances

Bug #808990 reported by P Spencer Davis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Undecided
Unassigned

Bug Description

I have two nodes, one running nova-api, nova-network, nova-volume, nova-compute and glance, the second is just running nova-compute. The nodes are running Ubuntu 11.04 server and I've installed from the ppa.launchpad repository, additionally both nodes use the KVM hypervisor, and kvm-ok returns that vitrualization (sp) is enabled in their bios. On the master node, I can start instances and they run just fine, but when a vm is scheduled on the second node, I receive the following errors:

2011-07-11 08:53:38,013 INFO nova.virt.libvirt_conn [-] instance instance-00000002: Creating image
2011-07-11 08:53:38,034 DEBUG nova.utils [-] Attempting to grab semaphore "00000001" for method "call_if_not_exists
"... from (pid=6846) inner /usr/lib/pymodules/python2.7/nova/utils.py:600
2011-07-11 08:53:38,036 ERROR nova.exception [-] Uncaught exception
(nova.exception): TRACE: Traceback (most recent call last):
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/exception.py", line 87, in _wrap
(nova.exception): TRACE: return f(*args, **kw)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 590, in spawn
(nova.exception): TRACE: block_device_mapping=block_device_mapping)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 815, in _creat
e_image
(nova.exception): TRACE: project=project)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 751, in _cache
_image
(nova.exception): TRACE: call_if_not_exists(base, fn, *args, **kwargs)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/utils.py", line 613, in inner
(nova.exception): TRACE: retval = f(*args, **kwargs)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 749, in call_i
f_not_exists
(nova.exception): TRACE: fn(target=base, *args, **kwargs)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 762, in _fetch
_image
(nova.exception): TRACE: images.fetch(image_id, target, user, project)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/virt/images.py", line 44, in fetch
(nova.exception): TRACE: metadata = image_service.get(elevated, image_id, image_file)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/nova/image/glance.py", line 139, in get
(nova.exception): TRACE: image_meta, image_chunks = self.client.get_image(image_id)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/glance/client.py", line 98, in get_image
(nova.exception): TRACE: res = self.do_request("GET", "/images/%s" % image_id)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/glance/client.py", line 54, in do_request
(nova.exception): TRACE: headers, params)
(nova.exception): TRACE: File "/usr/lib/pymodules/python2.7/glance/common/client.py", line 148, in do_request
(nova.exception): TRACE: "server. Got error: %s" % e)
(nova.exception): TRACE: ClientConnectionError: Unable to connect to server. Got error: [Errno 111] ECONNREFUSED
(nova.exception): TRACE:
2011-07-11 08:53:38,037 ERROR nova.compute.manager [-] Instance '2' failed to spawn. Is virtualization enabled in t
he BIOS? Details: Unable to connect to server. Got error: [Errno 111] ECONNREFUSED

Looking in /var/lib/nova/instances/_base, there are 0000000# files that are zero size.

The nodes have dual nics attached to a public 172.16.0.0/16 and a private 10.0.0.0/8 netowrk and i was using http://dodeeric.louvrex.net/?p=225 as an install guide.

/etc/nova/nova.conf:

# RabbitMQ
--rabbit_host=172.16.1.13
# MySQL
--sql_connection=mysql://nova:nova@172.16.1.13/nova
# Networking
--network_manager=nova.network.manager.VlanManager
--vlan_interface=eth1
--public_interface=eth0
--network_host=172.16.1.13
--routing_source_ip=172.16.1.13
--fixed_range=10.0.0.0/8
--network_size=1024
--dhcpbridge_flagfile=/etc/nova/nova.conf
--dhcpbridge=/usr/bin/nova-dhcpbridge
# Virtualization
--libvirt_type=kvm
# Volumes
--iscsi_ip_prefix=172.16.1.13
--num_targets=100
# APIs
--auth_driver=nova.auth.dbdriver.DbDriver
--cc_host=172.16.1.13
--ec2_url=http://172.16.1.13:8773/services/Cloud
--s3_host=172.16.1.13
--s3_dmz=172.16.1.13
# Image service
--glance_host=172.16.1.13
--image_service=nova.image.glance.GlanceImageService
# Misc
--logdir=/var/log/nova
--state_path=/var/lib/nova
--lock_path=/var/lock/nova
--verbose
# VNC Console
--vnc_enabled=true
--vncproxy_url=http://172.16.1.13:6080
--vnc_console_proxy_url=http://172.16.1.13:6080

Revision history for this message
Jay Pipes (jaypipes) wrote :

Hi! Can you paste your glance configs please. Also, what does this return:

curl http://172.16.1.13/

return?

Cheers,
jay

Revision history for this message
P Spencer Davis (p-spencer-davis) wrote :
Download full text (3.9 KiB)

Curl from both nodes:

$># curl 172.16.1.13
<html><body><h1>It works!</h1>
<p>This is the default web page for this server.</p>
<p>The web server software is running but no content has been added, yet.</p>
</body></html>

from the compute only node:

 curl http://172.16.1.13:9191

{"images": [{"name": null, "container_format": "ami", "disk_format": "ami", "checksum": "f97a52be69e30a81f5d2c6953c3ae5da", "id": 4, "size": 1476395008}, {"name": null, "container_format": "aki", "disk_format": "aki", "checksum": "55c855957ae2ca4b476dabc26dbb1685", "id": 3, "size": 4592768}]}

curl http://172.16.1.13:9292

{"versions": [{"status": "CURRENT", "id": "v1.0", "links": [{"href": "http://0.0.0.0:9292/v1/", "rel": "self"}]}]}csadmin@dhcp-172-16-1-15:~$

I'm using the default glance config files, as per the guide I was using.

/etc/glance/glance-api.conf

[DEFAULT]
# Show more verbose log output (sets INFO log level output)
verbose = True

# Show debugging output in logs (sets DEBUG log level output)
debug = False

# Which backend store should Glance use by default is not specified
# in a request to add a new image to Glance? Default: 'file'
# Available choices are 'file', 'swift', and 's3'
default_store = file

# Address to bind the API server
bind_host = 0.0.0.0

# Port the bind the API server to
bind_port = 9292

# Address to find the registry server
registry_host = 0.0.0.0

# Port the registry server is listening on
registry_port = 9191

# Log to this file. Make sure you do not set the same log
# file for both the API and registry servers!
log_file = /var/log/glance/api.log

# ============ Filesystem Store Options ========================

# Directory that the Filesystem backend store
# writes image data to
filesystem_store_datadir = /var/lib/glance/images/

# ============ Swift Store Options =============================

# Address where the Swift authentication service lives
swift_store_auth_address = 127.0.0.1:8080/v1.0/

# User to authenticate against the Swift authentication service
swift_store_user = jdoe

# Auth key for the user authenticating against the
# Swift authentication service
swift_store_key = a86850deb2742ec3cb41518e26aa2d89

# Container within the account that the account should use
# for storing images in Swift
swift_store_container = glance

# Do we create the container if it does not exist?
swift_store_create_container_on_put = False

[pipeline:glance-api]
pipeline = versionnegotiation apiv1app

[pipeline:versions]
pipeline = versionsapp

[app:versionsapp]
paste.app_factory = glance.api.versions:app_factory

[app:apiv1app]
paste.app_factory = glance.api.v1:app_factory

[filter:versionnegotiation]
paste.filter_factory = glance.api.middleware.version_negotiation:filter_factory

/etc/glance/glance-registry.conf

[DEFAULT]
# Show more verbose log output (sets INFO log level output)
verbose = True

# Show debugging output in logs (sets DEBUG log level output)
debug = False

# Address to bind the registry server
bind_host = 0.0.0.0

# Port the bind the registry server to
bind_port = 9191

# Log to this file. Make sure you do not set the same log
# file for both the API and registry servers!
log_file = /var/log/glance/registry.log

#...

Read more...

Revision history for this message
Jay Pipes (jaypipes) wrote :

What does:

curl http://172.16.1.13:9292/v1/images

produce?

Thanks,
jay

Revision history for this message
P Spencer Davis (p-spencer-davis) wrote :

from the compute-node:

 curl http://172.16.1.13:9292/v1/images
{"images": [{"name": null, "container_format": "ami", "disk_format": "ami", "checksum": "f97a52be69e30a81f5d2c6953c3ae5da", "id": 4, "size": 1476395008}, {"name": null, "container_format": "aki", "disk_format": "aki", "checksum": "55c855957ae2ca4b476dabc26dbb1685", "id": 3, "size": 4592768}]}

and from the master node:

curl http://172.16.1.13:9292/v1/images
{"images": [{"name": null, "container_format": "ami", "disk_format": "ami", "checksum": "f97a52be69e30a81f5d2c6953c3ae5da", "id": 4, "size": 1476395008}, {"name": null, "container_format": "aki", "disk_format": "aki", "checksum": "55c855957ae2ca4b476dabc26dbb1685", "id": 3, "size": 4592768}]}

Revision history for this message
Jay Pipes (jaypipes) wrote :

Looks like everything is working fine from the Glance side of things... I suspect that this might be a Nova issue. I've got someone installing on Windows that is running into a similar issue. I'll update this bug when I figure out some more...

Revision history for this message
P Spencer Davis (p-spencer-davis) wrote :

Alright, thanks for your time. I'll move it over to Nova Questions.

Revision history for this message
Jay Pipes (jaypipes) wrote : Re: [Bug 808990] Re: Creating zero sized instances on nova-compute node

No, I didn't mean to make you refile it :) still looking for the answer bear
with me...
On Jul 11, 2011 9:35 PM, "P Spencer Davis" <email address hidden>
wrote:
> Alright, thanks for your time. I'll move it over to Nova Questions.
>
> --
> You received this bug notification because you are a member of Glance
> Bug Team, which is subscribed to Glance.
> https://bugs.launchpad.net/bugs/808990
>
> Title:
> Creating zero sized instances on nova-compute node
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/glance/+bug/808990/+subscriptions

Revision history for this message
P Spencer Davis (p-spencer-davis) wrote :
Download full text (6.1 KiB)

From Nova questions:

On Mon, Jul 11, 2011 at 9:47 Vish Ishaya proposed the following answer:
the --glance_host and --glance_port flags were replaced with a single
flag called
--glance_api_servers

try
--glance_api_servers=172.16.1.13:9292

I changed nova.conf and now the instance image on the compute node is
larger, but still failing to run.

PM, Jay Pipes <email address hidden> wrote:
> No, I didn't mean to make you refile it :) still looking for the answer bear
> with me...
> On Jul 11, 2011 9:35 PM, "P Spencer Davis" <email address hidden>
> wrote:
>> Alright, thanks for your time. I'll move it over to Nova Questions.
>>
>> --
>> You received this bug notification because you are a member of Glance
>> Bug Team, which is subscribed to Glance.
>> https://bugs.launchpad.net/bugs/808990
>>
>> Title:
>> Creating zero sized instances on nova-compute node
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/glance/+bug/808990/+subscriptions
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/808990
>
> Title:
>  Creating zero sized instances on nova-compute node
>
> Status in OpenStack Image Registry and Delivery Service (Glance):
>  New
>
> Bug description:
>  I have two nodes, one running nova-api, nova-network, nova-volume,
>  nova-compute and glance, the second is just running nova-compute. The
>  nodes are running Ubuntu 11.04 server and I've installed from the
>  ppa.launchpad repository, additionally both nodes use the KVM
>  hypervisor, and kvm-ok returns that vitrualization (sp)  is enabled in
>  their bios. On the master node, I can start instances and they run
>  just fine, but when a vm is scheduled on the second node, I receive
>  the following errors:
>
>  2011-07-11 08:53:38,013 INFO nova.virt.libvirt_conn [-] instance instance-00000002: Creating image
>  2011-07-11 08:53:38,034 DEBUG nova.utils [-] Attempting to grab semaphore "00000001" for method "call_if_not_exists
>  "... from (pid=6846) inner /usr/lib/pymodules/python2.7/nova/utils.py:600
>  2011-07-11 08:53:38,036 ERROR nova.exception [-] Uncaught exception
>  (nova.exception): TRACE: Traceback (most recent call last):
>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/exception.py", line 87, in _wrap
>  (nova.exception): TRACE:     return f(*args, **kw)
>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 590, in spawn
>  (nova.exception): TRACE:     block_device_mapping=block_device_mapping)
>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 815, in _creat
>  e_image
>  (nova.exception): TRACE:     project=project)
>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 751, in _cache
>  _image
>  (nova.exception): TRACE:     call_if_not_exists(base, fn, *args, **kwargs)
>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/utils.py", line 613, in inner
>  (nova.exception): TRACE:     retval = f(*args, **kwargs)
>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7...

Read more...

Revision history for this message
P Spencer Davis (p-spencer-davis) wrote :
Download full text (6.5 KiB)

After changing the nova.conf file and deleting the contents of
/var/lib/nova/instances/ on the compute node, everything works.

On Tue, Jul 12, 2011 at 8:12 AM, Spencer Davis
<email address hidden> wrote:
> From Nova questions:
>
> On Mon, Jul 11, 2011 at 9:47 Vish Ishaya proposed the following answer:
> the --glance_host and --glance_port flags were replaced with a single
> flag called
> --glance_api_servers
>
> try
> --glance_api_servers=172.16.1.13:9292
>
>
> I changed nova.conf and now the instance image on the compute node is
> larger, but still failing to run.
>
>
> PM, Jay Pipes <email address hidden> wrote:
>> No, I didn't mean to make you refile it :) still looking for the answer bear
>> with me...
>> On Jul 11, 2011 9:35 PM, "P Spencer Davis" <email address hidden>
>> wrote:
>>> Alright, thanks for your time. I'll move it over to Nova Questions.
>>>
>>> --
>>> You received this bug notification because you are a member of Glance
>>> Bug Team, which is subscribed to Glance.
>>> https://bugs.launchpad.net/bugs/808990
>>>
>>> Title:
>>> Creating zero sized instances on nova-compute node
>>>
>>> To manage notifications about this bug go to:
>>> https://bugs.launchpad.net/glance/+bug/808990/+subscriptions
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/808990
>>
>> Title:
>>  Creating zero sized instances on nova-compute node
>>
>> Status in OpenStack Image Registry and Delivery Service (Glance):
>>  New
>>
>> Bug description:
>>  I have two nodes, one running nova-api, nova-network, nova-volume,
>>  nova-compute and glance, the second is just running nova-compute. The
>>  nodes are running Ubuntu 11.04 server and I've installed from the
>>  ppa.launchpad repository, additionally both nodes use the KVM
>>  hypervisor, and kvm-ok returns that vitrualization (sp)  is enabled in
>>  their bios. On the master node, I can start instances and they run
>>  just fine, but when a vm is scheduled on the second node, I receive
>>  the following errors:
>>
>>  2011-07-11 08:53:38,013 INFO nova.virt.libvirt_conn [-] instance instance-00000002: Creating image
>>  2011-07-11 08:53:38,034 DEBUG nova.utils [-] Attempting to grab semaphore "00000001" for method "call_if_not_exists
>>  "... from (pid=6846) inner /usr/lib/pymodules/python2.7/nova/utils.py:600
>>  2011-07-11 08:53:38,036 ERROR nova.exception [-] Uncaught exception
>>  (nova.exception): TRACE: Traceback (most recent call last):
>>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/exception.py", line 87, in _wrap
>>  (nova.exception): TRACE:     return f(*args, **kw)
>>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 590, in spawn
>>  (nova.exception): TRACE:     block_device_mapping=block_device_mapping)
>>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 815, in _creat
>>  e_image
>>  (nova.exception): TRACE:     project=project)
>>  (nova.exception): TRACE:   File "/usr/lib/pymodules/python2.7/nova/virt/libvirt/connection.py", line 751, in _cache
>>  _image
>>  (nova.exce...

Read more...

Revision history for this message
Jay Pipes (jaypipes) wrote : Re: Creating zero sized instances on nova-compute node

OK, that makes sense. The images being stored in /var/lib/nova/instances were zero-length because the Nova server was attempted to communicate with a non-existing Glance server locally on that box, and the image was unfortunately being stored with a zero length. When looking for an instance with an ID, the compute node was searching for an image file, which it found, but had zero length. Clearing out the directory (which is the local cache) forced Nova to re-fetch the image from Glance.

Revision history for this message
Jay Pipes (jaypipes) wrote :

There is a still a bug here, however. When a connection is refused to Glance (as in your original post) the Nova service is writing an empty file to its cache, which is wrong.

I'll change the project to Nova and mark it confirmed.

summary: - Creating zero sized instances on nova-compute node
+ Failure to communicate with Glance server stores empty files in
+ /var/lib/nova/instances
affects: glance → nova
Changed in nova:
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.