Machine Check Exception with Broadwell CPU

Bug #1509764 reported by brainsail
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
intel-microcode (Ubuntu)
Fix Released
Undecided
Tim Gardner
Xenial
Fix Released
Undecided
Tim Gardner

Bug Description

My system freezes and reboots 30 seconds later. This happens at random times between shortly after boot or up to three ours after boot. My CPU is a Core i5 5675C. Happens with Ubuntu 15.04 and 15.10. I think it is the same problem discussed here https://bugzilla.kernel.org/show_bug.cgi?id=103351
They have a microcode update as a fix there.
All Broadwell CPUs seem to be affected. It does not happen with Fedora 22.
There are two workarounds that help for some people:
1. boot with "processor.max_cstate=0 intel_idle.max_cstate=0 idle=poll"
2. Disable speedstep in the bios
Both don't work for me.

Revision history for this message
brainsail (robert-voigt) wrote :
Revision history for this message
brainsail (robert-voigt) wrote :
brainsail (robert-voigt)
description: updated
Revision history for this message
brainsail (robert-voigt) wrote :

I expected this to gain more attention. The current version of Ubuntu is unusable on all Broadwell CPUs. A fix is available in that repository:
https://github.com/bgw/bdw-ucode-update-tool.git
They talk about it in that kernel bug I linked to above. The fix works for people there and for me. Why don't you include it in linux-firmware if the license allows, or get permission if not.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Does installing intel-microcode fix the issue ?

Revision history for this message
brainsail (robert-voigt) wrote :

intel-microcode was installed (by default?). It does not contain the fix. I wasn't aware of this package. It should be the one this bug is against.
That git repository above installs /lib/firmware/intel-ucode/06-47-01.initramfs, which fixed it for me.

Revision history for this message
brainsail (robert-voigt) wrote :

intel-microcode contains /lib/firmware/intel-ucode/*, which needs to be updated

affects: linux-firmware (Ubuntu) → intel-microcode (Ubuntu)
Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote : Re: [Bug 1509764] Re: Machine Check Exception with Broadwell CPU

On Thu, Nov 5, 2015, at 14:46, brainsail wrote:
> intel-microcode contains /lib/firmware/intel-ucode/*, which needs to be
> updated

Get the Canonical Intel platform team on this, please.

This is not a simple matter of updating a package: upstream DID NOT
release the required microcode updates yet on the usual Linux microcode
update channel, and there is no public information about the microcode
updates.

OTOH, apparently some microcode updates also require changes to the
BIOS. Without some help from @intel, (or, supposedly, access to NDA'ed
developer channel information), we don't know which ones have this sort
of dependency.

Hopefully, the Canonical platform team has contacts @intel to find out
which Haswell, Broadwell and Skylake microcode update revisions are safe
for the operating system to update to, and why Intel has not issued
Linux microcode updates since January 21st, 2015...

--
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique de Moraes Holschuh <email address hidden>

Revision history for this message
Keve Gabbert (keve-a-gabbert) wrote :

An updated microcode file will be released shortly (probably Monday) and includes update for BDW.

Revision history for this message
Keve Gabbert (keve-a-gabbert) wrote :
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Henrique - please let me know if/when you incorporate this update in your upstream packaging so I can sync to Ubuntu.

Changed in intel-microcode (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote :

I am working on the updated package right now, and should upload it to Debian unstable in a few hours at most.

Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote :

On Mon, 09 Nov 2015, Henrique de Moraes Holschuh wrote:
> I am working on the updated package right now, and should upload it to
> Debian unstable in a few hours at most.

The upload has been accepted into non-free unstable (Debian), it will be
pushed to the public Debian mirrors at the next dinstall pulse.

--
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package intel-microcode - 3.20151106.1

---------------
intel-microcode (3.20151106.1) unstable; urgency=medium

  * New upstream microcode data file 20151106
    + New Microcodes:
      sig 0x000306f4, pf mask 0x80, 2015-07-17, rev 0x0009, size 14336
      sig 0x00040671, pf mask 0x22, 2015-08-03, rev 0x0013, size 11264
    + Updated Microcodes:
      sig 0x000306a9, pf mask 0x12, 2015-02-26, rev 0x001c, size 12288
      sig 0x000306c3, pf mask 0x32, 2015-08-13, rev 0x001e, size 21504
      sig 0x000306d4, pf mask 0xc0, 2015-09-11, rev 0x0022, size 16384
      sig 0x000306f2, pf mask 0x6f, 2015-08-10, rev 0x0036, size 30720
      sig 0x00040651, pf mask 0x72, 2015-08-13, rev 0x001d, size 20480
    * This massive Haswell + Broadwell (and related Xeons) update fixes
      several critical errata, including the high-hitting BDD86/BDM101/
      HSM153(?) which triggers an MCE and locks the processor core
      (LP: #1509764)
    * Might fix critical errata BDD51, BDM53 (TSX-related)
  * source: remove superseded upstream data file: 20150121
  * Add support for supplementary microcode bundles:
    + README.source: update and mention supplementary microcode
    + Makefile: support supplementary microcode
      Add support for supplementary microcode bundles, which (unlike .fw
      microcode override files) can be superseded by a higher revision
      microcode from the latest regular microcode bundle. Also, fix the
      "oldies" target to have its own exclude filter (IUC_OLDIES_EXCLUDE)
  * Add support for x32 arch:
    + README.source: mention x32
    + control,rules: enable building on x32 arch (Closes: #777356)
  * ucode-blacklist: add Broadwell and Haswell-E signatures
    Add a missing signature for Haswell Refresh (Haswell-E) to the "must
    be updated only by the early microcode update driver" list. There is
    at least one report of one of the Broadwell microcode updates disabling
    TSX-NI, so add them as well just in case

 -- Henrique de Moraes Holschuh <email address hidden> Mon, 09 Nov 2015 23:07:32 -0200

Changed in intel-microcode (Ubuntu Xenial):
status: In Progress → Fix Released
Revision history for this message
brainsail (robert-voigt) wrote :

Is there a way to update release media soon? Because it does not finish installing before MCE hits. Workaround is to put a Haswell CPU in the socket temporarily, or install the harddrive in another machine.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.