I have a rather annoying problem. My PC is rebooting a lot, without giving any BSODs. This happens when I'm moving large files, when playing certain games and sometimes when streaming videos (although this could be due to other reasons and just coincidence). There are no reboots when I leave the computer alone, and I can surf the web, watch videos on my HD, use Photoshop etc.

I recently replaced my power supply and ram due to the computer not booting at all. I had these reboot problems before this, but they were rather rare. For a week after replacing parts the reboots were also very rare, but now I can reproduce them very easily. E.g. the game Borderlands reboots in the main menu after a few seconds. First it would reboot after an hour into the game, then the time got shorter and shorter and now I can't play it. I've monitored the temps of various components, but can't seem to find anything overheating. The GPU goes to 60C+, but that's hardly fatal. I can however play less hardware-intensive games.

I ran Prime95's in-place large FFTs test and I immediately got a BSOD, but even though I had "automatically restart" on BSODs unchecked, it still rebooted, so I don't know what the BSOD was about.

Any guesses on what could be causing it?

So:
- Computer reboots when moving large files.
- Computer reboots when playing hardware-intensive games.
- Reboots might also be linked to video streaming.
- BSOD when running Prime95.
- Rebooting has become a lot more frequent over time (think of a curve with time on x-axis and reboots on y-axis).

- Recently replaced power supply and RAM, but problems existed before this.
- Cannot see anything unusual in hardware temperatures.

Have you checked for resource conflicts? It seems that if you are banging the hardware a lot and get re-boots, that might be a place to start.

Can you let us know what your hardware set up is?
How about your OS?

Hey Kingston,
Thats a tough problem alright & I sympathise.

Heres what I would do-
Next time it reboots write down the time & go into the event viewer.
look at all the error events that occur before and after the reboot time.
If there are no critical error events then you might be looking at a hardware problem.
CPU
Memory
Hard Drive
Mobo
Try the following software to test all your hardware.
http://www.ultimatebootcd.com/
If it fails or freezes/reboots during testing then you know the problem is almost certainly hardware.

I'm convinced that video streaming causes reboots. It's happened too many times now for it to be pure coincidence.

I also got a bluescreen twice. One was on long enough so I could read the text. The cause was a machine check exception. This hopefully narrows the problem down somewhat (it's a rare occasion when you're happy about getting a BSOD!)

http://en.wikipedia.org/wiki/Machine_Check_Exception

Normal causes for MCE errors include overheating and/or incorrect hardware installation. Some specific manually induced causes could include:
Overclocking (naturally increases heat output)
Poorly fitted heatsink/computer fans (the same problem can happen with excessive dust in the CPU fan)
An overloaded internal or external power supply, which can be fixed by upgrading.

I haven't ever overclocked anything and I recently got a new and more powerful PSU, so I'll give re-seating the CPU heatsink a go. Seems unlikely that it's the cause though, as the CPU temps are always normal.

Windows also linked me to this page. Sounds strange as I've replaced RAM recently and my motherboard does support my CPU. Perhaps I should update my BIOS to be sure. There isn't any way to ensure that the computer won't reboot during the updating process, though, so I'll only do this if I have nothing else to go on.

Have you checked for resource conflicts? It seems that if you are banging the hardware a lot and get re-boots, that might be a place to start.

Can you let us know what your hardware set up is?
How about your OS?

Thanks for the reply. I checked device manager, but it doesn't show any conflicts.

How silly of me to forget to mention my hardware. I'm running XP SP3.

MSI P35 Neo
Intel C2D E6750
Nvidia 8800GTS 640mb
3 x 1gb Kingston Valueram 6400 CL5
Corsair HX620

I have a sound card but I removed it and switched to the on-board one until this problem is fixed.

Hey Kingston,
Thats a tough problem alright & I sympathise.

Heres what I would do-
Next time it reboots write down the time & go into the event viewer.
look at all the error events that occur before and after the reboot time.
If there are no critical error events then you might be looking at a hardware problem.
CPU
Memory
Hard Drive
Mobo
Try the following software to test all your hardware.
http://www.ultimatebootcd.com/
If it fails or freezes/reboots during testing then you know the problem is almost certainly hardware.

Thanks for the mental support! :icon_biggrin:

Event viewer doesn't show anything related to the time of rebooting. There is a Service Control Manager 7000 error regarding a service called SSPORT that pops up now and again. Apparently it's a Samsung Universal Print driver. I'll disable that service as I don't have the printer anymore.

Thanks for the link, I'll give the ultimate boot cd a try if re-seating the CPU heatsink doesn't help.

Re-seated CPU heatsink - still crashes.

Ran Memtest86+ off UBCD - no errors after 3 hours and 5 passes. I'll run it again overnight to be sure.

Ran Mersienne Prime test off UBCD - crashed after a few minutes.

BIOS had set the voltage of memory to 1.9, when the recommended value is 1.8. It also had the clock speeds a bit higher. I changed that, but the computer still crashes.

Haven't checked hard disk yet.

I'm using Prime95's blend test (in Windows) as a benchmark. So far, nothing has changed. It still crashes within a few seconds of starting.

I've only got one file in Windows' minidump folder. Here are the debugged contents:

Microsoft (R) Windows Debugger Version 6.11.0001.404 X86
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Documents and Settings\Elias\Työpöytä\Mini110209-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: SRV*c:\symbols*[url]http://msdl.microsoft.com/download/symbols;[/url][+] srv*DownstreamStore*[url]http://msdl.microsoft.com/download/symbols[/url]
Executable search path is: 
Windows XP Kernel Version 2600 (Service Pack 3) MP (2 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS Personal
Built by: 2600.xpsp_sp3_gdr.090804-1435
Machine Name:
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805634c0
Debug session time: Mon Nov  2 19:17:05.343 2009 (GMT+2)
System Uptime: 0 days 2:50:30.082
Loading Kernel Symbols
...............................................................
................................................................
........
Loading User Symbols
Loading unloaded module list
..................
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 9C, {5, b8344050, b2000018, 2000e0f}

Unable to load image RtkHDAud.sys, Win32 error 0n2
*** WARNING: Unable to verify timestamp for RtkHDAud.sys
*** ERROR: Module load completed but symbols could not be loaded for RtkHDAud.sys
Probably caused by : RtkHDAud.sys ( RtkHDAud+9f13a )

Followup: MachineOwner
---------

1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

MACHINE_CHECK_EXCEPTION (9c)
A fatal Machine Check Exception has occurred.
KeBugCheckEx parameters;
    x86 Processors
        If the processor has ONLY MCE feature available (For example Intel
        Pentium), the parameters are:
        1 - Low  32 bits of P5_MC_TYPE MSR
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of P5_MC_ADDR MSR
        4 - Low  32 bits of P5_MC_ADDR MSR
        If the processor also has MCA feature available (For example Intel
        Pentium Pro), the parameters are:
        1 - Bank number
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
        4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
    IA64 Processors
        1 - Bugcheck Type
            1 - MCA_ASSERT
            2 - MCA_GET_STATEINFO
                SAL returned an error for SAL_GET_STATEINFO while processing MCA.
            3 - MCA_CLEAR_STATEINFO
                SAL returned an error for SAL_CLEAR_STATEINFO while processing MCA.
            4 - MCA_FATAL
                FW reported a fatal MCA.
            5 - MCA_NONFATAL
                SAL reported a recoverable MCA and we don't support currently
                support recovery or SAL generated an MCA and then couldn't
                produce an error record.
            0xB - INIT_ASSERT
            0xC - INIT_GET_STATEINFO
                  SAL returned an error for SAL_GET_STATEINFO while processing INIT event.
            0xD - INIT_CLEAR_STATEINFO
                  SAL returned an error for SAL_CLEAR_STATEINFO while processing INIT event.
            0xE - INIT_FATAL
                  Not used.
        2 - Address of log
        3 - Size of log
        4 - Error code in the case of x_GET_STATEINFO or x_CLEAR_STATEINFO
    AMD64 Processors
        1 - Bank number
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
        4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
Arguments:
Arg1: 00000005
Arg2: b8344050
Arg3: b2000018
Arg4: 02000e0f

Debugging Details:
------------------

   NOTE:  This is a hardware error.  This error was reported by the CPU
   via Interrupt 18.  This analysis will provide more information about
   the specific error.  Please contact the manufacturer for additional
   information about this error and troubleshooting assistance.

   This error is documented in the following publication:

      - IA-32 Intel(r) Architecture Software Developer's Manual 
        Volume 3: System Programming Guide

   Bit Mask:

       MA                           Model Specific       MCA
    O  ID      Other Information      Error Code     Error Code
   VV  SDP ___________|____________ _______|_______ _______|______
   AEUECRC|                        |               |              |
   LRCNVVC|                        |               |              |
   ^^^^^^^|                        |               |              |
      6         5         4         3         2         1
   3210987654321098765432109876543210987654321098765432109876543210
   ----------------------------------------------------------------
   1011001000000000000000000001100000000010000000000000111000001111


VAL   - MCi_STATUS register is valid
        Indicates that the information contained within the IA32_MCi_STATUS
        register is valid.  When this flag is set, the processor follows the
        rules given for the OVER flag in the IA32_MCi_STATUS register when
        overwriting previously valid entries.  The processor sets the VAL 
        flag and software is responsible for clearing it.

UC    - Error Uncorrected
        Indicates that the processor did not or was not able to correct the 
        error condition.  When clear, this flag indicates that the processor
        was able to correct the error condition.

EN    - Error Enabled
        Indicates that the error was enabled by the associated EEj bit of the
        IA32_MCi_CTL register.

PCC   - Processor Context Corrupt
        Indicates that the state of the processor might have been corrupted
        by the error condition detected and that reliable restarting of the
        processor may not be possible.

BUSCONNERR - Bus and Interconnect Error   BUS{LL}_{PP}_{RRRR}_{II}_{T}_err
        These errors match the format 0000 1PPT RRRR IILL



   Concatenated Error Code:
   --------------------------
   _VAL_UC_EN_PCC_BUSCONNERR_20F

   This error code can be reported back to the manufacturer.
   They may be able to provide additional information based upon
   this error.  All questions regarding STOP 0x9C should be
   directed to the hardware manufacturer.

BUGCHECK_STR:  0x9C_GenuineIntel

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  INTEL_CPU_MICROCODE_ZERO

PROCESS_NAME:  Idle

LAST_CONTROL_TRANSFER:  from 80705bfb to 8053767a

STACK_TEXT:  
b8344028 80705bfb 0000009c 00000005 b8344050 nt!KeBugCheckEx+0x1b
b8344154 80700c52 b8340d70 00000000 00000000 hal!HalpMcaExceptionHandler+0xdd
b8344154 b33e413a b8340d70 00000000 00000000 hal!HalpMcaExceptionHandlerWrapper+0x4a
WARNING: Stack unwind information not available. Following frames may be wrong.
b84cf90c 00000000 00000000 00000000 000068bd RtkHDAud+0x9f13a


STACK_COMMAND:  kb

FOLLOWUP_IP: 
RtkHDAud+9f13a
b33e413a ??              ???

SYMBOL_STACK_INDEX:  3

SYMBOL_NAME:  RtkHDAud+9f13a

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: RtkHDAud

IMAGE_NAME:  RtkHDAud.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  4acb21d2

FAILURE_BUCKET_ID:  0x9C_GenuineIntel_RtkHDAud+9f13a

BUCKET_ID:  0x9C_GenuineIntel_RtkHDAud+9f13a

Followup: MachineOwner
---------

I ran the HD diagnostic utility from UBCD on my hard drive. It found several errors.

C:539 H:1 S:1615 Error: Media error detected
C:1991 H:5 S: 634 Error: ECC error
C:1991 H:5 S: 638 Error: ECC error
C:1991 H:5 S: 639 Error: ECC error

Eureka? Could this be the cause of all my troubles? I'll get a new drive today and see if it fixes everything. Fingers crossed.

Way to go - hope that is it! Let us know

FUUUUU-

I replaced the hard drive. It crashed during Windows install. I tried again and I got it installed. However, the crashing is now even more frequent. I can be in Windows for about two minutes before it crashes.

I was able to save a minidump.

Microsoft (R) Windows Debugger Version 6.11.0001.404 X86
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [L:\Mini110909-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: SRV*c:\symbols*[url]http://msdl.microsoft.com/download/symbols[/url]
Executable search path is: 
Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatible
Product: WinNt
Built by: 2600.xpsp_sp2_rtm.040803-2158
Machine Name:
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805644a0
Debug session time: Mon Nov  9 22:26:37.890 2009 (GMT+2)
System Uptime: 0 days 0:01:33.640
Loading Kernel Symbols
...............................................................
.............................................
Loading User Symbols
Loading unloaded module list
...
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 9C, {5, f7723050, b2000018, 2000e0f}

Probably caused by : Unknown_Image ( ANALYSIS_INCONCLUSIVE )

Followup: MachineOwner
---------

1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

MACHINE_CHECK_EXCEPTION (9c)
A fatal Machine Check Exception has occurred.
KeBugCheckEx parameters;
    x86 Processors
        If the processor has ONLY MCE feature available (For example Intel
        Pentium), the parameters are:
        1 - Low  32 bits of P5_MC_TYPE MSR
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of P5_MC_ADDR MSR
        4 - Low  32 bits of P5_MC_ADDR MSR
        If the processor also has MCA feature available (For example Intel
        Pentium Pro), the parameters are:
        1 - Bank number
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
        4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
    IA64 Processors
        1 - Bugcheck Type
            1 - MCA_ASSERT
            2 - MCA_GET_STATEINFO
                SAL returned an error for SAL_GET_STATEINFO while processing MCA.
            3 - MCA_CLEAR_STATEINFO
                SAL returned an error for SAL_CLEAR_STATEINFO while processing MCA.
            4 - MCA_FATAL
                FW reported a fatal MCA.
            5 - MCA_NONFATAL
                SAL reported a recoverable MCA and we don't support currently
                support recovery or SAL generated an MCA and then couldn't
                produce an error record.
            0xB - INIT_ASSERT
            0xC - INIT_GET_STATEINFO
                  SAL returned an error for SAL_GET_STATEINFO while processing INIT event.
            0xD - INIT_CLEAR_STATEINFO
                  SAL returned an error for SAL_CLEAR_STATEINFO while processing INIT event.
            0xE - INIT_FATAL
                  Not used.
        2 - Address of log
        3 - Size of log
        4 - Error code in the case of x_GET_STATEINFO or x_CLEAR_STATEINFO
    AMD64 Processors
        1 - Bank number
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
        4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
Arguments:
Arg1: 00000005
Arg2: f7723050
Arg3: b2000018
Arg4: 02000e0f

Debugging Details:
------------------

   NOTE:  This is a hardware error.  This error was reported by the CPU
   via Interrupt 18.  This analysis will provide more information about
   the specific error.  Please contact the manufacturer for additional
   information about this error and troubleshooting assistance.

   This error is documented in the following publication:

      - IA-32 Intel(r) Architecture Software Developer's Manual 
        Volume 3: System Programming Guide

   Bit Mask:

       MA                           Model Specific       MCA
    O  ID      Other Information      Error Code     Error Code
   VV  SDP ___________|____________ _______|_______ _______|______
   AEUECRC|                        |               |              |
   LRCNVVC|                        |               |              |
   ^^^^^^^|                        |               |              |
      6         5         4         3         2         1
   3210987654321098765432109876543210987654321098765432109876543210
   ----------------------------------------------------------------
   1011001000000000000000000001100000000010000000000000111000001111


VAL   - MCi_STATUS register is valid
        Indicates that the information contained within the IA32_MCi_STATUS
        register is valid.  When this flag is set, the processor follows the
        rules given for the OVER flag in the IA32_MCi_STATUS register when
        overwriting previously valid entries.  The processor sets the VAL 
        flag and software is responsible for clearing it.

UC    - Error Uncorrected
        Indicates that the processor did not or was not able to correct the 
        error condition.  When clear, this flag indicates that the processor
        was able to correct the error condition.

EN    - Error Enabled
        Indicates that the error was enabled by the associated EEj bit of the
        IA32_MCi_CTL register.

PCC   - Processor Context Corrupt
        Indicates that the state of the processor might have been corrupted
        by the error condition detected and that reliable restarting of the
        processor may not be possible.

BUSCONNERR - Bus and Interconnect Error   BUS{LL}_{PP}_{RRRR}_{II}_{T}_err
        These errors match the format 0000 1PPT RRRR IILL



   Concatenated Error Code:
   --------------------------
   _VAL_UC_EN_PCC_BUSCONNERR_20F

   This error code can be reported back to the manufacturer.
   They may be able to provide additional information based upon
   this error.  All questions regarding STOP 0x9C should be
   directed to the hardware manufacturer.

BUGCHECK_STR:  0x9C_GenuineIntel

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  INTEL_CPU_MICROCODE_ZERO

LAST_CONTROL_TRANSFER:  from 80707bff to 80537832

STACK_TEXT:  
f7723028 80707bff 0000009c 00000005 f7723050 nt!KeBugCheckEx+0x1b
f7723154 80702c52 f771fd70 00000000 00000000 hal!HalpMcaExceptionHandler+0xdd
f7723154 00000000 f771fd70 00000000 00000000 hal!HalpMcaExceptionHandlerWrapper+0x4a


STACK_COMMAND:  kb

SYMBOL_NAME:  ANALYSIS_INCONCLUSIVE

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: Unknown_Module

IMAGE_NAME:  Unknown_Image

DEBUG_FLR_IMAGE_TIMESTAMP:  0

FAILURE_BUCKET_ID:  0x9C_GenuineIntel_ANALYSIS_INCONCLUSIVE

BUCKET_ID:  0x9C_GenuineIntel_ANALYSIS_INCONCLUSIVE

Followup: MachineOwner
---------

1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

MACHINE_CHECK_EXCEPTION (9c)
A fatal Machine Check Exception has occurred.
KeBugCheckEx parameters;
    x86 Processors
        If the processor has ONLY MCE feature available (For example Intel
        Pentium), the parameters are:
        1 - Low  32 bits of P5_MC_TYPE MSR
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of P5_MC_ADDR MSR
        4 - Low  32 bits of P5_MC_ADDR MSR
        If the processor also has MCA feature available (For example Intel
        Pentium Pro), the parameters are:
        1 - Bank number
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
        4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
    IA64 Processors
        1 - Bugcheck Type
            1 - MCA_ASSERT
            2 - MCA_GET_STATEINFO
                SAL returned an error for SAL_GET_STATEINFO while processing MCA.
            3 - MCA_CLEAR_STATEINFO
                SAL returned an error for SAL_CLEAR_STATEINFO while processing MCA.
            4 - MCA_FATAL
                FW reported a fatal MCA.
            5 - MCA_NONFATAL
                SAL reported a recoverable MCA and we don't support currently
                support recovery or SAL generated an MCA and then couldn't
                produce an error record.
            0xB - INIT_ASSERT
            0xC - INIT_GET_STATEINFO
                  SAL returned an error for SAL_GET_STATEINFO while processing INIT event.
            0xD - INIT_CLEAR_STATEINFO
                  SAL returned an error for SAL_CLEAR_STATEINFO while processing INIT event.
            0xE - INIT_FATAL
                  Not used.
        2 - Address of log
        3 - Size of log
        4 - Error code in the case of x_GET_STATEINFO or x_CLEAR_STATEINFO
    AMD64 Processors
        1 - Bank number
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
        4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
Arguments:
Arg1: 00000005
Arg2: f7723050
Arg3: b2000018
Arg4: 02000e0f

Debugging Details:
------------------

   NOTE:  This is a hardware error.  This error was reported by the CPU
   via Interrupt 18.  This analysis will provide more information about
   the specific error.  Please contact the manufacturer for additional
   information about this error and troubleshooting assistance.

   This error is documented in the following publication:

      - IA-32 Intel(r) Architecture Software Developer's Manual 
        Volume 3: System Programming Guide

   Bit Mask:

       MA                           Model Specific       MCA
    O  ID      Other Information      Error Code     Error Code
   VV  SDP ___________|____________ _______|_______ _______|______
   AEUECRC|                        |               |              |
   LRCNVVC|                        |               |              |
   ^^^^^^^|                        |               |              |
      6         5         4         3         2         1
   3210987654321098765432109876543210987654321098765432109876543210
   ----------------------------------------------------------------
   1011001000000000000000000001100000000010000000000000111000001111


VAL   - MCi_STATUS register is valid
        Indicates that the information contained within the IA32_MCi_STATUS
        register is valid.  When this flag is set, the processor follows the
        rules given for the OVER flag in the IA32_MCi_STATUS register when
        overwriting previously valid entries.  The processor sets the VAL 
        flag and software is responsible for clearing it.

UC    - Error Uncorrected
        Indicates that the processor did not or was not able to correct the 
        error condition.  When clear, this flag indicates that the processor
        was able to correct the error condition.

EN    - Error Enabled
        Indicates that the error was enabled by the associated EEj bit of the
        IA32_MCi_CTL register.

PCC   - Processor Context Corrupt
        Indicates that the state of the processor might have been corrupted
        by the error condition detected and that reliable restarting of the
        processor may not be possible.

BUSCONNERR - Bus and Interconnect Error   BUS{LL}_{PP}_{RRRR}_{II}_{T}_err
        These errors match the format 0000 1PPT RRRR IILL



   Concatenated Error Code:
   --------------------------
   _VAL_UC_EN_PCC_BUSCONNERR_20F

   This error code can be reported back to the manufacturer.
   They may be able to provide additional information based upon
   this error.  All questions regarding STOP 0x9C should be
   directed to the hardware manufacturer.

BUGCHECK_STR:  0x9C_GenuineIntel

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  INTEL_CPU_MICROCODE_ZERO

LAST_CONTROL_TRANSFER:  from 80707bff to 80537832

STACK_TEXT:  
f7723028 80707bff 0000009c 00000005 f7723050 nt!KeBugCheckEx+0x1b
f7723154 80702c52 f771fd70 00000000 00000000 hal!HalpMcaExceptionHandler+0xdd
f7723154 00000000 f771fd70 00000000 00000000 hal!HalpMcaExceptionHandlerWrapper+0x4a


STACK_COMMAND:  kb

SYMBOL_NAME:  ANALYSIS_INCONCLUSIVE

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: Unknown_Module

IMAGE_NAME:  Unknown_Image

DEBUG_FLR_IMAGE_TIMESTAMP:  0

FAILURE_BUCKET_ID:  0x9C_GenuineIntel_ANALYSIS_INCONCLUSIVE

BUCKET_ID:  0x9C_GenuineIntel_ANALYSIS_INCONCLUSIVE

Followup: MachineOwner

This crashlog and the one posted earlier have something in common:

BUSCONNERR - Bus and Interconnect Error   BUS{LL}_{PP}_{RRRR}_{II}_{T}_err
        These errors match the format 0000 1PPT RRRR IILL
Concatenated Error Code:
   --------------------------
   _VAL_UC_EN_PCC_BUSCONNERR_20F

   This error code can be reported back to the manufacturer.
   They may be able to provide additional information based upon
   this error.  All questions regarding STOP 0x9C should be
   directed to the hardware manufacturer.

Any help in uncovering what all this means?

Another minidump

Microsoft (R) Windows Debugger Version 6.11.0001.404 X86
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [L:\Mini111009-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is: 
Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS Personal
Built by: 2600.xpsp_sp2_rtm.040803-2158
Machine Name:
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805644a0
Debug session time: Tue Nov 10 15:21:23.953 2009 (GMT+2)
System Uptime: 0 days 0:03:33.562
Loading Kernel Symbols
...............................................................
...................................................
Loading User Symbols
Loading unloaded module list
...........
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 9C, {5, f7723050, b2000018, 2000e0f}

Probably caused by : intelppm.sys ( intelppm!AcpiC1Idle+12 )

Followup: MachineOwner
---------

1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

MACHINE_CHECK_EXCEPTION (9c)
A fatal Machine Check Exception has occurred.
KeBugCheckEx parameters;
    x86 Processors
        If the processor has ONLY MCE feature available (For example Intel
        Pentium), the parameters are:
        1 - Low  32 bits of P5_MC_TYPE MSR
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of P5_MC_ADDR MSR
        4 - Low  32 bits of P5_MC_ADDR MSR
        If the processor also has MCA feature available (For example Intel
        Pentium Pro), the parameters are:
        1 - Bank number
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
        4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
    IA64 Processors
        1 - Bugcheck Type
            1 - MCA_ASSERT
            2 - MCA_GET_STATEINFO
                SAL returned an error for SAL_GET_STATEINFO while processing MCA.
            3 - MCA_CLEAR_STATEINFO
                SAL returned an error for SAL_CLEAR_STATEINFO while processing MCA.
            4 - MCA_FATAL
                FW reported a fatal MCA.
            5 - MCA_NONFATAL
                SAL reported a recoverable MCA and we don't support currently
                support recovery or SAL generated an MCA and then couldn't
                produce an error record.
            0xB - INIT_ASSERT
            0xC - INIT_GET_STATEINFO
                  SAL returned an error for SAL_GET_STATEINFO while processing INIT event.
            0xD - INIT_CLEAR_STATEINFO
                  SAL returned an error for SAL_CLEAR_STATEINFO while processing INIT event.
            0xE - INIT_FATAL
                  Not used.
        2 - Address of log
        3 - Size of log
        4 - Error code in the case of x_GET_STATEINFO or x_CLEAR_STATEINFO
    AMD64 Processors
        1 - Bank number
        2 - Address of MCA_EXCEPTION structure
        3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
        4 - Low  32 bits of MCi_STATUS MSR for the MCA bank that had the error
Arguments:
Arg1: 00000005
Arg2: f7723050
Arg3: b2000018
Arg4: 02000e0f

Debugging Details:
------------------

   NOTE:  This is a hardware error.  This error was reported by the CPU
   via Interrupt 18.  This analysis will provide more information about
   the specific error.  Please contact the manufacturer for additional
   information about this error and troubleshooting assistance.

   This error is documented in the following publication:

      - IA-32 Intel(r) Architecture Software Developer's Manual 
        Volume 3: System Programming Guide

   Bit Mask:

       MA                           Model Specific       MCA
    O  ID      Other Information      Error Code     Error Code
   VV  SDP ___________|____________ _______|_______ _______|______
   AEUECRC|                        |               |              |
   LRCNVVC|                        |               |              |
   ^^^^^^^|                        |               |              |
      6         5         4         3         2         1
   3210987654321098765432109876543210987654321098765432109876543210
   ----------------------------------------------------------------
   1011001000000000000000000001100000000010000000000000111000001111


VAL   - MCi_STATUS register is valid
        Indicates that the information contained within the IA32_MCi_STATUS
        register is valid.  When this flag is set, the processor follows the
        rules given for the OVER flag in the IA32_MCi_STATUS register when
        overwriting previously valid entries.  The processor sets the VAL 
        flag and software is responsible for clearing it.

UC    - Error Uncorrected
        Indicates that the processor did not or was not able to correct the 
        error condition.  When clear, this flag indicates that the processor
        was able to correct the error condition.

EN    - Error Enabled
        Indicates that the error was enabled by the associated EEj bit of the
        IA32_MCi_CTL register.

PCC   - Processor Context Corrupt
        Indicates that the state of the processor might have been corrupted
        by the error condition detected and that reliable restarting of the
        processor may not be possible.

BUSCONNERR - Bus and Interconnect Error   BUS{LL}_{PP}_{RRRR}_{II}_{T}_err
        These errors match the format 0000 1PPT RRRR IILL



   Concatenated Error Code:
   --------------------------
   _VAL_UC_EN_PCC_BUSCONNERR_20F

   This error code can be reported back to the manufacturer.
   They may be able to provide additional information based upon
   this error.  All questions regarding STOP 0x9C should be
   directed to the hardware manufacturer.

BUGCHECK_STR:  0x9C_GenuineIntel

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  INTEL_CPU_MICROCODE_ZERO

PROCESS_NAME:  Idle

LAST_CONTROL_TRANSFER:  from 80707bff to 80537832

STACK_TEXT:  
f7723028 80707bff 0000009c 00000005 f7723050 nt!KeBugCheckEx+0x1b
f7723154 80702c52 f771fd70 00000000 00000000 hal!HalpMcaExceptionHandler+0xdd
f7723154 f7519062 f771fd70 00000000 00000000 hal!HalpMcaExceptionHandlerWrapper+0x4a
f78aed50 804dd133 00000000 0000000e 005f0050 intelppm!AcpiC1Idle+0x12
f78aed54 00000000 0000000e 005f0050 00430050 nt!KiIdleLoop+0x10


STACK_COMMAND:  kb

FOLLOWUP_IP: 
intelppm!AcpiC1Idle+12
f7519062 6a00            push    0

SYMBOL_STACK_INDEX:  3

SYMBOL_NAME:  intelppm!AcpiC1Idle+12

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: intelppm

IMAGE_NAME:  intelppm.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  41107b37

FAILURE_BUCKET_ID:  0x9C_GenuineIntel_intelppm!AcpiC1Idle+12

BUCKET_ID:  0x9C_GenuineIntel_intelppm!AcpiC1Idle+12

Followup: MachineOwner
---------

Now the cause is intelppm.sys, the cpu driver. This could be driver related, as windows only started crashing after I installed drivers. I was able to run in safe mode without crashes.

Could be the hardware too, of course.

I really want to update my BIOS, but I'm afraid it might crash during the update.

I ran Prime95 under Safe mode. Resulted in a crash D:

So I guess it's down to motherboard, graphics card or CPU.

sigh...

Updated BIOS. Didn't help.

So I figured it was either the CPU or Mobo. CPUs are less likely to fail so I bought a new mobo.

Result: Success! The computer works perfectly now. I had almost given up.

Congrats! Does 20/20 hindsight allow you to go back through the dumps and find the cause? I searched through them but my limited knowledge gives me no insights.

Congrats! Does 20/20 hindsight allow you to go back through the dumps and find the cause? I searched through them but my limited knowledge gives me no insights.

I only learnt to do a basic debug during the course of this problem, so I'm definitely no expert. The BUSCONNERR (bus and interconnect error) would explain the crashes, but googling shows so many hits with different causes that it can't really be used. The image names seem random although associated to drivers (the realtek driver and the cpu driver), so I guess they are of little help.

INTEL_CPU_MICROCODE_ZERO appears in each crash. With hindsight I googled with some extra keywords and it turns out it points to a hardware failure of either the cpu or motherboard... Even though it can somehow be induced by software as well. With software ruled out, though, it's a pretty clear guide to the source of the problem.