My Computer Just Shut Itself Down

This is a very common problem in the chat-room, and this article is to save needless repetition.

It is intended to be informal, and will be added to as I encounter more users with problems that are not covered by the article. The main purpose of this is to establish a troubleshooting guide, with links to other articles with guides on how to set up testing / monitoring etc.

= Record Keeping = This kind of thing can be hard to do; you may feel frustrated and want to reboot immediately to continue working.

Instead, make a note of what you were doing when the machine rebooted. Anything that drives the CPU to 100% is an indicator towards temperature. That includes programs which historically have caused X to hang, such as Flash. Were you in the middle of a large emerge? Did you recently upgrade any package you were running?

= Temperature = Was it during a compile? Is it a laptop? Can you feel heat on the case or by the fan?

If any of the above are true, then temperature may well be your problem. Modern computers try to shut themselves down before overheating causes permanent damage (which is better than melting!), and this might be what happened.

Another victim of insufficient cooling can be your graphics card. Graphically intensive tasks will cause it to heat up, as will anything that uses it for parallel tasks (using CUDA or the like). Buggy drivers may be a cause as well.

Checking Temperature in BIOS
If you were present for the shutdown, you can usually most BIOSes will tell you the processor temperature. Core or Die temperature are the important figures. (Sometimes these figures are wrong).

Checking Temperature in Linux
Read the lm_sensors guide for how to do this. (Again, sometimes this reports temperature inaccurately, so be warned.)

Use Your Hand
Computers can burn you, so be careful. Follow the same guidelines that people use when opening a door during a fire.

If you are getting figures from BIOS and lm_sensors that are clearly wrong, use a hand-thermometer. Make sure that you do not make contact with the computer parts, and that the thermometer does not go above its maximum rating. (In the past, mercury thermometers used to explode when the pressure got too high. There's no article on the wiki for cleaning mercury out of a computer.)

= Failing Components = This is harder to troubleshoot. I will update this article as I work through such a situation with users.