Maui Forums
[Solved] - Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - Printable Version

+- Maui Forums (https://forums.mauilinux.org)
+-- Forum: Maui Support (https://forums.mauilinux.org/forumdisplay.php?fid=74)
+--- Forum: Hardware (https://forums.mauilinux.org/forumdisplay.php?fid=85)
+--- Thread: [Solved] - Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. (/showthread.php?tid=24237)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - kdemeoz - 11th February 2017

This is now officially ridiculous. Another total system freeze. Even the SysTray clock froze [so does that imply that this time even the kernel had locked up?]. 7th Hard Reset needed since my 31/12/16 Maui clean reinstallation, & 3rd since changing from NVidia to Nouveau GPU driver. Expletives!

The irony this time was that the freeze occurred with most pgms closed, whilst i was running my weekly backup to my USB stick. 

As noted in my previous post, i had continued running with only the single keyboard & mouse plugged in, but as far as i can see now, tonight's latest freeze should logically eliminate the hypothesis of the freezes being caused by conflicts of dual keyboards & mice. I think that tomorrow (it's now very very late here] i will plug my 2nd keyboard & mouse back in [ergonomically beneficial, as previously explained].

I've not seen any new kernels in Update Manager, so in lieu of that step, i have now made the grub edit & update, & will reboot once this is posted. 
Code:
GRUB_CMDLINE_LINUX_DEFAULT="atkdb.reset i8042.nomux quiet splash"



RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - kdemeoz - 14th February 2017

Another total system freeze [upon Resume again, except this time i was able to unlock the Lock Screen fine, only for it then all to freeze with my desktop showing]. 8th Hard Reset [no, REISUB worked this time, see below] needed since my 31/12/16 Maui clean reinstallation, & 4th since changing from NVidia to Nouveau GPU driver.

Interestingly this time after unplugging & reconnecting the k/bs & mice, they did begin functioning again [thanks Rocky], but everything else was still frozen. Having regained the k/bs, this time i was at least able to REISUB, so this time no Hard Reset was necessary.

As per the parallel thread https://forums.mauilinux.org/showthread.php?tid=24275&pid=41686#pid41686 ,  i am now about to begin downloading the debs to do the process of installing kernel 4.9.9... desperately hoping it will cure these ridiculous repeat freezes.


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - kdemeoz - 14th February 2017

Now running kernel 4.9.9 in Tower. Details here: https://forums.mauilinux.org/showthread.php?tid=24275&pid=41690#pid41690

Only time will tell if this has actually fixed the chronic freeze problems...


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - kdemeoz - 19th February 2017

Another total system freeze9th Hard Reset needed since my 31/12/16 Maui clean reinstallation, 5th since changing from NVidia to Nouveau GPU driver, & 1st with Kernel 4.9.9.

Unlike most of these freezes, today's one was not during Resume from Suspend, but instead was virtually the same as the 12Feb one, ie: the freeze occurred with most pgms closed, whilst i was running my weekly backup to my USB stick [VeraCrypt + luckyBackup]. 

I presume the kernel itself again had not frozen, coz even though Plasmashell & Inputs froze [& annoyingly luckyBackup also froze thus ruining the in-progress backup], my Clementine Icecast internet music streaming continued unscathed. 

Damnit. 


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - rocky7x - 19th February 2017

Could you please attach the syslog after the crash so we can see what happened? The same exercise as before ..


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - kdemeoz - 19th February 2017

Hi Rocky, thanks. The "bad stuff" seems to be Feb 19 18:28:56 - Feb 19 18:40:28.

.zip   syslog.zip (Size: 57.51 KB / Downloads: 661)


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - rocky7x - 19th February 2017

Hi,

From what I can see, seems like a hardware issue with one of your CPUs (#2). Kernel detected a lockup of the CPU twice and after the second time, it got stuck, maybe the other CPU went into lockup as well. I'm certainly no expert on this, but seems to me like a hardware issue. We'll see what others think of it...


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - kdemeoz - 20th February 2017

Hi Rocky, & thanks.

Yes i also noticed that, & of course it worried me, but i'm out of my depth here wrt properly understanding & interpreting this re root cause & implications. Hence i did some research, searching for "NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s!", eg:
a. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1530405
b. https://ubuntuforums.org/showthread.php?t=2205211
c. https://bbs.archlinux.org/viewtopic.php?id=220414
d. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1441387
e. https://askubuntu.com/questions/875173/nmi-watchdog-bug-soft-lockup-cpu2-stuck-for-23s-plymouthd305

...& this indicated:
1. I am not alone; LOTS of people have this problem.
2. Most reports seem for Ubuntu 16.04, but some predate it with 15.04, & even 14.04 (including many derivatives of all these).
3. A wide range of kernels are mentioned, from which i intuit that (a) all kernels are crap, or (b) more likely IMO the kernel is not the root cause[?]
4. No single specific user-software program is associated with these failures.
5. Other opinions [guesses?] on possible root causes span Nvidia GPU drivers, Nouveau GPU drivers [thus by inference cancelling each other out & implying All Is Lost, Return to Slide-rules], USB3 being enabled in BIOS, Hyperthreading being enabled/ not enabled in BIOS [sigh], systemd, PSU, bulging MoB capacitors, overclocking / not overclocking, moon-phases, wind-direction...

I wonder about a need/possibility of flashing a BIOS-update (if available], but the idea rather terrifies me as (a) i've never done that in Linux (it seems complicated], only in Windows in a previous Lappy (was simple & easy), (b) with my luck there'll be a power-blackout during the flashing, thus turning my Tower into a [pile of] brick[s], (c] at least one other report mentions that doing a BIOS upgrade did not fix the problem.

I have a growing feeling that i'm going to be stuck with these occasional, random, freezes, well into the future... For my equanimity, maybe i need to just adopt a Kubrickesque philosophy, viz: Dr. Strangelove Or: How I Learned To Stop Worrying And Love The Bomb.


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - kdemeoz - 20th February 2017

leszek, do you have any further comments pls? As best i can now understand, my multiple Tower freezes seem NOT to be caused by kernels or Maui [Ubuntu or Plasma], but also NOT by my hardware [cpu] being faulty [given all the many posts i sampled; surely we cannot ALL have bad cpu's?]. So where does this leave me to turn? Is there now officially nothing left to attempt? If not, that means i simply must accept a permanent state of random freezes.


RE: Tower's 1st [no, 3rd] Hard-Reset since clean-reinstall. - leszek - 20th February 2017

It seems to me be a bug in your hardware and the kernel has no workaround for it.