Hi,
I have been running Ubuntu on an old Dell Latitude E6410 for a number of years and since the start it has been exhibiting crashes. The symptoms are always the same:
- It invariably happens when I am browsing online in Firefox
- The LED light will start to flicker indicating disk access (its on old hard disk not an SSD)
- This flickering will become more intense until it is pretty much constant and as it intensifies the mouse becomes more unresponsive
- Initially the mouse will move the cursor but it wont respond to clicks (eg try clicking on the X to close the browser as I'm almost certain it is something that the browser is doing that is causing this constant disk access)
- I have mapped the Ctrl-Alt-M key combo to Xkill and IF I am quick enough when the flashing starts it will respond to the Xkill command and I can kill the browser and the problem is resolved.
- 9 times out of 10 though I will not be quick enough and then the machine is basically frozen, it will remain there for hours apparently trying to access the disk and will be unresponsive to keystrokes or mouse clicks. The only way to resolve the problem at this point is to power cycle the laptop.
This crashing happens with a frequency of about once a week, I'm currently running 18.04.6 LTS but this has been going on for years (back as far as 12.04.6 LTS I think) so I'm sure there is no point in upgrading to a newer revision of Ubuntu.
So I have two questions:
1. Does anyone know how I would debug this to figure out what the problem is ?
2. If not, is there anyway of running Firefox in some sort of 'sandbox' mode such that when it crashes I can easily open a terminal and just kill Firefox. As it is when it goes crazy it seems to lock up the whole system. Is there any way I can run it that avoids if taking over all the system resources ? This was one of the major advantages of Linux vs Windows when I used to run Linux years ago; if there was a problem with a program in Windows it took the whole system down with a BSOD whereas with linux I would just open a terminal and issue the ps and kill <pid> commands. It seems now Linux (or Ubuntu at any rate) has contracted the craptastic Windows behaviour.
On the subject of trying to debug it I did create a simple shell script (crashLogging.sh attached) to try to see what is going on when the problem arises. I have also attached the outputs (log1.txt, log2.txt) when the problem happens. It is shows that ~100% of the CPU is occupied with 'wa, IO-wait : time waiting for I/O completion', does this give any further clue to the root cause of the problem or can anyone suggest any further experiments to debug the problem ? Doing a grep on the Cpu activity you can see that all of a sudden the wa % ramps up to near 100% and then just stays there indefinitely:
log2.txt shows a basline( 20 captures of the top command when there is no problem):
grep "Cpu" log2.txt
%Cpu(s): 12.3 us, 2.2 sy, 0.0 ni, 84.9 id, 0.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 20.4 us, 4.0 sy, 0.0 ni, 75.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 7.4 us, 2.3 sy, 0.0 ni, 89.5 id, 0.8 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 3.1 us, 1.2 sy, 0.0 ni, 95.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 1.9 us, 0.8 sy, 0.0 ni, 97.1 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 4.7 us, 1.3 sy, 0.0 ni, 93.7 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 4.2 us, 1.2 sy, 0.0 ni, 94.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 5.6 us, 2.3 sy, 0.0 ni, 92.0 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 6.4 us, 1.8 sy, 0.0 ni, 91.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 3.8 us, 1.6 sy, 0.0 ni, 94.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 6.2 us, 1.9 sy, 0.0 ni, 91.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 25.9 us, 4.0 sy, 0.0 ni, 67.2 id, 2.6 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu(s): 16.1 us, 4.2 sy, 0.0 ni, 77.3 id, 2.4 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 15.7 us, 3.4 sy, 0.0 ni, 78.3 id, 2.5 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 10.5 us, 2.8 sy, 0.0 ni, 81.2 id, 5.5 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 21.5 us, 4.2 sy, 0.0 ni, 73.4 id, 0.7 wa, 0.0 hi, 0.2 si, 0.0 st
%Cpu(s): 16.6 us, 3.8 sy, 0.0 ni, 79.5 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 28.7 us, 4.2 sy, 0.0 ni, 65.1 id, 1.8 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 8.1 us, 2.6 sy, 0.0 ni, 89.1 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 12.2 us, 2.8 sy, 0.0 ni, 84.8 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st
Then about a 80s into log2 the cpu 'wa' % starts to ramp up indicating the start of the problem:
grep "Cpu" log1.txt
%Cpu(s): 12.3 us, 2.2 sy, 0.0 ni, 84.9 id, 0.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 9.5 us, 2.2 sy, 0.0 ni, 88.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 40.4 us, 5.8 sy, 0.0 ni, 52.2 id, 1.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 36.2 us, 6.9 sy, 0.0 ni, 49.3 id, 7.4 wa, 0.0 hi, 0.2 si, 0.0 st
%Cpu(s): 28.1 us, 5.0 sy, 0.1 ni, 43.4 id, 23.4 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 16.7 us, 3.7 sy, 0.0 ni, 39.8 id, 39.8 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 4.1 us, 1.3 sy, 0.0 ni, 35.8 id, 58.8 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 2.3 us, 1.5 sy, 0.0 ni, 34.1 id, 62.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 0.7 us, 0.7 sy, 0.0 ni, 1.8 id, 96.8 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 1.2 us, 1.7 sy, 0.0 ni, 10.1 id, 87.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.8 us, 1.5 sy, 0.0 ni, 5.8 id, 91.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.6 us, 1.1 sy, 0.0 ni, 4.7 id, 93.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 1.0 us, 1.4 sy, 0.0 ni, 17.2 id, 80.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.3 us, 0.9 sy, 0.0 ni, 0.2 id, 98.6 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 0.9 us, 1.7 sy, 0.0 ni, 0.1 id, 97.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.6 us, 3.2 sy, 0.0 ni, 5.9 id, 90.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.9 us, 1.5 sy, 0.0 ni, 6.8 id, 90.8 wa, 0.0 hi, 0.0 si, 0.0 st
Anyone have any suggestions or is there any other information in these logs that would explain what is going on ? Thanks,
Usjes
I have been running Ubuntu on an old Dell Latitude E6410 for a number of years and since the start it has been exhibiting crashes. The symptoms are always the same:
- It invariably happens when I am browsing online in Firefox
- The LED light will start to flicker indicating disk access (its on old hard disk not an SSD)
- This flickering will become more intense until it is pretty much constant and as it intensifies the mouse becomes more unresponsive
- Initially the mouse will move the cursor but it wont respond to clicks (eg try clicking on the X to close the browser as I'm almost certain it is something that the browser is doing that is causing this constant disk access)
- I have mapped the Ctrl-Alt-M key combo to Xkill and IF I am quick enough when the flashing starts it will respond to the Xkill command and I can kill the browser and the problem is resolved.
- 9 times out of 10 though I will not be quick enough and then the machine is basically frozen, it will remain there for hours apparently trying to access the disk and will be unresponsive to keystrokes or mouse clicks. The only way to resolve the problem at this point is to power cycle the laptop.
This crashing happens with a frequency of about once a week, I'm currently running 18.04.6 LTS but this has been going on for years (back as far as 12.04.6 LTS I think) so I'm sure there is no point in upgrading to a newer revision of Ubuntu.
So I have two questions:
1. Does anyone know how I would debug this to figure out what the problem is ?
2. If not, is there anyway of running Firefox in some sort of 'sandbox' mode such that when it crashes I can easily open a terminal and just kill Firefox. As it is when it goes crazy it seems to lock up the whole system. Is there any way I can run it that avoids if taking over all the system resources ? This was one of the major advantages of Linux vs Windows when I used to run Linux years ago; if there was a problem with a program in Windows it took the whole system down with a BSOD whereas with linux I would just open a terminal and issue the ps and kill <pid> commands. It seems now Linux (or Ubuntu at any rate) has contracted the craptastic Windows behaviour.
On the subject of trying to debug it I did create a simple shell script (crashLogging.sh attached) to try to see what is going on when the problem arises. I have also attached the outputs (log1.txt, log2.txt) when the problem happens. It is shows that ~100% of the CPU is occupied with 'wa, IO-wait : time waiting for I/O completion', does this give any further clue to the root cause of the problem or can anyone suggest any further experiments to debug the problem ? Doing a grep on the Cpu activity you can see that all of a sudden the wa % ramps up to near 100% and then just stays there indefinitely:
log2.txt shows a basline( 20 captures of the top command when there is no problem):
grep "Cpu" log2.txt
%Cpu(s): 12.3 us, 2.2 sy, 0.0 ni, 84.9 id, 0.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 20.4 us, 4.0 sy, 0.0 ni, 75.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 7.4 us, 2.3 sy, 0.0 ni, 89.5 id, 0.8 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 3.1 us, 1.2 sy, 0.0 ni, 95.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 1.9 us, 0.8 sy, 0.0 ni, 97.1 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 4.7 us, 1.3 sy, 0.0 ni, 93.7 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 4.2 us, 1.2 sy, 0.0 ni, 94.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 5.6 us, 2.3 sy, 0.0 ni, 92.0 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 6.4 us, 1.8 sy, 0.0 ni, 91.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 3.8 us, 1.6 sy, 0.0 ni, 94.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 6.2 us, 1.9 sy, 0.0 ni, 91.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 25.9 us, 4.0 sy, 0.0 ni, 67.2 id, 2.6 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu(s): 16.1 us, 4.2 sy, 0.0 ni, 77.3 id, 2.4 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 15.7 us, 3.4 sy, 0.0 ni, 78.3 id, 2.5 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 10.5 us, 2.8 sy, 0.0 ni, 81.2 id, 5.5 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 21.5 us, 4.2 sy, 0.0 ni, 73.4 id, 0.7 wa, 0.0 hi, 0.2 si, 0.0 st
%Cpu(s): 16.6 us, 3.8 sy, 0.0 ni, 79.5 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 28.7 us, 4.2 sy, 0.0 ni, 65.1 id, 1.8 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 8.1 us, 2.6 sy, 0.0 ni, 89.1 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 12.2 us, 2.8 sy, 0.0 ni, 84.8 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st
Then about a 80s into log2 the cpu 'wa' % starts to ramp up indicating the start of the problem:
grep "Cpu" log1.txt
%Cpu(s): 12.3 us, 2.2 sy, 0.0 ni, 84.9 id, 0.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 9.5 us, 2.2 sy, 0.0 ni, 88.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 40.4 us, 5.8 sy, 0.0 ni, 52.2 id, 1.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 36.2 us, 6.9 sy, 0.0 ni, 49.3 id, 7.4 wa, 0.0 hi, 0.2 si, 0.0 st
%Cpu(s): 28.1 us, 5.0 sy, 0.1 ni, 43.4 id, 23.4 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 16.7 us, 3.7 sy, 0.0 ni, 39.8 id, 39.8 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 4.1 us, 1.3 sy, 0.0 ni, 35.8 id, 58.8 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 2.3 us, 1.5 sy, 0.0 ni, 34.1 id, 62.0 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 0.7 us, 0.7 sy, 0.0 ni, 1.8 id, 96.8 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 1.2 us, 1.7 sy, 0.0 ni, 10.1 id, 87.1 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.8 us, 1.5 sy, 0.0 ni, 5.8 id, 91.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.6 us, 1.1 sy, 0.0 ni, 4.7 id, 93.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 1.0 us, 1.4 sy, 0.0 ni, 17.2 id, 80.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.3 us, 0.9 sy, 0.0 ni, 0.2 id, 98.6 wa, 0.0 hi, 0.1 si, 0.0 st
%Cpu(s): 0.9 us, 1.7 sy, 0.0 ni, 0.1 id, 97.2 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.6 us, 3.2 sy, 0.0 ni, 5.9 id, 90.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu(s): 0.9 us, 1.5 sy, 0.0 ni, 6.8 id, 90.8 wa, 0.0 hi, 0.0 si, 0.0 st
Anyone have any suggestions or is there any other information in these logs that would explain what is going on ? Thanks,
Usjes