Complete freeze in different Distros

Graphics:
Device-1: NVIDIA TU117 [GeForce GTX 1650] vendor: eVga.com. driver: nvidia v: 535.104.05 arch: Turing pcie: speed: 8 GT/s lanes: 8 ports: active: none off: HDMI-A-1 empty: DP-1,DP-2 bus-ID: 01:00.0 chip-ID: 10de:1f82
Display: x11 server: X.Org v: 1.21.1.7 with: Xwayland v: 22.1.8 compositor: kwin_x11 driver: X: loaded: nvidia unloaded: fbdev,modesetting,nouveau,vesa gpu: nvidia,nvidia-nvswitch display-ID: :0 screens: 1
Screen-1: 0 s-res: 1920x1080 s-dpi: 92
Monitor-1: HDMI-A-1 mapped: HDMI-0 note: disabled model: Samsung LF24T35 res: 1920x1080 dpi: 92 diag: 606mm (23.9") API:
OpenGL v: 4.6.0 NVIDIA 535.104.05 renderer: NVIDIA GeForce GTX 1650/PCIe/SSE2 direct-render: Yes
this is the section i am referring to where the radeon card isn't listed. neither are the usual radeon or amdgpu drivers.

for comparison, this is from my intel/nvidia system where both Devices are listed along with their drivers and other info:
Code:
inxi -Fnxxz
<snip>
Graphics:  Device-1: Intel 4th Gen Core Processor Integrated Graphics vendor: Lenovo driver: i915 
           v: kernel bus-ID: 00:02.0 chip-ID: 8086:0416 
           Device-2: NVIDIA GK107GLM [Quadro K1100M] vendor: Lenovo driver: nouveau v: kernel 
           bus-ID: 01:00.0 chip-ID: 10de:0ff6 
           Display: x11 server: X.Org 1.20.11 compositor: xfwm4 driver: loaded: modesetting 
           unloaded: fbdev,vesa resolution: 1280x720~60Hz s-dpi: 96 
           OpenGL: renderer: Mesa DRI Intel HD Graphics 4600 (HSW GT2) v: 4.5 Mesa 20.3.5 
           compat-v: 3.0 direct render: Yes
<snip>
 


This is what I got from the gpu-manager log

Code:
log_file: /var/log/gpu-manager.log
last_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
new_boot_file: /var/lib/ubuntu-drivers-common/last_gfx_boot
can't access /opt/amdgpu-pro/bin/amdgpu-pro-px
Looking for nvidia modules in /lib/modules/6.4.12-060412-generic/kernel
Looking for nvidia modules in /lib/modules/6.4.12-060412-generic/updates/dkms
Found nvidia.ko module in /lib/modules/6.4.12-060412-generic/updates/dkms/nvidia.ko.zst
Looking for amdgpu modules in /lib/modules/6.4.12-060412-generic/kernel
Looking for amdgpu modules in /lib/modules/6.4.12-060412-generic/updates/dkms
Is nvidia loaded? yes
Was nvidia unloaded? no
Is nvidia blacklisted? no
Is intel loaded? no
Is radeon loaded? no
Is radeon blacklisted? no
Is amdgpu loaded? no
Is amdgpu blacklisted? no
Is amdgpu versioned? no
Is amdgpu pro stack? no
Is nouveau loaded? no
Is nouveau blacklisted? yes
Is nvidia kernel module available? yes
Is amdgpu kernel module available? no
Vendor/Device Id: 10de:1f82
BusID "PCI:1@0:0:0"
Is boot vga? yes
Skipping "/dev/dri/card0", driven by "nvidia-drm"
Skipping "/dev/dri/card0", driven by "nvidia-drm"
Skipping "/dev/dri/card0", driven by "nvidia-drm"
Skipping "/dev/dri/card0", driven by "nvidia-drm"
Does it require offloading? no
last cards number = 1
Has amd? no
Has intel? no
Has nvidia? yes
How many cards? 1
Has the system changed? No
Takes 0ms to wait for nvidia udev rules completed.
Single card detected
Nothing to do

And this is what I got from the previous commands you sent me (see attached file, sorry but I was afraid of not making it in time to copy and paste it)
 

Attachments

  • 3.jpg
    3.jpg
    126.2 KB · Views: 110
@KGIII :-

I'm not sure what some folks are doing, but I decided to look and see how long a CMOS battery is expected to last - and people are saying around 3 years. I don't recall replacing one within the past decade (on my own devices). The CMOS and RTC are very low-power these days. I have no idea how they go through them that quickly.
Nah, people aren't going through them that quickly. What many people are doing is to replace them when they have cause to open their machine for any other reason, just because they can. Lithium-ion CR2032's should last around a decade. Many would rather replace a battery as frequently as possible, rather than risk things going wrong when there's no real excuse.

You know yourself, they're no hassle to access, especially on a desktop rig. And they're pretty cheap, and absolutely everybody & his dog sells them...


Mike. ;)
 
Last edited:
Has amd? no
Has intel? no
Has nvidia? yes
How many cards? 1
as your previous question indicated, i can't say that not identifying one gpu would cause the issues you are having. that being said, i do find it quite odd. i have seen some systems where one gpu driver (my nouveau driver sometimes does this) didn't load at boot, but can't recall one where the device itself doesn't seem to be listed at all.

since the system is so unstable, it may be hard to look through other logs or otherwise try to troubleshoot while booted into it. i know your original post said the keyboard doesn't respond, but when the system is frozen have you tried getting to another tty with Ctrl+Alt+F1 (or F2 - F7 if F1 doesn't work) and logging into a text console?

if that works, you might be able to troubleshoot from there. if not, i would try grabbing logs from /var/log/ like boot.log, kern.log and syslog. kern.log and syslog should be timestamped so you could see if there were issues at or before the freeze. with boot.log you would be looking to see if most lines have an OK. it's been a while since i saw the alternative so i'm not sure if it says FAILED or something like that. note that not all lines have an OK like the snippet below:
Code:
sudo cat /var/log/boot.log
<snip>
[  OK  ] Reached target Sound Card.
         Starting GNOME Display Manager...
         Starting Hold until boot process finishes up...
[  OK  ] Started User Login Management.
[  OK  ] Started Unattended Upgrades Shutdown.
[  OK  ] Started Dispatcher daemon for systemd-networkd.
[  OK  ] Started GNOME Display Manager.
[  OK  ] Started Disk Manager.
 
since the system is so unstable, it may be hard to look through other logs or otherwise try to troubleshoot while booted into it. i know your original post said the keyboard doesn't respond, but when the system is frozen have you tried getting to another tty with Ctrl+Alt+F1 (or F2 - F7 if F1 doesn't work) and logging into a text console?

I tried but it didn't work, nothing responds. Even when pressing bloq mayus the little light on the keyboard doesn't go on or off, for example.

i would try grabbing logs from /var/log/ like boot.log, kern.log and syslog. kern.log and syslog should be timestamped so you could see if there were issues at or before the freeze. with boot.log you would be looking to see if most lines have an OK. it's been a while since i saw the alternative so i'm not sure if it says FAILED or something like that. note that not all lines have an OK like the snippet below:
sudo cat /var/log/boot.log
Thank you! I will check that out and report back
 
Last edited:
Alright, I got the logs. They are awfully long, I doubt the forum would let me post that many characters in here.

Everything inside the bootlog says OK, and at the start of each session it says, for example:

Code:
/dev/sdb2: clean, 289247/7299072 files, 5087329/29172736 blocks

I suppose that's normal.

Inside kernlog I'm not sure what I'm looking for, and it's really extensive. I'm not sure if it's ok to attach the txt here, so let me know what should I keep an eye out for.

^ same with syslog

Now I'm going to do the CMOS Battery thing. Sadly, is behind the video card so I will have to take it out.
 
Nah, people aren't going through them that quickly. What many people are doing is to replace them when they have cause to open their machine for any other reason, just because they can.

I dunno, that's what Google said and tons of links confirmed the idea that they'd last 3 years.

I have no idea how they could possibly go through the battery that quickly. Try as I might, I don't think I've replaced my own battery in at least a decade. (I have replaced someone else's battery.)
 
here's hoping your work with the cmos battery pays off and you don't have to worry about the logs because that can be tough to sort through. in a situation like yours, it might be helpful to know the time when the system froze. if you look at the messages right before that, do you see any that have the words error or warning in them?

you could do a general search through them for the words "error" and "warning" (without the quotes), but unfortunately even systems that start and run fine will have some of those. on other forums people would post longer output like that to a site like https://pastebin.com/

a couple of disclaimers about that is most people probably won't want to wade through thousands of lines of log files. in addition, those messages may contain info like your ip address which it might be good to replace with something like <redacted>.

your original post says you've had trouble with Mint Cinnamon, Mint Mate and Kubuntu. have you tried anything based on fedora (i have read nobara is supposed to be easier to use with nvidia) or arch (something like endeavour os) just to see if they behave differently? even if only running them live. did you have issues with the ones listed when running them live?
 
here's hoping your work with the cmos battery pays off and you don't have to worry about the logs because that can be tough to sort through. in a situation like yours, it might be helpful to know the time when the system froze. if you look at the messages right before that, do you see any that have the words error or warning in them?

Thank you for your wishes! I just reseated my CMOS Battery and the BIOS reset was succesful. Now, I logged into Linux and I'm writing from there. So far, so good, but I will keep it running doing stuff to see what happens.

In the meantime, I got into KSystemlog and I saw some lines that got my attention:

Code:
25/8/23 21:34    kded5    kf.bluezqt: PendingCall Error: "Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken."

25/8/23 21:34    dbus-daemon    [system] Failed to activate service 'org.bluez': timed out (service_start_timeout=25000ms)

25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:255" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:0" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:0" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:0" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:0" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:255" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:254" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:0" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:0" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0
25/8/23 21:34    audit    AVC apparmor="DENIED" operation="unlink" class="file" profile="snap.firefox.firefox" name="/dev/char/195:254" pid=1789 comm="Renderer" requested_mask="d" denied_mask="d" fsuid=1000 ouid=0

(Just copied and pasted them in no particular order)

your original post says you've had trouble with Mint Cinnamon, Mint Mate and Kubuntu. have you tried anything based on fedora (i have read nobara is supposed to be easier to use with nvidia) or arch (something like endeavour os) just to see if they behave differently? even if only running them live. did you have issues with the ones listed when running them live?

If nothing else works, I might try. I like the look and feel of both Mint and Kubuntu, so I'm hoping I can make this work. But I will not give up, if my home is not in these Distros, I will find one.

I will report back.
 
25/8/23 21:34 dbus-daemon [system] Failed to activate service 'org.bluez': timed out (service_start_timeout=25000ms)
bluez has to do with bluetooth. i get a line very similar that to every boot because i have mine hard blocked with a switch. as far as apparmor goes, some info here: https://wiki.ubuntu.com/AppArmor

in my experience users generally would never have to adjust anything to do with apparmor. i would only consider those to be an issue if they are happening all the time you have firefox open and are absolutely flooding your logs.
 
It was short lived. It did take longer for it to freeze, but it eventually did. I'm back to where I started
 
Reporting back. I switched to Opensuse Tumbleweed, installed and everything. This time, the system froze but I can use the mouse and keyboard, and I'm able to get into and out of the terminal using CTRL ALT F1 and CTRL ALT F2 respectively.

What should I do now? How can I work this out?
 
Good morning.
I am now thinking this may be a service problem, either a component breaking down, or a bad connection/dry joint,
use smartctl to test your SSD and old hard-drive [read up on how to use before you start]
remove motherboard slot components [ram/sound/graphic cards, network cards, & etc] check the pins look ok and put them back, unplug hard-drive cables and reconnect [ by removing then reconnecting these bits you will clear possible dry joints]
Have you dropped the unit or accidentally knocked the CPU heat sink ? if so you may have broken the thermal joint between it and the CPU [this could cause over heating, leading to both freezing and if it gets too hot a complete failure.]

just a few thoughts.

 
I'm able to get into and out of the terminal using CTRL ALT F1 and CTRL ALT F2 respectively.

What should I do now? How can I work this out?
you could check

systemctl status

to see if it lists any FAILED units. i just look at the very beginning of the output where it says "FAILED: # units". after that i press the letter Q to stop that process. another one might be

journalctl -e

that should show the last 25 lines or so from the system journal which is like all of the log files in one place. similarly the Q key will return you to a regular command prompt.

the formatting for commands can make l's (lowercase L's) look like 1's. in the two above commands, those are lowercase l's. without formatting the first words are systemctl and journalctl.
 
Good morning.
I am now thinking this may be a service problem, either a component breaking down, or a bad connection/dry joint,
use smartctl to test your SSD and old hard-drive [read up on how to use before you start]
remove motherboard slot components [ram/sound/graphic cards, network cards, & etc] check the pins look ok and put them back, unplug hard-drive cables and reconnect [ by removing then reconnecting these bits you will clear possible dry joints]
Have you dropped the unit or accidentally knocked the CPU heat sink ? if so you may have broken the thermal joint between it and the CPU [this could cause over heating, leading to both freezing and if it gets too hot a complete failure.]

just a few thoughts.

Good day sir! I removed all slot components and all of them are looking good. The pc is in top shape, I'm very careful with that. I have not dropped the unit, and the temperature readings are normal, both in Linux and Windows, so no overheating.

you could check

systemctl status

to see if it lists any FAILED units. i just look at the very beginning of the output where it says "FAILED: # units". after that i press the letter Q to stop that process. another one might be

journalctl -e

that should show the last 25 lines or so from the system journal which is like all of the log files in one place. similarly the Q key will return you to a regular command prompt.

the formatting for commands can make l's (lowercase L's) look like 1's. in the two above commands, those are lowercase l's. without formatting the first words are systemctl and journalctl.
systemctl status shows 0 failed units.
journalctl -e shows more stuff but I wasn't able to copy and past them here quickly, the system froze again and this time CTRL ALT F1 didn't respond.

What I'm seeing with OpenSUSE is that, as far as I tried, it only freezes a few seconds after loading Firefox. I uninstalled Firefox and tried Falkon, and the same thing happens.

I will continue to test OpenSUSE without opening a browser to see if it freezes. In the other distros, it did freeze anywhere, but it seems it's not the case now. In the other hand, I'll get my hands in another USB stick and perform another install, so I can discard if my USB stick could be the culprit or not.

I will report back. Thank you all for your help, I really appreciate it.
 
Reporting back.

I tried with a newer USB stick, I installed openSUSE Tumbleweed KDE and it froze right away trying to move around the welcome popup. I tried another fresh install, this time with GNOME just to see if that had something to do. It lasted longer, I left it idle and it didn't freeze, until I opened Firefox, came to this forum and try to Login. As soon as I was about to press login, complete freeze again.

Should I just give up? Is my system incompatible? Windows 11 works perfectly, I have long gaming sessions and no problems at all so my ram and video are clearly ok, but I really want to be a Linux user darn it!

Thank you all again for your assistance. I will continue to try things if you have ideas, but I do not know what to search anymore.
 
did you ever test either fedora or nobara or endeavour os to try the other branches of the linux family tree? in addition, debian is pre-ubuntu so it could be worth at least a live test. some users here (myself included) have had good luck with mx linux (https://mxlinux.org/) which is slightly different from most of the above in that it doesn't use systemd by default.
 
Is this machine a Lenovo V55t-15API business tower?

 
did you ever test either fedora or nobara or endeavour os to try the other branches of the linux family tree? in addition, debian is pre-ubuntu so it could be worth at least a live test. some users here (myself included) have had good luck with mx linux (https://mxlinux.org/) which is slightly different from most of the above in that it doesn't use systemd by default.

I did not, but I will try mx linux. I see it offers XFCE, KDE and Fluxbox, in your opinion, which should I choose? Considering my problem.

Is this machine a Lenovo V55t-15API business tower?
Hmm no it's not, just a regular build I guess
 

Members online


Top