Startup issues with Linux Mint

Yeah, I erased the previous Windows installation and installed Linux instead of running a dual-boot set-up. I was tired of tip-toeing around all the Microsoft nonsense.

How do I go about switching the Kernel to 6.8?
 


Sorry I had to find my notes...
https://www.linux.org/threads/what-a-difference-a-kernel-makes.58728/page-2 post 21.
1776822877331.gif
 
When done...open the Terminal and run this command...
Code:
uname -r

You'll see this...
1776823302060.png


Hope this fixes your problem...fingers crossed.
1776823390367.gif
 
I confess I'm a little at a loss on how to do the kernel cmdline on bootup part.
This thread appears to me to have gone off the rails a bit.

The problem AIUI from post #1 is that machine takes multiple restarts, but then works. The other mentioned issue was the acpi error messages.

To deal the the acpi error messages first: as @GatorsFan mentioned in post #2, they can usually be ignored because they don't usually interfere with the functioning of a system. If you want more information on that check out this post: https://www.linux.org/threads/acpi-...resolve-symbol-yadda-yadda.58522/#post-281615, and follow the links in it to get some idea about the status of acpi messages and what you can do about them, which is not a lot unless you are into disassembling, modifying and recompiling the acpi tables (check out the kernel docs). Since you mention that machine actually does eventually get to work, then that sort of corroborates the notion that the acpi messages can be ignored.

On the main issue of the machine not functioning properly and needing restarts, another sort of investigation is needed: first is to look at the logs.

Fortunately the logs shown in post #7 do show what the problem might be, again identified by @GatorsFan in post #16. That error message needs to be investigated because the kernel is telling the user what has crashed the operating system:
Code:
kernel: x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
@GatorsFan proposed a solution in post #16.

Did you, @Omnifitense, try out that proposal?

It needs to be tried because it directly addresses what the logs have shown to be the issue at this point in time.

Some instructions for adding the kernel option (boot parameter) can be found here: https://linuxconfig.org/how-to-set-kernel-boot-parameters-on-linux.

If you do not try out booting with the suggested kernel parameter, you will not know whether it resolves the issue.

Further on that issue, there is an alternative kernel option that can be tried which is:
split_lock_detect=warn. In this case the user would receive a warning about the split lock problem. So, if the split lock is indeed the problem, the system shouldn't stop as a result of that problem, rather it should continue to operate but probably in a degraded fashion, which means, slower or more sluggishly. The slowdown is what the split lock problem is known to do.

There is no point in altering the kernel before trying to deal with the current kernel's error message. It's really only reasonable to consider other approaches after the kernel's error message has been dealt with. Later kernels are generally better equipped to deal with issues than earlier kernels, though one needs to be aware that some kernels do have unexpected bugs, but these are usually advertised in the linux community. A quick check online using AI shows that split lock issues have arisen with many kernels including 6.8 and 6.17.
 
Last edited:
Since the OP is getting nowhere fast...the more ideas to help solve the problem the better.
New Kernel problems especially 6.14/6.17 are know...I experienced one myself...which was fixed by rolling back the Kernel to 6.8.

I'm running 5 versions of Mint...Cinnamon 22.1...22.3...Mate 22.3...Mint xfce 22.1 and Cinnamon 22.3 on my spare SSD with zero Boot problems or any other problems and I'm no expert.

Linux Mint is easy to install and run...apart from the occasional Kernel problem...I think it's either hardware or the user that's causing problems through inexperience.
Some times it's better to do a clean install and get the beginner to follow simple instructions...than going on for days...weeks trying to workout what has happened if it's not hardware...but that's me.
1776836760662.gif
 
This thread appears to me to have gone off the rails a bit.

The problem AIUI from post #1 is that machine takes multiple restarts, but then works. The other mentioned issue was the acpi error messages.

To deal the the acpi error messages first: as @GatorsFan mentioned in post #2, they can usually be ignored because they don't usually interfere with the functioning of a system. If you want more information on that check out this post: https://www.linux.org/threads/acpi-...resolve-symbol-yadda-yadda.58522/#post-281615, and follow the links in it to get some idea about the status of acpi messages and what you can do about them, which is not a lot unless you are into disassembling, modifying and recompiling the acpi tables (check out the kernel docs). Since you mention that machine actually does eventually get to work, then that sort of corroborates the notion that the acpi messages can be ignored.

On the main issue of the machine not functioning properly and needing restarts, another sort of investigation is needed: first is to look at the logs.

Fortunately the logs shown in post #7 do show what the problem might be, again identified by @GatorsFan in post #16. That error message needs to be investigated because the kernel is telling the user what has crashed the operating system:
Code:
kernel: x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
@GatorsFan proposed a solution in post #16.

Did you, @Omnifitense, try out that proposal?

It needs to be tried because it directly addresses what the logs have shown to be the issue at this point in time.

Some instructions for adding the kernel option (boot parameter) can be found here: https://linuxconfig.org/how-to-set-kernel-boot-parameters-on-linux.

If you do not try out booting with the suggested kernel parameter, you will not know whether it resolves the issue.

Further on that issue, there is an alternative kernel option that can be tried which is:
split_lock_detect=warn. In this case the user would receive a warning about the split lock problem. So, if the split lock is indeed the problem, the system shouldn't stop as a result of that problem, rather it should continue to operate but probably in a degraded fashion, which means, slower or more sluggishly. The slowdown is what the split lock problem is known to do.

There is no point in altering the kernel before trying to deal with the current kernel's error message. It's really only reasonable to consider other approaches after the kernel's error message has been dealt with. Later kernels are generally better equipped to deal with issues than earlier kernels, though one needs to be aware that some kernels do have unexpected bugs, but these are usually advertised in the linux community. A quick check online using AI shows that split lock issues have arisen with many kernels including 6.8 and 6.17.
I did try the Split Locks thing, and it didn't do anything. I agree that at this point it may be a hardware issue,
 
I did try the Split Locks thing, and it didn't do anything. I agree that at this point it may be a hardware issue,
If it may be hardware, then one can check the hardware. It's often the case that intermittent issues like those described in post #1 are hardware related.

It's worth upgrading to the latest possible state before checking hardware to see whether the problems still persist. That should include the latest firmware as well as the kernel and packages, and also the latest BIOS update.

If one doesn't have the system in the latest updated state, then one will be trying to fix problems that may already have been fixed by those updates. Hence, if problems persist despite the system being currently upgraded, some of the following can be looked at, mostly using root or sudo privileges. Note that none of the following changes anything, it just gathers information. It's all run in a terminal which is an efficient way to do this because the output will appear on screen immediately to be read or copied for pasting.

First check the logs for errors:
Code:
journalctl -b -x -p 3
The options mean: -b (this boot); -x (explanation of errors); -p (priority of message 3=error). Note the errors since each really needs an explanation, though not all are necessarily issues for the functioning of a system.

Check for missing firmware:
Code:
dmesg | grep -i firmware

Check for missing microcode:
Code:
dmesg | grep -i microcode

Check for cpu vulnerabilities:
Code:
lscpu
Observe the section on "Vulnerabilites" at the bottom of the output. If there are no issues, this output produces output like "Not affected" or "Mitigation"

Check the drives:
Code:
smartctl -x /dev/<device>
One can get the device name from the output of lsblk, for example:
Code:
[~]$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sr0          11:0    1  1024M  0 rom
nvme0n1     259:0    0 465.8G  0 disk
├─nvme0n1p1 259:1    0   476M  0 part /boot/efi
├─nvme0n1p2 259:2    0  14.9G  0 part [SWAP]
└─nvme0n1p3 259:3    0 450.4G  0 part /
The device name is that denoted by the term "disk" in the output with a prefix, so in this case it would be:
/dev/nvme0n1
In the output of the smartctl command check the "Critical Warning", the temperatures, and whether the drive has: PASSED.

Check the memory with the package memtest86+. One can run it from a live disk or a rescue disk, which is how it's done here. In BIOS systems one can install the package and run it on the next boot by selecting it from the grub menu. It doesn't always appear in the grub menu on UEFI systems. It's easier run from a live disk I think.

Check the filesystem. This has to be done on an unmounted system. Using a live disk to run the fsck is safe, and is the way it's used here. To check the various partitions, get their names from the lsblk output above, for example the root partition the one with: /, so its name is: /dev/nvme0n1p3. To check the filesystem on it run:
Code:
fsck /dev/nvme01p3
Each partition can be checked. The output on screen will show if the filesystem is "clean" or something else.

Check the overall temperatures in the system by running:
Code:
sensors
The sensors command is in the package: lm-sensors. Install it if it's not installed.
In particular check that the temperatures are within the ranges shown in the output.

There are other checks, but the above is a basic start to gather info.
 
Last edited:
Did you try rolling back the Kernel ?
1776902435922.gif
 
If it may be hardware, then one can check the hardware. It's often the case that intermittent issues like those described in post #1 are hardware related.

It's worth upgrading to the latest possible state before checking hardware to see whether the problems still persist. That should include the latest firmware as well as the kernel and packages, and also the latest BIOS update.

If one doesn't have the system in the latest updated state, then one will be trying to fix problems that may already have been fixed by those updates. Hence, if problems persist despite the system being currently upgraded, some of the following can be looked at, mostly using root or sudo privileges. Note that none of the following changes anything, it just gathers information. It's all run in a terminal which is an efficient way to do this because the output will appear on screen immediately to be read or copied for pasting.

First check the logs for errors:
Code:
journalctl -b -x -p 3
The options mean: -b (this boot); -x (explanation of errors); -p (priority of message 3=error). Note the errors since each really needs an explanation, though not all are necessarily issues for the functioning of a system.

Check for missing firmware:
Code:
dmesg | grep -i firmware

Check for missing microcode:
Code:
dmesg | grep -i microcode

Check for cpu vulnerabilities:
Code:
lscpu
Observe the section on "Vulnerabilites" at the bottom of the output. If there are no issues, this output produces output like "Not affected" or "Mitigation"

Check the drives:
Code:
smartctl -x /dev/<device>
One can get the device name from the output of lsblk, for example:
Code:
[~]$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sr0          11:0    1  1024M  0 rom
nvme0n1     259:0    0 465.8G  0 disk
├─nvme0n1p1 259:1    0   476M  0 part /boot/efi
├─nvme0n1p2 259:2    0  14.9G  0 part [SWAP]
└─nvme0n1p3 259:3    0 450.4G  0 part /
The device name is that denoted by the term "disk" in the output with a prefix, so in this case it would be:
/dev/nvme0n1
In the output of the smartctl command check the "Critical Warning", the temperatures, and whether the drive has: PASSED.

Check the memory with the package memtest86+. One can run it from a live disk or a rescue disk, which is how it's done here. In BIOS systems one can install the package and run it on the next boot by selecting it from the grub menu. It doesn't always appear in the grub menu on UEFI systems. It's easier run from a live disk I think.

Check the filesystem. This has to be done on an unmounted system. Using a live disk to run the fsck is safe, and is the way it's used here. To check the various partitions, get their names from the lsblk output above, for example the root partition the one with: /, so its name is: /dev/nvme0n1p3. To check the filesystem on it run:
Code:
fsck /dev/nvme01p3
Each partition can be checked. The output on screen will show if the filesystem is "clean" or something else.

Check the overall temperatures in the system by running:
Code:
sensors
The sensors command is in the package: lm-sensors. Install it if it's not installed.
In particular check that the temperatures are within the ranges shown in the output.

There are other checks, but the above is a basic start to gather info.
No missing firmware or microcode.

The vulnerabilities on lscpu returned:

Spec store bypass: Mitigation; Speculative Store Bypass disabled via p
rctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user poi
nter sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditi
onal; PBRSB-eIBRS Not affected; BHI BHI_DIS_S
Vmscape: Mitigation; IBPB before exit to userspace

smartctl would not let me run it, 'Permission Denied'.

I'll try running the others at a later date.
 
No missing firmware or microcode.

The vulnerabilities on lscpu returned:

Spec store bypass: Mitigation; Speculative Store Bypass disabled via p
rctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user poi
nter sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditi
onal; PBRSB-eIBRS Not affected; BHI BHI_DIS_S
Vmscape: Mitigation; IBPB before exit to userspace

smartctl would not let me run it, 'Permission Denied'.

I'll try running the others at a later date.
Permission Denied just means that you have to run smartcltl commands with root privileges.:)
 
After the tweaking above, I was able to run smartctl. Everything seemed to be in order apart from one thing, right at the end (no other errors logged):
Read Self-test Log failed: Invalid Field in Command (0x4002)
 
After the tweaking above, I was able to run smartctl. Everything seemed to be in order apart from one thing, right at the end (no other errors logged):
Read Self-test Log failed: Invalid Field in Command (0x4002)
If you didn't run a self-test, there is no issue to be concerned about. There's also a known bug in version smartmontools 7.4 with nvme drives, but that's not necessarily an issue either in determining the overall health of the drive.

Two lines to check are "Media and Data Integrity Errors", which is 0 on a healthy drive, and "Available Spare" which 100% on a healthy drive. If those values are present in the output, then the likelihood is that there is no issue, given that there's no issue elsewhere in the output.
 
This sounds exactly like my Dell laptop when I put my linux drive in it. From what I was told it's a secure boot thing. I am running zorin on it and it take multiple boots to get it running. I never disabled secure boot. Once I get my new bottom case, I am going to install the Linux drive, load up 18.1, disable secure boot and see how it goes.
 
This sounds exactly like my Dell laptop when I put my linux drive in it. From what I was told it's a secure boot thing. I am running zorin on it and it take multiple boots to get it running. I never disabled secure boot. Once I get my new bottom case, I am going to install the Linux drive, load up 18.1, disable secure boot and see how it goes.
From this response, it seems like the problem may have been that secure boot wasn't disabled when I first installed Linux.

Luckily, I'm able to get around it by not restarting, just logging off and entering sleep mode every night.
 
Let's see entire picture. Please paste sudo dmesg result.
 
The whole entry was too long to be posted, so I had to attach a txt file.

You pasted document formatted by LibreOffice Writer, instead of a text file. And also the log doesn't start from the beginning.

Paste it directly from the terminal, do:

sudo dmesg >dmesg.txt

to generate it.
 


Follow Linux.org

Members online


Top