Hi,
I am using Ubuntu 24.04.1 in a ThinkPad with Core Ultra 7 155H processor where for the part 3-4 weeks I am observing an issue that my system is freezing/lagging periodically for just a instant. So I checked in the System Monitor (screenshot attached) to find that one of the threads is being used ~65% periodically (the curve is sinusoidal in nature). I couldn't find the program that is doing this. It's the same thing when I'm even not using any programs just in the background maybe Dropbox sync along with other background processes are running. It is the same even if the system is in Airplane Mode.
Can anybody help me with finding the cause of the issue and solving it? It is becoming very irritating as I can't do anything properly as even scrolling a PDF, or watching YouTube, everything is getting affected.
Thanks and regards,
Saumyen
Thanks for providing the link to the video in post #4. Seeing the problem, and knowing from post #1 that it's generalised on the system, led me to the following observations.
It looks like a jitter issue. Jitter is fluctuation in a transmission signal (or display image.). It refers to some offset in space and time from the normal behaviour of regular transmission signals. In network transmission it would refer to a bit arriving a bit before or after a clock cycle. In a GUI app such an offset in the sending of, what are called, "scheduling-clock interrupts" to the cpu, causes jitter which looks to me very much like what is seen in the linked video provided in post #4.
Basically, the cpu gets sent signals in micro-seconds, called ticks, to manage it. Here is a quote from my notes:
The tick is a periodic timer interrupt executing on each CPU at a frequency ranging from 100 to 1000 Hz, though some architectures propose fancier values. It performs many jobs:
- Run expired general purpose timer callbacks
- Elapse posix CPU timers and run those that have expired
- Timekeeping: maintain internal clock (jiffies) and external clock (gettimeofday())
- Scheduler: maintain internal state, fairness and priorities (task preemption)
- Maintain global load average
- Maintain perf events, etc…
I'm sorry I can't have a reference for that because I was remiss in not putting it with my notes some years ago now.
Nevertheless, the kernel docs have info here:
In that kernel doc, there are a number of kernel options that one can try to control the matter, and there is also a link to a test at github which can be downloaded to make some investigation of the issue. It will need the necessary apps to compile it, e.g. the build-essential packages in debian and similar systems.
Bear in mind that there's quite a bit of reading to do to become familiar with the processes involved.
There's some more information on the subject here:
This blog post is the second in a technical series by SUSE Lab...
www.suse.com
For comparison's sake, here are some of the relevant configurations in the kernel on a machine here that runs perfectly well:
Code:
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_MACHZ_WDT=m
In relation to the cpu frequency activity that you proposed as the issue, it is not unusual for the cpu activity with multiple cores to fluctuate quite a bit, as
@MikeWalsh pointed out in post #2. This can be readily shown with intel cpus such as mentioned in post #1 with the i7z command.
For example, in the following snapshot of output from the i7z command in a terminal on a computer here functioning perfectly well, one can notice that core 7 has almost double the actual frequency of the other cores, but if one lets the i7z output run continuously on screen, that larger frequency is seen to subside and vary with other cpu cores being used. The i7z command needs to be run as root:
Code:
$ i7z
Cpu speed from cpuinfo 2495.00Mhz
cpuinfo might be wrong if cpufreq is enabled. To guess correctly try estimating via tsc
Linux's inbuilt cpu_khz code emulated now
True Frequency (without accounting Turbo) 2495 MHz
CPU Multiplier 25x || Bus clock frequency (BCLK) 99.80 MHz
Socket [0] - [physical cores=14, logical cores=20, max online cores ever=14]
TURBO ENABLED on 14 Cores, Hyper Threading ON
Max Frequency without considering Turbo 2594.80 MHz (99.80 x [26])
Max TURBO Multiplier (if Enabled) with 1/2/3/4/5/6 Cores is 48x/48x/47x/47x/45x/45x
Real Current Frequency 1628.95 MHz [99.80 x 16.32] (Max of below)
Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % Temp VCore
Core 1 [0]: 814.03 (8.16x) 2.65 94.3 0 4.81 23 0.7723
Core 2 [2]: 820.96 (8.23x) 1 98.7 0 1 24 0.7723
Core 3 [4]: 823.77 (8.25x) 1 98.8 0 1 24 0.7723
Core 4 [6]: 797.78 (7.99x) 1 99.8 0 0 24 0.7723
Core 5 [8]: 824.49 (8.26x) 1 98.7 0 1 24 0.7720
Core 6 [10]: 817.38 (8.19x) 1.23 93.2 0 6.36 26 0.7720
Core 7 [12]: 1628.95 (16.32x) 2.89 1.19 0 96.9 26 0.7720
Core 8 [13]: 798.67 (8.00x) 1 0.11 0 99.9 26 0.7709
Core 9 [14]: 799.11 (8.01x) 1 1.4 0 98.4 27 0.7709
Core 10 [15]: 798.20 (8.00x) 1 0.97 0 99 27 0.7709
Core 11 [16]: 804.01 (8.06x) 1 1.13 0 98.8 23 0.7709
Core 12 [17]: 801.02 (8.03x) 1 1.2 0 98.7 23 0.7709
Core 13 [18]: 796.85 (7.98x) 1 1.28 0 98.6 23 0.7720
Core 14 [19]: 796.83 (7.98x) 1 1.91 0 98 24 0.7720
Ctrl+C to exit
The upshot of the observation is that spikes in cpu frequency are not necessarily part of the freezing issue, rather, as mentioned earlier, it looks like a jitter issue related to the scheduling of the ticks, so that avenue of investigation is one that may be of some use. Of course I can't say with any certainty, but it's my best hunch.
EDIT: Another approach to this problem could be to investigate if the GPU is implicated. One way of doing that is to run the machine in text mode for a longer time than one expects the aberrant behaviour to occur, and if it doesn't occur, then it may be worth looking into the GPU in greater detail, in particular the drivers. Just a thought.