Hey all. I have an embedded board with a quad core A53 ARM that's running kernel 5.4.74. We build the kernel our self, and we have it patched with the latest 5.4.74 rt42 patches from here (https://mirrors.edge.kernel.org/pub/linux/kernel/projects/rt/5.4/older/).
We have an instability in our system. If left on long enough (30+mins), processes that are blocking on the nanosleep() system call will never be woken up. It's not just our own user apps that get stuck either; it's any process in the OS that calls the nanosleep(). The "ping" program will get stuck in a sleep, even the bash "sleep" command will get stuck.
We've noticed that the kernel timer_list on CPU 0 will accumulate negative timer entries while the system is in this state. As far as I can tell, the negative timer entries mean that those entries are long-past expired. It's almost like the timer queues are not getting serviced quick enough or at all potentially.
If we fall back to the normal 5.4.74 kernel (ie. don't apply the rt patch), then this issue disappears all together. Wondering where else we can look for a potential solution. Maybe some next steps to debug this instability. Thanks!
We have an instability in our system. If left on long enough (30+mins), processes that are blocking on the nanosleep() system call will never be woken up. It's not just our own user apps that get stuck either; it's any process in the OS that calls the nanosleep(). The "ping" program will get stuck in a sleep, even the bash "sleep" command will get stuck.
We've noticed that the kernel timer_list on CPU 0 will accumulate negative timer entries while the system is in this state. As far as I can tell, the negative timer entries mean that those entries are long-past expired. It's almost like the timer queues are not getting serviced quick enough or at all potentially.
If we fall back to the normal 5.4.74 kernel (ie. don't apply the rt patch), then this issue disappears all together. Wondering where else we can look for a potential solution. Maybe some next steps to debug this instability. Thanks!
Last edited: