ssh stops responding, processes and load average are growing to the roof.

donvito8

New Member
Joined
Nov 2, 2023
Messages
2
Reaction score
0
Credits
23
As in the subject line.
Redhat 7 - one server from time to time stops responding to ssh.
The server still pings and is quite responsive but Processes are growing e.g. 1900, the load average: 1596.82, 1589.61, 1571.18

What I noticed we are flooded in /var/log/messages with:

kernel: audit: audit_backlog;=65536 > audit_backlog_limit=65535
kernel: audit: audit_lost=2112805915 audit_rate_limit=0 audit_backlog_limit=65535
kernel: audit: backlog limit exceeded

any idea how to troubleshot?
 


You can change the backlog limit if you want. That may make a difference.

I'd also look at top/htop to see if there's anything that stands out as a huge consumer of resources, be it CPU or RAM. I'd lean towards CPU but there could be a RAM bottleneck, so I'd check that too.

Edit: This is more than just getting started or whatnot, so I've moved it to the Red Hat sub-forum.
 
KGill,

Thank you for your answer. Unfortunately I can not run TOP/HTOP because the server is remote and ssh not available but I still keep receiving notification from nagios. CPU stable, mem even decreasing. Open files = 90 000 +

I can only image that the server is in a loop, it opens new files, processes but does not close any of them for some reason.

One of the developers can still log in to java kvm and operate with no problems.

I am thinking about a script that will be checking sshd and if not active it will start/restart it.
 
You can't reboot the server and get SSH access before it goes haywire?

At the same time, assuming you have enough time, you could try changing the backlog limit. (That's something you can Google, from what I see.)

This sounds like a bit of an issue. There will also be (hopefully) responses from people more adept than I am.
 

Staff online

Members online


Top