Recv-Q hung state or full

Discussion in 'Linux Networking' started by linbeg, Sep 4, 2013.

  1. linbeg

    linbeg New Member

    Messages:
    8
    Likes Received:
    1
    Trophy Points:
    3
    recently we have been facing issues with recv-q getting hung, were in receive buffer gets stuck at some point of time which inturn increased cpu useage of the process which uses that socket. Please help me out what needs to be check at this point of time.

    ryanvade likes this.
  2. grim76

    grim76 Active Member Staff Writer

    Messages:
    177
    Likes Received:
    48
    Trophy Points:
    28
    Do you have error logs, or messages in the logs that point to this conclusion?
    ryanvade likes this.
  3. linbeg

    linbeg New Member

    Messages:
    8
    Likes Received:
    1
    Trophy Points:
    3
    Hi Grim,

    I use ss command to find this thing

    ss dst 110.160.23.11
    State Recv-Q Send-Q Local Address:port Peer Address:port
    ESTAB 0 0 10.50.1.11:48298 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:52109 110.160.23.11:1110
    ESTAB 1181589 0 10.50.1.11:40343 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:48362 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:40529 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:52101 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:35219 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:52122 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:52113 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:54278 110.160.23.11:1110
    ESTAB 30788 0 10.50.1.11:60268 110.160.23.11:1110
    ESTAB 0 0 10.50.1.11:40528 110.160.23.11:1110
  4. ryanvade

    ryanvade Administrator Staff Member Staff Writer

    Messages:
    1,226
    Likes Received:
    413
    Trophy Points:
    83
    What command exactly are you trying?? Perhaps adding --verbose can show you what is going on.
  5. linbeg

    linbeg New Member

    Messages:
    8
    Likes Received:
    1
    Trophy Points:
    3
    I was trying to run "ss dst IP" , the occurance of this issue is random. Will check with --verbose option when it occurs again. However can you give me couple of more options to check if this issue re-occurs? .
  6. linbeg

    linbeg New Member

    Messages:
    8
    Likes Received:
    1
    Trophy Points:
    3
    Also, we have not done tuning of tcp parameters on our servers, as linux servers have capability of auto tuning itself. I have found one configuration for reciever buffer size

    net.core.rmem_max = 131071
    net.core.rmem_default = 124928
    net.ipv4.tcp_rmem = 4096 87380 4194304

    After googling serveral websites, i found net.ipv4.tcp_rmem (max) value should not be greater than the net.core.rmem_max value specified. However if you see above settings

    tcp_rmem max value is 4194304
    rmem_max value is 131071

    Will this have any impact ??
  7. ryanvade

    ryanvade Administrator Staff Member Staff Writer

    Messages:
    1,226
    Likes Received:
    413
    Trophy Points:
    83
    I do not believe it will cause any issues, however I am not sure.
  8. linbeg

    linbeg New Member

    Messages:
    8
    Likes Received:
    1
    Trophy Points:
    3
    Well i'm stuck, no were to go. I guess i need to check more if the issue reoccurs. What would be the best tcp configuration for 16G server or the default tuning done by linux would be better?
  9. grim76

    grim76 Active Member Staff Writer

    Messages:
    177
    Likes Received:
    48
    Trophy Points:
    28
    The send and recv q can be high while the system is transferring data, streaming data to a server. You can make adjustments to the buffers, but keep in mind that it may add some latency to network communications.

    Tuning is an art and you are going to have to play with the settings to get them right for what you are doing. This is the hard part of being a sysadmin.
    ryanvade likes this.
  10. ryanvade

    ryanvade Administrator Staff Member Staff Writer

    Messages:
    1,226
    Likes Received:
    413
    Trophy Points:
    83
  11. linbeg

    linbeg New Member

    Messages:
    8
    Likes Received:
    1
    Trophy Points:
    3
    issue re-occured again, below is the output for the epheremal port on which recv-Q is high and it looks like it got stuck and application is not able to read.

    ~$ sudo ss -emoi src 112.213.11.100:59511
    State Recv-Q Send-Q Local Address:port Peer Address:port
    ESTAB 1510228 0 112.213.11.100:59511 109.160.55.100:1110 uid:700 ino:37515225 sk:ffff8800148dc780
    mem:(r1657624,w0,f1256,t0) ts sack cubic wscale:2,7 rto:270 rtt:42.5/7.5 ato:100 cwnd:4 ssthresh:3 send 1.0Mbps rcv_rtt:308.75 rcv_space:1149054
  12. grim76

    grim76 Active Member Staff Writer

    Messages:
    177
    Likes Received:
    48
    Trophy Points:
    28
    I think you need to look at more than your recv-Q. You are going to want to look at your system overall. You could have problems with Disk I/O, not enough RAM, bad application code, wrongly configured nic teaming, or a litany of other things.

    You are going to want to start at what is the application doing when this happens and work your way back until you find the problem. This problem may also be a symptom of problems on the other side of the connection.

    More information about the problem would be a good thing. Right now we don't have much to work with.

    Here are some things that will be helpful:
    1. Amount of RAM in the machine
    2. Is the machine virtual or physical
    3. Does the machine host any other applications
    4. Amount of hard drive space in the machine
    5. How is that hard drive space configured (File system, LVM, etc....)
    6. Recent changes to the system (Patching, updates, new software, etc....)
    7. Recent changes to the application (Patching, updates, new versions...)
    8. Was the application ever working properly
    9. If the application was working properly then what changed right before you noticed it wasn't working.
    10. Does the system show a high load during normal operations

    This is just a list of questions to get started with.

    I apologize if this post comes off a bit rough. Sometimes asking the tough questions is the only way to come to a solution.
  13. linbeg

    linbeg New Member

    Messages:
    8
    Likes Received:
    1
    Trophy Points:
    3
    Hi Grim,

    I can understand :) , however i was backtracking and there were few changes done since last couple of monthsat application end. I'm gonna do few changes at application front and let you know if issue still exists.

    Also as these applications are running since long on same configuration however these issues have been cropped up since last couple of months which is making me to think of what all changes were carried out during this period.

    Thanks Grim will get back to you soon.

    Cheers :) .
  14. Tejashree

    Tejashree New Member

    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    1
    I am getting the same issue after we shifted our servers from physical machines to virtual machines. SendQ is not a problem. But recvQ size is not decreasing and as a result we are losing a lot of hits from the other side, our application is not accepting any data.
  15. grim76

    grim76 Active Member Staff Writer

    Messages:
    177
    Likes Received:
    48
    Trophy Points:
    28
    Please start a new thread so that your issue can properly be addressed.

Share This Page