UDP receive buffer error
up vote
2
down vote
favorite
We are running opensips SIP proxy in high traffic environment, it's using UDP protocol. We are seeing sometime RX
error or overrun error on interface. I have set rmem_max to 16M
but still i am seeing error this is what i am seeing in netstat. any idea how to fix this error?
We have 40 CPU and 64GB memory on system so it is not a resource issue.
One more thing, we are running tcpdump on it and capturing all SIP traffic. do you think tcpdump
can cause that issue?
netstat -su
Udp:
27979570 packets received
2727 packets to unknown port received.
724419 packet receive errors
41731936 packets sent
322 receive buffer errors
0 send buffer errors
InCsumErrors: 55
Dropwatch -l kas
846 drops at tpacket_rcv+5f (0xffffffff815e46ff)
3 drops at tpacket_rcv+5f (0xffffffff815e46ff)
4 drops at unix_stream_connect+2ca (0xffffffff815a388a)
552 drops at tpacket_rcv+5f (0xffffffff815e46ff)
503 drops at tpacket_rcv+5f (0xffffffff815e46ff)
4 drops at unix_stream_connect+2ca (0xffffffff815a388a)
1557 drops at tpacket_rcv+5f (0xffffffff815e46ff)
6 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1203 drops at tpacket_rcv+5f (0xffffffff815e46ff)
2051 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1940 drops at tpacket_rcv+5f (0xffffffff815e46ff)
541 drops at tpacket_rcv+5f (0xffffffff815e46ff)
221 drops at tpacket_rcv+5f (0xffffffff815e46ff)
745 drops at tpacket_rcv+5f (0xffffffff815e46ff)
389 drops at tpacket_rcv+5f (0xffffffff815e46ff)
568 drops at tpacket_rcv+5f (0xffffffff815e46ff)
651 drops at tpacket_rcv+5f (0xffffffff815e46ff)
622 drops at tpacket_rcv+5f (0xffffffff815e46ff)
377 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
577 drops at tpacket_rcv+5f (0xffffffff815e46ff)
9 drops at tpacket_rcv+5f (0xffffffff815e46ff)
135 drops at tpacket_rcv+5f (0xffffffff815e46ff)
217 drops at tpacket_rcv+5f (0xffffffff815e46ff)
358 drops at tpacket_rcv+5f (0xffffffff815e46ff)
211 drops at tpacket_rcv+5f (0xffffffff815e46ff)
337 drops at tpacket_rcv+5f (0xffffffff815e46ff)
54 drops at tpacket_rcv+5f (0xffffffff815e46ff)
105 drops at tpacket_rcv+5f (0xffffffff815e46ff)
27 drops at tpacket_rcv+5f (0xffffffff815e46ff)
42 drops at tpacket_rcv+5f (0xffffffff815e46ff)
249 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1080 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1932 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1476 drops at tpacket_rcv+5f (0xffffffff815e46ff)
681 drops at tpacket_rcv+5f (0xffffffff815e46ff)
840 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1076 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1021 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
294 drops at tpacket_rcv+5f (0xffffffff815e46ff)
186 drops at tpacket_rcv+5f (0xffffffff815e46ff)
313 drops at tpacket_rcv+5f (0xffffffff815e46ff)
257 drops at tpacket_rcv+5f (0xffffffff815e46ff)
132 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
343 drops at tpacket_rcv+5f (0xffffffff815e46ff)
282 drops at tpacket_rcv+5f (0xffffffff815e46ff)
191 drops at tpacket_rcv+5f (0xffffffff815e46ff)
303 drops at tpacket_rcv+5f (0xffffffff815e46ff)
96 drops at tpacket_rcv+5f (0xffffffff815e46ff)
223 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
183 drops at tpacket_rcv+5f (0xffffffff815e46ff)
linux performance udp
add a comment |
up vote
2
down vote
favorite
We are running opensips SIP proxy in high traffic environment, it's using UDP protocol. We are seeing sometime RX
error or overrun error on interface. I have set rmem_max to 16M
but still i am seeing error this is what i am seeing in netstat. any idea how to fix this error?
We have 40 CPU and 64GB memory on system so it is not a resource issue.
One more thing, we are running tcpdump on it and capturing all SIP traffic. do you think tcpdump
can cause that issue?
netstat -su
Udp:
27979570 packets received
2727 packets to unknown port received.
724419 packet receive errors
41731936 packets sent
322 receive buffer errors
0 send buffer errors
InCsumErrors: 55
Dropwatch -l kas
846 drops at tpacket_rcv+5f (0xffffffff815e46ff)
3 drops at tpacket_rcv+5f (0xffffffff815e46ff)
4 drops at unix_stream_connect+2ca (0xffffffff815a388a)
552 drops at tpacket_rcv+5f (0xffffffff815e46ff)
503 drops at tpacket_rcv+5f (0xffffffff815e46ff)
4 drops at unix_stream_connect+2ca (0xffffffff815a388a)
1557 drops at tpacket_rcv+5f (0xffffffff815e46ff)
6 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1203 drops at tpacket_rcv+5f (0xffffffff815e46ff)
2051 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1940 drops at tpacket_rcv+5f (0xffffffff815e46ff)
541 drops at tpacket_rcv+5f (0xffffffff815e46ff)
221 drops at tpacket_rcv+5f (0xffffffff815e46ff)
745 drops at tpacket_rcv+5f (0xffffffff815e46ff)
389 drops at tpacket_rcv+5f (0xffffffff815e46ff)
568 drops at tpacket_rcv+5f (0xffffffff815e46ff)
651 drops at tpacket_rcv+5f (0xffffffff815e46ff)
622 drops at tpacket_rcv+5f (0xffffffff815e46ff)
377 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
577 drops at tpacket_rcv+5f (0xffffffff815e46ff)
9 drops at tpacket_rcv+5f (0xffffffff815e46ff)
135 drops at tpacket_rcv+5f (0xffffffff815e46ff)
217 drops at tpacket_rcv+5f (0xffffffff815e46ff)
358 drops at tpacket_rcv+5f (0xffffffff815e46ff)
211 drops at tpacket_rcv+5f (0xffffffff815e46ff)
337 drops at tpacket_rcv+5f (0xffffffff815e46ff)
54 drops at tpacket_rcv+5f (0xffffffff815e46ff)
105 drops at tpacket_rcv+5f (0xffffffff815e46ff)
27 drops at tpacket_rcv+5f (0xffffffff815e46ff)
42 drops at tpacket_rcv+5f (0xffffffff815e46ff)
249 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1080 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1932 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1476 drops at tpacket_rcv+5f (0xffffffff815e46ff)
681 drops at tpacket_rcv+5f (0xffffffff815e46ff)
840 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1076 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1021 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
294 drops at tpacket_rcv+5f (0xffffffff815e46ff)
186 drops at tpacket_rcv+5f (0xffffffff815e46ff)
313 drops at tpacket_rcv+5f (0xffffffff815e46ff)
257 drops at tpacket_rcv+5f (0xffffffff815e46ff)
132 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
343 drops at tpacket_rcv+5f (0xffffffff815e46ff)
282 drops at tpacket_rcv+5f (0xffffffff815e46ff)
191 drops at tpacket_rcv+5f (0xffffffff815e46ff)
303 drops at tpacket_rcv+5f (0xffffffff815e46ff)
96 drops at tpacket_rcv+5f (0xffffffff815e46ff)
223 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
183 drops at tpacket_rcv+5f (0xffffffff815e46ff)
linux performance udp
1
Why are you running tcpdump in a production system with a high use?
– Rui F Ribeiro
Apr 16 '16 at 11:52
we are capturingsip.pcap
for troubleshooting.
– Satish
Apr 16 '16 at 13:50
2
I would suggest doing port mirroring at switch level and capturing traffic with tcpdump in another machine.
– Rui F Ribeiro
Apr 16 '16 at 15:33
I like your idea but we are using cloud and SDN network it's no fun because you don't know sometime where your machine running with bunch of other services.
– Satish
Nov 30 at 14:51
PS, where possible, please include the kernel version you report your results from (or version of whatever software your results come from). The results might look different on some future kernel.
– sourcejedi
Nov 30 at 18:08
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
We are running opensips SIP proxy in high traffic environment, it's using UDP protocol. We are seeing sometime RX
error or overrun error on interface. I have set rmem_max to 16M
but still i am seeing error this is what i am seeing in netstat. any idea how to fix this error?
We have 40 CPU and 64GB memory on system so it is not a resource issue.
One more thing, we are running tcpdump on it and capturing all SIP traffic. do you think tcpdump
can cause that issue?
netstat -su
Udp:
27979570 packets received
2727 packets to unknown port received.
724419 packet receive errors
41731936 packets sent
322 receive buffer errors
0 send buffer errors
InCsumErrors: 55
Dropwatch -l kas
846 drops at tpacket_rcv+5f (0xffffffff815e46ff)
3 drops at tpacket_rcv+5f (0xffffffff815e46ff)
4 drops at unix_stream_connect+2ca (0xffffffff815a388a)
552 drops at tpacket_rcv+5f (0xffffffff815e46ff)
503 drops at tpacket_rcv+5f (0xffffffff815e46ff)
4 drops at unix_stream_connect+2ca (0xffffffff815a388a)
1557 drops at tpacket_rcv+5f (0xffffffff815e46ff)
6 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1203 drops at tpacket_rcv+5f (0xffffffff815e46ff)
2051 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1940 drops at tpacket_rcv+5f (0xffffffff815e46ff)
541 drops at tpacket_rcv+5f (0xffffffff815e46ff)
221 drops at tpacket_rcv+5f (0xffffffff815e46ff)
745 drops at tpacket_rcv+5f (0xffffffff815e46ff)
389 drops at tpacket_rcv+5f (0xffffffff815e46ff)
568 drops at tpacket_rcv+5f (0xffffffff815e46ff)
651 drops at tpacket_rcv+5f (0xffffffff815e46ff)
622 drops at tpacket_rcv+5f (0xffffffff815e46ff)
377 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
577 drops at tpacket_rcv+5f (0xffffffff815e46ff)
9 drops at tpacket_rcv+5f (0xffffffff815e46ff)
135 drops at tpacket_rcv+5f (0xffffffff815e46ff)
217 drops at tpacket_rcv+5f (0xffffffff815e46ff)
358 drops at tpacket_rcv+5f (0xffffffff815e46ff)
211 drops at tpacket_rcv+5f (0xffffffff815e46ff)
337 drops at tpacket_rcv+5f (0xffffffff815e46ff)
54 drops at tpacket_rcv+5f (0xffffffff815e46ff)
105 drops at tpacket_rcv+5f (0xffffffff815e46ff)
27 drops at tpacket_rcv+5f (0xffffffff815e46ff)
42 drops at tpacket_rcv+5f (0xffffffff815e46ff)
249 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1080 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1932 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1476 drops at tpacket_rcv+5f (0xffffffff815e46ff)
681 drops at tpacket_rcv+5f (0xffffffff815e46ff)
840 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1076 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1021 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
294 drops at tpacket_rcv+5f (0xffffffff815e46ff)
186 drops at tpacket_rcv+5f (0xffffffff815e46ff)
313 drops at tpacket_rcv+5f (0xffffffff815e46ff)
257 drops at tpacket_rcv+5f (0xffffffff815e46ff)
132 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
343 drops at tpacket_rcv+5f (0xffffffff815e46ff)
282 drops at tpacket_rcv+5f (0xffffffff815e46ff)
191 drops at tpacket_rcv+5f (0xffffffff815e46ff)
303 drops at tpacket_rcv+5f (0xffffffff815e46ff)
96 drops at tpacket_rcv+5f (0xffffffff815e46ff)
223 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
183 drops at tpacket_rcv+5f (0xffffffff815e46ff)
linux performance udp
We are running opensips SIP proxy in high traffic environment, it's using UDP protocol. We are seeing sometime RX
error or overrun error on interface. I have set rmem_max to 16M
but still i am seeing error this is what i am seeing in netstat. any idea how to fix this error?
We have 40 CPU and 64GB memory on system so it is not a resource issue.
One more thing, we are running tcpdump on it and capturing all SIP traffic. do you think tcpdump
can cause that issue?
netstat -su
Udp:
27979570 packets received
2727 packets to unknown port received.
724419 packet receive errors
41731936 packets sent
322 receive buffer errors
0 send buffer errors
InCsumErrors: 55
Dropwatch -l kas
846 drops at tpacket_rcv+5f (0xffffffff815e46ff)
3 drops at tpacket_rcv+5f (0xffffffff815e46ff)
4 drops at unix_stream_connect+2ca (0xffffffff815a388a)
552 drops at tpacket_rcv+5f (0xffffffff815e46ff)
503 drops at tpacket_rcv+5f (0xffffffff815e46ff)
4 drops at unix_stream_connect+2ca (0xffffffff815a388a)
1557 drops at tpacket_rcv+5f (0xffffffff815e46ff)
6 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1203 drops at tpacket_rcv+5f (0xffffffff815e46ff)
2051 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1940 drops at tpacket_rcv+5f (0xffffffff815e46ff)
541 drops at tpacket_rcv+5f (0xffffffff815e46ff)
221 drops at tpacket_rcv+5f (0xffffffff815e46ff)
745 drops at tpacket_rcv+5f (0xffffffff815e46ff)
389 drops at tpacket_rcv+5f (0xffffffff815e46ff)
568 drops at tpacket_rcv+5f (0xffffffff815e46ff)
651 drops at tpacket_rcv+5f (0xffffffff815e46ff)
622 drops at tpacket_rcv+5f (0xffffffff815e46ff)
377 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
577 drops at tpacket_rcv+5f (0xffffffff815e46ff)
9 drops at tpacket_rcv+5f (0xffffffff815e46ff)
135 drops at tpacket_rcv+5f (0xffffffff815e46ff)
217 drops at tpacket_rcv+5f (0xffffffff815e46ff)
358 drops at tpacket_rcv+5f (0xffffffff815e46ff)
211 drops at tpacket_rcv+5f (0xffffffff815e46ff)
337 drops at tpacket_rcv+5f (0xffffffff815e46ff)
54 drops at tpacket_rcv+5f (0xffffffff815e46ff)
105 drops at tpacket_rcv+5f (0xffffffff815e46ff)
27 drops at tpacket_rcv+5f (0xffffffff815e46ff)
42 drops at tpacket_rcv+5f (0xffffffff815e46ff)
249 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1080 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1932 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1476 drops at tpacket_rcv+5f (0xffffffff815e46ff)
681 drops at tpacket_rcv+5f (0xffffffff815e46ff)
840 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1076 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1021 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
294 drops at tpacket_rcv+5f (0xffffffff815e46ff)
186 drops at tpacket_rcv+5f (0xffffffff815e46ff)
313 drops at tpacket_rcv+5f (0xffffffff815e46ff)
257 drops at tpacket_rcv+5f (0xffffffff815e46ff)
132 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
343 drops at tpacket_rcv+5f (0xffffffff815e46ff)
282 drops at tpacket_rcv+5f (0xffffffff815e46ff)
191 drops at tpacket_rcv+5f (0xffffffff815e46ff)
303 drops at tpacket_rcv+5f (0xffffffff815e46ff)
96 drops at tpacket_rcv+5f (0xffffffff815e46ff)
223 drops at tpacket_rcv+5f (0xffffffff815e46ff)
1 drops at tcp_rcv_state_process+1b0 (0xffffffff81550f70)
183 drops at tpacket_rcv+5f (0xffffffff815e46ff)
linux performance udp
linux performance udp
edited Apr 16 '16 at 3:47
asked Apr 16 '16 at 3:07
Satish
63111131
63111131
1
Why are you running tcpdump in a production system with a high use?
– Rui F Ribeiro
Apr 16 '16 at 11:52
we are capturingsip.pcap
for troubleshooting.
– Satish
Apr 16 '16 at 13:50
2
I would suggest doing port mirroring at switch level and capturing traffic with tcpdump in another machine.
– Rui F Ribeiro
Apr 16 '16 at 15:33
I like your idea but we are using cloud and SDN network it's no fun because you don't know sometime where your machine running with bunch of other services.
– Satish
Nov 30 at 14:51
PS, where possible, please include the kernel version you report your results from (or version of whatever software your results come from). The results might look different on some future kernel.
– sourcejedi
Nov 30 at 18:08
add a comment |
1
Why are you running tcpdump in a production system with a high use?
– Rui F Ribeiro
Apr 16 '16 at 11:52
we are capturingsip.pcap
for troubleshooting.
– Satish
Apr 16 '16 at 13:50
2
I would suggest doing port mirroring at switch level and capturing traffic with tcpdump in another machine.
– Rui F Ribeiro
Apr 16 '16 at 15:33
I like your idea but we are using cloud and SDN network it's no fun because you don't know sometime where your machine running with bunch of other services.
– Satish
Nov 30 at 14:51
PS, where possible, please include the kernel version you report your results from (or version of whatever software your results come from). The results might look different on some future kernel.
– sourcejedi
Nov 30 at 18:08
1
1
Why are you running tcpdump in a production system with a high use?
– Rui F Ribeiro
Apr 16 '16 at 11:52
Why are you running tcpdump in a production system with a high use?
– Rui F Ribeiro
Apr 16 '16 at 11:52
we are capturing
sip.pcap
for troubleshooting.– Satish
Apr 16 '16 at 13:50
we are capturing
sip.pcap
for troubleshooting.– Satish
Apr 16 '16 at 13:50
2
2
I would suggest doing port mirroring at switch level and capturing traffic with tcpdump in another machine.
– Rui F Ribeiro
Apr 16 '16 at 15:33
I would suggest doing port mirroring at switch level and capturing traffic with tcpdump in another machine.
– Rui F Ribeiro
Apr 16 '16 at 15:33
I like your idea but we are using cloud and SDN network it's no fun because you don't know sometime where your machine running with bunch of other services.
– Satish
Nov 30 at 14:51
I like your idea but we are using cloud and SDN network it's no fun because you don't know sometime where your machine running with bunch of other services.
– Satish
Nov 30 at 14:51
PS, where possible, please include the kernel version you report your results from (or version of whatever software your results come from). The results might look different on some future kernel.
– sourcejedi
Nov 30 at 18:08
PS, where possible, please include the kernel version you report your results from (or version of whatever software your results come from). The results might look different on some future kernel.
– sourcejedi
Nov 30 at 18:08
add a comment |
1 Answer
1
active
oldest
votes
up vote
4
down vote
accepted
Lookup tpacket_rcv
: it's in af_packet.c
. AF_PACKET
means tcpdump.
It looks like someone else saw a problem triggered by tcpdump, without resolution: https://groups.google.com/forum/#!msg/mechanical-sympathy/qLqYTouygTE/rq9XSBxgqiMJ
But I'm suspicious about drops in tpacket_rcv. Probably it means function exit via the label ring_is_full
. It sounds like packets will not reach tcpdump because its buffer is overflowing. However I don't think this means the packet is (necessarily) being dropped completely; it can still reach the UDP socket. It suggests your dropwatch transcript didn't cover any of the UDP drops shown in the counters. I don't think AF_PACKET are being counted as UDP drops just because they're both datagram sockets. Unfortunately it looks like these tp_drops
are not shown by netstat
.
I would want to run dropwatch and filter out the tpacket_rcv lines with grep -v
. Just long enough to see an increase in your UDP receive error counter.
I think rmem_max
for UDP will only help if the app tries to raise receive buffers. No real search results for "opensips rmem". I would try raising rmem_default
instead. But I really hope they would be shown as "receive buffer errors" if that was the problem...
And yet, they're not UDP checksum errors either...
There's another tunable called netdev_max_backlog
. Apparently the corresponding overflow counter is the second column of /proc/net/softnet_stat
(there's one row per cpu). But this happens before the packet is fed to the UDP stack, so it shouldn't have this affect the UDP statistics...
Whelp, that's all I can think of now, it's a bit mysterious :(.
EDIT:
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate. After adjust that we are seeing good result. Drops are there, but very very low counts. – Satish
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate, after adjust that we are seeing good result drops are there but very very low counts.
– Satish
Nov 30 at 14:50
1
@Satish thank you! That's useful to know. I.e. that overflow of socket buffers may appear as "packet receive errors" etc.; it is not very obvious. And of course it is useful for anyone who has the exact same problem as you. I have edited my answer to include your comment.
– sourcejedi
Nov 30 at 18:03
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
accepted
Lookup tpacket_rcv
: it's in af_packet.c
. AF_PACKET
means tcpdump.
It looks like someone else saw a problem triggered by tcpdump, without resolution: https://groups.google.com/forum/#!msg/mechanical-sympathy/qLqYTouygTE/rq9XSBxgqiMJ
But I'm suspicious about drops in tpacket_rcv. Probably it means function exit via the label ring_is_full
. It sounds like packets will not reach tcpdump because its buffer is overflowing. However I don't think this means the packet is (necessarily) being dropped completely; it can still reach the UDP socket. It suggests your dropwatch transcript didn't cover any of the UDP drops shown in the counters. I don't think AF_PACKET are being counted as UDP drops just because they're both datagram sockets. Unfortunately it looks like these tp_drops
are not shown by netstat
.
I would want to run dropwatch and filter out the tpacket_rcv lines with grep -v
. Just long enough to see an increase in your UDP receive error counter.
I think rmem_max
for UDP will only help if the app tries to raise receive buffers. No real search results for "opensips rmem". I would try raising rmem_default
instead. But I really hope they would be shown as "receive buffer errors" if that was the problem...
And yet, they're not UDP checksum errors either...
There's another tunable called netdev_max_backlog
. Apparently the corresponding overflow counter is the second column of /proc/net/softnet_stat
(there's one row per cpu). But this happens before the packet is fed to the UDP stack, so it shouldn't have this affect the UDP statistics...
Whelp, that's all I can think of now, it's a bit mysterious :(.
EDIT:
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate. After adjust that we are seeing good result. Drops are there, but very very low counts. – Satish
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate, after adjust that we are seeing good result drops are there but very very low counts.
– Satish
Nov 30 at 14:50
1
@Satish thank you! That's useful to know. I.e. that overflow of socket buffers may appear as "packet receive errors" etc.; it is not very obvious. And of course it is useful for anyone who has the exact same problem as you. I have edited my answer to include your comment.
– sourcejedi
Nov 30 at 18:03
add a comment |
up vote
4
down vote
accepted
Lookup tpacket_rcv
: it's in af_packet.c
. AF_PACKET
means tcpdump.
It looks like someone else saw a problem triggered by tcpdump, without resolution: https://groups.google.com/forum/#!msg/mechanical-sympathy/qLqYTouygTE/rq9XSBxgqiMJ
But I'm suspicious about drops in tpacket_rcv. Probably it means function exit via the label ring_is_full
. It sounds like packets will not reach tcpdump because its buffer is overflowing. However I don't think this means the packet is (necessarily) being dropped completely; it can still reach the UDP socket. It suggests your dropwatch transcript didn't cover any of the UDP drops shown in the counters. I don't think AF_PACKET are being counted as UDP drops just because they're both datagram sockets. Unfortunately it looks like these tp_drops
are not shown by netstat
.
I would want to run dropwatch and filter out the tpacket_rcv lines with grep -v
. Just long enough to see an increase in your UDP receive error counter.
I think rmem_max
for UDP will only help if the app tries to raise receive buffers. No real search results for "opensips rmem". I would try raising rmem_default
instead. But I really hope they would be shown as "receive buffer errors" if that was the problem...
And yet, they're not UDP checksum errors either...
There's another tunable called netdev_max_backlog
. Apparently the corresponding overflow counter is the second column of /proc/net/softnet_stat
(there's one row per cpu). But this happens before the packet is fed to the UDP stack, so it shouldn't have this affect the UDP statistics...
Whelp, that's all I can think of now, it's a bit mysterious :(.
EDIT:
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate. After adjust that we are seeing good result. Drops are there, but very very low counts. – Satish
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate, after adjust that we are seeing good result drops are there but very very low counts.
– Satish
Nov 30 at 14:50
1
@Satish thank you! That's useful to know. I.e. that overflow of socket buffers may appear as "packet receive errors" etc.; it is not very obvious. And of course it is useful for anyone who has the exact same problem as you. I have edited my answer to include your comment.
– sourcejedi
Nov 30 at 18:03
add a comment |
up vote
4
down vote
accepted
up vote
4
down vote
accepted
Lookup tpacket_rcv
: it's in af_packet.c
. AF_PACKET
means tcpdump.
It looks like someone else saw a problem triggered by tcpdump, without resolution: https://groups.google.com/forum/#!msg/mechanical-sympathy/qLqYTouygTE/rq9XSBxgqiMJ
But I'm suspicious about drops in tpacket_rcv. Probably it means function exit via the label ring_is_full
. It sounds like packets will not reach tcpdump because its buffer is overflowing. However I don't think this means the packet is (necessarily) being dropped completely; it can still reach the UDP socket. It suggests your dropwatch transcript didn't cover any of the UDP drops shown in the counters. I don't think AF_PACKET are being counted as UDP drops just because they're both datagram sockets. Unfortunately it looks like these tp_drops
are not shown by netstat
.
I would want to run dropwatch and filter out the tpacket_rcv lines with grep -v
. Just long enough to see an increase in your UDP receive error counter.
I think rmem_max
for UDP will only help if the app tries to raise receive buffers. No real search results for "opensips rmem". I would try raising rmem_default
instead. But I really hope they would be shown as "receive buffer errors" if that was the problem...
And yet, they're not UDP checksum errors either...
There's another tunable called netdev_max_backlog
. Apparently the corresponding overflow counter is the second column of /proc/net/softnet_stat
(there's one row per cpu). But this happens before the packet is fed to the UDP stack, so it shouldn't have this affect the UDP statistics...
Whelp, that's all I can think of now, it's a bit mysterious :(.
EDIT:
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate. After adjust that we are seeing good result. Drops are there, but very very low counts. – Satish
Lookup tpacket_rcv
: it's in af_packet.c
. AF_PACKET
means tcpdump.
It looks like someone else saw a problem triggered by tcpdump, without resolution: https://groups.google.com/forum/#!msg/mechanical-sympathy/qLqYTouygTE/rq9XSBxgqiMJ
But I'm suspicious about drops in tpacket_rcv. Probably it means function exit via the label ring_is_full
. It sounds like packets will not reach tcpdump because its buffer is overflowing. However I don't think this means the packet is (necessarily) being dropped completely; it can still reach the UDP socket. It suggests your dropwatch transcript didn't cover any of the UDP drops shown in the counters. I don't think AF_PACKET are being counted as UDP drops just because they're both datagram sockets. Unfortunately it looks like these tp_drops
are not shown by netstat
.
I would want to run dropwatch and filter out the tpacket_rcv lines with grep -v
. Just long enough to see an increase in your UDP receive error counter.
I think rmem_max
for UDP will only help if the app tries to raise receive buffers. No real search results for "opensips rmem". I would try raising rmem_default
instead. But I really hope they would be shown as "receive buffer errors" if that was the problem...
And yet, they're not UDP checksum errors either...
There's another tunable called netdev_max_backlog
. Apparently the corresponding overflow counter is the second column of /proc/net/softnet_stat
(there's one row per cpu). But this happens before the packet is fed to the UDP stack, so it shouldn't have this affect the UDP statistics...
Whelp, that's all I can think of now, it's a bit mysterious :(.
EDIT:
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate. After adjust that we are seeing good result. Drops are there, but very very low counts. – Satish
edited Nov 30 at 17:54
answered Apr 16 '16 at 10:34
sourcejedi
22.2k43398
22.2k43398
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate, after adjust that we are seeing good result drops are there but very very low counts.
– Satish
Nov 30 at 14:50
1
@Satish thank you! That's useful to know. I.e. that overflow of socket buffers may appear as "packet receive errors" etc.; it is not very obvious. And of course it is useful for anyone who has the exact same problem as you. I have edited my answer to include your comment.
– sourcejedi
Nov 30 at 18:03
add a comment |
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate, after adjust that we are seeing good result drops are there but very very low counts.
– Satish
Nov 30 at 14:50
1
@Satish thank you! That's useful to know. I.e. that overflow of socket buffers may appear as "packet receive errors" etc.; it is not very obvious. And of course it is useful for anyone who has the exact same problem as you. I have edited my answer to include your comment.
– sourcejedi
Nov 30 at 18:03
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate, after adjust that we are seeing good result drops are there but very very low counts.
– Satish
Nov 30 at 14:50
There is a setting in SIP proxy server related buffer size that size was small to handle high pps rate, after adjust that we are seeing good result drops are there but very very low counts.
– Satish
Nov 30 at 14:50
1
1
@Satish thank you! That's useful to know. I.e. that overflow of socket buffers may appear as "packet receive errors" etc.; it is not very obvious. And of course it is useful for anyone who has the exact same problem as you. I have edited my answer to include your comment.
– sourcejedi
Nov 30 at 18:03
@Satish thank you! That's useful to know. I.e. that overflow of socket buffers may appear as "packet receive errors" etc.; it is not very obvious. And of course it is useful for anyone who has the exact same problem as you. I have edited my answer to include your comment.
– sourcejedi
Nov 30 at 18:03
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f276831%2fudp-receive-buffer-error%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Why are you running tcpdump in a production system with a high use?
– Rui F Ribeiro
Apr 16 '16 at 11:52
we are capturing
sip.pcap
for troubleshooting.– Satish
Apr 16 '16 at 13:50
2
I would suggest doing port mirroring at switch level and capturing traffic with tcpdump in another machine.
– Rui F Ribeiro
Apr 16 '16 at 15:33
I like your idea but we are using cloud and SDN network it's no fun because you don't know sometime where your machine running with bunch of other services.
– Satish
Nov 30 at 14:51
PS, where possible, please include the kernel version you report your results from (or version of whatever software your results come from). The results might look different on some future kernel.
– sourcejedi
Nov 30 at 18:08