[DCOS_OSS-1591] Long running Connections to VIPs timeout and cause service failures. Created: 25/Aug/17 Updated: 10/Feb/21 Resolved: 10/Feb/21 |
|
Status: | Resolved |
Project: | DC/OS |
Component/s: | networking |
Affects Version/s: | DC/OS 1.10.2 |
Fix Version/s: | None |
Type: | Task | Priority: | Medium |
Reporter: | Jeffrey Zampieron | Assignee: | Vinod Kone (Inactive) |
Resolution: | Won't Do | ||
Labels: | add_team:20200603, documentation, issuetype:improvement | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Team: |
Description |
In my setup, we use a VIP to connect to pgpool from a set of java services. Anyone running microservices on DCOS using database connections via VIPs will most likely have this failure. This is documented in the https://dcos.io/docs/1.9/networking/load-balancing-vips/virtual-ip-addresses/ FAQ at the bottom, but does not include a detailed discussion of the configuration to either control or mitigate the situation. I found the following adjustments to be necessary to avoid the problem: A FAQ documentation update is probably sufficient, but I'm not familiar enough with the details of minuteman to do this completely. This issue may also impact the DC/OS installer to set the proper TCP keep alive settings. |
Comments |
Comment by deric (Inactive) [ 24/Mar/18 ] |
I'm having the same issue with TCP connections. Is there any progress/any debugging output needed to investigate this issue? $ cat /proc/sys/net/ipv4/tcp_keepalive_time 7200 $ cat /proc/sys/net/ipv4/tcp_keepalive_intvl 75 $ cat /proc/sys/net/ipv4/tcp_keepalive_probes 9 |
Comment by deric (Inactive) [ 24/Mar/18 ] |
To demonstrate the issue I've created a simple publisher/subscriber service over ZeroMQ: { "id": "/zmq/pub", "backoffFactor": 1.15, "backoffSeconds": 1, "cmd": "python3 pub.py -v", "container": { "portMappings": [ { "containerPort": 6500, "hostPort": 0, "labels": { "VIP_0": "/zmq/pub:6500" }, "protocol": "tcp", "servicePort": 10123, "name": "pub" } ], "type": "DOCKER", "volumes": [], "docker": { "image": "deric/pub-sub-sleep:latest", "forcePullImage": false, "privileged": false, "parameters": [] } }, "cpus": 0.1, "disk": 0, "instances": 1, "maxLaunchDelaySeconds": 3600, "mem": 128, "gpus": 0, "networks": [ { "mode": "container/bridge" } ], "requirePorts": false, "upgradeStrategy": { "maximumOverCapacity": 1, "minimumHealthCapacity": 1 }, "killSelection": "YOUNGEST_FIRST", "unreachableStrategy": { "inactiveAfterSeconds": 0, "expungeAfterSeconds": 0 }, "healthChecks": [], "fetch": [], "constraints": [] } and subscriber: { "id": "/zmq/sub", "backoffFactor": 1.15, "backoffSeconds": 1, "cmd": "python3 sub.py --host tcp://zmqpub.marathon.l4lb.thisdcos.directory:6500", "container": { "type": "DOCKER", "volumes": [], "docker": { "image": "deric/pub-sub-sleep:latest", "forcePullImage": true, "privileged": false, "parameters": [] } }, "cpus": 0.1, "disk": 0, "env": { }, "instances": 1, "maxLaunchDelaySeconds": 3600, "mem": 128, "gpus": 0, "networks": [ { "mode": "host" } ], "portDefinitions": [], "requirePorts": false, "upgradeStrategy": { "maximumOverCapacity": 1, "minimumHealthCapacity": 1 }, "killSelection": "YOUNGEST_FIRST", "unreachableStrategy": { "inactiveAfterSeconds": 0, "expungeAfterSeconds": 0 }, "healthChecks": [], "fetch": [], "constraints": [] } Between sending messages publisher exponentially increases sleep interval: I0324 15:02:29.154968 28033 executor.cpp:160] Starting task zmq_pub.5ee5560e-2f74-11e8-bd7f-fe0d65b1eb4a WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. [INFO] Current libzmq version is 4.1.6 [INFO] Current pyzmq version is 17.0.0 [INFO] Pushing messages to: tcp://*:6500 [DEBUG] msg #0, now sleeping for 2 s [DEBUG] msg #1, now sleeping for 4 s [DEBUG] msg #2, now sleeping for 12 s [DEBUG] msg #3, now sleeping for 32 s [DEBUG] msg #4, now sleeping for 86 s [DEBUG] msg #5, now sleeping for 235 s [DEBUG] msg #6, now sleeping for 638 s [DEBUG] msg #7, now sleeping for 1735 s [DEBUG] msg #8, now sleeping for 4716 s subscriber joined the show late, thus missing first few messages: I0324 15:02:11.469764 26516 exec.cpp:162] Version: 1.4.0 I0324 15:02:11.472103 26524 exec.cpp:237] Executor registered on agent b2156e36-4853-4fb1-ad88-948ddfd39ff8-S7 I0324 15:02:11.472849 26531 executor.cpp:120] Registered docker executor on 195.201.82.178 I0324 15:02:11.472980 26526 executor.cpp:160] Starting task zmq_sub.50bd118d-2f74-11e8-bd7f-fe0d65b1eb4a WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. [INFO] Current libzmq version is 4.1.6 [INFO] Current pyzmq version is 17.0.0 [INFO] Subscribing to messages from: tcp://zmqpub.marathon.l4lb.thisdcos.directory:6500 [INFO] HERE [INFO] [1] b'msg #4, now sleeping for 86 s' [INFO] [2] b'msg #5, now sleeping for 235 s' [INFO] [3] b'msg #6, now sleeping for 638 s' [INFO] [4] b'msg #7, now sleeping for 1735 s' nonetheless the last [DEBUG] msg #8, now sleeping for 4716 s didn't arrive. |
Comment by Deepak Goel [ 24/Mar/18 ] |
deric TCP keepalive has to be enabled by the application. Can you take a tcpdump and check if your application is using keepalive. Most probably it is not. Usually, there are two ways to handle long live connections: 1. Either you enable keepalive at the application level or 2. Increase the idle connection timeout value in IPVS (default is 900 sec). |
Comment by deric (Inactive) [ 27/Mar/18 ] |
@dgoel The application wasn't using keepalive, in case of ZeroMQ the parameter is called TCP_KEEPALIVE_IDLE. After setting TCP_KEEPALIVE_IDLE=1000 I've managed to sustain opened TCP connection for at least 26 hours without sending a message on the channel. Here's the source code I've used for testing keepalive settings. Where is the "the idle connection timeout value in IPVS" settings? I wasn't able to find anywhere. I assume it should be part of dcos-navstar service configuration. |
Comment by deric (Inactive) [ 27/Mar/18 ] |
Oh, you probably mean: ipvsadm -l --timeout Timeout (tcp tcpfin udp): 900 120 300 Funny, that I've managed to keep alive connection with setting keepalive=1000. It appears that the connection won't timeout immediately. According to Docker knowledge base the timeout can be modified by sysctl -w net.ipv4.tcp_keepalive_time=600 IMHO these settings are independent, overriding IPVS timeout should be done via: ipvsadm --set 3600 120 300 Would it be possible to pass the --persistent flag from Marathon config? |
Comment by Deepak Goel [ 27/Mar/18 ] |
we currently don't have an automated way of changing IPVS settings |