[DCOS_OSS-231] ninuteman repeatedly crashing complaining 'iptables: command not found' Created: 11/Jul/16  Updated: 09/Nov/18  Resolved: 11/Jul/16

Status: Resolved
Project: DC/OS
Component/s: networking
Affects Version/s: 1.7-open
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Enzo Wang Assignee: Sargun Dhillon (Inactive)
Resolution: Won't Do  
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

dcos version: 1.7-open
marathon version: 1.1.1
OS: SLES 12.1
Kernel: 3.12.59-60.45
Docker: 1.10.3 with direct-lvm
iptables: iptables-1.4.21-2.10.x86_64
ipset: ipset-6.21.1-3.31.x86_64
libipset: libipset3-6.21.1-3.31.x86_64



 Description   

I've installed the DCOS on a 10 bare-metal node cluster with SLES 12.1 OS.

All the functionalities seem to be working except for the VIP.

I found the minuteman service is constantly restarted by systemd.

Here are some logs:
error.log:

2016-07-12 00:08:01.077 [error] <0.988.0>@minuteman_network_sup:load_rule:83 Unknown response: {ok,"/bin/sh: line 1: iptables: command not found\n"}
2016-07-12 00:08:01.078 [error] <0.988.0> Supervisor minuteman_sup had child undefined started with minuteman_network_sup:start_link() at undefined exit with reason {'EXIT',{iptables_fail,[{minuteman_network_sup,load_rule,1,[{file,"/pkg/src/minuteman/_build/prod/lib/minuteman/src/minuteman_network_sup.erl"},{line,84}]},{lists,foreach,2,[{file,"lists.erl"},{line,1337}]},{minuteman_network_sup,start_link,0,[{file,"/pkg/src/minuteman/_build/prod/lib/minuteman/src/minuteman_network_sup.erl"},{line,46}]},{supervisor,do_start_child,2,[{file,"supervisor.erl"},{line,343}]},{supervisor,start_children,3,[{file,"supervisor.erl"},{line,326}]},{supervisor,init_children,...},...]}} in context start_error
2016-07-12 00:08:01.079 [error] <0.986.0> CRASH REPORT Process <0.986.0> with 0 neighbours exited with reason: {{shutdown,{failed_to_start_child,minuteman_network_sup,{'EXIT',{iptables_fail,[{minuteman_network_sup,load_rule,1,[{file,"/pkg/src/minuteman/_build/prod/lib/minuteman/src/minuteman_network_sup.erl"},{line,84}]},{lists,foreach,2,[{file,"lists.erl"},{line,1337}]},{minuteman_network_sup,start_link,0,[{file,"/pkg/src/minuteman/_build/prod/lib/minuteman/src/minuteman_network_sup.erl"},{line,46}]},{supervisor,do_start_child,2,[{file,"supervisor.erl"},{line,343}]},{supervisor,start_children,3,[{...},...]},...]}}}},...} in application_master:init/4 line 134

crash.log:

2016-07-12 00:08:01 =SUPERVISOR REPORT====
     Supervisor: {local,minuteman_sup}
     Context:    start_error
     Reason:     {'EXIT',{iptables_fail,[{minuteman_network_sup,load_rule,1,[{file,"/pkg/src/minuteman/_build/prod/lib/minuteman/src/minuteman_network_sup.erl"},{line,84}]},{lists,foreach,2,[{file,"lists.erl"},{line,1337}]},{minuteman_network_sup,start_link,0,[{file,"/pkg/src/minuteman/_build/prod/lib/minuteman/src/minuteman_network_sup.erl"},{line,46}]},{supervisor,do_start_child,2,[{file,"supervisor.erl"},{line,343}]},{supervisor,start_children,3,[{file,"supervisor.erl"},{line,326}]},{supervisor,init_children,2,[{file,"supervisor.erl"},{line,292}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}}
     Offender:   [{pid,undefined},{id,minuteman_network_sup},{mfargs,{minuteman_network_sup,start_link,[]}},{restart_type,permanent},{shutdown,5000},{child_type,supervisor}]

2016-07-12 00:08:01 =CRASH REPORT====
  crasher:
    initial call: application_master:init/4
    pid: <0.986.0>
    registered_name: []
    exception exit: {{{shutdown,{failed_to_start_child,minuteman_network_sup,{'EXIT',{iptables_fail,[{minuteman_network_sup,load_rule,1,[{file,"/pkg/src/minuteman/_build/prod/lib/minuteman/src/minuteman_network_sup.erl"},{line,84}]},{lists,foreach,2,[{file,"lists.erl"},{line,1337}]},{minuteman_network_sup,start_link,0,[{file,"/pkg/src/minuteman/_build/prod/lib/minuteman/src/minuteman_network_sup.erl"},{line,46}]},{supervisor,do_start_child,2,[{file,"supervisor.erl"},{line,343}]},{supervisor,start_children,3,[{file,"supervisor.erl"},{line,326}]},{supervisor,init_children,2,[{file,"supervisor.erl"},{line,292}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}}}},{minuteman_app,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,134}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}
    ancestors: [<0.985.0>]
    messages: [{'EXIT',<0.987.0>,normal}]
    links: [<0.985.0>,<0.760.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 1598
    stack_size: 27
    reductions: 241
  neighbours:

I am wondering why it said iptables not found.

Could it be the reason that SLES is not supported well by DCOS? Must I reinstall DCOS with RHEL/CentOS to get all the things up and running?



 Comments   
Comment by Enzo Wang [ 11/Jul/16 ]

In addition, health check in the dcos gui said 'Layer 4 Load Balancer' is healthy, which is not true.

Comment by Sargun Dhillon (Inactive) [ 11/Jul/16 ]

SLES is unsupported by DC/OS.

Comment by Enzo Wang [ 11/Jul/16 ]

I read that RHEL/CentOS better suit dcos.

Since almost everything is running with our SLES, I would like to continue the evaluation if the this problem is not that hard to fix.

Could you shed some light on this issue?

Comment by Sargun Dhillon (Inactive) [ 11/Jul/16 ]

Can you install iptables, and ipsets? They're dependencies of DC/OS.

Comment by Enzo Wang [ 11/Jul/16 ]

As mentioned, the environment is like:

dcos version: 1.7-open
marathon version: 1.1.1
OS: SLES 12.1
Kernel: 3.12.59-60.45
Docker: 1.10.3 with direct-lvm
iptables: iptables-1.4.21-2.10.x86_64
ipset: ipset-6.21.1-3.31.x86_64
libipset: libipset3-6.21.1-3.31.x86_64

I also checked the /opt/mesosphere/packages, ipset/iptables are not there. Don't know if it is related

Comment by Enzo Wang [ 11/Jul/16 ]

That was the issue, after manually did the soft link, issue got fixed! Thanks.

Generated at Tue May 24 04:03:28 CDT 2022 using JIRA 7.8.4#78004-sha1:5704c55c9196a87d91490cbb295eb482fa3e65cf.