[DCOS_OSS-1574] Navstar is unhealthy on Core OS 1465+ Created: 18/Aug/17  Updated: 09/Nov/18  Resolved: 22/Sep/17

Status: Resolved
Project: DC/OS
Component/s: navstar
Affects Version/s: DC/OS 1.9.2
Fix Version/s: DC/OS 1.9.5, DC/OS 1.10.0, DC/OS 1.11.0

Type: Bug Priority: Blocker
Reporter: Deepak Goel Assignee: Sergey Urbanovich (Inactive)
Resolution: Done  
Labels: networking
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by DCOS_OSS-1428 Navstar failing to start after kernel... Resolved
Relates
Team: DELETE Networking Team
Watchers:
Avinash Sridharan (Inactive), Deepak Goel, Jan-Philip Gehrcke (Inactive), Marco Monaco, Sergey Urbanovich (Inactive), System Administrator
Sprint: Networking Team 1.11 Sprint 1, Infinity 1.11 Sprint 3
Story Points: 5

 Description   
Aug 18 20:50:02 vm-d7z9.c.massive-bliss-781.internal systemd[1]: Started Navstar: A distributed systems & network overlay orchestration engine.
Aug 18 20:50:02 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: Exec: /opt/mesosphere/packages/navstar--8be35702ce7ed25d47b7322a674ffabc615326ec/navstar/erts-8.2.2/bin/erlexec -noshell -noinput +Bd -boot /opt/mesosphere/packages/navstar--8be35702ce7ed25d47b7322a674ffabc615326ec/navstar/releases/0.1.0/navstar -mode embedded -boot_var ERTS_LIB_DIR /opt/mesosphere/packages/navstar--8be35702ce7ed25d47b7322a674ffabc615326ec/navstar/lib -config /tmp/sys.navstar@10.138.0.4.config -args_file /tmp/vm.navstar@10.138.0.4.args -pa -- foreground
Aug 18 20:50:02 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: Root: /opt/mesosphere/packages/navstar--8be35702ce7ed25d47b7322a674ffabc615326ec/navstar
Aug 18 20:50:02 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: /opt/mesosphere/packages/navstar--8be35702ce7ed25d47b7322a674ffabc615326ec/navstar
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: Setup running ...
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: Directories verified. Res = {[ok],[]}
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: Setup finished processing hooks ...
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1153.0>@netlink_codec:nl_dec_nla:551 nl_dec_nla: unable to decode pay load of <<210,70,0,0>>
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1153.0>@gen_netlink_client:terminate:239 Terminating, due to: function_clause, in state: wait_for_responses_rtnl, with state data: {state,#Port<0.1021>,34,18130,2,{<0.1182.0>,#Ref<0.0.8.894>},[],1,0}
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1184.0>@netlink_codec:nl_dec_nla:551 nl_dec_nla: unable to decode pay load of <<210,70,0,0>>
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1184.0>@gen_netlink_client:terminate:239 Terminating, due to: function_clause, in state: wait_for_responses_rtnl, with state data: {state,#Port<0.1056>,36,18130,2,{<0.1211.0>,#Ref<0.0.8.971>},[],1,0}
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1213.0>@netlink_codec:nl_dec_nla:551 nl_dec_nla: unable to decode pay load of <<210,70,0,0>>
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1213.0>@gen_netlink_client:terminate:239 Terminating, due to: function_clause, in state: wait_for_responses_rtnl, with state data: {state,#Port<0.1070>,34,18130,2,{<0.1219.0>,#Ref<0.0.8.1000>},[],1,0}
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1221.0>@netlink_codec:nl_dec_nla:551 nl_dec_nla: unable to decode pay load of <<210,70,0,0>>
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1221.0>@gen_netlink_client:terminate:239 Terminating, due to: function_clause, in state: wait_for_responses_rtnl, with state data: {state,#Port<0.1082>,34,18130,2,{<0.1227.0>,#Ref<0.0.8.1029>},[],1,0}
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1229.0>@netlink_codec:nl_dec_nla:551 nl_dec_nla: unable to decode pay load of <<210,70,0,0>>
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1229.0>@gen_netlink_client:terminate:239 Terminating, due to: function_clause, in state: wait_for_responses_rtnl, with state data: {state,#Port<0.1094>,34,18130,2,{<0.1235.0>,#Ref<0.0.8.1058>},[],1,0}
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1237.0>@netlink_codec:nl_dec_nla:551 nl_dec_nla: unable to decode pay load of <<210,70,0,0>>
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.1237.0>@gen_netlink_client:terminate:239 Terminating, due to: function_clause, in state: wait_for_responses_rtnl, with state data: {state,#Port<0.1106>,34,18130,2,{<0.1243.0>,#Ref<0.0.8.1087>},[],1,0}
Aug 18 20:50:03 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [warning] <0.949.0>@lashup_hyparview_events:handle_info:156 Received unknown info: {'EXIT',<0.1024.0>,shutdown}, in state: {state,<0.958.0>,#Ref<0.0.8.341>,[],[]}
Aug 18 20:50:05 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: {"Kernel pid terminated",application_controller,"{application_terminated,navstar_overlay,shutdown}"}
Aug 18 20:50:05 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: Kernel pid terminated (application_controller) ({application_terminated,navstar_overlay,shutdown})
Aug 18 20:50:05 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: [1B blob data]
Aug 18 20:50:06 vm-d7z9.c.massive-bliss-781.internal navstar-env[18115]: Crash dump is being written to: erl_crash.dump...done



 Comments   
Comment by Deepak Goel [ 18/Aug/17 ]

https://github.com/mesosphere/gen_netlink/pull/17
https://github.com/dcos/navstar/pull/92
https://github.com/dcos/dcos/pull/1872

Comment by Jan-Philip Gehrcke (Inactive) [ 24/Aug/17 ]

https://github.com/dcos/dcos/pull/1872 just landed in DC/OS w/o a merge train.

Comment by Avinash Sridharan (Inactive) [ 18/Sep/17 ]

Re-opening this since it needs to backported to 1.9.5

Comment by Marco Monaco [ 19/Sep/17 ]

Are we able to do the backport until EOD? Otherwise, can we please update the fixVersion if this will not make it into 1.9.5?

Comment by Avinash Sridharan (Inactive) [ 19/Sep/17 ]

Deepak Goel ?

Comment by Deepak Goel [ 19/Sep/17 ]

backport PR https://github.com/dcos/dcos/pull/1939

Comment by Jan-Philip Gehrcke (Inactive) [ 22/Sep/17 ]

This now also landed in 1.9.5 via train 227.

Comment by Jan-Philip Gehrcke (Inactive) [ 22/Sep/17 ]

Deepak Goel can you confirm that this patch is now included in all three of the following branches?

https://github.com/dcos/dcos/blob/1.10/packages/navstar/buildinfo.json
https://github.com/dcos/dcos/blob/1.9/packages/navstar/buildinfo.json
https://github.com/dcos/dcos/blob/master/packages/navstar/buildinfo.json

If you can confirm: please close the ticket

Comment by Avinash Sridharan (Inactive) [ 22/Sep/17 ]

Sergey Urbanovich since Deepak Goel is on PTO can you confirm and close this out?

Comment by Sergey Urbanovich (Inactive) [ 22/Sep/17 ]

[x] master

[x] 1.10

[x] 1.9

Comment by Jan-Philip Gehrcke (Inactive) [ 22/Sep/17 ]

Nice! Since this means that the relevant functionality landed in the DC/OS repo(s) we can now actually close the ticket. Doing so.

Generated at Thu Dec 02 13:29:37 CST 2021 using JIRA 7.8.4#78004-sha1:5704c55c9196a87d91490cbb295eb482fa3e65cf.