[DCOS_OSS-4156] Can't load DC/OS dashboard with ansible-dcos on a on-premises server from localhost Created: 20/Sep/18  Updated: 09/Nov/18  Resolved: 25/Oct/18

Status: Resolved
Project: DC/OS
Component/s: adminrouter, dcos-net, dcos-net-spartan, exhibitor, marathon, mesos, mesos-dns, mesos-module, networking
Affects Version/s: DC/OS 1.11.0
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: dnguyen-fnx Assignee: Jan Repnak (Inactive)
Resolution: Invalid  
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File dcos-exhibitor.png     PNG File dcos-marathon.png     PNG File dcos-mesos-dns.png     PNG File dcos-mesos-master.png     PNG File dcos-net.png     PNG File dcos-net.service.png     HTML File inventory    
Team: DELETE Cluster Ops Team

 Description   

Hello,

I use ansible-dcos to deploy Mesosphere DC/OS from my local machine, to a on-pemises bare-metal server (1 master, 1 boot, 1 private agent, 1 public agent).
But the problem is that I am unable to connect to the DC/OS dashboard (172.21.3.51) from my localhost.
 
You will find below the inventory file, and the logs for marathon, admin-router, mesos-master, exhibitor, net, mesos-dns.
 
Inventory

[defaults]
master1.dcos ansible_ssh_host=172.21.3.51
agent1.dcos ansible_ssh_host=172.21.3.52
public1.dcos ansible_ssh_host=172.21.3.53
boot.dcos ansible_ssh_host=172.21.3.55

[masters]
master1.dcos

[agents]
agent1.dcos

[agent_publics]
public1.dcos

[bootstraps]
boot.dcos

[datacenter:children]
masters
agents
agent_publics
bootstraps

[datacenter:vars]
ansible_ssh_user=root
ansible_ssh_pass=azertyuiop
ansible_sudo_pass=azertyuiop
username=root
dcos_iaas_target='onprem'
dcos_ip_detect_interface='enp0s25'
dcos_bootstrap_ip='10.10.10.15'
dcos_master_list=['10.10.10.11']
dcos_resolvers=['8.8.4.4', '8.8.8.8']
dcos_dns_search='None'
dcos_exhibitor_address='masterlb.internal'
dcos_master_address='masterlb.external'
dns_search=[master1.dcos, agent1.dcos, public1.dcos, boot.dcos]
    •  
      Note:  I am not sure if dns_search and dcos_dns_search are well configured.

marathon

sept. 20 13:28:14 gzlaac01 marathon[29131]: [2018-09-20 13:28:14,089] INFO 127.0.0.1 - - [20/Sep/2018:11:28:14 +0000] "GET //127.0.0.1:8080/v2/apps?embed=apps.tasks&label=DCOS_SERVICE_NAME HTTP/1.1" 200 11 "-" "Master Admin Router" (mesosphere.chaos.http.ChaosRequestLog:qtp614565258-54)
sept. 20 13:28:14 gzlaac01 marathon[29131]: [2018-09-20 13:28:14,090] INFO 127.0.0.1 - - [20/Sep/2018:11:28:14 +0000] "GET //127.0.0.1:8080/v2/leader HTTP/1.1" 200 29 "-" "Master Admin Router" (mesosphere.chaos.http.ChaosRequestLog:qtp614565258-53)
sept. 20 13:28:14 gzlaac01 marathon[29131]: [2018-09-20 13:28:14,506] INFO No match for:aaf548a9-6b43-46b0-834c-3bdec27c9bc2-O65 from:172.21.3.53 reason:No offers wanted (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-12)
sept. 20 13:28:39 gzlaac01 marathon[29131]: [2018-09-20 13:28:39,100] INFO 127.0.0.1 - - [20/Sep/2018:11:28:39 +0000] "GET //127.0.0.1:8080/v2/apps?embed=apps.tasks&label=DCOS_SERVICE_NAME HTTP/1.1" 200 11 "-" "Master Admin Router" (mesosphere.chaos.http.ChaosRequestLog:qtp614565258-53)
sept. 20 13:28:39 gzlaac01 marathon[29131]: [2018-09-20 13:28:39,102] INFO 127.0.0.1 - - [20/Sep/2018:11:28:39 +0000] "GET //127.0.0.1:8080/v2/leader HTTP/1.1" 200 29 "-" "Master Admin Router" (mesosphere.chaos.http.ChaosRequestLog:qtp614565258-47)

 
admin-router

sept. 20 13:31:34 gzlaac01 adminrouter.sh[29590]: 2018/09/20 13:31:34 [info] 29599#0: *913 [lua] cache.lua:259: store_leader_data(): marathon leader is local, context: ngx.timer
sept. 20 13:31:34 gzlaac01 adminrouter.sh[29590]: 2018/09/20 13:31:34 [info] 29599#0: *913 [lua] cache.lua:274: store_leader_data(): marathon leader cache has been successfully updated, context: ngx.timer
sept. 20 13:31:34 gzlaac01 adminrouter.sh[29590]: 2018/09/20 13:31:34 [info] 29599#0: *913 [lua] cache.lua:456: refresh_needed(): Cache `mesos_leader_last_refresh` expired. Refresh., context: ngx.timer
sept. 20 13:31:34 gzlaac01 adminrouter.sh[29590]: 2018/09/20 13:31:34 [info] 29599#0: *913 [lua] cache.lua:259: store_leader_data(): mesos leader is local, context: ngx.timer
sept. 20 13:31:34 gzlaac01 adminrouter.sh[29590]: 2018/09/20 13:31:34 [info] 29599#0: *913 [lua] cache.lua:274: store_leader_data(): mesos leader cache has been successfully updated, context: ngx.timer
sept. 20 13:31:34 gzlaac01 adminrouter.sh[29590]: 2018/09/20 13:31:34 [info] 29599#0: *913 [lua] cache.lua:574: Created recursive timer for cache updating., context: ngx.timer
sept. 20 13:31:35 gzlaac01 adminrouter.sh[29590]: 2018/09/20 13:31:35 [info] 29598#0: *917 [lua] cache.lua:499: refresh_cache(): Executing cache refresh triggered by timer, context: ngx.timer
sept. 20 13:31:35 gzlaac01 adminrouter.sh[29590]: 2018/09/20 13:31:35 [info] 29598#0: *917 [lua] cache.lua:574: Created recursive timer for cache updating., context: ngx.timer

mesos-master

(marathon) at scheduler-10a32715-48f6-466f-85c2-1e176e8bdc9c@172.21.3.51:15101
sept. 20 13:32:59 gzlaac01 mesos-master[28928]: I0920 13:32:59.882798 28947 http.cpp:1185] HTTP GET for /master/state-summary from 172.21.3.51:58628 with User-Agent='python-requests/2.18.4'
sept. 20 13:33:00 gzlaac01 mesos-master[28928]: I0920 13:33:00.757025 28938 master.cpp:8832] Sending 1 offers to framework aaf548a9-6b43-46b0-834c-3bdec27c9bc2-0000 (metronome) at scheduler-0560034c-da55-4e56-ad4d-9198c4053258@172.21.3.51:15201
sept. 20 13:33:00 gzlaac01 mesos-master[28928]: I0920 13:33:00.758522 28952 master.cpp:5511] Processing DECLINE call for offers: [ aaf548a9-6b43-46b0-834c-3bdec27c9bc2-O69 ] for framework aaf548a9-6b43-46b0-834c-3bdec27c9bc2-0000 (metronome) at scheduler-0560034c-da55-4e56-ad4d-9198c4053258@172.21.3.51:15201
sept. 20 13:33:00 gzlaac01 mesos-master[28928]: I0920 13:33:00.758709 28952 master.cpp:10757] Removing offer

 
exhibitor

gzlaac01 java[28265]: [myid:] INFO [Thread-4193:NIOServerCnxn@1044] - Closed socket connection for client /127.0.0.1:41686 (no session established for client)
sept. 20 13:34:01 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /127.0.0.1:41694
sept. 20 13:34:01 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@883] - Processing ruok command from /127.0.0.1:41694
sept. 20 13:34:01 gzlaac01 java[28265]: [myid:] INFO [Thread-4194:NIOServerCnxn@1044] - Closed socket connection for client /127.0.0.1:41694 (no session established for client)
sept. 20 13:34:01 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /127.0.0.1:41696
sept. 20 13:34:01 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@883] - Processing srvr command from /127.0.0.1:41696
sept. 20 13:34:01 gzlaac01 java[28265]: [myid:] INFO [Thread-4195:NIOServerCnxn@1044] - Closed socket connection for client /127.0.0.1:41696 (no session established for client)
sept. 20 13:34:03 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /127.0.0.1:41700
sept. 20 13:34:03 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@883] - Processing ruok command from /127.0.0.1:41700
sept. 20 13:34:03 gzlaac01 java[28265]: [myid:] INFO [Thread-4196:NIOServerCnxn@1044] - Closed socket connection for client /127.0.0.1:41700 (no session established for client)
sept. 20 13:34:03 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /127.0.0.1:41702
sept. 20 13:34:03 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@883] - Processing srvr command from /127.0.0.1:41702
sept. 20 13:34:03 gzlaac01 java[28265]: [myid:] INFO [Thread-4197:NIOServerCnxn@1044] - Closed socket connection for client /127.0.0.1:41702 (no session established for client)
sept. 20 13:34:05 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /127.0.0.1:41708
sept. 20 13:34:05 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@883] - Processing ruok command from /127.0.0.1:41708
sept. 20 13:34:05 gzlaac01 java[28265]: [myid:] INFO [Thread-4198:NIOServerCnxn@1044] - Closed socket connection for client /127.0.0.1:41708 (no session established for client)
sept. 20 13:34:05 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] - Accepted socket connection from /127.0.0.1:41710
sept. 20 13:34:05 gzlaac01 java[28265]: [myid:] INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@883] - Processing srvr command from /127.0.0.1:41710
sept. 20 13:34:05 gzlaac01 java[28265]: [myid:] INFO [Thread-4199:NIOServerCnxn@1044] - Closed socket connection for client /127.0.0.1:41710 (no session established for client)

 
net

sept. 20 12:27:38 gzlaac01 dcos-net-setup.py[31720]: RTNETLINK answers: File exists
sept. 20 12:27:39 gzlaac01 bootstrap[31725]: [INFO] Clearing proxy environment variables
sept. 20 12:27:39 gzlaac01 bootstrap[31725]: [INFO] PID 28265 has command line [b'/opt/mesosphere/active/java/usr/java/bin/java', b'-Dzookeeper.log.dir=/var/lib/dcos/exhibitor/zookeeper', b'-Dzookeeper.root.logger=INFO,CONSOLE', b'-cp', b'/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../build/classes:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../build/lib/*.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/netty-3.10.5.Final.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/log4j-systemd-journal-appender-1.3.2.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/log4j-jna-4.2.2.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/log4j-1.2.16.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../zookeeper-3.4.10.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../src/java/lib/*.jar:/var/lib/dcos/exhibitor/conf:', b'-Djna.tmpdir=/var/lib/dcos/exhibitor/tmp', b'-Dcom.sun.management.jmxremote', b'-Dcom.sun.management.jmxremote.local.only=false', b'org.apache.zookeeper.server.quorum.QuorumPeerMain', b'/var/lib/dcos/exhibitor/conf/zoo.cfg']
sept. 20 12:27:39 gzlaac01 bootstrap[31725]: [INFO] PID file hasn't been modified. ZK still seems to be at that PID.
sept. 20 12:27:39 gzlaac01 bootstrap[31725]: [INFO] Shortcut succeeeded, assuming local zk is in good config state, not waiting for quorum.
sept. 20 12:27:39 gzlaac01 bootstrap[31725]: [DEBUG] bootstrapping dcos-net
sept. 20 12:27:39 gzlaac01 systemd[1]: Started DC/OS Net: A distributed systems & network overlay orchestration engine.
sept. 20 12:27:39 gzlaac01 dcos-net-env[31732]: Exec: /opt/mesosphere/packages/dcos-net-a7a87864e6e6dff9fdee69bf838e290742bce438/dcos-net/erts-9.2/bin/erlexec -noshell -noinput +Bd -boot /opt/mesosphere/packages/dcos-neta7a87864e6e6dff9fdee69bf838e290742bce438/dcos-net/releases/0.0.1/dcos-net -mode embedded -boot_var ERTS_LIB_DIR /opt/mesosphere/packages/dcos-net-a7a87864e6e6dff9fdee69bf838e290742bce438/dcos-net/lib -config /tmp/sys.config -args_file /tmp/vm.args -pa – foreground
sept. 20 12:27:39 gzlaac01 dcos-net-env[31732]: Root: /opt/mesosphere/packages/dcos-net--a7a87864e6e6dff9fdee69bf838e290742bce438/dcos-net
sept. 20 12:27:39 gzlaac01 dcos-net-env[31732]: /opt/mesosphere/packages/dcos-net--a7a87864e6e6dff9fdee69bf838e290742bce438/dcos-net

 
dcos-mesos-dns

sept. 20 12:15:56 gzlaac01 ping[28888]: PING ready.spartan (127.0.0.1) 56(84) bytes of data.
sept. 20 12:15:56 gzlaac01 bootstrap[28891]: [INFO] Clearing proxy environment variables
sept. 20 12:15:56 gzlaac01 bootstrap[28891]: [INFO] PID 28265 has command line [b'/opt/mesosphere/active/java/usr/java/bin/java', b'-Dzookeeper.log.dir=/var/lib/dcos/exhibitor/zookeeper', b'-Dzookeeper.root.logger=INFO,CONSOLE', b'-cp', b'/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../build/classes:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../build/lib/*.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/netty-3.10.5.Final.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/log4j-systemd-journal-appender-1.3.2.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/log4j-jna-4.2.2.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/log4j-1.2.16.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../zookeeper-3.4.10.jar:/opt/mesosphere/active/exhibitor/usr/zookeeper/bin/../src/java/lib/*.jar:/var/lib/dcos/exhibitor/conf:', b'-Djna.tmpdir=/var/lib/dcos/exhibitor/tmp', b'-Dcom.sun.management.jmxremote', b'-Dcom.sun.management.jmxremote.local.only=false', b'org.apache.zookeeper.server.quorum.QuorumPeerMain', b'/var/lib/dcos/exhibitor/conf/zoo.cfg']
sept. 20 12:15:56 gzlaac01 bootstrap[28891]: [INFO] PID file hasn't been modified. ZK still seems to be at that PID.
sept. 20 12:15:56 gzlaac01 bootstrap[28891]: [INFO] Shortcut succeeeded, assuming local zk is in good config state, not waiting for quorum.
sept. 20 12:15:56 gzlaac01 bootstrap[28891]: [DEBUG] bootstrapping dcos-mesos-dns
sept. 20 12:15:56 gzlaac01 systemd[1]: Started Mesos DNS: domain name based service discovery.

 
Can you help me please ? 

Thanking you in advance
 



 Comments   
Comment by Automation Bot [ 20/Sep/18 ]

JIRA automation rule triggered: Team field was updated based on the assignee

Comment by Dominik Dary (Inactive) [ 24/Oct/18 ]

Lee Hambley assigning this to your team.

Comment by Lee Hambley (Inactive) [ 25/Oct/18 ]

dnguyen-fnx the projects provided under dcos-labs are not officially supported. 

I asked our internal stakeholders who contribute to that project to check for any obvious mistakes and the consensus seemed to be that you were using an outdated version or a fork.

You may have more suggest trying an up to date version and/or seeking help via GitHub's issue tracker.

Generated at Fri May 20 13:11:31 CDT 2022 using JIRA 7.8.4#78004-sha1:5704c55c9196a87d91490cbb295eb482fa3e65cf.