Details
-
Type:
Task
-
Status: Resolved
-
Priority:
Medium
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: Marathon 1.4.12, Marathon 1.5.7, Marathon 1.6.322, DC/OS 1.10.6
-
Component/s: marathon-docs
-
Labels:
-
Epic Link:
-
Sprint:Marathon Sprint 1.11-12
-
Story Points:0
Description
Hi,
I have a Marathon cluster with three Mesos agents:
root@gradyhost1:~# curl gradyhost1:8080/v2/info | python -m json.tool { "elected": true, "event_subscriber": null, "frameworkId": "fc4877a0-98ee-4e52-858d-afbe3b6a66e8-0000", "http_config": { "assets_path": null, "http_port": 8080, "https_port": 8443 }, "leader": "gradyhost1.eng.platformlab.ibm.com:8080", "marathon_config": { "checkpoint": true, "executor": "//cmd", "failover_timeout": 604800, "features": [], "framework_name": "marathon", "ha": true, "hostname": "gradyhost1.eng.platformlab.ibm.com", "leader_proxy_connection_timeout_ms": 5000, "leader_proxy_read_timeout_ms": 10000, "local_port_max": 20000, "local_port_min": 10000, "master": "zk://gradyhost1.eng.platformlab.ibm.com:2181,gradyhost2.eng.platformlab.ibm.com:2181,gradyhost3.eng.platformlab.ibm.com:2181/mesos", "mesos_leader_ui_url": "http://gradyhost1.eng.platformlab.ibm.com:5050/", "mesos_role": null, "mesos_user": "root", "reconciliation_initial_delay": 15000, "reconciliation_interval": 600000, "task_launch_timeout": 300000, "task_reservation_timeout": 20000, "webui_url": null }, "name": "marathon", "version": "1.1.1", "zookeeper_config": { "zk": "zk://gradyhost1.eng.platformlab.ibm.com:2181,gradyhost2.eng.platformlab.ibm.com:2181,gradyhost3.eng.platformlab.ibm.com:2181/marathon", "zk_max_versions": 25, "zk_session_timeout": 10000, "zk_timeout": 10000 } } root@gradyhost1:~# ps -ef | grep mesos-slave root 1203 32490 0 08:25 pts/3 00:00:00 grep --color=auto mesos-slave root 27575 27560 0 06:58 ? 00:00:11 mesos-slave --work_dir=/var/log/mesos --containerizers=mesos,docker --attributes=rack:rack-1 --master=zk://gradyhost1.eng.platformlab.ibm.com:2181,gradyhost2.eng.platformlab.ibm.com:2181,gradyhost3.eng.platformlab.ibm.com:2181/mesos root@gradyhost2:~# ps -ef | grep mesos-slave root 1169 1155 0 06:58 ? 00:00:09 mesos-slave --work_dir=/var/log/mesos --containerizers=mesos,docker --attributes=rack:rack-2 --master=zk://gradyhost1.eng.platformlab.ibm.com:2181,gradyhost2.eng.platformlab.ibm.com:2181,gradyhost3.eng.platformlab.ibm.com:2181/mesos root 5034 4984 0 08:08 pts/2 00:00:00 grep --color=auto mesos-slave root@gradyhost3:~# ps -ef | grep mesos-slave root 14565 14551 0 06:58 ? 00:00:08 mesos-slave --work_dir=/var/log/mesos --containerizers=mesos,docker --attributes=rack:rack-3 --master=zk://gradyhost1.eng.platformlab.ibm.com:2181,gradyhost2.eng.platformlab.ibm.com:2181,gradyhost3.eng.platformlab.ibm.com:2181/mesos root 18069 18049 0 08:08 pts/2 00:00:00 grep --color=auto mesos-slave
And I have tried the GROUP_BY constraint, but found it does not work, my test steps as below:
Create application /test-group-by with following definition
# curl -X POST -H "Content-type: application/json" gradyhost1:8080/v2/apps -d '{ "id": "test-group-by", "cmd": "sleep 6000", "instances": 6, "constraints": [["rack", "GROUP_BY", "2"]] }'
And according to my understanding of the GROUP_BY constraint description in Marathon official doc, I think those six instances should be distributed evenly across two Mesos agents, But I found that the result was not:
# curl gradyhost1:8080/v2/apps/test-group-by | jq '. | .app.tasks[].host' "gradyhost1.eng.platformlab.ibm.com" "gradyhost3.eng.platformlab.ibm.com" "gradyhost2.eng.platformlab.ibm.com" "gradyhost3.eng.platformlab.ibm.com" "gradyhost2.eng.platformlab.ibm.com" "gradyhost1.eng.platformlab.ibm.com"
They are distributed evenly across all three Mesos agents, This results are not what I expected. Are there some things I missed understand?
Attachments
Issue Links
- relates to
-
MARATHON-2348 No way to spread tasks evenly on a cluster?
-
- Resolved
-