hadoopapache-zookeepermesosapache-storm-topologyapache-aurora

The command status stop "Creating job WordCountTopology" after submit a Topology


I attempted to build a Heron Cluster using Apache Mesos, Apache Aurora, ZooKeeper and HDFS. However, When I submit the WordCountTopology after finished it, the command ouput as following: Stopping the "Creating job WordCountTopology".

yitian@ubuntu:~/.heron/conf/aurora$ heron submit aurora/yitian/devel --config-path ~/.heron/conf ~/.heron/examples/heron-api-examples.jar com.twitter.heron.examples.api.WordCountTopology WordCountTopology
[2018-02-13 06:58:30 +0000] [INFO]: Using cluster definition in /home/yitian/.heron/conf/aurora
[2018-02-13 06:58:30 +0000] [INFO]: Launching topology: 'WordCountTopology'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/uploader/heron-dlog-uploader.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/statemgr/heron-zookeeper-statemgr.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Starting Curator client connecting to: heron01:2181  
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.imps.CuratorFrameworkImpl: Starting  
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.state.ConnectionStateManager: State change: CONNECTED  
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Directory tree initialized.  
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Checking existence of path: /home/yitian/heron/state/topologies/WordCountTopology  
[2018-02-13 06:58:34 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: The destination directory does not exist. Creating it now at URI '/home/yitian/heron/topologies/aurora'  
[2018-02-13 06:58:37 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: Uploading topology package at '/tmp/tmpvYzRv7/topology.tar.gz' to target HDFS at '/home/yitian/heron/topologies/aurora/WordCountTopology-yitian-tag-0--8268125700662472072.tar.gz'  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/topologies/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/executionstate/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.aurora.AuroraLauncher: Launching topology in aurora  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.utils.SchedulerUtils: Updating scheduled-resource in packing plan: WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Deleted node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
INFO] Creating job WordCountTopology

Heron Tracker shows:

status  "success"
executiontime   0.00007081031799316406
message ""
version "0.17.1"
result  {}

Heron UI shows nothing: enter image description here

Aurora sheduler running as: enter image description here

Besides, it has two host in the cluster.

  1. The master named heron01, running Mesos Master, zookeeper and Aurora Scheduler.
  2. The slave named heron02, running Mesos slave, Aurora Observer and Executor.

I can open the Observer(heron02:1338) and Executor(heron02:5051) using website. I do not know where I made a mistake. The cluster configuration so complex that I cannot show here totally. You can see my website about the cluster configuration. I apologies my website is Chinese language but I believe you can understand the configuration file content in the website. The blog is here Thanks for your help so much.


Solution

  • This problem is caused by insufficient cluster resources. When Aurora Scheduler scheduled the instances to worker node in Heron cluster, if a worker node does not have enough resources to allocate an instance, it will cause the instance to be pending, waiting for a working node with sufficient resources in the cluster to appear. So this problem was solved by increasing the RAM resources of worker node in the Heron cluster.