javamavenhadoophadoop-yarnapache-twill

How to package and run twill sample application


I'm trying to use apache twill to build a YARN application. From the slides of twill presentation, they are talking about the use of maven-bundle-plugin to package the hello world sample.

So to package the sample hello world, I first tried to package the jar with mvn assembly:assembly -DdescriptorId=jar-with-dependencies. Then by adding the following to pom.xml (and doing mvn clean install):

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.felix</groupId>
      <artifactId>maven-bundle-plugin</artifactId>
      <version>2.5.3</version>
      <extensions>true</extensions>
      <configuration>
        <instructions>
          <Bundle-SymbolicName>${pom.groupId}.${pom.artifactId}</Bundle-SymbolicName>
          <Bundle-Name>${pom.artifactId}</Bundle-Name>
          <Bundle-Version>1.0.0</Bundle-Version>
          <Private-Package>org.wso2.mbp.helloworld</Private-Package>
          <Bundle-Activator>org.wso2.mbp.helloworld.Activator</Bundle-Activator>
          <Embed-Dependency>*;scope=compile|runtime</Embed-Dependency>
          <Embed-Transitive>true</Embed-Transitive>
          <Import-Package>
            org.apache.twill.*,
            org.osgi.framework,
            *;resolution:=optional
          </Import-Package>
        </instructions>
      </configuration>
    </plugin>
  </plugins>
</build>

How are twill application packaged? and then how to run them on hadoop?


Solution

  • For packaging, you can use the maven-bundle-plugin. I usually have it like this in pom.xml:

    <build>
      <plugins>
        <plugin>
          <groupId>org.apache.felix</groupId>
          <artifactId>maven-bundle-plugin</artifactId>
          <version>2.3.7</version>
          <extensions>true</extensions>
          <configuration>
            <instructions>
              <Embed-Dependency>*;inline=false;groupId=!org.apache.hadoop</Embed-Dependency>
              <Embed-Transitive>true</Embed-Transitive>
              <Embed-Directory>lib</Embed-Directory>
            </instructions>
          </configuration>
          <executions>
            <execution>
              <phase>package</phase>
              <goals>
                <goal>bundle</goal>
              </goals>
            </execution>
          </executions>
        </plugin>
      </plugins>
    </build>
    

    Then run MAVEN_OPTS="-Xmx512m" mvn clean package. That should create a .jar file under the target directory. If you use "jar -tf" to look at the content of the jar file, it should be something like this:

    my/package/HelloWorld.class
    my/package/HelloWorld$HelloWorldRunnable.class
    lib/twill-api-0.3.0-incubating.jar
    lib/twill-core-0.3.0-incubating.jar
    lib/..
    

    To launch the application, make sure you are on a host that can access the Hadoop cluster that you are planning to have the app launched. Then you can scp and unjar the file in some directory, followed by shell command like this in the expanded jar directory:

    $> export HADOOP_CP=`hadoop classpath`
    $> java -cp .:lib/*:$HADOOP_CP my.package.HelloWorld 
    

    The main() method inside the the HelloWorld should be able to interact with ZooKeeper and YARN and starts the app in the cluster.