mapreducerepositorycloudera-cdhmrunit

MRunit dependency latest in cloudera repository


I could not find the latest mrunit(1.1.0) in Cloudera repository. The one available is 0.8.0-incubating. Following is my pom:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.ma.hadoop</groupId>
    <artifactId>MapReduce</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <properties>
        <hadoop.version>2.3.0-cdh5.1.2</hadoop.version>
        <hive.version>0.12.0-cdh5.1.2</hive.version>
        <mrunit.version>0.8.0-incubating</mrunit.version>
    </properties>
    <dependencies>
        <!-- For unit testing -->
        <dependency>
            <groupId>org.apache.mrunit</groupId>
            <artifactId>mrunit</artifactId>
            <version>${mrunit.version}</version>
        </dependency>
        <!-- This is sufficient for all -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
    </dependencies>
    <build>
        <finalName>Mapred</finalName>
        <pluginManagement>
            <plugins>
                <plugin>
                    <groupId>org.codehaus.mojo</groupId>
                    <artifactId>exec-maven-plugin</artifactId>
                    <version>1.2.1</version>
                </plugin>
            </plugins>
        </pluginManagement>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.1</version>
                <configuration>
                    <source>${jdk.version}</source>
                    <target>${jdk.version}</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <version>2.4</version>
                <configuration>
                    <outputDirectory>${basedir}</outputDirectory>
                </configuration>
            </plugin>
        </plugins>
    </build>
    <repositories>
        <repository>
            <id>maven-hadoop</id>
            <name>Hadoop Releases</name>
            <url>https://repository.cloudera.com/content/repositories/releases/</url>
        </repository>
        <repository>
            <id>cloudera-repos</id>
            <name>Cloudera Repos</name>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
        </repository>
    </repositories>
</project>

if i change the version to 1.1.0 eclipse throws a artifact not found on the mrunit dependency in the pom file.

I tried adding apache repo

<id>central</id>
<url>http://repo1.maven.org/maven2/</url>
<repository>

Eclipse downloads the jar in the .m2 but I still get the artifact not found. An the unit test call will not compile. Can someone please help what is the safe way to use latest mrunit with cloudera repo.

Thanks, Amit


Solution

  • In your mrunitdependancy declaration:

    <dependency>
       <groupId>org.apache.mrunit</groupId>
       <artifactId>mrunit</artifactId>
       <version>${mrunit.version}</version>
    </dependency>
    

    You should add <classifier>hadoop2</classifier>to clarify which version hadoop you want to use, the classifier value is hadoop1 or hadoop2.

    So you need to change pom.xmldependancy to this since you use Hadoop 2.X:

    <dependency>
        <groupId>org.apache.mrunit</groupId>
        <artifactId>mrunit</artifactId>
        <version>${mrunit.version}</version>
        <classifier>hadoop2</classifier>
    </dependency>