As far as we know, each Cosmos user in the FIWARE Lab (cosmos.lab.fiware.org) has a maximum of 5GB available in HDFS.
Nevertheless we are getting a DSQuotaExceededException
when running our map-reduce Hadoop jobs in spite of the data generated by the job doesn't exceed the 5GB quota.
If we monitor the HDFS usage during the execution of the map-reduce job, we get the following output:
Command: "while true; do date; hadoop fs -count -q . ; sleep 20; done" Format: DATE QUOTA REMAINING_QUOTA SPACE_QUOTA REMAINING_SPACE_QUOTA DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME jue jul 28 18:50:12 CEST 2016 none inf 5368709120 1197734302 19 46 1389627219 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:50:34 CEST 2016 none inf 5368709120 2678747494 16 26 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:50:57 CEST 2016 none inf 5368709120 2678747494 16 26 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:51:20 CEST 2016 none inf 5368709120 2678747494 16 26 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:51:44 CEST 2016 none inf 5368709120 2678747494 16 26 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:52:07 CEST 2016 none inf 5368709120 2678747494 16 26 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:52:28 CEST 2016 none inf 5368709120 1198032544 22 35 1389528792 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:52:50 CEST 2016 none inf 5368709120 1197738517 19 39 1389625814 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:53:11 CEST 2016 none inf 5368709120 2678747494 16 27 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:53:35 CEST 2016 none inf 5368709120 2678747494 16 27 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:53:59 CEST 2016 none inf 5368709120 2678747494 16 27 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:54:22 CEST 2016 none inf 5368709120 2678747494 16 27 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:54:46 CEST 2016 none inf 5368709120 2678747494 16 27 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:55:09 CEST 2016 none inf 5368709120 2477420902 17 28 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:55:31 CEST 2016 none inf 5368709120 1197738514 19 39 1389625815 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:55:55 CEST 2016 none inf 5368709120 1197738514 20 48 1389625815 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:56:17 CEST 2016 none inf 5368709120 2678747506 16 28 895957138 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:56:40 CEST 2016 none inf 5368709120 2678747506 16 28 895957138 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:57:04 CEST 2016 none inf 5368709120 2678747506 16 28 895957138 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:57:28 CEST 2016 none inf 5368709120 2678747506 16 28 895957138 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:57:51 CEST 2016 none inf 5368709120 2678747506 16 28 895957138 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:58:13 CEST 2016 none inf 5368709120 1198032556 16 37 1389528788 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:58:34 CEST 2016 none inf 5368709120 1197738742 19 40 1389625760 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:58:56 CEST 2016 none inf 5368709120 2678747494 16 29 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:59:20 CEST 2016 none inf 5368709120 2678747494 16 29 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 18:59:43 CEST 2016 none inf 5368709120 2678747494 16 29 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:00:07 CEST 2016 none inf 5368709120 2678747494 16 29 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:00:31 CEST 2016 none inf 5368709120 2678747494 16 29 895957142 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:00:54 CEST 2016 none inf 5368709120 1076586601 22 38 1228684181 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:01:18 CEST 2016 none inf 5368709120 1197724648 19 41 1389630437 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:01:41 CEST 2016 none inf 5368709120 1197724648 19 41 1389630437 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:02:05 CEST 2016 none inf 5368709120 1197724648 19 41 1389630437 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:02:29 CEST 2016 none inf 5368709120 1197724648 19 41 1389630437 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:02:53 CEST 2016 none inf 5368709120 1197724648 19 41 1389630437 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:03:14 CEST 2016 none inf 5368709120 364004107 19 46 1667537284 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:03:36 CEST 2016 none inf 5368709120 197959591 20 48 1722885456 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:03:57 CEST 2016 none inf 5368709120 201060881 18 44 1722549413 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:04:19 CEST 2016 none inf 5368709120 201060881 18 44 1722549413 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:04:40 CEST 2016 none inf 5368709120 201060881 18 44 1722549413 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:05:02 CEST 2016 none inf 5368709120 201060881 18 44 1722549413 hdfs://cosmosmaster-gi/user/rbarriuso jue jul 28 19:05:23 CEST 2016 none inf 5368709120 201060881 18 44 1722549413 hdfs://cosmosmaster-gi/user/rbarriuso
After a while the execution finishes with this exception:
16/07/28 19:03:11 INFO mapred.JobClient: Task Id : attempt_201604111313_157784_r_000006_0, Status : FAILED org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /user/rbarriuso is exceeded: quota=5368709120 diskspace consumed=5.0g at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3778) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3640) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2400(DFSClient.java:2846) at org.apache.ha...
As you can see at the end of the log above, the maximum HDFS usage corresponds to 1.722.549.413 bytes and 201.060.881 bytes of free quota (according to hadoop fs -count -q
), which doesn't sum the 5GB of available user space.
Moreover, the taken space doesn't match the remaining free space.
How is the remaining quota space calculated?
Is there any way to avoid the DSQuotaExceededException
?
Thanks in advance.
You have to take into account the replication factor HDFS applies to all the data. By default, this is 3, thus your effective quota is 5GB/3. This quota can be increased by contacting the admin (me :)) via email.