I am working with bed files and I want to subset rows that are in a specific size range. I'm only interested in rows that "chromEnd - chromStart" is between the range of 140-160.
for example for the following bed file I want to subset the second and the fifth rows (10229-10082 = 147
and 65133-64976 = 157
):
chr1 10061 10229 A00327:118:HNV2VDMXX:1:1316:4779:23265 12 +
chr1 10082 10229 A00327:118:HNV2VDMXX:1:2488:28519:18662 30 +
chr1 49486 49880 A00327:118:HNV2VDMXX:1:2412:2564:16517 12 +
chr1 54472 54800 A00327:118:HNV2VDMXX:1:1304:1633:32095 30 +
chr1 64976 65133 A00327:118:HNV2VDMXX:1:1488:3739:12038 30 +
chr1 75240 75547 A00327:118:HNV2VDMXX:1:2370:12102:12524 30 +
chr1 106775 107146 A00327:118:HNV2VDMXX:1:1324:32696:22169 31 +
Is there any possible way to subset these rows?
Many ways, but I really like awk:
awk '{ s=$3-$2 } s >= 140 && s <= 160 { print }' input.bed > output.bed