As per AWS billing dashboard I see a higher cost as "EC2: NAT Gateway - Data Processed", is there a way I can get bottom to this, which instance or which user or which s3 bucket or which emr cluster?
A NAT Gateway will be attached to a specific VPC so this resource usage can be scoped to that, in addition a NAT Gateway is bound to route table(s) in a VPC.
By using this you can identify the subnets in which resources that are using the NAT Gateway are residing, if you have multiple NAT Gateways CloudWatch metrics exist that will allow you to get a breakdown of the BytesIn and BytesOut.
From here you could enable VPC flow logs on the selected subnets and then analyse the transit that is occurring perhaps using Athena to query your logs.