I've got a problem i have been trying to work out I have googled a few things that are similar to what i want to do but can't work out exactly how to do it,
I have around 250M ip address's and i want to look that up against the maxmind geolite2 data so that i can tell what country each IP-address originates from,
I have imported all the data into my Redshift cluster with talend,
table a has ID and 'ipaddress' ie 10.0.0.5
table b (maxmind) has country name and IP range as expressed as 10.0.0.0/24
how could i use Redshift SQL to match these two considering the size of my source data source?
edit: heres the link to the geolite2 data https://dev.maxmind.com/geoip/geoip2/geolite2/
You could try using Amazon Redshift's ability to Import Custom Python Library Modules to load the netaddr library. Then, you could use the library within a User Defined Function written in Python.
See also: IP Range to CIDR conversion in Python?