Why Bulk GeoIP Lookup

When a developer wants to know the location of an IP individually, the Apility.io Geolocation API is able to return the information in hundredths of a second. This behavior is perfect if you want to integrate the service within applications that make requests along with other logic such as web applications or apps. However, for applications that perform massive data processing, it is not feasible to make a request to the server every time a developer wants to know the geolocation of an IP address. In this case, the Bulk GeoIP Lookup service in the geolocation API comes to rescue!

The Bulk GeoIP Lookup API endpoint

We keep our API design pattern simple and straightforward. To resolve a set of IP addresses by calling the Bulk GeoIP Lookup service, just pass the list of IP addresses separated by a comma in the query string. Let’s use the curl command for the examples:

Request:

 
$ curl -H "X-Auth-Token: UUID" -X GET "https://api.apility.net/geoip_batch/118.70.124.34,8.8.8.8,9.9.9.9"

And the object response:

{  
   "response":[  
      {  
         "ip":"9.9.9.9",
         "geoip":{  
            "continent_names":{  
               "es":"Europa",
               "zh-CN":"\u6b27\u6d32",
               "de":"Europa",
               "en":"Europe",
               "pt-BR":"Europa",
               "fr":"Europe",
               "ru":"\u0415\u0432\u0440\u043e\u043f\u0430",
               "ja":"\u30e8\u30fc\u30ed\u30c3\u30d1"
            },
            "hostname":"",
            "postal":"",
            "country_names":{  
               "es":"Francia",
               "zh-CN":"\u6cd5\u56fd",
               "de":"Frankreich",
               "en":"France",
               "pt-BR":"Fran\u00e7a",
               "fr":"France",
               "ru":"\u0424\u0440\u0430\u043d\u0446\u0438\u044f",
               "ja":"\u30d5\u30e9\u30f3\u30b9\u5171\u548c\u56fd"
            },
            "region_names":{  

            },
            "city_geoname_id":-1,
            "continent_geoname_id":6255148,
            "address":"9.9.9.9",
            "latitude":48.8582,
            "country_geoname_id":3017382,
            "country":"FR",
            "longitude":2.3387000000000002,
            "region_geoname_id":-1,
            "time_zone":"Europe/Paris",
            "city_names":{  

            },
            "city":"",
            "as":{  
               "name":"QUAD9-AS-1 - Quad9",
               "country":"US",
               "asn":"19281",
               "networks":[  
                  "9.9.9.0/24",
                  "149.112.112.0/24",
                  "149.112.149.0/24"
               ],
               "ip":"9.9.9.9"
            },
            "accuracy_radius":1000,
            "region":"",
            "continent":"EU"
         }
      },
      {  
         "ip":"8.8.8.8",
         "geoip":{  
            "continent_names":{  
               "es":"Norteam\u00e9rica",
               "zh-CN":"\u5317\u7f8e\u6d32",
               "de":"Nordamerika",
               "en":"North America",
               "pt-BR":"Am\u00e9rica do Norte",
               "fr":"Am\u00e9rique du Nord",
               "ru":"\u0421\u0435\u0432\u0435\u0440\u043d\u0430\u044f \u0410\u043c\u0435\u0440\u0438\u043a\u0430",
               "ja":"\u5317\u30a2\u30e1\u30ea\u30ab"
            },
            "hostname":"",
            "postal":"",
            "country_names":{  
               "es":"Estados Unidos",
               "zh-CN":"\u7f8e\u56fd",
               "de":"USA",
               "en":"United States",
               "pt-BR":"Estados Unidos",
               "fr":"\u00c9tats-Unis",
               "ru":"\u0421\u0428\u0410",
               "ja":"\u30a2\u30e1\u30ea\u30ab\u5408\u8846\u56fd"
            },
            "region_names":{  

            },
            "city_geoname_id":-1,
            "continent_geoname_id":6255149,
            "address":"8.8.8.8",
            "latitude":37.751,
            "country_geoname_id":6252001,
            "country":"US",
            "longitude":-97.822,
            "region_geoname_id":-1,
            "time_zone":"",
            "city_names":{  

            },
            "city":"",
            "as":{  
               "name":"GOOGLE - Google LLC",
               "country":"US",
               "asn":"15169",
               "networks":[  
                  "8.8.4.0/24",
                  "8.8.8.0/24",
                  ...
                  "216.252.222.0/24"
               ],
               "ip":"8.8.8.8"
            },
            "accuracy_radius":1000,
            "region":"",
            "continent":"NA"
         }
      },
      {  
         "ip":"118.70.124.34",
         "geoip":{  
            "continent_names":{  
               "es":"Asia",
               "zh-CN":"\u4e9a\u6d32",
               "de":"Asien",
               "en":"Asia",
               "pt-BR":"\u00c1sia",
               "fr":"Asie",
               "ru":"\u0410\u0437\u0438\u044f",
               "ja":"\u30a2\u30b8\u30a2"
            },
            "hostname":"",
            "postal":"",
            "country_names":{  
               "es":"Vietnam",
               "zh-CN":"\u8d8a\u5357",
               "de":"Vietnam",
               "en":"Vietnam",
               "pt-BR":"Vietn\u00e3",
               "fr":"Vietnam",
               "ru":"\u0412\u044c\u0435\u0442\u043d\u0430\u043c",
               "ja":"\u30d9\u30c8\u30ca\u30e0"
            },
            "region_names":{  
               "en":"Thanh Pho Ha Noi"
            },
            "city_geoname_id":1581130,
            "continent_geoname_id":6255147,
            "address":"118.70.124.34",
            "latitude":21.0333,
            "country_geoname_id":1562822,
            "country":"VN",
            "longitude":105.85,
            "region_geoname_id":1581129,
            "time_zone":"Asia/Ho_Chi_Minh",
            "city_names":{  
               "es":"Han\u00f3i",
               "de":"Hanoi",
               "en":"Hanoi",
               "pt-BR":"Han\u00f3i",
               "fr":"Hano\u00ef",
               "ru":"\u0425\u0430\u043d\u043e\u0439",
               "ja":"\u30cf\u30ce\u30a4"
            },
            "city":"Hanoi",
            "as":{  
               "name":"FPT-AS-AP The Corporation for Financing & Promoting Technology",
               "country":"VN",
               "asn":"18403",
               "networks":[  
                  "1.52.0.0/14",
                  "1.52.0.0/20",
                  ...
                  "210.245.127.0/24"
               ],
               "ip":"118.70.124.34"
            },
            "accuracy_radius":20,
            "region":"Thanh Pho Ha Noi",
            "continent":"AS"
         }
      }
   ]

The object response contains an array of objects containing the IP address and the GeoIP object. Then, developers have to read this array and extract the information for each IP address. Hence, developers can iterate over a list of IP addresses in a database or file and group by a hundred IP addresses each request to the API.

But, how much time can a process or application save using the Bulk GeoIP Lookup endpoint versus the standard GeoIP Lookup request?

How Apility.io tackles network latency

This test is going to be performed on a laptop running Mac OSX from a fiber-connected network of 300Mbit symmetric delivered by Telefónica de España. Apility.io has a set of servers deployed worldwide on different Cloud Services Providers: Google, AWS, DigitalOcean, Hetzner, and others. Each server is a satellite and all of them form a constellation. A DNS with a latency based resolution algorithm returns always to the closest server to the origin. This process is automatic and transparent to the end user.

If we ping the server from our network:

$ ping api.apility.net
PING satellite-XXXXXXX-XX.apility.net (XXX.XXX.XXX.XXX): 56 data bytes
64 bytes from XXX.XXX.XXX.XXX: icmp_seq=0 ttl=45 time=65.528 ms
64 bytes from XXX.XXX.XXX.XXX: icmp_seq=1 ttl=45 time=66.324 ms
64 bytes from XXX.XXX.XXX.XXX: icmp_seq=2 ttl=45 time=66.001 ms
64 bytes from XXX.XXX.XXX.XXX: icmp_seq=3 ttl=45 time=65.869 ms

The ping is around 60ms. Latency from a residential network is much higher than latency from a data center network and much much higher than latency from the same data center.

We have developed two scripts: the First script will perform 100 requests and each request an IP address. The second script will perform a single request with 100 IP addresses.

Script 1: An IP address per request 100 times

This script reads a file with 100 IP addresses and performs a request per each IP address:

#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
    curl -s -X GET "http://api.apility.net/geoip/$line?token=YOUR_API_KEY" > /dev/null
done < "$1"

Then I launch the script with the time command:

$ time ./test-100request.sh tor-exit.lst 

real	0m15.548s
user	0m0.673s
sys	0m0.599s

It took 15548 milliseconds to process 100 IP addresses. That’s an average time of 155ms per IP address including the network latency. Not bad, probably not enough if you want to process thousands of IP addresses.

Script 2: 100 IP addresses per request

This script reads a file with 100 IP addresses, concatenates them and then performs a single request per all the IP addresses:

#!/bin/bash
IPS=''
while IFS='' read -r line || [[ -n "$line" ]]; do
    IPS=$IPS,$line
done < "$1" curl -s -X GET "http://api.apility.net/geoip_batch/${IPS:1}?token=YOUR_API_KEY" > /dev/null

Then I launch the script with the time command:

$ time ./test-1bulkrequest.sh tor-exit.lst 

real	0m1.260s
user	0m0.015s
sys	0m0.017s

It took 1260 milliseconds to process 100 IP addresses. That’s an average time of 12ms per IP address including the network latency.

We have saved a staggering 92% of process time! 

Conclusions

As a wrap-up, clearly the Bulk GeoIP Lookup services can help to develop more efficient and performant application and services. Obviously, this service still lags behind a solution with a local GeoIP database. Moreover, our service can save your team of the hassle of maintaining up to date this database and keep it up and running smoothly. Surely the Bulk GeoIP Lookup service is a perfect trade-off between performance and maintainability.