Home | Benchmarks | Categories | Atom Feed

Posted on Sat 08 January 2022 under DevOps and Networking

Where is every IP Address?

IPinfo builds and sells IPv4 and IPv6 address metadata. This is available either by API, file download or as a Snowflake dataset. When you present an IP address, it'll offer that IP's physical location and ownership information. You can also see if it's used as a VPN or Tor endpoint, is owned by a hosting company and which domain names have been pointed at it.

For example, this site's IPv4 address 146.185.174.209 is listed as being owned by Digital Ocean, a hosting company, and is located in their data centre in Amsterdam with the domain marksblogg.com pointed at it.

$ curl -S ipinfo.io/146.185.174.209?token=obscured
{
  "ip": "146.185.174.209",
  "hostname": "www.marksblogg.com",
  "city": "Amsterdam",
  "region": "North Holland",
  "country": "NL",
  "loc": "52.3740,4.8897",
  "org": "AS14061 DigitalOcean, LLC",
  "postal": "1012",
  "timezone": "Europe/Amsterdam",
  "asn": {
    "asn": "AS14061",
    "name": "DigitalOcean, LLC",
    "domain": "digitalocean.com",
    "route": "146.185.160.0/20",
    "type": "hosting"
  },
  "company": {
    "name": "Digital Ocean, Inc.",
    "domain": "digitalocean.com",
    "type": "hosting"
  },
  "privacy": {
    "vpn": false,
    "proxy": false,
    "tor": false,
    "relay": false,
    "hosting": true,
    "service": ""
  },
  "abuse": {
    "address": "DigitalOcean, LLC, 101 Avenue of the Americas, 10th Floor, New York, NY, 10013, United States of America",
    "country": "US",
    "email": "abuse@digitalocean.com",
    "name": "Abuse Department",
    "network": "146.185.168.0/21",
    "phone": "+13478756044"
  },
  "domains": {
    "ip": "146.185.174.209",
    "total": 1,
    "domains": [
      "marksblogg.com"
    ]
  }
}

The above API response was put together from 15 different data sources, including IPinfo's own global "Probe Network".

The above information is useful for companies trying to detect e-commerce fraud, maintain sanctions compliance, understand their client's infrastructure as well as personalise and optimise offerings.

For customers with large sets of network logs, having a local IP address database can allow for good data locality when enriching dataset via batch processes. For customers with existing systems, the API is very easy to integrate and has 14 programming language bindings as of this writing.

In May, I signed a 3-month deal with IPinfo to provide consulting services. In this post, I'll describe how their probe network, which identifies the physical location of almost every IPv4 address on the planet, operates.

How IPinfo Got Started

Ben Dowling, the founder of IPinfo, began looking into ways to detect IP Address locations in 2013, shortly after he arrived in the US from the UK and began working for Facebook. In 2016, after a stint as the CTO of Calm, a meditation app, he decided to start IPinfo which builds and sells IP address metadata.

Ben's first customer was Tesla and from there he managed to win deals with Cloudflare, Microsoft and Tencent to name a few. Today IPinfo supplies data to over 400K businesses around the world. In July, their API crossed the 40B queries per month milestone.

The Probe Network

IPinfo uses a large variety of sources of information on IP addresses. In the Autumn of 2020, they set out to add a new data source for IPv4 locations. This source was called the "Probe Network" and would run an ICMP census across the entire publicly-addressable IPv4 space every few weeks.

To conduct the census, zmap is run from four regions and collects a list of all IPv4 addresses that respond to any one of four ping types. Then, that list of IPv4 addresses is given to 90+ servers around the world. Those servers will then ping those addresses and record their ping times using scamper.

The ping times from every server are then put together and calculations are made as to the likely location of each IPv4 address.

Uncovering the Truth

Network devices can fake long ping responses but they can't fake short ones. If an IPv4 address block has WHOIS information stating they're located in the US but servers in Dubai, Mumbai and Istanbul have pings times of less than 30ms, it is unlikely those addresses are being used in the US. ICMP responses from those three locations to the continental US should take at least 100ms.

The above scenario has uncovered IP addresses being covertly used in countries under US sanctions while posing as being in other parts of the world.

Producing the Dataset

IPinfo stores all of its collected data in Google Cloud's BigQuery. As of July, a little more than 300 TB of data was being stored there. This data is synthesized down to CSV, JSON, MMDB and XML files which are MBs in size, refreshed every few weeks and distributed to both IPinfo's customers as well as to IPinfo's API servers.

IPinfo has gone to great lengths to minimise the queries they execute against BigQuery and as a result, they spend around $4,500 / month on the service. This is one of the lowest costs per TB per month I've seen for a dataset of this size and level of importance to a business.

Expanding the Network

After IPinfo signed agreements with nine Cloud providers the probe network now has a fleet of 90+ VMs. Expanding this further has been an ongoing challenge as few providers have VMs located in a wide variety of geographies.

To add to this, IPinfo has had to reach out to the Cloud Provider's heads of security ahead of time to get signoff for the probe network. If you are running software that can ping the entire internet in less than a day it may look as though you're running large DDOS attacks. Thankfully the security teams have been understanding thus far.

Thank you for taking the time to read this post. I offer both consulting and hands-on development services to clients in North America and Europe. If you'd like to discuss how my offerings can help your business please contact me via LinkedIn.

Copyright © 2014 - 2024 Mark Litwintschik. This site's template is based off a template by Giulio Fidente.