I get a lot of questions from clients and colleagues on “the relationship between identity and the Internet.” Usually they want to know what others can learn about them from their activities on the Internet, or how they can separate their personal identities from those activities to secure their privacy, reputation, or even safety.
This blog post offers a simplified introduction to the foundation of identity on the Internet: IP addresses and domain names. These two constructs underpin your every interaction on the Internet, from turning on your mobile device to chatting with your friends or paying your bills online. Understanding exactly what they are and how they are governed is necessary for knowing the baseline of what can be learned about you from your activities on the Internet, even if you were to visit a website from a new phone and immediately destroy the phone.
Identity on the Internet involves much more than a pair of identifiers, of course. Your social media presence tells a much richer story about who you are than some IP address your ISP assigned you. The thing is, you don’t need a social media profile to interact with much of the Internet. When you do need one, you can fake much of the information to safeguard your privacy. What you can’t escape is your IP address. You might be savvy and hop on proxies, VPNs, or Tor. But you will only ever be adding layers of indirection to the trail that leads back to your device and to you, rather than eliminating the trail altogether. The best thing you can do to safeguard your privacy is to fully understand the system in which you interact and to judge the risks of your actions based upon your vantage point within that system.
So, to help you fully understand that system, let’s begin with the basics.
What constitutes identity on a network?
First, we must understand that a “network” is a group of computers, mobile devices, or other machines that communicate with one another under a common language. Each machine must have its own, unique address to which it receives information from others on the network, just as your house has its own, unique address to which it receives mail. Without addresses, it would be impossible to route information to the intended party. Therefore, the foundational elements of identity on a network are the addresses that enable the sharing of information among parties.
On the Internet, there are two systems that machines use to address one another. The Internet Protocol (IP) is the standard system for addressing machines on the Internet. The Domain Name System (DNS) is the standard system for assigning human-readable “domain names” or “hostnames” to IP addresses. Ultimately, machines are addressed by IP addresses. DNS enables machines to translate domain names to those numerical addresses.
How do the names and addresses work?
Whenever you visit google.com, your Web browser looks up one of the IP addresses assigned to it using the DNS protocol. Once your browser has this address, it can communicate with google.com. Several IPv4 addresses are assigned to google.com. Here’s one of them as of the time of this blog post:
Google also addresses itself on the newer IPv6: 2607:f8b0:4000:807::1004
You have an IP address, too. Your Internet Service Provider assigned it to your Internet Access Point, which might be a modem or a mobile device, most likely using the Dynamic Host Configuration Protocol (DHCP). You can see your IP address by Googling what is my ip. Google can tell you this because you send your IP address to them every time you interact with their website. The same applies to every other service you interact with on the Internet.
When you communicate on the Internet, your device sends and receives information to and from other machines on the Internet in the form of packets. Packets are analogous to envelopes in the mailing system: they contain a message to be delivered, as well as a small “header” that explains how to route the message to the intended party and back again. Your device writes the IP address of itself and the intended recipient on each of those headers in the same way you would write a sender’s address and a return address on an envelope.
Here’s an illustration of an IPv4 packet header for clarity:
If you’re curious about how packets actually “ship” from your device to a recipient, do some research on the Internet Backbone, which includes Autonomous Systems and the Border Gateway Protocol (BGP). That system is largely out of scope for the purpose of blog post. However it’s highly relevant for anyone concerned about privacy on the Internet, because every packet you send or receive through this system has the potential to be observed and collected by intermediate parties.
Who governs names and addresses?
There is an interesting history behind who governs the registration and assignment of IP addresses and domain names.
ICANN, an American non-profit organization, is responsible for the global coordination of IP address assignments and the Domain Name System (DNS). ICANN was originally established as IANA in the early 1970s by a single man named Jon Postel, a.k.a. the czar of socket numbers, who contributed significantly to the Internet Engineering Task Force (IETF) when the Internet was still ARPANET prior to the commercialization of what we now know as the Internet.
As ARPANET grew, it became difficult for people to keep track of the assignments of IP addresses, domain names, and other identifiers on the greater “inter-network.” IANA served to standardize those identifiers and their means of distribution. Eventually, as the commercial Internet grew, it was imperative that people followed the policies and procedures IANA established, because everyone had become invested in those policies and procedures. Without those there would be no shared understanding on how to address one another, and the Internet would fail as a telecommunications infrastructure.
How does ICANN govern names and addresses?
Wait a minute. How does ICANN, a non-profit organization, even have the authority to govern the assignment of IP addresses and domain names around the world? How are its policies enforced?
Let’s take a step back and observe how names and addresses are governed today. This will help us understand why the status quo exists today.
IP address governance
IP addresses are not a tangible resource. You are not given an IP address in the same way you are given a computer. You are assigned an IP address by your Internet Service Provider in the same way your home is assigned a mailing address by your municipal government. IP addresses are nothing more than a finite set of numbers, standardized by the IETF in RFC 791 and RFC 2460. These are technical standards that simply describe a protocol, or “way of doing something,” and how it should be implemented.
ICANN writes policies that govern how those numbers are distributed worldwide to Internet Service Providers (ISPs) and ultimately to consumers and organizations. ICANN designates (or “allocates”) very large portions of the IP address pool among five Regional Internet Registries (RIRs): ARIN, APNIC, LACNIC, RIPE NCC, and AFRINIC. The RIRs then allocate their shares of the IP address space among the ISPs that have registered to be a member of their respective jurisdictions. Finally, the ISPs assign either individual IP addresses or small blocks of IP addresses to their customers such as households or businesses.
Domain name governance
The Domain Name System (DNS) is managed slightly differently. Unlike IP addresses, there is virtually no limit to the number of domain names that can exist, and there is also an expectation that domain names represent brands and trademarks. Consequently there is more of a bottom-up approach to distribution in which consumers register for domain names with an ICANN-accredited registrar such as GoDaddy. When you register a domain name, the nameservers for your domain name are cataloged in the root zone database for the Top-Level Domain (TLD) of your domain name: .com, .net, .org and so on. Altogether, when you request to visit www.example.net in your Web browser, your device must iteratively request the IP address of the .net nameserver, then the example.net nameserver, and finally the www.example.net nameserver before it obtains the desired IP address. If your device hasn’t cached the IP address of any of these nameservers, it will need to query one of the DNS Root Servers, whose IP addresses are likely stored in the device itself.
Breaking from the status quo
You might ask, “What’s to stop me from defecting from the system and writing software that forges the IP address of the packets leaving my device?”
Well, nothing’s to stop you from forging your own IP address. You just won’t be able to communicate if you do. Think of it as faking the return address on an envelope. The recipient won’t know any better than to send a response to the faked address. If ISPs were to change their address assignments arbitrarily, their customers would be unable to communicate with the rest of the Internet. Think of the billions of dollars that would be lost if online services were suddenly to become unreachable. Something like this actually happened when Pakistan Telecom briefly caused a global outage of YouTube by advertising an incorrect route to the service in an attempt to block it from the country.
ICANN has authority because the world is entrenched in the product of its policies. Everyone follows the governance of ICANN because everyone can expect everyone else to do the same. If some new organization were to spring up and declare an entirely new form of Internet, they would be laughed at. Who would join an Internet that nobody else is using?
This is why people care so much about IP addresses in the context of online privacy. IP addresses are the backbone of identity on the Internet, without which it would be impossible to communicate on the Internet. Manipulating an IP address to protect one’s identity, while not impossible, is challenging, expensive, and imperfect.
How do names and addresses link to personal identity?
You know now that IP addresses (and to a lesser extent domain names) are the foundational elements of identity on the Internet. And to communicate on the Internet, you must have an IP address that was assigned by an ISP, because the assignments of an ISP are legitimated through peering on the Internet Backbone.
ISPs must inevitably tie customers’ accounts to customers’ IP addresses so they can manage billing and services for those customers. The essential link between network identity and personal identity is a customer’s account, which may include names, mailing addresses, email addresses, telephone numbers, credit card numbers, checking account numbers, and anything else you might disclose to obtain service.
Generally, ISPs will relinquish this information only to law enforcement agencies upon, say, a subpoena of the individual.
A system does exist for sharing information on the registrants of IP addresses and domain names with the general public. That system is called Whois, and it operates slightly differently for IP and DNS. Its basic function is to enable anyone to obtain the contact information for the registrant of an IP address block or a domain name from a Whois server, such as the IANA Whois Server.
Whois records for IP addresses almost never disclose the identity of the end user. Typically they disclose the administrative, technical, and billing points of contact for the registrants of an IP address block to which an individual IP address belongs, rather than the assignee of the individual address. Usually the registrant of a block is either an ISP or a multinational corporation, government agency, university, or other large organization with enough funds to register whole blocks of IP addresses. Whois records are not directly useful for identifying a person behind an IP address, but they can direct you to the organization who should know that person.
Whois records for domain names may disclose the identity of the end user, but only if the end user registered the domain name. Unlike IP addresses, domain names are cheap and can be registered by consumers. Many people (including myself) register domain names for their small businesses or personal blogs. These people are likely to register using their personal contact information, because that is their only means of contact. Because of this, many domain name registrars will sell a Whois privatization service that masks their customers’ contact information. Corporations and other large organizations might not need this service, but individuals should pay the small fee to safeguard their privacy.
It’s worth mentioning that contact information in Whois records isn’t always reliable. Most registrars do not enforce the accuracy of contact information, and indeed, many Whois records contain bogus contact information. Since 2012, ICANN has been drafting significant changes to Whois policies to enhance the accuracy of Whois records, which could have a profound effect on Internet privacy.
Private IP addresses
A challenge in deriving personal identities from IP addresses is the fact that many public IP address service many people at once. The Internet Protocol distinguishes public and private IP addresses. IP-connected devices have at least one of each. ISPs assign public IP address to gateway devices, such as modems. Gateway devices assign private IP addresses to the devices on their local network or “intranet.” Devices outside the local network cannot see these addresses or send information from their own local network to a private address in another local network.
Everyone working in your office shares the same public IP address and has their own private IP address. A single public IP address could serve as the identity for hundreds or thousands of people. An investigator attempting to track down an individual based on activity by a public IP address must learn not only what organization has been assigned the public IP address, but how that organization assigns addresses internally. The investigator will only succeed if the organization has logged which machines are assigned which private IP addresses at the time of the suspicious activity, and if the investigator can get the organization to divulge those logs.
This has been only a brief introduction to identity on the Internet. There are so many other facets to this topic and they will need to be covered in future blog posts. The intent of this blog post is to describe the two systems that lay the foundation of identity on the public Internet — IP and DNS — and to describe the implications of those systems on privacy and identity. I also hope this blog post inspires you to learn more about the architecture and governance of the Internet, which have set the framework for privacy in the digital age. Keep asking how things work to the smallest detail, and you will be better prepared to make smart decisions on privacy and risk on the Internet.
Republished with permission from blog.davemoore.cc.