Skip to main content
Background Image

Naming

·1917 words·9 mins
Mike Wyer
Author
Mike Wyer
Table of Contents

What’s in a name?
#

Naming is (famously) one of the hardest problems in Tech, though it’s hard everywhere. Like UI/UX design, or reliability at scale, or incident management, people who don’t understand the problem well enough assume that it’s simpler or easier than it really is, or that they can do it well.

That’s not to cast shade on folks who are not experienced SREs, it’s just one of those areas where non-experts’ biases and blind spots cause them to consistently underestimate the challenges involved. No doubt this is true of many other specialisms too, and I am living in blissful ignorance of the complexity of glass blowing, or play-by-play commentary of professional wrestling.

But I want to dig a bit deeper into the science (and art) of naming within Tech, because it matters. And doing it well is hard. If we have terminology to analyse and describe what makes a bad name versus what makes a good name, that gives us some predictive power to evaluate names and naming schemes before publishing them.

And is this whole article just a cry for help from someone forced to suffer the indignities of UUIDs as identifiers? Maybe. It is why Rule 7 exists.

Since that feels important, let’s get a few bad practices out of the way first, then maybe we can dig deeper to understand what makes them bad, and how they can be improved.

Bad patterns
#

Using UUIDs as portable / public identifiers
They are hard to transcribe from a screenshot and provide no information about which system they came from. If you have multiple UUIDs flying around, it can be really hard to track down the owning component. Operationally, plain UUIDs are only useful as opaque, private, internal identifiers. If you insist on using them, also provide a registry where all UUIDs across your system can be resolved to their origin. And consider supporting aliases or slugs so humans don’t have to deal with UUIDs directly.
Generating random names for components in a stack
When they start throwing errors or generating log lines, you will have no way to discover which stack they belong to.
Re-using existing project names.
There are always more cool names out there. Don’t confuse people.
Using cute names for projects without telling people what they mean.
What do you mean you don’t remember Popples, which turned inside out, which is why this is an awesome name for a debugger?
Spending so long on picking a cool name and logo you forget to scope out the actual project.
Ironically, the example this is taken from was meant to be a delivery tool. There was a good 12 months after the excitement of announcing the name before any working code was produced. A good name does not always mean a good project.
Using . and other non-DNS safe characters in cloud component names so they cannot be resolved using DNS.
You might not think you want DNS resolution of your components. Your operators may disagree. Sometimes having a DNS-queryable namespace can be really useful.
Using - (hyphen) in project names that might ever have a python implementation.
This is completely python’s fault, but still. Sometimes you have external constraints that aren’t going away any time soon. It helps to be aware of them.
Embedding unnecessary geographic or temporal data in the name (or domain).
davedb-us-east-1a may be running in US East 1 A today, but what if they want to run it across multiple availability zones? Or move to us-central-2 to get access to new hardware? If the locality is in the domain name, that adds to the complexity of generating SSL certificates if it ever needs to be served from somewhere else.

Breaking it down
#

Naming seems simple, until you realise it isn’t.

A name can have multiple qualities / properties / attributes (eg. unique, ASCII word-characters only, contains the letters “bbq”). The utility of a particular name can depend very heavily on which qualities were considered when the name was chosen. Conversely, ignoring or violating some of these qualities can produce names which are confusing, unhelpful, and increase the cognitive load of operators.

Here are some properties a name might have:

Unambiguous
cannot be mistaken for another (similar) name.
Identifying
refers to a specific entity.
Unique
at most one entity has that name.
Discoverable
can be looked up somewhere.
Descriptive
describes the purpose or usage.
Classifying
what kind of thing is this?
Locating
where (logically or physically) is this thing located?
Obscure
identify without describing.
Unguessable
picked from a search space which is too expensive to exhaustively query.
Predictable
the name can be inferred from other qualities (eg type + location + purpose).
Humorous
puns can make work more fun.
Historical
needed to interact with legacy systems.
Pronounceable
handy if you ever need to communicate the name on a voice call.
Memorable
can be remembered by a human.
Short-term Memorable
can be accurately remembered and repeated by a human during an incident.
Portable
can be used across multiple systems.
Constrained
meets syntactic or semantic constraints (eg URL- or DNS-safe characters, ASCII, UTF-8, less than 12 characters, is not the name of a species of fish)
Durable
continues to be useful over time. This could also be phrased as “Does not contain the word new or next-generation or any seasonal reference”
Scalable
still satisfies the necessary attributes when there are thousands or millions of instances. Is picked from a namespace which allows for enough names to be generated to satisfy future requirements.

And I’m sure there are more. I’d be happy to add more properties to the list (with attribution) if folks add comments with suggestions.

It’s also important to be aware of how the name will be used. Different attributes are useful in different circumstances. Here are a few examples to make the point…

Project Naming
#

When starting a new project, or making significant changes to an existing one, it can help to give that undertaking a name. Some folks find it inspiring to give their project an impressive name that will strike fear into their competitors and invoke a sense of awe and pride in those lucky individuals fortunate enough to get to work on it. Others like to pick a name from Greek mythology, for a bit of supernatural and elitist kudos. And in some cases you just need a name without any connotations whatsoever, like Operation Mincemeat.

When I was revising the Incident Management process at Gousto, we called the new process IMAG (Incident Management At Gousto, heavily inspired by the same initiative at Google). That gave us a short and specific name we could use to say “We are launching IMAG on 1st of July” and “Team Carrots has completed their IMAG training” and “Please follow IMAG for incidents in future”.

I would suggest that a good project name is

  1. Unique
  2. Discoverable
  3. Pronounceable
  4. Memorable
  5. Durable

Depending on the situation, it can help to use a descriptive name. If you choose a non-descriptive name, the other qualities need to be very strong, and you have to do the work of over-communicating the project name and purpose.

Nobody knows what the project Osiris does from the name alone- so this is only a good name if the project is referred to as (say) “Osiris (our new routing backend)” until the old routing backend is decommissioned and there is only Osiris.

I have not yet seen any project called “New X” which was the last and final iteration of X. What’s the oldest bridge in Paris? The Pont Neuf (“New Bridge”). So people have been getting this wrong for centuries, but that’s no reason to continue doing it!

When I was at Google, I think there were three independent projects called Expresso. Leaving aside any spelling concerns, that still leaves the problem of having to refer to “Expresso the new routing engine” as opposed to “Expresso the config language”. I think it was Perry Lorrier (one of the finest SREs it was ever my privilege to learn from) who circulated some naming guidelines which included the phrase “If you don’t own the go/PROJ_NAME link for your project, then your project is called something else, whether you realise it or not”.

That clarified the de facto canonical namespace and discovery method for project names.

Side-note: trot.to is more respectful of the history and origin of go/ links than some other providers (who have chosen to trademark a term they didn’t create and also attack open-source solutions).

Looks like trot.to’s open source offering has been withdrawn as it didn’t lead to more paid SaaS uptake.

Given that there are great open source L7 routing/proxy solutions (like Envoy), there should be a decent open source solution for managing the config side of things.

That said, the basic functionality is pretty straightforward and used as an example in the GCP docs/tutorials.

There are also browser-based implementations (like glinks)

Resource / Component Naming
#

In this situation, when we are naming components of a system which are going to be deployed to some tech stack somewhere, the desirable attributes are quite different.

Sometimes our platform will dictate some constraints (eg the length and contents of identifiers for cloud storage buckets, or project names, etc).

The other big factor is how the name is going to be used. What other systems, processes, people, or roles will need to be able to use, understand, resolve, or discover the name? What constraints do those systems have?

If this resource is part of a project and the project name changes, can/should the resource name also change? If not, ensure the resource naming is independent of the project name.

What information (if any) does the name need to convey? If the resource is already strongly typed, with a well-defined and easily discoverable location, then there is no point putting that information in the name of the resource.

When dealing with multiple similar stacks, where all the resources are strongly typed and located, it makes far more sense to name them by purpose or role, eg gateway rather than europe-west2-vpc-0-gateway. That way all the IAC config in the stack can refer to just gateway and you don’t need to copy/paste/edit all the names and identifiers every time you deploy the stack to a new location.

A consistent, predictable / generated naming scheme can be used to isolate independent layers of the stack.

If the network layer deploys a gateway called gateway, then higher levels of the stack only need to care that there is a gateway called gateway without knowing any specific unique deployment identifier (ARN or UUID). If you are using terraform, for example, that means the application layer can just specify “gateway” as its gateway as a static string in the config, rather than having a terraform dependency on the outputs of the lower stack (which would be an example of fragile coupling). It’s reasonable to have a requirement on the network stack that it provisions a gateway called “gateway” without having to link the two layers any more tightly than that.

Conclusion
#

Or at least, the part where I stop trying to solve an impossible problem in a general way. Suffice to say:

  1. Names matter.
  2. Think about usage before deciding on a name.
  3. Consider the attributes the name needs to have, and test your names against those attributes.
  4. Generally, an internal identifier / primary key is a bad name for referring to that entity as it only satisfies a few specific attributes.