numids: yearstamp numeric unique ids too

this is a followup to @/yseq but for random numeric ids.

consider the unique ids that are used in urls such as in reddit urls or the youtube video ids. these are strings of alphanumeric characters. that gives great flexibility but strings come with some performance downsides in most programming languages. an int64 id in comparison is pretty easy to use, fast, and doesn't generate pressure on the garbage collector. and if a user ever needs to enter an id manually somewhere on a keypad, digits are always easier to type than strings (example: credit card numbers or bank account ids). i have a soft spot for int64 ids and prefer using them over strings in most cases.

there's a small caveat to that: javascript doesn't have int64s but only floating point numbers. so to ensure javascript never garbles the id, it's best to keep the id value less than 2^50 or so. but that should be still good enough for most cases. and there's no need to worry about accidentally generating a naughty word with integers.

on the flipside int64 ids can have high rate of collisions in the case of high rate of id generation. so relying int64 might be a bit risky but for posts and userids in small forums, issue tracker ids, it's more than enough. another downside could be that int64 ids are more "guessable" but this probably doesn't matter much for forum post or issue tracker ids.

id length

how big should the id be?

i really love short ids. if the id is short, i can even remember it. e.g. if in my project a contentious issue has a memorable 4 digit id, i might remember it and look it up directly via id rather than always searching for it.

context: i love to type urls from memory perfectly. i never rely on autocompletion or history completion. i have relatively good memory for this. some websites handle this quite well thanks to their simple url structure. some are terrible. but if i create a website, i want it to have a simple url structure.

keep the id length short if the system doesn't generate lot of ids. but do vary the length. some ids should be 5 digits long, some 7 digits. this way nobody can rely on a specific length. furthermore the id length can simply grow if there are many collisions during generation. this way the system handles an increased id pressure gracefully.

perhaps distinguish id length for humans and robots. if an alerting system creates automated tickets, give those tickets long ids. this way robots don't eat up the short id space that humans prefer.

yearstamping

in @/yseq i explained my love for putting some date information into the ids. the same can be done here too. append the last two year digits to the end of the id. so an id like 12323 mean it's an id from 2023. or use the last 3 digits if worried about the year 2100 problem. e.g. 123023 for an id from 2023.

it needs to be a suffix because the id length is variable. putting it at the end means both the generation and extraction of this piece of data remains trivial programmatically.

yearstamping also reduces the chance for collisions. a new id can only collide from other ids from this year. this can make the uniqueness check a bit faster.

it also allows the administrators operate on old ids easily. for instance they can use a glob like "*23" to select all ids from 2023 for archiving.

edit after some experience: prepend the timestamp if possible. it's much easier to read the stamp for human eyes. and also allows easy sorting too.

daystamping

in case you are doing full alphanumeric ids, then you can easily daystamp too. you can encode the current month and day with two characters. use "abc def ghi jkl" for the months, "1234567 89abcde fghijkl mnopqrs tuv" for the day of the month. so "25k3" would encode 2025-11-03 or "25bt" for 2025-02-29. the year part is always obvious because this scheme uses letters for the month.

alternatively you can use the full A-Za-z character set which works out to 52 characters. combining two such characters work out to 52*52 = 2'704 numbers. instead of 4 character daystamping you can 2 character monthstamp and ids will last for 2'704/12 = 225 years. the only downside is that they won't be human readable.

comparison to yseq

the downside of @/yseq is that the id length must remain static if the users want to use it to compare events chronologically via the less-than operator over the id numbers. no such length restriction on random ids because such comparison intentionally doesn't make sense unless two ids are chronologically far away. with sequential ids users often try to farm sequential ids to grab the round or nice numbers. no such incentive with random numbers.

go with the random ids unless there ids need to be able to express a chronological relationship between them. use an int50 id if you don't expect to need many ids (e.g. less than a million per year) for javascript compatibility.

letterbase (added on 2026-06-06)

if you need stick to letters for some reason, then i recommend using letterbase like this:

012 345 678 9012
abc def ghi jklm

suppose you have a YYM prefixed ID. convert YY into two letter base digits. for the month you can go beyond the decimal system so make use of that. use b for february because it's month 2, m for december (because it's month 12. 2026-July would be encoded as cgh (from 26-7), 2090-December would be jam, 2050-June would be fag, 2100-January could be kab.

once you learn to read letterbase, then these ids become just as readable as decimal numbers.

edits

2024-03-22: originally i argued that weekstamping should be done at the end. now i recommend to weekstamp the beginning. then a simple string sort gives a semi-sorted order of ids which is neat.
2025-04-06: added the monthstamping section.
2025-11-03: replaced the week and monthstamping sections with the much cleaner daystamping one and added a recommendation to prepend the id.
2026-06-06: added the letterbase section.

Published on 2024-03-01, last modified on 2026-06-06.

Add new comment:

(Adding a new comment or reply requires javascript.)

to the frontpage