Generate unique words based on computer time

Updated Gribouillis 0 Tallied Votes 502 Views Share

This snippet defines a function returning new identifiers created from reading the computer's time.

#!/usr/bin/env python
# -*-coding: utf8-*-
# Title: newword.py
# Author: Gribouillis for the python forum at www.daniweb.com
# Created: 2015 March 21
# License: Public Domain
# Use this code freely.
# History:
#   0.3.1 bugfix
#   0.3.0 The 3 first letters of words now indicate the day the word
#         was created. Words can now have more than 11 characters
#         (currently 3 + 11 = 14) which should allow a better time
#         granularity if a faster machine is used. Generated words
#         will be ordered until year 2354
#   0.2.2 bugfix
#   0.2.1 sort letters to generate ordered words
#   0.2.0 bugfix for python 3
#   0.1.0 first version
'''This module defines a function to generate unique words
'''
from __future__ import (absolute_import, division,
                        print_function, unicode_literals)

__version__ = '0.3.1'

class new_word(object):
    from struct import pack, unpack
    from string import ascii_letters
    ascii_letters = str('').join(sorted(ascii_letters))
    import time
    from datetime import datetime
    origin = datetime(year=1970, month=1, day=1)
    day_length = 24 * 3600.0
    
    if hasattr(time, 'perf_counter'):
        perf_counter = time.perf_counter
    else:
        import timeit
        perf_counter = timeit.default_timer
    
    def int_from_float(self, x):
        u, v = self.unpack(">LL", self.pack(">d", x))
        return (u << 32) | v
    
    def __init__(self):
        self.day_init()
        self.result = ''
    
    def day_init(self):
        now = self.datetime.now()
        today = self.datetime(year=now.year, month=now.month, day=now.day)
        days = (today - self.origin).days
        self.prefix = self.days_prefix(days)
        self.offset = (now - today).total_seconds()
        self.zero = self.perf_counter()
    
    def days_prefix(self, n):
        L = [None, None, None]
        for i in range(3):
            n, r = divmod(n, 52)
            L[2-i] = self.ascii_letters[r]
        return ''.join(L)
    
    def new_int(self):
        while True:
            c = self.perf_counter()
            offset = self.offset + (c - self.zero)
            if offset > self.day_length:
                self.day_init()
            elif offset > self.offset:
                self.zero = c
                self.offset = offset
                return self.int_from_float(offset)
    
    def gen_letter(self):
        ivalue = self.new_int()
        for i in range(11):
            ivalue, r = divmod(ivalue, 52)
            yield self.ascii_letters[r]
    
    def new_word(self, nchar=14):
        """new_word(nchar=14) --> a new word containing only ascii letters
        
        The word is a unique str instance, based on the machine time. Unlikely
        collisions can occur if the computer's clock is reset to a time in
        the past.
        
        new in version 0.3.0:
            argument nchar indicates maximum number of characters (up to 14)
        
        Example:
        
            >>> new_word()
            'GLAgSFyTODBfRd'
            
        Remark:
        
            The output can be reverse-engineered to discover when the
            word was generated, according to the generating computer's clock.
        """
        while True:
            result = (self.prefix + str('').join(self.gen_letter())[::-1])[:nchar]
            if result > self.result:
                self.result = result
                return result

new_word = new_word().new_word

if __name__ == '__main__':
    L = [] 
    for i in range(5):
        L. append(new_word())
    for w in L:
        print(w)

""" my output -->
GLAgSGaBhFOkxy
GLAgSGaBhFqvOe
GLAgSGaBhGGbKu
GLAgSGaBhGSVMm
GLAgSGaBhGcKLK
"""
Gribouillis 1,391 Programming Explorer Team Colleague

Version 0.2.0 is a minor bugfix for python >= 3.3!

Gribouillis 1,391 Programming Explorer Team Colleague

Version 0.2.1 sorts ascii letters to generate ordered words (at least as long as generated words have 11 characters). This is particularly useful if this code is used to generate file names in a directory.

ddanbe 2,724 Professional Procrastinator Featured Poster

Nice snippet. :)
Is there any particular reason for 11 characters?

Gribouillis 1,391 Programming Explorer Team Colleague

is there any particular reason for 11 characters?

No, it is due to the granularity of floating numbers on a 64 bits processor :)

Gribouillis 1,391 Programming Explorer Team Colleague

@ddanbe I thought again about your question, and it occured to me that I was losing some granularity by converting a number of seconds since the epoch into a string. Indeed, the new function perf_counter() in python can potentially measure very small fractions of a second.

So, version 0.3.0 implements a new strategy: the number of seconds since the beginning of the day is converted to an 11 characters string and a 3 characters prefix is prepended for the day, which means that the function can work until year 2354.

Theoretically, having more characters means that more unique names can be generated in a very short time. Of course this code is actually slow, because it is in pure python, but it could be a first step towards a fast compiled function to do this :)

Thank you for your support !

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.