cdata-structuresdictionary

Quick Way to Implement Dictionary in C


One of the things which I miss while writing programs in C is a dictionary data structure. What's the most convenient way to implement one in C? I am not looking for performance, but ease of coding it from scratch. I don't want it to be generic either -- something like char*int will do. But I do want it to be able to store an arbitrary number of items.

This is intended more as an exercise. I know that there are 3rd party libraries available which one can use. But consider for a moment, that they don't exist. In such a situation what's the quickest way you can implement a dictionary satisfying the above requirements.


Solution

  • Section 6.6 of The C Programming Language presents a simple dictionary (hashtable) data structure. I don't think a useful dictionary implementation could get any simpler than this. For your convenience, I reproduce the code here.

    struct nlist { /* table entry: */
        struct nlist *next; /* next entry in chain */
        char *name; /* defined name */
        char *defn; /* replacement text */
    };
    
    #define HASHSIZE 101
    static struct nlist *hashtab[HASHSIZE]; /* pointer table */
    
    /* hash: form hash value for string s */
    unsigned hash(char *s)
    {
        unsigned hashval;
        for (hashval = 0; *s != '\0'; s++)
          hashval = *s + 31 * hashval;
        return hashval % HASHSIZE;
    }
    
    /* lookup: look for s in hashtab */
    struct nlist *lookup(char *s)
    {
        struct nlist *np;
        for (np = hashtab[hash(s)]; np != NULL; np = np->next)
            if (strcmp(s, np->name) == 0)
              return np; /* found */
        return NULL; /* not found */
    }
    
    char *strdup(char *);
    /* install: put (name, defn) in hashtab */
    struct nlist *install(char *name, char *defn)
    {
        struct nlist *np;
        unsigned hashval;
        if ((np = lookup(name)) == NULL) { /* not found */
            np = (struct nlist *) malloc(sizeof(*np));
            if (np == NULL || (np->name = strdup(name)) == NULL)
              return NULL;
            hashval = hash(name);
            np->next = hashtab[hashval];
            hashtab[hashval] = np;
        } else /* already there */
            free((void *) np->defn); /*free previous defn */
        if ((np->defn = strdup(defn)) == NULL)
           return NULL;
        return np;
    }
    
    char *strdup(char *s) /* make a duplicate of s */
    {
        char *p;
        p = (char *) malloc(strlen(s)+1); /* +1 for ’\0’ */
        if (p != NULL)
           strcpy(p, s);
        return p;
    }
    

    Note that if the hashes of two strings collide, it may lead to an O(n) lookup time. You can reduce the likelihood of collisions by increasing the value of HASHSIZE. For a complete discussion of the data structure, please consult the book.