clinuxmemory-management

Where does `getpwuid` allocate memory from?


Edited on Oct 23

I would like to understand where does the function getpwuid allocate memory from. I have some sample code that prints the username for the user id input to the program.

I read the manual page for getpwuid and is says:

The return value may point to a static area, and may be overwritten by subsequent calls to getpwent(3), getpwnam(), or getpwuid(). (Do not pass the returned pointer to free(3).)

I read that the static area in the memory layout of the process contains the text, initialized data and uninitialized data. But the returned address is not in any of these regions (as far as I can understand - from looking at the boundary of these regions from etext, edata and end).

I have the following questions:

  1. I'm unable to understand who is allocating memory for the username string (and the six other fields in struct passwd). Who is responsible for freeing it?
  2. How can the compiler possibly know how long the username, password and other fields are going to be, so it can allocate the memory statically?
  3. Why is pwd at 0x7f2b3c7aba60 why is pwd->pw_name at 0x5646b174e2a0 ?
#include <pwd.h>
#include <ctype.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

extern char etext, edata, end;

char* userNameFromId(uid_t uid)
{
    struct passwd *pwd;
    pwd = getpwuid(uid);
    printf("pwd is located at             %10p\n", pwd);
    printf("pw_name is located at         %10p\n", pwd->pw_name);
    return (pwd==NULL)?NULL:pwd->pw_name;
}

int main(int argc, char** argv)
{
    uid_t u;
    char* endptr = NULL;
    char* name;
    if(argc!=2){
        printf("Usage: %s [user_id]\n", argv[0]);
        return -1;
    }
    u = strtol(argv[1], &endptr, 10);
    if(*endptr!='\0') {
        printf("%s is not a number\nUsage: %s [user_id]\n", argv[1], argv[0]);
        return -1;
    }
    name = userNameFromId(u);
    if(name == NULL) {
        printf("No user was found with the given id: %s\n", argv[1]);
        return -1;
    }
    printf("program text ends before      %10p\n", &etext);
    printf("initialized data ends before  %10p\n", &edata);
    printf("uninitializd data ends before %10p\n", &end);
    printf("name is located at            %10p\n", &name);
    printf("program break is located at   %10p\n", sbrk(0));
    printf("User name for id %d is %s\n", u, name);
    
    FILE *file = fopen("/proc/self/maps", "r");
    if (file == NULL) {
        perror("Error opening file");
        return -1;
    }
    char buffer[1024];
    while (fgets(buffer, sizeof(buffer), file) != NULL) {
        printf("%s", buffer);
    }
    fclose(file);
    return 0;
}

Upon executing this program, it prints something like the following:

pwd is located at             0x7f2b3c7aba60
pw_name is located at         0x5646b174e2a0
program text ends before      0x5646b1272555
initialized data ends before  0x5646b1275010
uninitializd data ends before 0x5646b1275018
name is located at            0x7ffe9895c0a0
program break is located at   0x5646b176f000
User name for id 1000 is rragavendrak
5646b1271000-5646b1272000 r--p 00000000 08:20 7068                       /home/rranjithkuma/a.out
5646b1272000-5646b1273000 r-xp 00001000 08:20 7068                       /home/rranjithkuma/a.out
5646b1273000-5646b1274000 r--p 00002000 08:20 7068                       /home/rranjithkuma/a.out
5646b1274000-5646b1275000 r--p 00002000 08:20 7068                       /home/rranjithkuma/a.out
5646b1275000-5646b1276000 rw-p 00003000 08:20 7068                       /home/rranjithkuma/a.out
5646b174e000-5646b176f000 rw-p 00000000 00:00 0                          [heap]
7f2b3c587000-7f2b3c58a000 rw-p 00000000 00:00 0 
7f2b3c58a000-7f2b3c5b2000 r--p 00000000 08:20 2282                       /usr/lib/x86_64-linux-gnu/libc.so.6
7f2b3c5b2000-7f2b3c747000 r-xp 00028000 08:20 2282                       /usr/lib/x86_64-linux-gnu/libc.so.6
7f2b3c747000-7f2b3c79f000 r--p 001bd000 08:20 2282                       /usr/lib/x86_64-linux-gnu/libc.so.6
7f2b3c79f000-7f2b3c7a0000 ---p 00215000 08:20 2282                       /usr/lib/x86_64-linux-gnu/libc.so.6
7f2b3c7a0000-7f2b3c7a4000 r--p 00215000 08:20 2282                       /usr/lib/x86_64-linux-gnu/libc.so.6
7f2b3c7a4000-7f2b3c7a6000 rw-p 00219000 08:20 2282                       /usr/lib/x86_64-linux-gnu/libc.so.6
7f2b3c7a6000-7f2b3c7b3000 rw-p 00000000 00:00 0 
7f2b3c7b8000-7f2b3c7ba000 rw-p 00000000 00:00 0 
7f2b3c7ba000-7f2b3c7bc000 r--p 00000000 08:20 2279                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7f2b3c7bc000-7f2b3c7e6000 r-xp 00002000 08:20 2279                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7f2b3c7e6000-7f2b3c7f1000 r--p 0002c000 08:20 2279                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7f2b3c7f2000-7f2b3c7f4000 r--p 00037000 08:20 2279                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7f2b3c7f4000-7f2b3c7f6000 rw-p 00039000 08:20 2279                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7ffe9893d000-7ffe9895e000 rw-p 00000000 00:00 0                          [stack]
7ffe989b1000-7ffe989b5000 r--p 00000000 00:00 0                          [vvar]
7ffe989b5000-7ffe989b7000 r-xp 00000000 00:00 0                          [vdso]

Here some information about my system:

rragavendrak@DESKTOP-JJOG9GH:~$ uname -a
Linux DESKTOP-JJOG9GH 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Solution

    1. I'm unable to understand who is allocating memory for the username string (and the six other fields in struct passwd). Who is responsible for freeing it?

    The library used is responsible for any allocations and deallocations needed.

    Use ldd --version to see what implementation of the C standard library your system is using. Most of the distributions uses either or (where distributions using is in vast majority).

    's getpwuid uses malloc and realloc in combination with a static struct passwd.

    The actual code is full of macros but it looks something like this after preprocessing:

    static char *buffer;
    
    struct passwd *getpwuid(uid_t uid) {
        static size_t buffer_size;
        static struct passwd resbuf;
        struct passwd *result;
    
        if (buffer == NULL) { // first call to getpwuid, allocate 1024 bytes
            buffer_size = 1024;
            buffer = (char *)malloc(buffer_size);
        }
    
        while (
            buffer != NULL &&
            (__getpwuid_r(uid, &resbuf, buffer, buffer_size, &result) == ERANGE)) 
        {
            // not enough space in buffer, realloc:
            char *new_buf;
            buffer_size *= 2;
            new_buf = (char *)realloc(buffer, buffer_size);
            if (new_buf == NULL) {
                free(buffer);
                ((*__errno_location()) = (ENOMEM));
            }
            buffer = new_buf;
        }
    
        if (buffer == NULL) result = NULL;
    
        return result;
    }
    

    The __getpwuid_r function will use buffer to store the strings that resbuf points to if buffer_size is large enough, which is initially 1024. If it's not large enough, __getpwuid_r will fail and the while loop in getpwuid will then double the size by buffer_size *= 2; and then do realloc and then call __getpwuid_r again until it succeeds (or realloc fails).

    The strings pointed out by resbuf (and result) are all allocated on the heap (where buffer points). The same area will be used for all calls to getpwuid until you query for a uid where the strings exceed buffer_size. A new heap allocation will then be made. When the program exits, buffer_size will therefore be the largest needed during the program run.

    There's a weak_alias (buffer, FREEMEM_NAME) in the source that may indicate that it will actually call free when the program exists - but that's not very important. What's important is that it will only have one area allocated that will be used for all calls, so it will not leak and run out of memory no matter how many times you call getpwuid.


    I only know of very few Linux distributions that don't use , but use instead. Although the code is very different from 's version, it uses the same approach where line points at the heap allocated string storage and size holds the number of bytes currently allocated:

    static char *line;
    static struct passwd pw;
    static size_t size;
    
    struct passwd *getpwuid(uid_t uid)
    {
            struct passwd *res;
            __getpw_a(0, uid, &pw, &line, &size, &res);
            return res;
    }
    

    Here __getpw_a internally calls __nscd_query to get the length of all the struct passwd strings and then calculates the exact length needed to store them. If the size is too small, it will call realloc to make room for this entry. At the end of the program run, only one heap allocated buffer with the max size needed during the program run will be left.


    1. How can the compiler possibly know how long the username, password and other fields are going to be, so it can allocate the memory statically?

    If the standard C library in use is written for an environment where there is a known upper limit to how long each of the strings can be, it can use a static char buffer[sum_of_the_longest_strings_allowed] for storage. Knowing the longest strings allowed at compile-time may however not be impossible in case the database actually storing the information is located "elsewhere" (like NIS or LDAP). The getpwuid implementation would have to know of the longest lengths permitted by all the possible backend systems, which may be hard in all but embedded system.

    1. Why is pwd at 0x7f2b3c7aba60 why is pwd->pw_name at 0x5646b174e2a0 ?

    pwd is pointing at a static struct passwd (resbuf in the above implementation) while pwd->pw_name is pointing inside the heap allocated buffer (where static char *buffer points in the implementation).