The objcopy
tool makes it easy to embed arbitrary files into an ELF executable:
objcopy --add-section program.file1=file1.dat \
--add-section program.file2=file2.dat \
program program+files
It seems to me that it should be possible for the program+files
to access file1
and file2
programmatically without opening and reading any external files. However, there seems to be no easy way to obtain this information from within the running program.
The files were added as named sections of the ELF executable. However, Linux only loads the segments described by the ELF program header table. The sections are never present in that set since they are not necessary for execution.
So while it is possible to obtain a pointer to the currently running program's ELF header, it is pointless since the sections were not loaded at all.
uintptr_t address = getauxval(AT_PHDR) & -4096;
Elf64_Ehdr *elf = (Elf64_Ehdr *) address;
// dangling pointer, sections aren't loaded by the OS
Elf64_Shdr *sections = ((unsigned char *) elf) + elf->e_shoff;
My intention was to search the sections by name at runtime, find the ones prefixed by program.
and compute pointers to them so that my code can use them like ordinary memory blocks.
I can't use predefined symbols for this because I want to support an arbitrary number of embedded files, including no embedded file at all. I need to look up these sections at runtime.
Linux will only load segments marked with PT_LOAD
. Can these sections be placed in PT_LOAD
segments somehow? objcopy
does not seem to have the ability to edit the program header table and add new PT_LOAD
segments. How would one go about doing that?
My intention was to search the sections by name at runtime, find the ones prefixed by program. and compute pointers to them so that my code can use them like ordinary memory blocks.
You can find the program
on disk (using /proc/self/exe
), mmap
it1, decode section headers (see this answer) and then compute pointers to sections of interest and use them as you wish.
Can these sections be placed in PT_LOAD segments somehow?
No: that would require rebuilding parts of the executable which are not possible to rebuild without re-linking the entire program.
Update:
If you don't care all that much about memory usage of your program, you could modify the last LOAD
segment to "cover" the entire program+files
, and then you can skip the separate mmap
-- the files would already be in memory.
You just need to increase the .p_filesz
and .p_memsz
such that phdr.p_offset + phdr.p_filesz == file_size
.
The price is that you'll cause data that normally isn't loaded into memory (e.g. section header, debug sections (if any)) to occupy memory. But with demand paging, the price could be very small -- nothing should access these "extra" memory regions, and so nothing should cause them to be paged in.
P.S. I know of no standard utility that can update .p_filesz
etc, but it's pretty easy to write such patcher in C
or in Python
.
1 You don't have to mmap
the entire program
, just the part of it which contains desired section(s).