pathzsh

Sanitizing PATH in case of duplicates


I have set

typeset -aU path

which helps in avoiding duplicates in my PATH. The duplicate is not added if I add it at the end to my PATH:

path+=~/bin
path+=~/foo
path+=~/bin

After this, my PATH contains only one copy of my bin directory.

If I put a directory at the start of the PATH (which is rarely done, but sometimes necessary), i.e.

PATH=~/bin:$PATH

I will end up with an additional copy of my bin directory in my PATH. Is it possible to also automatically remove duplicate directories in this case?

I can force this manually with the help of an auxiliary array, i.e.

temppath=($path)
path=($temppath)

but I wonder if there is a simpler way to do this.


Solution

  • It looks like the -U (unique) attribute can be set separately for each variable in a pair of tied parameters like path and PATH. The two names essentially act as distinct interfaces to a shared bit of data, and the behavior changes when different interfaces are used to set and retrieve that data.

    By setting -U for both path and PATH, the shell should remove duplicates no matter how entries are added:

    typeset -U PATH path
    

    This is all that's absolutely necessary, since by default the -T (tied) and -a (array) attributes for path/PATH are already set. For a new variable that behaves similarly, the declaration could look like this:

    typeset -aUT LD_LIBRARY_PATH ld_library_path
    

    Testing:

    => typeset -aUT PATH path
    => PATH=/usr/bin
    => path+=~/bin
    => typeset -p path
    typeset -aUT PATH path=( /usr/bin /Users/me/bin )
    => path+=~/foo
    => path+=~/bin
    => typeset -p path
    typeset -aUT PATH path=( /usr/bin /Users/me/bin /Users/me/foo )
    => PATH=~/bin:$PATH
    => typeset -p path
    typeset -aUT PATH path=( /Users/me/bin /usr/bin /Users/me/foo )
    => path=(~/foo $path)
    => typeset -p path
    typeset -aUT PATH path=( /Users/me/foo /Users/me/bin /usr/bin )
    

    With -U set differently for PATH and path:

    => path=()
    => typeset -U path; typeset +U PATH
    => typeset -p PATH path
    typeset -T PATH path=(  )
    typeset -aUT PATH path=(  )
    => PATH=/foo:/foo:/foo
    => typeset -p PATH path
    typeset -T PATH path=( /foo /foo /foo )
    typeset -aUT PATH path=( /foo /foo /foo )
    => path+=/bar; path+=/bar; path+=/bar
    => typeset -p PATH path
    typeset -T PATH path=( /foo /bar )
    typeset -aUT PATH path=( /foo /bar )
    => PATH=/foo:$PATH; PATH=/foo:$PATH;
    => typeset -p PATH path
    typeset -T PATH path=( /foo /foo /foo /bar )
    typeset -aUT PATH path=( /foo /foo /foo /bar )
    
    => path=()
    => typeset +U path; typeset -U PATH
    => typeset -p PATH path
    typeset -UT PATH path=(  )
    typeset -aT PATH path=(  )
    => path+=/alice; path+=/alice; path+=/alice
    => typeset -p PATH path
    typeset -UT PATH path=( /alice /alice /alice )
    typeset -aT PATH path=( /alice /alice /alice )
    => PATH=$PATH:/bob; PATH=$PATH:/bob; PATH=$PATH:/bob; 
    => typeset -p PATH path
    typeset -UT PATH path=( /alice /bob )
    typeset -aT PATH path=( /alice /bob )
    

    Bonus - path=($^path(-/N)) can be used to remove non-existent directories and empty and invalid entries from path:

    => typeset -p path   
    typeset -aUT PATH path=( /Users/me/foo /Users/me/bin /etc/hosts '' /usr/bin )
    => path=($^path(-/N))
    => typeset -p path   
    typeset -aUT PATH path=( /Users/me/bin /usr/bin )
    

    Another version: path=( ${(M)^path:#/*}(N-/) ), from @Stephane Chazelas in the comments. This will eliminate all relative entries from the path, not just empty values (which are interpreted as .). This is useful because those entries are also a security risk.

    MacOS sometimes adds pointless entries via /etc/paths.d, this is a way to clean them out.