After creating a TAGS file for my project (find . -name "*.py" | xargs etags
) I can use M-.
to jump to the definition of a function. That's great. But if I want the definition of a global constant -- say, x = 3
-- Emacs does not know where to find it.
Is there any way to explain to Emacs where constants, not just functions, are defined? I don't need this for anything defined within a function (or a for-loop or whatnot), just global ones.
Previous incarnations of this question used "top-level" instead of "global", but with @Thomas's help I realized that's imprecise. What I meant by a global definition is anything a module defines. Thus in
import m
if m.foo:
def f():
x = 3
return x
y, z = 1, 2
else:
def f():
x = 4
return x
y, z = 2, 3
del(z)
the things defined by the module are f
and y
, despite the sites of those definitions being indented to the right. x
is a local variable, and z
's definition is deleted before the end of the module.
I believe that a sufficient rule to capture all global assignments would be to simply ignore them inside def
expressions (noting that the def
keyword itself might be indented at any level) and otherwise parse for any symbol to the left of =
(noting that there might be more than one, because Python supports tuple assignments).
Etags does not seem to be able to produce such information for Python files which you can easily verify by running it on a trivial test file:
x = 3
def fun():
pass
Running etags test.py
produces a TAGS file with the following contents:
/tmp/test.py,13
def fun(3,7
As you can see, x
is completely absent in this file, so Emacs has no chance of finding it.
Invoking etags
' man page informs us that there is an option --globals
:
--globals Create tag entries for global variables in Perl and Makefile. This is the default in C and derived languages.
However, this seems to be one of those sad cases where the documentation is out of sync with the implementation, as this option does not seem to exist. (etags -h
does not list it either, only --no-globals
- probably because --globals
is the default, as it says above.)
However, even if --globals
is the default, the documenation snippet says it applies only to Perl, Makesfiles, C, and derived languages. We can check whether this is the case by creating another trivial test file, this time for C:
int x = 3;
void fun() {
}
And indeed, running etags test.c
produces the following TAGS file:
/tmp/test.c,26
int x 1,0
void fun(3,12
You see that x
is correctly identified for C. So it seems that global variables are simply not supported by etags
for Python.
However, because of Python's use of whitespace, it is not too hard to identify global variable definitions in source files - you can basically grep
for all lines that don't start with whitespace but contain a =
sign (of course, there are exceptions).
So, I wrote the following script to do that, which you can use as a drop-in replacement for etags
, as it calls etags
internally:
#!/bin/bash
# make sure that some input files are provided, or else there's
# nothing to parse
if [ $# -eq 0 ]; then
# the following message is just a copy of etags' error message
echo "$(basename ${0}): no input files specified."
echo " Try '$(basename ${0}) --help' for a complete list of options."
exit 1
fi
# extract all non-flag parameters as the actual filenames to consider
TAGS2="TAGS2"
argflags=($(etags -h | grep '^-' | sed 's/,.*$//' | grep ' ' | awk '{print $1}'))
files=()
skip=0
for arg in "${@}"; do
# the variable 'skip' signals arguments that should not be
# considered as filenames, even though they don't start with a
# hyphen
if [ ${skip} -eq 0 ]; then
# arguments that start with a hyphen are considered flags and
# thus not added to the 'files' array
if [ "${arg:0:1}" = '-' ]; then
if [ "${arg:0:9}" = "--output=" ]; then
TAGS2="${arg:9}2"
else
# however, since some flags take a parameter, we also
# check whether we should skip the next command line
# argument: the arguments for which this is the case are
# contained in 'argflags'
for argflag in ${argflags[@]}; do
if [ "${argflag}" = "${arg}" ]; then
# we need to skip the next 'arg', but in case the
# current flag is '-o' we should still look at the
# next 'arg' so as to update the path to the
# output file of our own parsing below
if [ "${arg}" = "-o" ]; then
# the next 'arg' will be etags' output file
skip=2
else
skip=1
fi
break
fi
done
fi
else
files+=("${arg}")
fi
else
# the current 'arg' is not an input file, but it may be the
# path to the etags output file
if [ "${skip}" = 2 ]; then
TAGS2="${arg}2"
fi
skip=0
fi
done
# create a separate TAGS file specifically for global variables
for file in "${files[@]}"; do
# find all lines that are not indented, are not comments or
# decorators, and contain a '=' character, then turn them into
# TAGS format, except that the filename is prepended
grep -P -Hbn '^[^[# \t].*=' "${file}" | sed -E 's/([0-9]+):([0-9]+):([^= \t]+)\s*=.*$/\3\x7f\1,\2/'
done |\
# count the bytes of each entry - this is needed for the TAGS
# specification
while read line; do
echo "$(echo $line | sed 's/^.*://' | wc -c):$line"
done |\
# turn the information above into the correct TAGS file format
awk -F: '
BEGIN { filename=""; numlines=0 }
{
if (filename != $2) {
if (numlines > 0) {
print "\x0c\n" filename "," bytes+1
for (i in lines) {
print lines[i]
delete lines[i]
}
}
filename=$2
numlines=0
bytes=0
}
lines[numlines++] = $3;
bytes += $1;
}
END {
if (numlines > 0) {
print "\x0c\n" filename "," bytes+1
for (i in lines)
print lines[i]
}
}' > "${TAGS2}"
# now run the actual etags, instructing it to include the global
# variables information
if ! etags -i "${TAGS2}" "${@}"; then
# if etags failed to create the TAGS file, also delete the TAGS2
# file
/bin/rm -f "${TAGS2}"
fi
Store this script on your $PATH
using a convenient name (I suggest sth. like etags+
) and then call it like so:
find . -name "*.py" | xargs etags+
Besides creating a TAGS file, the script also creates a TAGS2 file for all global variable definitions, and adds a line to the original TAGS file that references the latter.
From the perspective of Emacs, there's not difference in usage.