pythoninspect

Getting the list of classes in a given Python file


My goal is to fetch the list classes defined in a given Python file.

Following this link, I have implemented the following:

File b.py:

import imp
import inspect

module = imp.load_source("__inspected__", './a.py')
class_members = inspect.getmembers(module, inspect.isclass)
for cls in class_members:
    class_name, class_obj = cls
    member = cls[1]
    print(class_name)

File a.py:

from c import CClass


class MyClass:
    name = 'Edgar'

    def foo(self, x):
        print(x)

File c.py:

c_var = 2

class CClass:
   name = 'Anna'

I have two issues with this implementation. First, as is mentioned in the post, classes of imported module are printed out as well. I can't understand how to exclude them Second, looks like the imp file is depreciated in favour of importlib, but the doc seems sketchy. And I can't figure out how to refactor my solution. Any hints ?


Solution

  • So to use importlib similarly to how you're using imp, you can look at this: Python 3.4: How to import a module given the full path? and you get something like the following:

    import importlib.machinery
    import inspect
    
    module = importlib.machinery.SourceFileLoader("a", './a.py').load_module()
    class_members = inspect.getmembers(module, inspect.isclass)
    

    Solution #1: Look up class statements in the Abstract Syntax Tree (AST).

    Basically you can parse the file so that you can get the class declaration statements.

    import ast
    
    def get_classes(path):
        with open(path) as fh:        
           root = ast.parse(fh.read(), path)
        classes = []
        for node in ast.iter_child_nodes(root):
            if isinstance(node, ast.ClassDef):
                classes.append(node.name)
            else: 
                continue
        return classes
        
    for c in get_classes('a.py'):
        print(c)
    

    Solution #2: Look at imports and ignore import from statements.

    This is more in-line with your current approach, but is a little jankier. You can look for things imported by the file you're looking at and select out the import from statements (Python easy way to read all import statements from py module) and just make sure that none of the things imported show up later:

    import ast
    from collections import namedtuple
    
    Import = namedtuple("Import", ["module", "name", "alias"])
    
    def get_imports(path):
        with open(path) as fh:        
           root = ast.parse(fh.read(), path)
    
        for node in ast.iter_child_nodes(root):
            if isinstance(node, ast.Import):
                # We ignore direct imports
                continue
            elif isinstance(node, ast.ImportFrom):  
                module = node.module.split('.')
            else:
                continue
            for n in node.names:
                yield Import(module, n.name.split('.'), n.asname)
    
    imported = set()
    for imp in get_imports('a.py'):
        imported_classes.add(imp.name[0] if not imp.alias else imp.alias)
    

    Then you can just filter out the imported things you saw.

    for c in class_members:
        class_name, class_obj = c
        member = c[1]
        if class_name not in imported:
            print(class_name)
    

    Note that this currently doesn't distinguish between imported classes and imported functions, but this should work for now.