Package epydoc :: Module docparser
[hide private]
[frames] | no frames]

Module docparser
source code

Extract API documentation about python objects by parsing their source code.

DocParser is a processing class that reads the Python source code for one or more modules, and uses it to create APIDoc objects containing the API documentation for the variables and values defined in those modules.

DocParser can be subclassed to extend the set of source code constructions that it supports.

Classes [hide private]
ParseError An exception that is used to signify that docparser encountered syntactically invalid Python code while processing a Python source file.

Functions [hide private]
    Module parser
ValueDoc parse_docs(filename=None, name=None, context=None, is_script=False)
Generate the API documentation for a specified object by parsing Python source files, and return it as a ValueDoc.
  _parse_package(package_dir)
If the given directory is a package directory, then parse its __init__.py file (and the __init__.py files of all ancestor packages); and return its ModuleDoc.
  handle_special_module_vars(module_doc)
  _module_var_toktree(module_doc, name)
    Module Lookup
  _find(name, package_doc=None)
Return the API documentaiton for the object whose name is name.
  _is_submodule_import_var(module_doc, var_name)
Return true if var_name is the name of a variable in module_doc that just contains an imported_from link to a submodule of the same name.
  _find_in_namespace(name, namespace_doc)
  _get_filename(identifier, path=None)
    File tokenization loop
  process_file(module_doc)
Read the given ModuleDoc's file, and add variables corresponding to any objects defined in that file.
  add_to_group(container, api_doc, group_name)
    Shallow parser
  shallow_parse(line_toks)
Given a flat list of tokens, return a nested tree structure (called a token tree), whose leaves are identical to the original list, but whose structure reflects the structure implied by the grouping tokens (i.e., parenthases, braces, and brackets).
    Line processing
  process_line(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
  process_control_flow_line(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
  process_import(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
  process_from_import(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
  _process_fromstar_import(src, parent_docs)
Handle a statement of the form:
  _import_var(name, parent_docs)
Handle a statement of the form:
  _import_var_as(src, name, parent_docs)
Handle a statement of the form:
  _add_import_var(src, name, container)
Add a new imported variable named name to container, with imported_from=src.
  _global_name(name, parent_docs)
If the given name is package-local (relative to the current context, as determined by parent_docs), then convert it to a global name.
  process_assignment(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
  lhs_is_instvar(lhs_pieces, parent_docs)
  rhs_to_valuedoc(rhs, parent_docs)
  get_lhs_parent(lhs_name, parent_docs)
  process_one_line_block(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
The line handler for single-line blocks, such as:
  process_multi_stmt(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
The line handler for semicolon-separated statements, such as:
  process_del(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
The line handler for delete statements, such as:
  process_docstring(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
The line handler for bare string literals.
  process_funcdef(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
The line handler for function declaration lines, such as:
  apply_decorator(decorator_name, func_doc)
  init_arglist(func_doc, arglist)
  process_classdef(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)
The line handler for class declaration lines, such as:
  find_base(name, parent_docs)
    Parsing
  dotted_names_in(elt_list)
Return a list of all simple dotted names in the given expression.
  parse_name(elt, strip_parens=False)
If the given token tree element is a name token, then return that name as a string.
  parse_dotted_name(elt_list, strip_parens=True)
  split_on(elt_list, split_tok)
  parse_funcdef_arg(elt)
If the given tree token element contains a valid function definition argument (i.e., an identifier token or nested list of identifiers), then return a corresponding string identifier or nested list of string identifiers.
  parse_classdef_bases(elt)
If the given tree token element contains a valid base list (that contains only dotted names), then return a corresponding list of DottedNames.
  parse_dotted_name_list(elt_list)
If the given list of tree token elements contains a comma-separated list of dotted names, then return a corresponding list of DottedName objects.
  parse_string(elt_list)
  parse_string_list(elt_list)
    Variable Manipulation
  set_variable(namespace, var_doc, preserve_docstring=False)
Add var_doc to namespace.
  del_variable(namespace, name)
    Name Lookup
VariableDoc or None lookup_name(identifier, parent_docs)
Find and return the documentation for the variable named by the given identifier.
  lookup_variable(dotted_name, parent_docs)
  lookup_value(dotted_name, parent_docs)
Find and return the documentation for the value contained in the variable with the given name in the current namespace.
    Docstring Comments
  add_docstring_from_comments(api_doc, comments)
    Tree tokens
  pp_toktree(elts, spacing='normal', indent=0)
    Helper Functions
  get_module_encoding(filename)
  _get_module_name(filename, package_doc)
Return (dotted_name, is_package)
  flatten(lst, out=None)

Variables [hide private]
_moduledoc_cache A cache of ModuleDocs that we've already created.
    Configuration Constants: Control Flow
PARSE_TRY_BLOCKS Should the contents of try blocks be examined?
PARSE_EXCEPT_BLOCKS Should the contents of except blocks be examined?
PARSE_FINALLY_BLOCKS Should the contents of finally blocks be examined?
PARSE_IF_BLOCKS Should the contents of if blocks be examined?
PARSE_ELSE_BLOCKS Should the contents of else and elif blocks be examined?
PARSE_WHILE_BLOCKS Should the contents of while blocks be examined?
PARSE_FOR_BLOCKS Should the contents of for blocks be examined?
    Configuration Constants: Imports
IMPORT_HANDLING What should docparser do when it encounters an import statement?
IMPORT_STAR_HANDLING When docparser encounters a 'from m import *' statement, and is unable to parse m (either because IMPORT_HANDLING='link', or because parsing failed), how should it determine the list of identifiers expored by m?
DEFAULT_DECORATOR_BEHAVIOR When DocParse encounters an unknown decorator, what should it do to the documentation of the decorated function?
BASE_HANDLING What should docparser do when it encounters a base class that was imported from another module?
    Configuration Constants: Comment docstrings
COMMENT_DOCSTRING_MARKER The prefix used to mark comments that contain attribute docstrings for variables.
    Configuration Constants: Grouping
START_GROUP_MARKER The prefix used to mark a comment that starts a group.
END_GROUP_MARKER The prefix used to mark a comment that ends a group.
    Line processing
CONTROL_FLOW_KEYWORDS A list of the control flow keywords.

Function Details [hide private]

parse_docs(filename=None, name=None, context=None, is_script=False)

source code 
call graph 
Generate the API documentation for a specified object by parsing Python source files, and return it as a ValueDoc. The object to generate documentation for may be specified using the filename parameter or the name parameter. (It is an error to specify both a filename and a name; or to specify neither a filename nor a name).
Parameters:
  • filename - The name of the file that contains the python source code for a package, module, or script. If filename is specified, then parse will return a ModuleDoc describing its contents.
  • name - The fully-qualified python dotted name of any value (including packages, modules, classes, and functions). DocParser will automatically figure out which module(s) it needs to parse in order to find the documentation for the specified object.
  • context - The API documentation for the package that contains filename. If no context is given, then filename is assumed to contain a top-level module or package. It is an error to specify a context if the name argument is used.
Returns: ValueDoc

_parse_package(package_dir)

source code 
call graph 
If the given directory is a package directory, then parse its __init__.py file (and the __init__.py files of all ancestor packages); and return its ModuleDoc.

handle_special_module_vars(module_doc)

source code 
call graph 

_module_var_toktree(module_doc, name)

source code 
call graph 

_find(name, package_doc=None)

source code 
call graph 
Return the API documentaiton for the object whose name is name. package_doc, if specified, is the API documentation for the package containing the named object.

_is_submodule_import_var(module_doc, var_name)

source code 
call graph 
Return true if var_name is the name of a variable in module_doc that just contains an imported_from link to a submodule of the same name. (I.e., is a variable created when a package imports one of its own submodules.)

_find_in_namespace(name, namespace_doc)

source code 
call graph 

_get_filename(identifier, path=None)

source code 
call graph 

process_file(module_doc)

source code 
call graph 
Read the given ModuleDoc's file, and add variables corresponding to any objects defined in that file. In particular, read and tokenize module_doc.filename, and process each logical line using process_line().

add_to_group(container, api_doc, group_name)

source code 
call graph 

shallow_parse(line_toks)

source code 
call graph 

Given a flat list of tokens, return a nested tree structure (called a token tree), whose leaves are identical to the original list, but whose structure reflects the structure implied by the grouping tokens (i.e., parenthases, braces, and brackets). If the parenthases, braces, and brackets do not match, or are not balanced, then raise a ParseError.

Assign some structure to a sequence of structure (group parens).

process_line(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 
Returns:
new-doc, decorator..?

process_control_flow_line(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 

process_import(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 

process_from_import(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 

_process_fromstar_import(src, parent_docs)

source code 
call graph 
Handle a statement of the form:
>>> from <src> import *

If IMPORT_HANDLING is 'parse', then first try to parse the module <src>, and copy all of its exported variables to parent_docs[-1].

Otherwise, try to determine the names of the variables exported by <src>, and create a new variable for each export. If IMPORT_STAR_HANDLING is 'parse', then the list of exports if found by parsing <src>; if it is 'introspect', then the list of exports is found by importing and introspecting <src>.

_import_var(name, parent_docs)

source code 
call graph 
Handle a statement of the form:
>>> import <name>

If IMPORT_HANDLING is 'parse', then first try to find the value by parsing; and create an appropriate variable in parentdoc.

Otherwise, add a variable for the imported variable. (More than one variable may be created for cases like 'import a.b', where we need to create a variable 'a' in parentdoc containing a proxy module; and a variable 'b' in the proxy module.

_import_var_as(src, name, parent_docs)

source code 
call graph 
Handle a statement of the form:
>>> import src as name

If IMPORT_HANDLING is 'parse', then first try to find the value by parsing; and create an appropriate variable in parentdoc.

Otherwise, create a variables with its imported_from attribute pointing to the imported object.

_add_import_var(src, name, container)

source code 
call graph 
Add a new imported variable named name to container, with imported_from=src.

_global_name(name, parent_docs)

source code 
call graph 
If the given name is package-local (relative to the current context, as determined by parent_docs), then convert it to a global name.

process_assignment(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 

lhs_is_instvar(lhs_pieces, parent_docs)

source code 
call graph 

rhs_to_valuedoc(rhs, parent_docs)

source code 
call graph 

get_lhs_parent(lhs_name, parent_docs)

source code 
call graph 

process_one_line_block(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 
The line handler for single-line blocks, such as:
>>> def f(x): return x*2
This handler calls process_line twice: once for the tokens up to and including the colon, and once for the remaining tokens. The comment docstring is applied to the first line only.
Returns:
None

process_multi_stmt(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 
The line handler for semicolon-separated statements, such as:
>>> x=1; y=2; z=3
This handler calls process_line once for each statement. The comment docstring is not passed on to any of the sub-statements.
Returns:
None

process_del(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 
The line handler for delete statements, such as:
>>> del x, y.z
This handler calls del_variable for each dotted variable in the variable list. The variable list may be nested. Complex expressions in the variable list (such as x[3]) are ignored.
Returns:
None

process_docstring(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 
The line handler for bare string literals. If prev_line_doc is not None, then the string literal is added to that APIDoc as a docstring. If it already has a docstring (from comment docstrings), then the new docstring will be appended to the old one.

process_funcdef(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 
The line handler for function declaration lines, such as:
>>> def f(a, b=22, (c,d)):
This handler creates and initializes a new VariableDoc containing a RoutineDoc, adds the VariableDoc to the containing namespace, and returns the RoutineDoc.

apply_decorator(decorator_name, func_doc)

source code 
call graph 

init_arglist(func_doc, arglist)

source code 
call graph 

process_classdef(line, parent_docs, prev_line_doc, lineno, comments, decorators, encoding)

source code 
call graph 
The line handler for class declaration lines, such as:
>>> class Foo(Bar, Baz):
This handler creates and initializes a new VariableDoc containing a ClassDoc, adds the VariableDoc to the containing namespace, and returns the ClassDoc.

find_base(name, parent_docs)

source code 
call graph 

dotted_names_in(elt_list)

source code 
call graph 
Return a list of all simple dotted names in the given expression.

parse_name(elt, strip_parens=False)

source code 
call graph 
If the given token tree element is a name token, then return that name as a string. Otherwise, raise ParseError.
Parameters:
  • strip_parens - If true, then if elt is a single name enclosed in parenthases, then return that name.

parse_dotted_name(elt_list, strip_parens=True)

source code 
call graph 

Bug: does not handle 'x.(y).z'

split_on(elt_list, split_tok)

source code 
call graph 

parse_funcdef_arg(elt)

source code 
call graph 
If the given tree token element contains a valid function definition argument (i.e., an identifier token or nested list of identifiers), then return a corresponding string identifier or nested list of string identifiers. Otherwise, raise a ParseError.

parse_classdef_bases(elt)

source code 
call graph 
If the given tree token element contains a valid base list (that contains only dotted names), then return a corresponding list of DottedNames. Otherwise, raise a ParseError.

Bug: Does not handle either of:

   - class A( (base.in.parens) ): pass
   - class B( (lambda:calculated.base)() ): pass

parse_dotted_name_list(elt_list)

source code 
If the given list of tree token elements contains a comma-separated list of dotted names, then return a corresponding list of DottedName objects. Otherwise, raise ParseError.

parse_string(elt_list)

source code 
call graph 

parse_string_list(elt_list)

source code 
call graph 

set_variable(namespace, var_doc, preserve_docstring=False)

source code 
call graph 
Add var_doc to namespace. If namespace already contains a variable with the same name, then discard the old variable. If preserve_docstring is true, then keep the old variable's docstring when overwriting a variable.

del_variable(namespace, name)

source code 
call graph 

lookup_name(identifier, parent_docs)

source code 
call graph 
Find and return the documentation for the variable named by the given identifier.
Returns: VariableDoc or None

lookup_variable(dotted_name, parent_docs)

source code 
call graph 

lookup_value(dotted_name, parent_docs)

source code 
call graph 
Find and return the documentation for the value contained in the variable with the given name in the current namespace.

add_docstring_from_comments(api_doc, comments)

source code 
call graph 

pp_toktree(elts, spacing='normal', indent=0)

source code 
call graph 

get_module_encoding(filename)

source code 
call graph 

See Also: PEP 263

_get_module_name(filename, package_doc)

source code 
call graph 
Return (dotted_name, is_package)

flatten(lst, out=None)

source code 
Parameters:
  • lst - The nested list that should be flattened.
Returns:
a flat list containing the leaves of the given nested list.

Variables Details [hide private]

_moduledoc_cache

A cache of ModuleDocs that we've already created. _moduledoc_cache is a dictionary mapping from filenames to ValueDoc objects.
Type:
dict
Value:
{'/home/edloper/data/projects/epydoc/src/epydoc/__init__.py': <ModuleD\
oc epydoc>,
 '/home/edloper/data/projects/epydoc/src/epydoc/apidoc.py': <ModuleDoc\
 epydoc.apidoc>,
 u'/home/edloper/data/projects/epydoc/src/epydoc/checker.py': <ModuleD\
oc epydoc.checker>,
 u'/home/edloper/data/projects/epydoc/src/epydoc/cli.py': <ModuleDoc e\
pydoc.cli>,
...                                                                    
      

PARSE_TRY_BLOCKS

Should the contents of try blocks be examined?
Value:
True                                                                   
      

PARSE_EXCEPT_BLOCKS

Should the contents of except blocks be examined?
Value:
True                                                                   
      

PARSE_FINALLY_BLOCKS

Should the contents of finally blocks be examined?
Value:
True                                                                   
      

PARSE_IF_BLOCKS

Should the contents of if blocks be examined?
Value:
True                                                                   
      

PARSE_ELSE_BLOCKS

Should the contents of else and elif blocks be examined?
Value:
True                                                                   
      

PARSE_WHILE_BLOCKS

Should the contents of while blocks be examined?
Value:
False                                                                  
      

PARSE_FOR_BLOCKS

Should the contents of for blocks be examined?
Value:
False                                                                  
      

IMPORT_HANDLING

What should docparser do when it encounters an import statement?
  • 'link': Create variabledoc objects with imported_from pointers to the source object.
  • 'parse': Parse the imported file, to find the actual documentation for the imported object. (This will fall back to the 'link' behavior if the imported file can't be parsed, e.g., if it's a builtin.)
Value:
'link'                                                                 
      

IMPORT_STAR_HANDLING

When docparser encounters a 'from m import *' statement, and is unable to parse m (either because IMPORT_HANDLING='link', or because parsing failed), how should it determine the list of identifiers expored by m?
  • 'ignore': ignore the import statement, and don't create any new variables.
  • 'parse': parse it to find a list of the identifiers that it exports. (This will fall back to the 'ignore' behavior if the imported file can't be parsed, e.g., if it's a builtin.)
  • 'introspect': import the module and introspect it (using dir) to find a list of the identifiers that it exports. (This will fall back to the 'ignore' behavior if the imported file can't be parsed, e.g., if it's a builtin.)
Value:
'parse'                                                                
      

DEFAULT_DECORATOR_BEHAVIOR

When DocParse encounters an unknown decorator, what should it do to the documentation of the decorated function?
  • 'transparent': leave the function's documentation as-is.
  • 'opaque': replace the function's documentation with an empty ValueDoc object, reflecting the fact that we have no knowledge about what value the decorator returns.
Value:
'opaque'                                                               
      

BASE_HANDLING

What should docparser do when it encounters a base class that was imported from another module?
  • 'link': Create a valuedoc with a proxy_for pointer to the base class.
  • 'parse': Parse the file containing the base class, to find the actual documentation for it. (This will fall back to the 'link' behavior if the imported file can't be parsed, e.g., if it's a builtin.)
Value:
'link'                                                                 
      

COMMENT_DOCSTRING_MARKER

The prefix used to mark comments that contain attribute docstrings for variables.
Value:
'#: '                                                                  
      

START_GROUP_MARKER

The prefix used to mark a comment that starts a group. This marker should be followed (on the same line) by the name of the group. Following a start-group comment, all variables defined at the same indentation level will be assigned to this group name, until the parser reaches the end of the file, a matching end-group comment, or another start-group comment at the same indentation level.
Value:
'#{'                                                                   
      

END_GROUP_MARKER

The prefix used to mark a comment that ends a group. See START_GROUP_MARKER.
Value:
'#}'                                                                   
      

CONTROL_FLOW_KEYWORDS

A list of the control flow keywords. If a line begins with one of these keywords, then it should be handled by process_control_flow_line.
Value:
['if', 'elif', 'else', 'while', 'for', 'try', 'except', 'finally']