feat: microgen - add AST analysis utilities #2292

chalmerlowe · 2025-09-15T13:35:13Z

Follows: PR #2291 (should be merged after that PR is merged.)

This PR adds the CodeAnalyzer class, which is a node visitor that traverses an AST and extracts structured information about classes, methods, and their arguments.

Inludes:

generate.py file
the CodeAnalyzer class
several helper functions

Migrates the empty __init__.py file to the microgenerator package.

Introduces the CodeAnalyzer class and helper functions for parsing Python code using the ast module. This provides the foundation for understanding service client structures.

chalmerlowe · 2025-09-18T12:42:51Z

For clarity:

The GitHub Actions are being used to help ensure that unit tests pass.
The KOKORO tests are failing. This is a known problem and will be dealt with in a separate PR. It should not affect merging into the autogen dev branch.

scripts/microgenerator/generate.py

Linchin · 2025-09-18T19:16:57Z

scripts/microgenerator/generate.py

+
+    def _add_attribute(self, attr_name: str, attr_type: str | None = None):
+        """Adds a unique attribute to the current class context."""
+        if self._current_class_info:


Should we raise an error if self._current_class_info turns out to be false, instead of doing nothing silently?

The current structure of the code is this:

def visit_Assign(self, node): # ... other logic ... if self._current_class_info: # ... determine attr_name, attr_type ... self._add_attribute(attr_name, attr_type) self.generic_visit(node) def visit_AnnAssign(self, node): # ... other logic ... if self._current_class_info: # ... determine attr_name, attr_type ... self._add_attribute(attr_name, attr_type) self.generic_visit(node)

Given this structure:

The if self._current_class_info: check inside _add_attribute is redundant. It
will always be True because the callers guarantee it.

There is no need to raise an error for a missing class context within _add_attribute, as the situation is prevented by the calling methods.

I removed the check inside of _add_attribute. This makes _add_attribute cleaner and more focused, relying on the contract established by its callers. I updated the docstring to reflect this assumption.

Linchin · 2025-09-18T19:28:06Z

scripts/microgenerator/generate.py

+
+            all_class_keys.append(key)
+
+            # Skip filling details if not needed for the dictionary.


This check seems redundant

Good catch. Removed.

Linchin · 2025-09-18T19:30:30Z

scripts/microgenerator/generate.py

+    # Determine if the path is a file or directory and process accordingly
+    if os.path.isfile(path) and path.endswith(".py"):
+        structure, _, _ = parse_file(path)
+        process_structure(structure)


For my education, why is file_name omitted here?

Thanks for asking! Seems like a reasonable question.

This design of the list_code_objects() function is to return a well-structured dictionary with clear keys, whether it's analyzing a single file or an entire directory.

This library can be run automatically OR interactively. It also has the ability to be run against a single file OR a directory full of files. Looking at both cases:

if os.path.isfile(path) and path.endswith(".py"):

This block executes when the user provides a path to a single Python file.

process_structure(structure) is called without the file_name argument.

Why? Since we are only analyzing one file, any class name found is unique to
that file. There's no need to disambiguate it with the filename in the output
keys. The key in process_structure will just be class_info["class_name"].

elif os.path.isdir(path):

This block executes when the user provides a path to a directory.

The code iterates through all .py files within that directory.

process_structure(structure, file_name=os.path.basename(file_path)) is called
for each file.

Why? When scanning multiple files, it's possible to encounter classes with the
same name in different files. To prevent these from clobbering each other in
the results dictionary and to make the output clear, the file_name is used to
make the key unique. The key in process_structure becomes
f"{class_info["class_name"]} (in {file_name})".

In essence: The file_name argument is only provided when processing a directory to
ensure that class names in the output dictionary are unique, even if the same class
name appears in multiple files. When processing a single file, this disambiguation
is not necessary.

chalmerlowe added 12 commits September 11, 2025 12:03

chore: removes old proof of concept

b9d4a04

removes old __init__.py

5b4d538

Adds two utility files to handle basic tasks

132c571

Adds a configuration file for the microgenerator

90b224e

Removes unused comment

e071eab

chore: adds noxfile.py for the microgenerator

dc72a98

feat: microgen - adds two init file templates

7318f0b

feat: adds _helpers.py.js template

07910c5

Updates with two usage examples

dc54c99

feat: adds two partial templates for creating method signatures

28de5f8

feat: Add microgenerator __init__.py

c457754

Migrates the empty __init__.py file to the microgenerator package.

feat: Add AST analysis utilities

595e59f

Introduces the CodeAnalyzer class and helper functions for parsing Python code using the ast module. This provides the foundation for understanding service client structures.

chalmerlowe requested review from a team as code owners September 15, 2025 13:35

chalmerlowe requested review from logachev and removed request for a team September 15, 2025 13:35

product-auto-label bot added the size: l Pull request size is large. label Sep 15, 2025

blunderbuss-gcf bot assigned agrawal-siddharth Sep 15, 2025

product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Sep 15, 2025

chalmerlowe mentioned this pull request Sep 15, 2025

feat: microgen - adds source file gathering functions #2293

Merged

1 task

agrawal-siddharth assigned shollyman and unassigned agrawal-siddharth Sep 15, 2025

chalmerlowe changed the title ~~feat: Add AST analysis utilities~~ feat: microgen - add AST analysis utilities Sep 15, 2025

chalmerlowe added this to the µgen PoC milestone Sep 16, 2025

chalmerlowe self-assigned this Sep 16, 2025

Base automatically changed from feat/migrate-init to feat-adds-method-partials September 16, 2025 14:22

chalmerlowe unassigned shollyman Sep 16, 2025

Linchin reviewed Sep 18, 2025

View reviewed changes

scripts/microgenerator/generate.py Show resolved Hide resolved

Linchin reviewed Sep 18, 2025

View reviewed changes

scripts/microgenerator/generate.py Show resolved Hide resolved

Linchin reviewed Sep 18, 2025

View reviewed changes

Base automatically changed from feat-adds-method-partials to autogen September 18, 2025 19:33

chalmerlowe added 4 commits September 18, 2025 15:38

Merge branch 'autogen' into feat/add-ast-utilities

da97d1d

updates comment and removes redundant check.

b57aa5c

removes redundant show_attribute and show_methods check

6c5f4b0

Merge branch 'autogen' into feat/add-ast-utilities

9f50afa

Linchin approved these changes Sep 22, 2025

View reviewed changes

chalmerlowe merged commit 5259312 into autogen Sep 22, 2025
25 checks passed

chalmerlowe deleted the feat/add-ast-utilities branch September 22, 2025 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: microgen - add AST analysis utilities #2292

feat: microgen - add AST analysis utilities #2292

Uh oh!

chalmerlowe commented Sep 15, 2025

Uh oh!

chalmerlowe commented Sep 18, 2025

Uh oh!

Uh oh!

Uh oh!

Linchin Sep 18, 2025

Uh oh!

chalmerlowe Sep 19, 2025

Uh oh!

Linchin Sep 18, 2025

Uh oh!

chalmerlowe Sep 19, 2025

Uh oh!

Linchin Sep 18, 2025

Uh oh!

chalmerlowe Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		all_class_keys.append(key)

		# Skip filling details if not needed for the dictionary.

feat: microgen - add AST analysis utilities #2292

feat: microgen - add AST analysis utilities #2292

Uh oh!

Conversation

chalmerlowe commented Sep 15, 2025

Uh oh!

chalmerlowe commented Sep 18, 2025

Uh oh!

Uh oh!

Uh oh!

Linchin Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

chalmerlowe Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Linchin Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

chalmerlowe Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Linchin Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

chalmerlowe Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants