Skip to content

Add JImport model for structured Java import declarations#150

Merged
sinha108 merged 1 commit intocodellm-devkit:mainfrom
tylerstennett:feat/java-import-model
Mar 5, 2026
Merged

Add JImport model for structured Java import declarations#150
sinha108 merged 1 commit intocodellm-devkit:mainfrom
tylerstennett:feat/java-import-model

Conversation

@tylerstennett
Copy link
Contributor

These changes introduce the JImport Pydantic model to capture structured Java import metadata (path, is_static, is_wildcard) lost in previous versions, and adds a backward-compatible import_declarations field to JCompilationUnit. They also bump the codeanalyzer-java version from 2.3.3 to 2.3.7.

Motivation and Context

Previously, Java imports were stored as plain strings (List[str]), losing information about whether an import was static or used wildcard syntax. When two imports share the same path (e.g., import static Foo.bar and import Foo.bar.*), the string-only representation makes them indistinguishable. The new JImport model preserves this metadata.

How Has This Been Tested?

Model-layer tests (tests/models/java/test_java_models.py):

  • test_jcompilationunit_supports_legacy_import_list: verifies that plain-string imports are accepted and that import_declarations is automatically constructed with correct defaults (is_static=False, is_wildcard=False).
  • test_jcompilationunit_supports_structured_import_list: verifies that dict-based structured imports populate both imports and import_declarations, preserving is_static and is_wildcard flags.
  • test_jcompilationunit_uses_imports_when_import_declarations_is_empty: covers the edge case where import_declarations is present but empty, ensuring the validator falls back to the imports field.
  • test_jcompilationunit_prefers_non_empty_import_declarations: confirms that when both fields are provided and import_declarations is non-empty, the structured data takes precedence and imports is re-derived from it.
  • test_jcompilationunit_imports_round_trip_through_dump_apis: exercises model_dump() and model_dump_json() followed by re-parsing, ensuring no data is lost or mutated across serialization boundaries.

Codeanalyzer-layer tests (tests/analysis/java/test_jcodeanalyzer.py):

  • test_init_japplication_supports_legacy_import_schema: round-trips a v2.3.6-style JSON payload through _init_japplication and checks that both import fields are correctly populated.
  • test_init_japplication_supports_structured_import_schema: same flow with a v2.3.7-style structured payload, verifying is_static=True is preserved.
  • test_check_existing_analysis_file_level_accepts_legacy_import_schema / ..._structured_import_schema: writes temporary analysis.json files in each format and asserts that the cache-compatibility check passes for both.
  • test_check_existing_analysis_file_level_rejects_invalid_json: writes malformed JSON and confirms the checker returns False instead of raising, exercising the new JSONDecodeError/OSError handling.
  • test_init_codeanalyzer_reuses_legacy_cache_when_compatible: constructs a JCodeanalyzer with a pre-written legacy cache file and patches subprocess.run to confirm the backend is never invoked, validating end-to-end cache reuse with the old schema.
  • test_source_analysis_imports_disambiguate_static_and_wildcard: analyses a snippet containing import static Foo.bar and import Foo.bar.* and asserts that the two declarations are distinguishable by their is_static / is_wildcard flags despite sharing the same path (this previously would have been impossible).

Breaking Changes

No. The existing imports: List[str] field on JCompilationUnit is preserved and continues to be populated automatically. Users who only read imports will see no change. The new import_declarations: List[JImport] field is additive. Cached analysis.json files using the old string-based schema remain loadable without regeneration.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the Codellm-Devkit Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

  • The treesitter_java.py get_all_imports method is annotated with a note that it is currently unused and does not return JImport objects. It should be updated if it gets wired into a real code path.
  • check_exisiting_analysis_file_level now catches json.JSONDecodeError and OSError instead of letting corrupt cache files crash the initialization flow. This better aligns with the return type and allows regeneration.

@sinha108 sinha108 merged commit 6090cbb into codellm-devkit:main Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants