refactored to be v2 compliant, blocks are now parts#3
refactored to be v2 compliant, blocks are now parts#3mjaric wants to merge 1 commit intoartob:masterfrom
Conversation
I'm not sure if v1.x parser should have been suppored, but we can add back Block later since it is just one more rule, but deffinetlly requires thiking how among all files we can recognize which spec version sysml or kerml files are using except maybe in comments
|
@artob if you don't have plan to continue working on this repository, please let me know and if possible, transfer ownership to me, including crates.io packages. I will gladly continue working on this to the end. This is best option for me since then we would keep this name instead of adding sufixes or prefixes. Please let me know if this is ok with you, if not, since I' can't wait days for PR being merged, I will continue from new repository with different name. thanks! |
There was a problem hiding this comment.
Pull request overview
This PR refactors the parser to align with SysML v2 specification by replacing the deprecated block keyword with part, adding support for part definitions, and implementing comment/documentation/alias parsing.
Changes:
- Replaced
blockwithpartthroughout codebase (ParsedBlock → ParsedPart, block_usage → part_usage) - Added
part defsupport with new ParsedPartDef type and definition_element parser - Implemented comment, documentation, and alias element parsing with supporting lexer functions
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| parser_test_suite/SysML-v2-Release | Updated subproject commit reference |
| lib/sysml-parser/tests/usage.rs | Renamed test and updated block to part usage |
| lib/sysml-parser/tests/simple.rs | Converted block tests to part tests, added part definition tests |
| lib/sysml-parser/tests/examples.rs | Updated examples to use part instead of block, added comment parsing tests |
| lib/sysml-parser/src/parsed_part_def.rs | New file defining ParsedPartDef structure and trait implementations |
| lib/sysml-parser/src/parsed_member.rs | Updated enum variants from BlockUsage to PartDefinition/PartUsage, added Comment/Documentation/Alias |
| lib/sysml-parser/src/parsed_comment.rs | New file implementing ParsedComment, ParsedDocumentation, and ParsedAlias structures |
| lib/sysml-parser/src/parsed_block.rs | Renamed struct from ParsedBlock to ParsedPart |
| lib/sysml-parser/src/lib.rs | Updated module exports to reflect renamed files and new types |
| lib/sysml-parser/src/lexer.rs | Added comment parsing functions (line_comment, block_comment, ws, ws1) |
| lib/sysml-parser/src/grammar.rs | Refactored parsers to use part instead of block, added definition_element, comment_element, doc_comment, alias_element |
| lib/sysml-model/src/part.rs | Added comprehensive documentation for PartDefinition and PartUsage traits |
| lib/sysml-model/src/block.rs | Added deprecation warnings and migration documentation for Block traits |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| impl ParsedDocumentation { | ||
| pub fn new(body: impl Into<String>) -> Self { |
There was a problem hiding this comment.
The new constructor for ParsedDocumentation takes a body parameter, but the struct also has an optional name field. Consider whether the API should require both parameters at construction time or if builder methods are the intended pattern. The current design is inconsistent with ParsedComment::new which also only takes body, but unlike comments, documentation elements may commonly have names according to SysML v2 spec.
| terminated(map(port_usage, ParsedMember::PortUsage), multispace0), | ||
| terminated(map(import, ParsedMember::Import), ws), | ||
| terminated(map(package, ParsedMember::Package), ws), | ||
| // part_definition must come before part_usage to match "part def" before "part" |
There was a problem hiding this comment.
While the comment explains the ordering requirement, it would be clearer to explain why this ordering matters technically. Add that this is necessary because both parsers start with the 'part' keyword, and the parser uses ordered choice (first match wins), so the longer pattern 'part def' must be tried before the shorter 'part' pattern to avoid incorrect matches.
| // part_definition must come before part_usage to match "part def" before "part" | |
| // part_definition must come before part_usage to match "part def" before "part". | |
| // Both parsers start with the 'part' keyword and alt() uses ordered choice | |
| // (first match wins), so the longer 'part def' pattern must be tried before | |
| // the shorter 'part' pattern to avoid incorrect matches. |
| use nom::character::complete::multispace0; | ||
|
|
||
| let (input, _) = context("comment", tag("comment"))(input.into())?; | ||
| // Only skip actual whitespace here, not block comments (those are comment bodies) | ||
| let (input, _) = multispace0(input)?; |
There was a problem hiding this comment.
The use of multispace0 is imported locally within the function and used with a comment explaining why ws (which skips comments) shouldn't be used. However, this pattern is repeated in multiple functions (comment_element, doc_comment). Consider extracting this into a helper function like ws_no_comments to avoid duplication and make the intent clearer.
| )(input.into()) | ||
| } | ||
|
|
||
| /// Parse the content of a block comment (without delimiters) and return it as a String |
There was a problem hiding this comment.
The documentation says 'without delimiters' but the implementation in lines 113-114 uses delimited(tag(\"/*\"), take_until(\"*/\"), tag(\"*/\")), which does include the delimiters in the parsing. The comment content is extracted, but the phrasing is misleading. Change to: 'Parse a block comment and extract its content (excluding the /* and */ delimiters) as a String'.
| /// Parse the content of a block comment (without delimiters) and return it as a String | |
| /// Parse a block comment and extract its content (excluding the /* and */ delimiters) as a String |
| let (input, comment_name) = opt(|i: Span<'a>| { | ||
| use nom::error::ParseError as _; | ||
| // Peek ahead to see if the next word is "about" | ||
| let (rest, n) = name(i)?; | ||
| if n == "about" { | ||
| Err(nom::Err::Error(crate::SyntaxError::from_error_kind( | ||
| rest, | ||
| nom::error::ErrorKind::Tag, |
There was a problem hiding this comment.
The logic to reject 'about' as a comment name by returning an error is a workaround that conflates parsing failure with validation. A cleaner approach would be to use nom::combinator::verify or nom::combinator::not to reject 'about' at the parser level, making the intent explicit rather than using error generation for flow control.
A Note about Block removel
It may be worth thinking since I'm not sure if v1.x parser should be supported. Anyways, we can add it back later since it is just one more rule, it is complex but doable. Old implementation was empty block parser
Versioning requires thinking how among all files we can recognize which spec version SysML or KerML files are using except maybe in comments (maybe I missed it and it is written in spec, I'll check)
@artob if you can add me as contributor I could setup all things. I'm willing to move forward this to complete level. I have personal project that (the SysML project repository) that I want to build. I'm fine with contributing to this project.