Conversation
phschaad
left a comment
There was a problem hiding this comment.
Some questions and minor corrections - after that will take closer look at remaining parts
Co-authored-by: Philipp Schaad <schaad.phil@gmail.com>
| import typing | ||
|
|
||
|
|
||
| class TaskletType(Enum): |
|
|
||
|
|
||
| def _token_split(string_to_check: str) -> Set[str]: | ||
| """ |
There was a problem hiding this comment.
not the correct (Sphinx) docstring format we use in the rest of dace (apply everywhere)
| def _token_split(string_to_check: str) -> Set[str]: | ||
| """ | ||
| Splits a string into a set of tokens, keeping delimiters, and returns all tokens. | ||
| The input string is split on empty space and brackets (` `, `(`, `)`, `[`, `]`). |
There was a problem hiding this comment.
double backticks (``) for code comments
| if isinstance(node.op, ast.USub): | ||
| return f"-{node.operand.value}" | ||
| elif isinstance(node.op, ast.UAdd): | ||
| return str(node.operand.value) |
| return lhs_str, rhs_str | ||
|
|
||
|
|
||
| def _extract_non_connector_syms_from_tasklet(node: dace.nodes.Tasklet, state) -> typing.Set[str]: |
There was a problem hiding this comment.
missing type hints and direct inclusion of typing.
Since dace is Python 3.10+, use set[str] instead of typing
| n_in = len(in_conns) | ||
| n_out = len(out_conns) | ||
|
|
||
| assert n_out <= 1, "Only support tasklets with at most 1 output in this pass" |
| assert isinstance(node, dace.nodes.Tasklet) | ||
| code: CodeBlock = node.code | ||
| assert code.language == dace.dtypes.Language.Python | ||
| code_str: str = code.as_string |
There was a problem hiding this comment.
it's inefficient that you already have an AST object and you convert it to a string and then reparse it. Explain your reasoning please.
| lhs_data = state.sdfg.arrays[lhs_data_name] | ||
|
|
||
| # Assignment operators it will return op <- `=` and always populate `rhs1` | ||
| if code_str == f"{lhs} = {rhs}" or code_str == f"{lhs} = {rhs};": |
There was a problem hiding this comment.
I would not trust an expression like this (e.g., what about a = (b)?). Why not use isinstance(ast.Assign/AnnAssign)?
| info_dict.update({"type": ttype, "constant1": c1, "constant2": None, "op": op}) | ||
| return info_dict | ||
|
|
||
| raise NotImplementedError("Unhandled case in detect tasklet type") |
There was a problem hiding this comment.
Shouldn't this just return a cannot-classify-tasklet or "other" TaskletType result?
| return node | ||
|
|
||
|
|
||
| def rewrite_boolean_functions_to_boolean_ops(src: str) -> str: |
alexnick83
left a comment
There was a problem hiding this comment.
Very well written, but I feel that it is a bit overengineered considering the currently limited supported cases. Please find some comments and questions below, but I will probably need to have another look.
| Each pattern represents a specific combination of input types (arrays, scalars, symbols) | ||
| and operation types (assignment, binary operation, unary operation). | ||
|
|
||
| Note: inside a tasklet you always have scalars, it is about he connector types |
There was a problem hiding this comment.
| Note: inside a tasklet you always have scalars, it is about he connector types | |
| Note: inside a tasklet you always have scalars, it is about the connector types |
| ARRAY_ARRAY_ASSIGNMENT: Direct array-to-array copy (e.g., a = b) | ||
| ARRAY_SYMBOL_ASSIGNMENT: Symbol/constant assignment to array (e.g., a = sym) | ||
| ARRAY_SCALAR_ASSIGNMENT: Scalar variable assignment to array (e.g., a = scl) | ||
| SCALAR_ARRAY_ASSIGNMENT: Array assignment to scalar variable (e.g., scl = a) |
There was a problem hiding this comment.
Is this a valid case? Even when, e.g., you assign A[i] to a scalar b, isn't the source connector a scalar?
There was a problem hiding this comment.
I see now, way below (line 600+), that you are looking at the data type, not the connector type. Maybe this should be clarified in the note above (line 26).
| ---------- | ||
| string_to_check : str | ||
| The string to split into tokens. | ||
| pattern_str : str |
There was a problem hiding this comment.
But the parameter is not actually kept in the signature
| # Split while keeping delimiters | ||
| tokens = re.split(r'(\s+|[()\[\]])', string_to_check) | ||
|
|
||
| # Replace tokens that exactly match src |
There was a problem hiding this comment.
Comment doesn't make sense (copied from token_match?)
| if isinstance(node, ast.Constant): | ||
| return str(node.value) | ||
| elif isinstance(node, ast.UnaryOp) and isinstance(node.operand, ast.Constant): | ||
| if isinstance(node.op, ast.USub): |
There was a problem hiding this comment.
Do we need the bitwise inversion? In that case, I would evaluate the result and convert it to a string.
|
|
||
| found = op | ||
|
|
||
| code_rhs = src.split(" = ")[-1].strip() |
There was a problem hiding this comment.
Wouldn't it be better to convert to AST, find the assignment, and then take the value?
| """ | ||
| tdict = dict() | ||
| for ie in state.in_edges(tasklet): | ||
| if ie.data is not None: |
There was a problem hiding this comment.
Maybe worth testing what happens when the input comes from another tasklet ... we used to have some "hidden" scalar descriptors for that, but I am not up to date on that.
| For function calls, uses AST parsing to extract arguments in order. | ||
| For operators, splits the code by the operator symbol. | ||
| """ | ||
| code_rhs = code_str.split(" = ")[-1].strip() |
| n_out = len(out_conns) | ||
|
|
||
| assert n_out <= 1, "Only support tasklets with at most 1 output in this pass" | ||
| lhs = next(iter(node.out_connectors.keys())) if n_out == 1 else None |
There was a problem hiding this comment.
Very nitpicky, but why not just lhs = out_conns[0] if n_out == 1 else None?
|
|
||
| assert n_out == 1 | ||
|
|
||
| if n_in == 1: |
There was a problem hiding this comment.
Given the many restrictions on supported tasklets, the following feels somewhat too long. If I understand correctly, you accept codes with a single expression (assignment). The RHS may be either an operand or a unary/binary operator. The operands can only be connector names, symbols, or other expressions that evaluate to a constant. You also have a special case where the binary operation is applied to the same connector and interpreted as a unary op. Are you going to support more generalized tasklets in the future?
The analysis returns a dictionary containing the left- and right-hand-side operands of a given tasklet. It returns the operator (or function), the constants involved, and whether they are data (array or scalar) or constants (hardcoded constants or symbols), etc.
I have implemented the analysis for the auto-vectorization to determine whether to use vector-vector intrinsics or vector-scalar intrinsics, etc. It only accepts tasklets with at most 2 rhs operands. SplitTasklets that I have merged before can be used to ensure that all supported operators and a subset of functions (such as min, sin, cos, etc.) are used. Python tasklets have 2 rhs operands.
The analysis was also useful for a student trying to implement blocked-FP numbers in DaCe, so I think it will have further use beyond auto-vectorization.