Skip to content

31d Defining behaviour: Container types and assignment#104

Open
ngjunsiang wants to merge 6 commits intomainfrom
release
Open

31d Defining behaviour: Container types and assignment#104
ngjunsiang wants to merge 6 commits intomainfrom
release

Conversation

@ngjunsiang
Copy link
Contributor

@ngjunsiang ngjunsiang commented Oct 26, 2022

Hear me out: what if containers had types?

Suppse every container had a type attribute. A Student object would have Student.type == "Student". An Array might have type ARRAY, or ARRAY[1:3] OF INTEGER.

I imagine the Container protocol would look something like this:
https://github.com/nyjc-computing/pseudo-9608/blob/3bdf489b985911caa08996a570ed59d90a1d6190/pseudocode/lang/object.py#L146-L159

Why do we want this, or even need this? I'm looking at the part of the 9608 pseudocode reference that requires support for statements like these:

DECLARE NoughtsAndCrosses : ARRAY[1:3,1:3] OF STRING
DECLARE SavedGame : ARRAY[1:3,1:3] OF STRING
DECLARE i : INTEGER
DECLARE j : INTEGER

FOR i <- 1 TO 3
    FOR j <- 1 TO 3
        IF MOD(i, 2) = MOD(j, 2)
          THEN
            NoughtsAndCrosses[i, j] <- "X"
          ELSE
            NoughtsAndCrosses[i, j] <- "O"
        ENDIF
    ENDFOR
ENDFOR

SavedGame <- NoughtsAndCrosses

FOR i <- 1 TO 3
    FOR j <- 1 TO 3
        OUTPUT SavedGame[i, j]
    ENDFOR
ENDFOR

TYPE Student
    DECLARE Surname : STRING
    DECLARE FirstName : STRING
    DECLARE YearGroup : INTEGER
ENDTYPE

DECLARE Pupil1 : Student
DECLARE Pupil2 : Student
Pupil1.Surname <- "Johnson"
Pupil1.Firstname <- "Leroy"
Pupil1.YearGroup <- 6
Pupil2 <- Pupil1

OUTPUT Pupil2.Surname
OUTPUT Pupil2.FirstName
OUTPUT Pupil2.YearGroup

tl;dr Pseudo needs to support Array and Object assignments. Currently, this works:

$ pseudo main.pseudo 
X
O
X
O
X
O
X
O
X
Johnson
Leroy
6

We lean heavily on Python's object model to make this work; internally, within the global frame, the Pupil1 and Pupil2 names point to the same Object, while the NoughtsAndCrosses and SavedGame names point to the same Array.

But when we change the code a little:

DECLARE NoughtsAndCrosses : ARRAY[1:3,1:3] OF STRING
DECLARE SavedGame : ARRAY[1:4,1:4] OF STRING
DECLARE i : INTEGER
DECLARE j : INTEGER

FOR i <- 1 TO 3
    FOR j <- 1 TO 3
        IF MOD(i, 2) = MOD(j, 2)
          THEN
            NoughtsAndCrosses[i, j] <- "X"
          ELSE
            NoughtsAndCrosses[i, j] <- "O"
        ENDIF
    ENDFOR
ENDFOR

SavedGame <- NoughtsAndCrosses
NoughtsAndCrosses[2, 2] <- "O"

FOR i <- 1 TO 3
    FOR j <- 1 TO 3
        OUTPUT SavedGame[i, j]
    ENDFOR
ENDFOR

TYPE Student
    DECLARE Surname : STRING
    DECLARE FirstName : STRING
    DECLARE YearGroup : INTEGER
ENDTYPE

DECLARE Pupil1 : Student
DECLARE Pupil2 : Student
Pupil1.Surname <- "Johnson"
Pupil1.FirstName <- "Leroy"
Pupil1.YearGroup <- 6
Pupil2 <- Pupil1
Pupil1.FirstName <- "LEEROOOOYYYYYY"

OUTPUT Pupil2.Surname
OUTPUT Pupil2.FirstName
OUTPUT Pupil2.YearGroup

We get this:

$ pseudo main.pseudo 
X
O
X
O
O
O
X
O
X
Johnson
LEEROOOOYYYYYY
6

That's not quite how things are supposed to work in Pseudo. DECLAREing a name means that memory is set aside for its data, so Array and Object assignment should result in copying, not referencing of data. After the assignment, mutations to Pupil1 and NoughtsAndCrosses should not be reflected in the copies.

Not to mention that array assignment for arrays with different sizes is not supported in Pseudo.
We'll fix all this in this chapter.

@ngjunsiang
Copy link
Contributor Author

Along the way we fix a little bug in error reporting: In the scanning phase, tokens are just strings and not full-fledged Tokens yet, and therefore do not have line and column information. The scanner also would not have returned the list of scanned lines yet.

@ngjunsiang
Copy link
Contributor Author

Designing container types

How would this data go through our execution pipeline?

  1. The parser picks up the name, index ranges and element type, bundling them into a Declare/DeclareStmt
  2. The resolver declares the name and type (ARRAY) in the frame, and uses the typesystem to create an array typedvalue with the typedvalue of the member elements
  3. The interpreter ... uses the array

This means that:

  • The typedvalues are already "baked in" at the resolver stage. This makes them hard to change at runtime, since the interpreter isn't handling any declarations.
  • metadata for the array—its index ranges and element type—are contained in the Declare Expr, then passed to the Array. But the TypedValue for the Array has no knowledge of the metadata, making typechecking difficult for the resolver unless it carries out the instantiation.

We need clearer separation of responsibility here.

@ngjunsiang
Copy link
Contributor Author

Design draft

Intention: The resolver should only be responsible for checking the logic of operations. It should not be using up memory bringing Containers into actual existence; at most its work would be done in the typesystem to register custom types.

The interpreter is responsible for actually instantiating containers.

For the resolver to still do its job without bringing Arrays and Objects into existence, it must have access to array metadata, without relying on the Declare Expr sticking around after that.

The logical place to stash this metadata is in the TypedValue. That means, in addition to the type and value attributes, TypedValues will now pick up a metadata attribute as well.

@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented Oct 26, 2022

The new TypedValue

So ... what is a TypedValue exactly?

  • It serves as a wrapper for PyLiterals and PseudoValues, which are never stored directly in a Frame or Container, but always wrapped in a TypedValue
  • It contains type information for its contents, which the resolver uses to check the logic of operations so that the interpreter does not need to carry out runtime type-checking for them again.

In our new design, the TypedValue must still be able to serve these purposes without actually holding an Array or Object (remember that those will only be instantiated in the interpreter). That means the new TypedValue's metadata attribute must hold:

  • array index ranges and element types
  • object members and types

@ngjunsiang
Copy link
Contributor Author

A quick prototype of the *Metadata classes and the TypedValue's metadata attribute:

https://github.com/nyjc-computing/pseudo-9608/blob/45c8043ea5be497f44d600bda76edf81a1a26fb6/pseudocode/lang/object.py#L54-L71

@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented Oct 27, 2022

Once I comment the line, the code fails; there is already code that relies on value being the second argument. Also, if we want default arguments and __slots__, we'll have to make the init ourselves. And we're back to working code, plus TypedValue now having (empty) metadata:

https://github.com/nyjc-computing/pseudo-9608/blob/0e31cb5acd480cbc4e38c219a623ff60f6357931/pseudocode/lang/object.py#L63-L75

@ngjunsiang
Copy link
Contributor Author

ngjunsiang commented Oct 27, 2022

On second thought, I'd rather lose __slots__ and keep the code terse for now:

https://github.com/nyjc-computing/pseudo-9608/blob/c5498ac570085c906083bf96028056362e56728e/pseudocode/lang/object.py#L63-L70

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant