I have 2 frames that contain network packet arguments (source IP, destination IP and so on), one frame captured at the source, one at the destination. In my python software (now rewriting in haskell), I mapped the 2 dataframes via a hash of the packet.
Which gives:
-- generate a column with a hash of other columns
addHash :: FrameFiltered Packet -> Frame (Record '[PacketHash] )
addHash aframe =
fmap (addHash') (frame)
where
frame = fmap toHashablePacket (ffFrame aframe)
addHash' row = Col (hashWithSalt 0 row) :& RNil
-- here frame1 and frame2 have the same type
mergeTcpConnectionsFromKnownStreams frame1 frame2 =
mergedFrame
where
mergedFrame = innerJoin @'[PacketHash] ( hframe1) ( hframe2)
hframe1 = zipFrames (addHash aframe1) frame1
hframe2 = zipFrames (addHash aframe1) frame2
It compiles and it seems to run but after the innerJoin, there should be several columns with the same name. Doesn't that break the API somewhat ? how can I select the source IP between the 2 sourceIP present in the merged dataframe for instance ?
I have 2 frames that contain network packet arguments (source IP, destination IP and so on), one frame captured at the source, one at the destination. In my python software (now rewriting in haskell), I mapped the 2 dataframes via a hash of the packet.
Which gives:
It compiles and it seems to run but after the innerJoin, there should be several columns with the same name. Doesn't that break the API somewhat ? how can I select the source IP between the 2 sourceIP present in the merged dataframe for instance ?