Thanks for your question. The point cloud obtained from the sensor should be transformed to the global frame (where Z+ is the normal vector of the horizontal plane, e.g., the tabletop or the ground) since the output from our pre-trained network is constrained to a predefined view space in the global frame.
Originally posted by @psc0628 in #2 (comment)
Taking into account that the pointcloud should be transformed to the global frame. Does it matter where the object center (which is the same as the hemisphere center) is with respect to global frame?