-
Notifications
You must be signed in to change notification settings - Fork 80
Symbol report verification report #2544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Symbol report verification report #2544
Conversation
|
|
|
The created documentation from the pull request is available at: docu-html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see inline comments, proposal: take over the failure modes and mitigations from https://public-docs.ferrocene.dev/main/certification/core/safety-plan/tools.html#code-coverage
|
|
||
| Result | ||
| ~~~~~~ | ||
| Symbol report and blanket do not require qualification for use in safety-related software development according to ISO 26262. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as stated by masc2023 - without additional measures I cannot agree to this result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworked table
| Safety evaluation | ||
| ----------------- | ||
| This section outlines the safety evaluation of symbol report and blanket for its use within the S-CORE project. This evaluation assumes that the Rust compiler is | ||
| qualified and output of coverage data in `.profraw` format is correct. Due to that, we solely focus on post processing that is done by symbol report and blanket only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be use and measure the coverage based on the .profraw somewhere else? This can have impact on the tool classification, which is the reason I am asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these tools are the only measurement for coverage of rust
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no check by other tools and the tool is used to argue safety, plus manual checking is not feasible to guarantee operation is 100% the tool confidence level should decrease.
I would like to see more about the available testing and quality measures used to operate the tool to have improved argument why the likelihood of errors are reduced due to the process and quality rigor applied to the tool.
|
@pahmann @aschemmel-tech @masc2023 tried to resolve comments please recheck |
| - Overreporting, could result in testing gap. | ||
| - yes | ||
| - | Likelihood of such an error low due to wide usage of the tool (many S-CORE modules and other projects like ferrocene) | ||
| | Additionally, every new tool release is tested by running tests in prepared integration testsuite to detect such errors. (PROPOSAL POINT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add the expectation of the tester/user. So a failure can be detected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have this as Proposal point this will be a measure on the low confidence. I would say, that we cannot simply say that the detection is sufficient just because we plan to write a test for this in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we need to evaluate this report assuming future. This is required for feroceen on gives us estimates of price we need to put into this. So we shall agree that ie this point is neded and sufficient and we will request it from them.
| - A function is not being considered, although it is part of the certified subset | ||
| - yes | ||
| - | `symbol-report` is developed to use exactly the same information as the compiler | ||
| | Additionally, every new tool release is tested by running tests in prepared integration testsuite to detect such errors. (PROPOSAL POINT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in this case the user expectation will lead in some cases to a failure detection.
| ---------------------------- | ||
| Installation | ||
| ~~~~~~~~~~~~ | ||
| | To add the Code coverage to your project or module follow guidelines in WIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a link to WIP available, or nothing at the moment, then add a issue to follow up with that and link it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| - Overreporting, could result in testing gap. | ||
| - yes | ||
| - | Likelihood of such an error low due to wide usage of the tool (many S-CORE modules and other projects like ferrocene) | ||
| | Additionally, every new tool release is tested by running tests in prepared integration testsuite to detect such errors. (PROPOSAL POINT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What means Proposal Point? Additional measure to reduce the risk? Then add issue, link it here, that we can follow up as it then supports the argumentation why the rating is high
If Proposal Point is additional measure, update the table with additional measure needed yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Means that this can be an mitigation if Safety collegues things its good to have/needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared to the first draft the new version improved. Still it feels not mature and more an early state of evaluation. I do not have the confidence that tool qualification is not needed based on the available information.
Remark: Any information regarding the TCL from the source at Ferroscene? They have a bunch of public information on safety as they claim.
| - Overreporting, could result in testing gap. | ||
| - yes | ||
| - | Likelihood of such an error low due to wide usage of the tool (many S-CORE modules and other projects like ferrocene) | ||
| | Additionally, every new tool release is tested by running tests in prepared integration testsuite to detect such errors. (PROPOSAL POINT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have this as Proposal point this will be a measure on the low confidence. I would say, that we cannot simply say that the detection is sufficient just because we plan to write a test for this in the future.
| - False-positive: A function is reported as covered, although it is not covered | ||
| - Overreporting, could result in testing gap. | ||
| - yes | ||
| - | Likelihood of such an error low due to wide usage of the tool (many S-CORE modules and other projects like ferrocene) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument is very weak. It is a proven in use argumentation under the assumption that the use case is identical and the usage as well. Already in the case that there are tool configuration options this weakens your argument. Otherwise we could simply say. "Oh there are no issues in Linux and the likelihood is low, because it is used in so many devices".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I splited this as not enought and added additional measure column info.
| Safety evaluation | ||
| ----------------- | ||
| This section outlines the safety evaluation of symbol report and blanket for its use within the S-CORE project. This evaluation assumes that the Rust compiler is | ||
| qualified and output of coverage data in `.profraw` format is correct. Due to that, we solely focus on post processing that is done by symbol report and blanket only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no check by other tools and the tool is used to argue safety, plus manual checking is not feasible to guarantee operation is 100% the tool confidence level should decrease.
I would like to see more about the available testing and quality measures used to operate the tool to have improved argument why the likelihood of errors are reduced due to the process and quality rigor applied to the tool.
PandaeDo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alekseyborisyukvalidas Could you please support Pawel with the verification report?
| - Overcounting: Total number of functions is too low | ||
| - A function is not being considered, although it is part of the certified subset | ||
| - yes | ||
| - `symbol-report` is developed to use exactly the same information as the compiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May you need to adapt here like in the cells above, NO
| - Line that can be executed not being reported as executable | ||
| - Underreporting, code that should be tested may not being tested | ||
| - yes | ||
| - `blanket` warns if a function has no executable line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May you need to adapt here like in the cells above, NO, because you argue additional measure need
| - yes | ||
| - `blanket` warns if a function has no executable line | ||
| - yes | ||
| - **Yes**. Additionally, every new tool release is tested by running tests in prepared integration testsuite to detect such errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may in addition you add a comment note, which explains in some sentences, what this integration testsuite does, because you mentioned in in all argumentations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this explanation it is not possible to judge if it is sufficient. It is also not clear who will develop and run this "integration testsuite". Ferrocene states for this Malfunction "(Future work) End-to-end test that ensures the correct lines are being reported as executable" which could be interpreted that they would implement/run such test.
aschemmel-tech
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see inline comments
| - high | ||
| * - 3 | ||
| - Overcounting: Total number of functions is too low | ||
| - A function is not being considered, although it is part of the certified subset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a use case for S-CORE at all? What would we need the "Total number of functions" for? The relevant failure for S-CORE is described in Malfunction number 1 already, could be removed in my opinion.
| - **Yes**. Every new tool release is tested by running tests in prepared integration testsuite to detect such errors. | ||
| - high | ||
| * - 4 | ||
| - Undercounting: Total number of functions is too high |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a use case for S-CORE at all? What would we need the "Total number of functions" for? The relevant failure for S-CORE is described in Malfunction number 2 already, could be removed in my opinion.
| - Overcounting: Total number of functions is too low | ||
| - A function is not being considered, although it is part of the certified subset | ||
| - yes | ||
| - `symbol-report` is developed to use exactly the same information as the compiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
symbol-report is mentioned as a mitigation measure which checks is the compiler has an error - unclear what compiler error this refers to.
| - yes | ||
| - `blanket` warns if a function has no executable line | ||
| - yes | ||
| - **Yes**. Additionally, every new tool release is tested by running tests in prepared integration testsuite to detect such errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this explanation it is not possible to judge if it is sufficient. It is also not clear who will develop and run this "integration testsuite". Ferrocene states for this Malfunction "(Future work) End-to-end test that ensures the correct lines are being reported as executable" which could be interpreted that they would implement/run such test.
|
|
||
| Safety evaluation | ||
| ----------------- | ||
| This section outlines the safety evaluation of `symbol report` and `blanket` for its use within the S-CORE project. This evaluation assumes that the Rust compiler is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this is intended to check if we need to have toolqualification of the tools symbol report and blanket developed by Ferrocene. But actually we would need an evaluation of the full toolchain including the -Cinstrument-coverage function and the llvm-profparser. The argumentation of Ferrocene that these are " widely used" is not enough to consider these "safe enough" - So maybe the "integration testsuite" should also cover these tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, finally, if we come not to an common agreement, we would need to have tool qualification for the complete chain, as we request it here according,
https://eclipse-score.github.io/process_description/main/process_areas/tool_management/guidance/tool_management_guideline.html#gd_guidl__tool_qualification
The tool qualification shall be based on the method validation of the software tool.
No description provided.