[multibody] Add implementation of CenicIntegrator#24063
[multibody] Add implementation of CenicIntegrator#24063SeanCurtis-TRI merged 1 commit intoRobotLocomotion:masterfrom
Conversation
rpoyner-tri
left a comment
There was a problem hiding this comment.
+a:@jwnimmer-tri for feature review.
@rpoyner-tri made 1 comment.
Reviewable status: LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
rpoyner-tri
left a comment
There was a problem hiding this comment.
FYI @vincekurtz
@rpoyner-tri made 1 comment.
Reviewable status: LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
|
@sherm1 please have a look at this later once you're back at your computer. |
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri reviewed 3 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
a discussion (no related file):
Why "notes: none"? It seems to me like this is the first official release of CENIC? Or if not, what's the roadmap to the first official release?
d3cd86a to
c1b6562
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
a discussion (no related file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
Why "notes: none"? It seems to me like this is the first official release of CENIC? Or if not, what's the roadmap to the first official release?
At the very least, we should make sure we are erroring on wrong answers. The only items I know of in that category are #24061 and #23921 . The first can probably be covered by an error message patch, and the second (for now) by a @warning comment in the class docs.
Probably everything else (bindings, feature completeness) can be covered by caveats in a release notes announcement.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
Checkpoint on non-test code.
@jwnimmer-tri reviewed 2 files and all commit messages, and made 22 comments.
Reviewable status: 21 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
a discussion (no related file):
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
At the very least, we should make sure we are erroring on wrong answers. The only items I know of in that category are #24061 and #23921 . The first can probably be covered by an error message patch, and the second (for now) by a
@warningcomment in the class docs.Probably everything else (bindings, feature completeness) can be covered by caveats in a release notes announcement.
Omitting from release notes is mostly unrelated to correctness questions. Even if not release-noted, users could still try to use this code. If the code is wrong, then at least we need disclaimers in its documentation saying so (e.g., the class overview) in this PR. Leaving it non-noted doesn't actually solve the problem. And then once those caveats are documented, I don't see why we shouldn't release-note it, so that users who don't need the broken output ports can still start to try it out?
a discussion (no related file):
Working
Evaluate test coverage.
multibody/cenic/cenic_integrator.cc line 96 at r2 (raw file):
template <typename T> void CenicIntegrator<T>::DoResetStatistics() { stats_ = {};
If I delete this line of code, no tests fail. (And since this line was forgotten in the original ale/vince implementation, I do think we need to regression-test it.)
multibody/cenic/cenic_integrator.cc line 127 at r2 (raw file):
double working_accuracy = this->get_target_accuracy(); if (isnan(working_accuracy)) working_accuracy = kDefaultAccuracy;
nit I'm not sure why this accuracy setup code (and the default value) isn't part of our constructor? I don't see a reason to repeat this every time. If we set a target in the constructor, then we simply have:
const double working_accuracy = this->get_target_accuracy();
this->set_accuracy_in_use(working_accuracy);Code quote:
if (isnan(working_accuracy)) working_accuracy = kDefaultAccuracy;multibody/cenic/cenic_integrator.cc line 156 at r2 (raw file):
const ContinuousState<T>& dx_state) const { using std::isnan; const VectorBase<T>& dgq =
nit What is this word?
Code quote:
dgqmultibody/cenic/cenic_integrator.cc line 160 at r2 (raw file):
.get_substate(plant_subsystem_index_) .get_generalized_position(); const T x_norm = dgq.CopyToVector().template lpNorm<Eigen::Infinity>();
nit I'm not sure what the "x" here is for. This is only a norm of positions, not positions+velocities.
Code quote:
x_normmultibody/cenic/cenic_integrator.cc line 161 at r2 (raw file):
.get_generalized_position(); const T x_norm = dgq.CopyToVector().template lpNorm<Eigen::Infinity>(); if (isnan(x_norm)) return std::numeric_limits<T>::quiet_NaN();
nit This line seems like a no-op?
Code quote:
if (isnan(x_norm)) return std::numeric_limits<T>::quiet_NaN();multibody/cenic/cenic_integrator.cc line 189 at r2 (raw file):
scratch_.actuation_feedback; IcfLinearFeedbackGains<T>& external_feedback_storage = scratch_.external_feedback;
nit Typically, we keep the scratch names the same as the locals they are pre-allocating for.
Suggestion:
IcfLinearFeedbackGains<T>& actuation_feedback_storage =
scratch_.actuation_feedback_storage;
IcfLinearFeedbackGains<T>& external_feedback_storage =
scratch_.external_feedback_storage;multibody/cenic/cenic_integrator.cc line 226 at r2 (raw file):
// Solve for the full step x_{t+h}. We'll need this regardless of whether // error control is enabled or not. VectorX<T>& v_guess = scratch_.v;
nit Typically, we keep the scratch names the same as the locals they are pre-allocating for.
Suggestion:
scratch_.v_guessmultibody/cenic/cenic_integrator.cc line 240 at r2 (raw file):
// We're using error control, and will compare with two half-sized steps. // First half-step to (t + h/2) uses the average of v_t and v_{t+1} as the
typo
Suggestion:
v_t and v_{t+h}multibody/cenic/cenic_integrator.cc line 306 at r2 (raw file):
"CenicIntegrator: ICF solver only supports T = double."); } else if (!solver_.SolveWithGuess(model, tolerance, &data_)) { throw std::runtime_error("CenicIntegrator: optimization failed.");
Since DoStep is allowed to return false on failure, we should have a comment here explaining why we've decided to throw instead of returning a status.
multibody/cenic/cenic_integrator.cc line 358 at r2 (raw file):
const Context<T>& context = this->get_context(); const Context<T>& plant_context = plant().GetMyContextFromRoot(context); const VectorX<T>& q0 = plant().GetPositions(plant_context);
nit Assigning a VectorBlock view to a const VectorX& introduces a needless copy.
Code quote:
const VectorX<T>& q0 = plant().GetPositions(plant_context);multibody/cenic/cenic_integrator.cc line 374 at r2 (raw file):
const Joint<T>& joint = plant().get_joint(joint_index); if (joint.type_name() == "quaternion_floating") {
This hard-coded predicate means that if we ever add a ball_quaternion joint down the road, CENIC will become broken in a way that's unlikely to be noticed until a frustrated user debugs a complicated simulation.
It seems to me like proper separation of concerns is to ask MbP to normalize a configuration vector, using a new MbP API like plant.NormalizePositions(&q);. Given our speed-up assumption of "small h", however, that might be impractice.
Perhaps the MbP could offer an API that reports which indices of q are quaternions. There's a bunch of places throughout Drake and Anzu that already scan for that, so a consolidated method would be nice.
multibody/cenic/cenic_integrator.h line 193 at r1 (raw file):
external_systems_linearizer_; /* Pre-allocated objects used to formulate and solve the optimization problem.
typo
Suggestion:
Preallocatedmultibody/cenic/cenic_integrator.h line 207 at r1 (raw file):
T time_at_last_solve_{NAN}; /* Pre-allocated scratch space for intermediate calculations. */
typo
Suggestion:
Preallocatedmultibody/cenic/cenic_integrator.h line 159 at r2 (raw file):
void DoInitialize() final; bool DoStep(const T& h) final;
The layout ordering of these overrides is not consistent with their declaration order in integrator_base.h. Unless there is a clear pattern here about why this order is helpful, we should stick with the superclass choice of ordering, for consistency.
Code quote:
T CalcStateChangeNorm(
const systems::ContinuousState<T>& dx_state) const final;
void DoResetStatistics() final;
std::vector<systems::NamedStatistic> DoGetStatisticsSummary() const final;
void DoInitialize() final;
bool DoStep(const T& h) final;multibody/cenic/cenic_integrator.h line 165 at r2 (raw file):
@param model The ICF model for the convex problem min_v ℓ(v; q₀, v₀, h). @param v_guess The initial guess for the MbP plant velocities. @param x_next The output continuous state, includes state for both the
nit consistency
Suggestion:
@param[out] x_nextmultibody/cenic/cenic_integrator.h line 132 at r2 (raw file):
actuation_feedback; /* Non-limited external forces, τ = −Ke⋅v + be. */ contact_solvers::icf::internal::IcfLinearFeedbackGains<T> external_feedback;
nit These comments have stale notation, compared to IcfExternalSystemLinearizer. Check that class's overview and reflect that notation here:
(1) Either we should write τ̃(v) = ... as an exact definition, or write τ(v) ≈ ... as an approximation (with the appropriate RHS for whichever form we're using).
(2) The u and e should be subscripts.
Code quote:
/* Linearized external system gains (sized to the plant's num_velocities):
Torque-limited actuation, τ = clamp(−Ku⋅v + bu). */
contact_solvers::icf::internal::IcfLinearFeedbackGains<T>
actuation_feedback;
/* Non-limited external forces, τ = −Ke⋅v + be. */
contact_solvers::icf::internal::IcfLinearFeedbackGains<T> external_feedback;multibody/cenic/cenic_integrator.h line 143 at r2 (raw file):
/* Gets a reference to the ICF builder used to set up the convex problem. */ contact_solvers::icf::internal::IcfBuilder<T>& builder() {
I am unclear what this private method is buying us. Syntactically, the three call sites builder().Foo(...) could just as simply be builder_->Foo(...).
If we are worried about null checks ever tripping, then it should be a DEMAND since we need it enabled in production. I think it would be clearer, though, to just put one single null check atop DoStep instead of every time we touch it inside of that method? Then we can ditch this method.
multibody/cenic/cenic_integrator.h line 193 at r2 (raw file):
external_systems_linearizer_; /* Pre-allocated objects used to formulate and solve the optimization problem.
I find this phrasing a bit confusing. The mention of "preallocated" make these sound a bit like scratch storage, but they are actually stateful (at least some of them).
All member fields of any class are always "preallocated" in the sense that they are constructed during our class constructor. What is special happening here where we need to call that out?
In other words, I'm not 100% sure what the comment is trying to get at.
Code quote:
Pre-allocated objects used to formulate and solve the optimization problem.multibody/cenic/cenic_integrator.h line 220 at r2 (raw file):
std::unique_ptr<systems::DiagramContinuousState<T>> x_next_half_1_; /* x_{t+h/2+h/2}. */ std::unique_ptr<systems::DiagramContinuousState<T>> x_next_half_2_;
It seems to me like these three fields are simply preallocated scratch storage. If that's true, why aren't they part of struct Scratch?
Code quote:
/* Intermediate states for error control, which compares a single large
step (x_next_full_) to the result of two smaller steps (x_next_half_2_). */
/* x_{t+h}. */
std::unique_ptr<systems::DiagramContinuousState<T>> x_next_full_;
/* x_{t+h/2}. */
std::unique_ptr<systems::DiagramContinuousState<T>> x_next_half_1_;
/* x_{t+h/2+h/2}. */
std::unique_ptr<systems::DiagramContinuousState<T>> x_next_half_2_;c1b6562 to
e2387a9
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment and resolved 5 discussions.
Reviewable status: 16 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
multibody/cenic/cenic_integrator.h line 220 at r2 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
It seems to me like these three fields are simply preallocated scratch storage. If that's true, why aren't they part of
struct Scratch?
Done.
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri resolved 2 discussions.
Reviewable status: 14 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri reviewed 2 files and all commit messages, made 2 comments, and resolved 1 discussion.
Reviewable status: 14 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
multibody/cenic/cenic_integrator.cc line 127 at r2 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
nit I'm not sure why this accuracy setup code (and the default value) isn't part of our constructor? I don't see a reason to repeat this every time. If we set a target in the constructor, then we simply have:
const double working_accuracy = this->get_target_accuracy(); this->set_accuracy_in_use(working_accuracy);
(And in that case, we probably don't need a named local, either.)
this->set_accuracy_in_use(this->get_target_accuracy());multibody/cenic/cenic_integrator.cc line 231 at r3 (raw file):
VectorX<T>& v_guess = scratch_.v; v_guess = plant().GetVelocities(plant_context); ComputeNextContinuousState(model_at_x0_, v_guess, scratch_.x_next_full.get());
nit When using scratch space, we always alias it into a local variable reference (e.g., v_guess above) so that it still looks / feels like a local.
Ditto throughout.
Suggestion:
&x_next_fulle2387a9 to
e3afb3b
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment and resolved 10 discussions.
Reviewable status: 4 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
multibody/cenic/cenic_integrator.cc line 374 at r2 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
This hard-coded predicate means that if we ever add a
ball_quaternionjoint down the road, CENIC will become broken in a way that's unlikely to be noticed until a frustrated user debugs a complicated simulation.It seems to me like proper separation of concerns is to ask MbP to normalize a configuration vector, using a new MbP API like
plant.NormalizePositions(&q);. Given our speed-up assumption of "small h", however, that might be impractice.Perhaps the MbP could offer an API that reports which indices of
qare quaternions. There's a bunch of places throughout Drake and Anzu that already scan for that, so a consolidated method would be nice.
SpanningForest has the answers we want. It will take a minute to figure out how to use them correctly.
rpoyner-tri
left a comment
There was a problem hiding this comment.
TODO: write some release notes announcement text and attach it here.
-(release notes: none) +(release notes: feature) +(release notes: announce)
@rpoyner-tri made 1 comment.
Reviewable status: 4 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri resolved 1 discussion.
Reviewable status: 3 unresolved discussions, LGTM missing from assignee jwnimmer-tri(platform), needs at least two assigned reviewers.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri reviewed 3 files and all commit messages, made 18 comments, and resolved 1 discussion.
Reviewable status: 18 unresolved discussions, needs at least two assigned reviewers.
a discussion (no related file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
Working
Evaluate test coverage.
Done
multibody/cenic/test/cenic_integrator_test.cc line 409 at r4 (raw file):
The tolerance is loose because the initial conditions need to satisfy the constraint exactly.
I don't understand this sentence at all. The configuration exactly satisfies the constraint. The tolerance could be zero, I think? Why should it be loose?
(After simulating, I do buy that we need a loose tolerance per the stiff spring.)
multibody/cenic/test/cenic_integrator_test.cc line 6 at r4 (raw file):
#include <memory> #include <string> #include <tuple>
nit Unused <tuple>?
multibody/cenic/test/cenic_integrator_test.cc line 17 at r4 (raw file):
#include "drake/systems/analysis/simulator.h" #include "drake/systems/analysis/simulator_config_functions.h" #include "drake/systems/analysis/simulator_print_stats.h"
nit Unused simulator_print_stats. (Ditto in BUILD.)
multibody/cenic/test/cenic_integrator_test.cc line 22 at r4 (raw file):
#include "drake/systems/framework/diagram_builder.h" #include "drake/systems/primitives/constant_vector_source.h" #include "drake/systems/primitives/linear_system.h"
nit Unused linear_system.
multibody/cenic/test/cenic_integrator_test.cc line 59 at r4 (raw file):
)"""; using common::MaybePauseForUser;
BTW Don't we usually put the using statements ahead of everything else (i.e., using first then constants after)?
multibody/cenic/test/cenic_integrator_test.cc line 113 at r4 (raw file):
#pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wdeprecated-declarations" simulator_->set_publish_every_time_step(true);
We must not use deprecated API in new unit tests.
multibody/cenic/test/cenic_integrator_test.cc line 121 at r4 (raw file):
} std::shared_ptr<Meshcat> meshcat_;
IIUC the meshcat_ field is always available. Thus, it should be grouped into the // Always available. section of member fields. (In any case, it needs to be in some section, so that its lifetime is clear.)
multibody/cenic/test/cenic_integrator_test.cc line 117 at r4 (raw file):
integrator_ = &simulator_->reset_integrator<CenicIntegrator<double>>(); integrator_->set_maximum_step_size(0.1); integrator_->set_fixed_step_mode(false);
The default for fixed_step_mode is supposed to be false. If it were true, that would be a bug in the DUT. At best, we should EXPECT_EQ that it's false. Setting it to false gives the impression that it might not have been false, which would be a problem, not something to gloss over.
multibody/cenic/test/cenic_integrator_test.cc line 203 at r4 (raw file):
for (multibody::JointIndex i : plant_.GetJointIndices()) { multibody::Joint<double>& joint = plant_.get_mutable_joint(i); joint.set_default_damping_vector(Eigen::VectorXd::Zero(1));
nit We have a using for this; leverage it. Ditto throughout.
Suggestion:
VectorXdmultibody/cenic/test/cenic_integrator_test.cc line 252 at r4 (raw file):
// Set up a PID controller. const Eigen::Vector2d q_nom(-M_PI / 4, M_PI / 2);
nit q_nom is dead code.
multibody/cenic/test/cenic_integrator_test.cc line 389 at r4 (raw file):
// Applied actuation should be clamped to effort limits. // TODO(vincekurtz): report this in plant.get_net_actuation_output_port().
Does this indicate a defect in MbP's output when using CENIC (in which case it should be, at minimum, filed as an issue and cited here with the number instead of a username)?
multibody/cenic/cenic_integrator.h line 110 at r4 (raw file):
/** Sets the convex solver tolerances and iteration limits. */ void SetSolverParameters(
This public function is never called by the unit test.
multibody/cenic/cenic_integrator.h line 104 at r4 (raw file):
/** Gets the current convex solver tolerances and iteration limits. */ const contact_solvers::icf::IcfSolverParameters& get_solver_parameters()
This public function is never called by the unit test.
multibody/cenic/cenic_integrator.h line 156 at r4 (raw file):
void DoResetStatistics() final; std::vector<systems::NamedStatistic> DoGetStatisticsSummary() const final;
This function is never called during unit tests.
multibody/cenic/cenic_integrator.cc line 130 at r4 (raw file):
if (isnan(this->get_initial_step_size_target())) { if (isnan(this->get_maximum_step_size())) throw std::logic_error(
This error condition is never reached by unit tests.
multibody/cenic/cenic_integrator.cc line 129 at r4 (raw file):
// Set the initial time step and accuracy. if (isnan(this->get_initial_step_size_target())) { if (isnan(this->get_maximum_step_size()))
nit Multi-line if bodies (i.e., the throw here) require curly braces per GSG.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri made 1 comment.
Reviewable status: 19 unresolved discussions, needs at least two assigned reviewers.
a discussion (no related file):
Should we have test coverage that specifically calls CenicIntegrator::get_error_estimate and checks the return value for numerical correctness?
I believe we are testing it indirectly, but not in a way that would necessarily identify too-many-steps due to an e.g. scale error in the error math.
CC @vincekurtz
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment.
Reviewable status: 19 unresolved discussions, needs at least two assigned reviewers.
multibody/cenic/cenic_integrator.h line 104 at r4 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
This public function is never called by the unit test.
it is invoked at all by the integrator, but it is never tested as a feature. /working
e3afb3b to
7ec730b
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 4 comments and resolved 9 discussions.
Reviewable status: 10 unresolved discussions, needs at least two assigned reviewers.
multibody/cenic/cenic_integrator.h line 156 at r4 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
This function is never called during unit tests.
Done.
multibody/cenic/cenic_integrator.cc line 130 at r4 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
This error condition is never reached by unit tests.
Done.
multibody/cenic/test/cenic_integrator_test.cc line 113 at r4 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
We must not use deprecated API in new unit tests.
Done.
multibody/cenic/test/cenic_integrator_test.cc line 117 at r4 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
The default for
fixed_step_modeis supposed to be false. If it weretrue, that would be a bug in the DUT. At best, we shouldEXPECT_EQthat it's false. Setting it tofalsegives the impression that it might not have beenfalse, which would be a problem, not something to gloss over.
Done.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri reviewed 3 files and all commit messages, made 3 comments, and resolved 3 discussions.
Reviewable status: 9 unresolved discussions, needs at least two assigned reviewers.
multibody/cenic/test/cenic_integrator_test.cc line 113 at r4 (raw file):
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
Done.
With the vis_config.publish_period = 1e9; setting (combined with this change to no longer publish every time step), the meshcat recordings and "pause for user" stuff is now completely useless, since the only thing in the recording is the t0 initial condition.
(My vote would be to purge all visualization code here, but if you want to try to fix it that's OK by me too.)
multibody/cenic/test/cenic_integrator_test.cc line 164 at r5 (raw file):
/* Checks that the integrator complains when no initial step size is configured. */
nit This comment doesn't describe this test case.
Code quote:
/* Checks that the integrator complains when no initial step size is
configured. */multibody/cenic/test/cenic_integrator_test.cc line 186 at r5 (raw file):
for (const auto& [key, value] : later_stats) { SCOPED_TRACE(fmt::format("later stat for '{}'", key)); if (key == "integrator_num_steps_taken") {
These checks for specific numbers could end up dead code and nobody would notice.
(If we had const expected_stats = { ...}; map of wanted values, then we could use that map to check the expectations here, and also then check that all of the expected stats were accounted for while looping over later_stats.)
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment.
Reviewable status: 9 unresolved discussions, needs at least two assigned reviewers.
a discussion (no related file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
Should we have test coverage that specifically calls
CenicIntegrator::get_error_estimateand checks the return value for numerical correctness?I believe we are testing it indirectly, but not in a way that would necessarily identify too-many-steps due to an e.g. scale error in the error math.
CC @vincekurtz
/cc @joemasterjohn
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
Hmm, seems like a good idea to have some sort of check, but it's not obvious to me how to do so cleanly. Short of a brute-force "compute the two half-steps manually and check that the error estimate matches", which I'm not sure would add a ton of value |
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment.
Reviewable status: 9 unresolved discussions, needs at least two assigned reviewers.
multibody/cenic/test/cenic_integrator_test.cc line 113 at r4 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
With the
vis_config.publish_period = 1e9;setting (combined with this change to no longer publish every time step), the meshcat recordings and "pause for user" stuff is now completely useless, since the only thing in the recording is the t0 initial condition.(My vote would be to purge all visualization code here, but if you want to try to fix it that's OK by me too.)
I'm feeling like the usability of viz at all with continuous-time systems is at stake here. I threw together a very quick sketch of a possible solution: #24072 . I'll not claim it's the best plan, but it could work with adequate documentation, and maybe some deeper fixes to MeshcatAnimation (to account for non-fixed-rate frames).
I found a bug in this exercise (one place? two places?) : some of the checks for valid publish_period don't currently exclude zero. They will fail eventually, but not fast.
I'm happy to call the above a separate project, and rip out the viz here if you think that's a better workflow.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri made 2 comments.
Reviewable status: 9 unresolved discussions, needs at least two assigned reviewers.
a discussion (no related file):
Previously, vincekurtz (Vince Kurtz) wrote…
Hmm, seems like a good idea to have some sort of check, but it's not obvious to me how to do so cleanly. Short of a brute-force "compute the two half-steps manually and check that the error estimate matches", which I'm not sure would add a ton of value
One of the tests already here (EnergyConservation) is a "trend" test -- we run with accuracy of 1e-3 and then again with 1e-5 and confirm that the numerics got tighter. We could imagine doing the same thing for the reported error, also? The error should diminish when we use a smaller step size. That wouldn't directly check that the error math is correct, but it might add more confidence than we have now?
multibody/cenic/test/cenic_integrator_test.cc line 113 at r4 (raw file):
... possible solution ...
Yes -- per the deprecation PR, systems that want per-step publish semantics should offer that as a system-specific option instead of forcing the simulator to do it, which is what that PR does.
I'm feeling like the usability of viz at all with continuous-time systems is at stake here.
Visualization already works okay so long as the user doesn't set a nonsense publish period like we do here (of a billion seconds). Because this an integrator unit test and we are checking step details very precisely, we don't want the visualizer to add extra jitter, thus the weird viz configuration. So it's primarily just a question for this unit test and only this unit test.
See also #23758.
Anyway, since I don't think this particular visualization helps developers when debugging test failures here at all ever, that's why I suggested removing it. If anyone would like to argue why it's helpful here, then we can pursue making it work prior to landing this test code. But if nobody really needs it, then "toss it" is the best answer.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri made 4 comments.
Reviewable status: 25 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.h line 111 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
BTW I would've expected this (and the corresponding call on
IcfSolver) to pass by value (with the requisitestd::move). After all, the parameters are ultimately simply captured by the solver.
The fields of the struct are all PODs, so the move syntax hijinks would just be a lot of extra syntax with no upside.
In https://drake.mit.edu/styleguide/cppguide.html#Pass_By_Value note the key is cheap to move. A params struct is nearly never "cheap to move", since it will have many separate fields (rather than, say, a heap pointer like a container would).
multibody/cenic/cenic_integrator.cc line 77 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
nit: Not included in tests.
If we change this to return false, all unit tests in this directory fail. That seems like enough testing is already in place.
multibody/cenic/cenic_integrator.cc line 85 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
nit: Not included in tests.
If we change this to 1 or 3, then DoublePendulum.JointLimits fails. That seems like enough testing is already in place.
(Adding a unit test for EXPECT_EQ(dut.get...(), 2) is just copy-pasting the implementation into the test, which is useless.)
multibody/cenic/cenic_integrator.cc line 239 at r12 (raw file):
This questions is probably outside the domain of this PR.
Yes.
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri resolved 4 discussions.
Reviewable status: 21 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
33c6157 to
4ddf812
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
some fixes; some discussion. No serious test fixes yet.
@rpoyner-tri made 4 comments and resolved 10 discussions.
Reviewable status: 11 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.h line 41 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
BTW It kind of feels like something more should be said about this integrator's relationship with "external" systems. For example, those systems do not benefit from error control.
Such error control has been proposed (#23921). Listing a missing feature doesn't really feel like it belongs in a "benefits" section. Are you proposing a separate "limitations" section, where we list all of the open issues?
multibody/cenic/cenic_integrator.h line 56 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
BTW The plural of "Fields" feels out of place. Yes, it's the title of the paper. The paper introduces the idea of "irrotational contact fields", but the optimization problem doesn't contain multiple fields. Depending on the approximation of contact, one can produce different scalar fields that are individual ICFs, but the optimization problem, at any given moment, only deals with a single field.
Sticking with plural "Fields" would make the most sense in reading if it had a (TM) mark (which, of course, we don't actually have).
I dunno. When I see the formulation " Term Of Art (TOA)", I assume we are swallowing the term whole and not quibbling about grammar.
multibody/cenic/cenic_integrator.cc line 195 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
nit: I'm probably late to this show, but this relates to my previous note about the const reference and mutable reference both appearing in this function.
We're passing a mutable reference to
LinearizeExternalSystems()so it has some scratch pad to compute the derivatives of external systems. The documentation for that method suggests that "all mutations are undone prior to returning".Two questions:
- I can't find a test in icf_external_systems_linearizer_test.cc that actually affirms that property.
- Are we concerned at all what that does to the caching in the context? Twiddling the state of the plant context will cause its continuous state to get dirty (and dirtiness to cascade down stream). Presumably that's not a concern because everything down stream is either already dirty or should be dirty so no harm done. But, if that's the argument, it seems that using the actual context as scratch space is such an anti-pattern that the justification should be captured in code. (And it well may be and I just haven't read far enough to find that justification.)
I can't give justification for external_systems_linearizer details. Seems to me that if it were a mere matter of needing scratch space, IcfExternalSystemsLinearizer could have done that at ctor time. It may have something to do with function signatures deeper in the call stack?
\cc nimmer for external_systems_linearizer curator knowledge.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri reviewed 4 files and all commit messages, made 2 comments, and resolved 2 discussions.
Reviewable status: 9 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.h line 41 at r12 (raw file):
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
Such error control has been proposed (#23921). Listing a missing feature doesn't really feel like it belongs in a "benefits" section. Are you proposing a separate "limitations" section, where we list all of the open issues?
Remember that Sean's "BTW" is a lighter / more tentative discussion than you and I typically use it for. It's just a passing thought. If we don't think anything is missing here (and I don't think anything is), or if addressing it would slow down something important, we just move ahead. Let's just do that?
multibody/cenic/cenic_integrator.cc line 195 at r12 (raw file):
Are we concerned at all what that does to the caching in the context?
No.
In fact, it is not even possible to use different storage. We must perturb the actual whole diagram state in order to linearize the external systems, since their inputs back into us need to be a function of the perturbed state. It is an intrinsic part of integration to use the actual integrator state for these calculations.
I don't think there is an "anti-pattern" here that needs justification in a comment, nor am I sure what else we could do to have forestalled this review discussion with another kind of comment here or elsewhere.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri made 1 comment.
Reviewable status: 10 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.cc line 148 at r13 (raw file):
// Replaces the default norm (weighted infinity norm on the entire diagram // state) with an infinity norm on just plant positions. This is motivated by // Sec. V.E of [Kurtz // and Castro, 2025].
typo
Suggestion:
Kurtz and Castro, 2025
SeanCurtis-TRI
left a comment
There was a problem hiding this comment.
@SeanCurtis-TRI reviewed 2 files and all commit messages, and made 7 comments.
Reviewable status: 10 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.h line 41 at r12 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
Remember that Sean's "BTW" is a lighter / more tentative discussion than you and I typically use it for. It's just a passing thought. If we don't think anything is missing here (and I don't think anything is), or if addressing it would slow down something important, we just move ahead. Let's just do that?
I hadn't realized my BTWs have a different tone than yours. But Jeremy is correct in his interpretation of my tone.
I was pointing out, in passing, that the language implies something that isn't true. If that's transient, or too weak an implication for the code author to care about, I leave that up to you.
multibody/cenic/cenic_integrator.h line 56 at r12 (raw file):
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
I dunno. When I see the formulation " Term Of Art (TOA)", I assume we are swallowing the term whole and not quibbling about grammar.
When are we not quibbling about grammar? :) But, as noted above, it's a BTW. I can understand why someone might prefer the plural (although, I'll never be convinced that it's right.)
multibody/cenic/cenic_integrator.h line 111 at r12 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
The fields of the struct are all PODs, so the move syntax hijinks would just be a lot of extra syntax with no upside.
In https://drake.mit.edu/styleguide/cppguide.html#Pass_By_Value note the key is cheap to move. A params struct is nearly never "cheap to move", since it will have many separate fields (rather than, say, a heap pointer like a container would).
Fair.
I'd counter with the fact that it's harmless to do pass by value. And if the struct ever added non-POD members, the API would already be conditioned. But, as noted before, btw.
multibody/cenic/cenic_integrator.cc line 77 at r12 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
If we change this to
return false, all unit tests in this directory fail. That seems like enough testing is already in place.
Then the defect is that maintainers of the code are obliged to jump through hoops to determine that.
If, at the very least, the name of the function appeared in the test function (even if only in documentation that makes the observation you make), then naive attempts to do a reality check are easily satisfied.
The problem with absence is one can never tell if it's an oversight or an intention.
(Same note applies below.)
multibody/cenic/cenic_integrator.cc line 85 at r12 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
If we change this to
1or3, thenDoublePendulum.JointLimitsfails. That seems like enough testing is already in place.(Adding a unit test for
EXPECT_EQ(dut.get...(), 2)is just copy-pasting the implementation into the test, which is useless.)
Deferring to the previous conversation.
multibody/cenic/cenic_integrator.cc line 195 at r12 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
Are we concerned at all what that does to the caching in the context?
No.
In fact, it is not even possible to use different storage. We must perturb the actual whole diagram state in order to linearize the external systems, since their inputs back into us need to be a function of the perturbed state. It is an intrinsic part of integration to use the actual integrator state for these calculations.
I don't think there is an "anti-pattern" here that needs justification in a comment, nor am I sure what else we could do to have forestalled this review discussion with another kind of comment here or elsewhere.
Oh, I know it needs the whole Diagram context. That's clear from its purpose. I don't know that that means "not possible". Certainly, unwieldy and awkward. If I were to offer up a straw man, I'd say the linearizer could have a diagram context and copy state and manipulate. After all, to do its job, it's the context state that matters. But, like I said, I can appreciate that's undesirable.
Perhaps the characterization of the input context as "scratch" space for the linearizer is a difference in semantics. For me, I apply the term because of two facts a) the linearizer changes its values to perform computations and b) it promises to restore the context to its original input configuration. One assumes that is with the intent that the caller can't tell the context has been messed with. The fact that that promise isn't under test is troubling to me (if you agree, it need not be addressed in this PR, obvious).
And, no one touched on the subject of cache. Just looking at the design I wondered if there was an adverse effect on the caching system. Again, I haven't reasoned it through yet. I wanted to find out if someone already has.
multibody/cenic/cenic_integrator.cc line 239 at r12 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
This questions is probably outside the domain of this PR.
Yes.
My "probably" was intended to be a polite nicety not an indicator of whether or not this topic was outside the scope of this PR. I'll be more direct in the future.
Nevertheless, the question stands, this doesn't make sense in the larger context of the code as I read it.
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri made 2 comments.
Reviewable status: 10 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.cc line 195 at r12 (raw file):
And, no one touched on the subject of cache. Just looking at the design I wondered if there was an adverse effect on the caching system. Again, I haven't reasoned it through yet. I wanted to find out if someone already has.
I haven't tried to reason about it, but neither am I troubled by it. My goal is to land correct code first and see if this whole project is useful at all in the first place. Later if we love how well it's working, we can profile and optimize the parts that are slow in applications we care about.
Also, in broad strokes, the point of the integrator is to change the state (by advancing it), so it's not obvious why changing the state more than once while computing the ultimate value would be a cache invalidation problem.
The fact that that promise isn't under test is troubling to me (if you agree, it need not be addressed in this PR).
Please go ahead and open an add-a-test PR that would satisfy you. We are all cleaning up the mess left by our departed colleagues; more hands will make it lighter work.
multibody/cenic/cenic_integrator.cc line 239 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
My "probably" was intended to be a polite nicety not an indicator of whether or not this topic was outside the scope of this PR. I'll be more direct in the future.
Nevertheless, the question stands, this doesn't make sense in the larger context of the code as I read it.
My only explanation is that this seems to be how every other integrator works and nobody has complained so far. (Or are you saying CENIC is doing something different?) Ideally, yes, it would all be more clear, but I'm not sure that it's worth investing time trying to "solve" this?
SeanCurtis-TRI
left a comment
There was a problem hiding this comment.
@SeanCurtis-TRI made 2 comments.
Reviewable status: 10 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.cc line 195 at r12 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
And, no one touched on the subject of cache. Just looking at the design I wondered if there was an adverse effect on the caching system. Again, I haven't reasoned it through yet. I wanted to find out if someone already has.
I haven't tried to reason about it, but neither am I troubled by it. My goal is to land correct code first and see if this whole project is useful at all in the first place. Later if we love how well it's working, we can profile and optimize the parts that are slow in applications we care about.
Also, in broad strokes, the point of the integrator is to change the state (by advancing it), so it's not obvious why changing the state more than once while computing the ultimate value would be a cache invalidation problem.
The fact that that promise isn't under test is troubling to me (if you agree, it need not be addressed in this PR).
Please go ahead and open an add-a-test PR that would satisfy you. We are all cleaning up the mess left by our departed colleagues; more hands will make it lighter work.
I'll look into the "add-a-test" PR.
As for caching -- worst case it leads to some redundant calculations, but, by definition, it shouldn't harm correctness.
multibody/cenic/cenic_integrator.cc line 239 at r12 (raw file):
Previously, jwnimmer-tri (Jeremy Nimmer) wrote…
My only explanation is that this seems to be how every other integrator works and nobody has complained so far. (Or are you saying CENIC is doing something different?) Ideally, yes, it would all be more clear, but I'm not sure that it's worth investing time trying to "solve" this?
In my original post, I'd done the research to answer the question is "CENIC is doing something different?" The answer is yes. But I hadn't communicated that...at all. The answer is yes; no other integrator implementation sets the time in the context. This is handled in IntegratorBase's NVI implemtnation. (Note: ImplicitIntegrator keeps that assertion from being 100% true -- it sets the time in order to compute a Jacobian, but for this conversation, it's true enough).
I strongly suspect that setting it here is harmless because the value probably gets stomped on by IntegratorBase. But that's not a justification for keeping the code.
4ddf812 to
177228c
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment and resolved 1 discussion.
Reviewable status: 10 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.cc line 239 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
In my original post, I'd done the research to answer the question is "CENIC is doing something different?" The answer is yes. But I hadn't communicated that...at all. The answer is yes; no other integrator implementation sets the time in the context. This is handled in
IntegratorBase's NVI implemtnation. (Note:ImplicitIntegratorkeeps that assertion from being 100% true -- it sets the time in order to compute a Jacobian, but for this conversation, it's true enough).I strongly suspect that setting it here is harmless because the value probably gets stomped on by
IntegratorBase. But that's not a justification for keeping the code.
I'll study this and try some things.
177228c to
fc9dfcc
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 2 comments and resolved 4 discussions.
Reviewable status: 6 unresolved discussions, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/test/cenic_integrator_test.cc line 186 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
nit: The expected keys have nine key values. Only six are explicitly called out here. The ones that aren't called out are assumed to be their initial values based solely on their type. This is a very surprising methodology:
- Why aren't we testing for all nine?
- Why do those three values not change from their initial value? This test simply seems to suggest that those stats are duds. Surely, the test should do enough so that all of the stats produce some new value based on work done?
If the test is justified as is, then the justification should be captured because I would argue it can't be inferred.
Updated. PTAL
multibody/cenic/test/cenic_integrator_test.cc line 187 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
btw: It seems the only reason this scoped trace is here, is because six of the stat keys are glossed over. If all nine were explicitly addressed, then the trace would be completely redundant.
Even so, it's not redundant. What do you think happens when a maintainer misspells a key string?
fc9dfcc to
72d5222
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment and resolved 5 discussions.
Reviewable status: 1 unresolved discussion, LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/test/cenic_integrator_test.cc line 407 at r12 (raw file):
Previously, SeanCurtis-TRI (Sean Curtis) wrote…
BTW What's the motivation for these tests on joint limits and efforts (and possibly coupling in the future)? This really feels like a property of the ICF builder and solver. The integrator doesn't seem to have any code that is explicitly aware of joint (let alone limits or effort).
I'm ok wit leaving them, unless you think they are obviously brittle.
72d5222 to
e31ef62
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment and resolved 1 discussion.
Reviewable status: LGTM missing from assignee SeanCurtis-TRI(platform).
multibody/cenic/cenic_integrator.cc line 239 at r12 (raw file):
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
I'll study this and try some things.
No test fails when I delete the SetTime() call. 🤷
rpoyner-tri
left a comment
There was a problem hiding this comment.
All discussions addressed; PTAL
@rpoyner-tri made 1 comment.
Reviewable status: LGTM missing from assignee SeanCurtis-TRI(platform).
|
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
Eeek! My bad on this one, looks like the |
jwnimmer-tri
left a comment
There was a problem hiding this comment.
@jwnimmer-tri reviewed 2 files and all commit messages.
Reviewable status: LGTM missing from assignee SeanCurtis-TRI(platform).
rpoyner-tri
left a comment
There was a problem hiding this comment.
All done but for Sean L G T M. Is it time?
@rpoyner-tri made 1 comment.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @rpoyner-tri).
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @rpoyner-tri).
a discussion (no related file):
Ugh. Still waiting for Sean stamp. My comment previous was accepted by the software in error.
SeanCurtis-TRI
left a comment
There was a problem hiding this comment.
@SeanCurtis-TRI reviewed 3 files and all commit messages, and made 5 comments.
Reviewable status: 3 unresolved discussions.
multibody/cenic/test/cenic_integrator_test.cc line 187 at r12 (raw file):
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
Even so, it's not redundant. What do you think happens when a maintainer misspells a key string?
Given the new spelling, it's certainly not redundant now. But the problem it solves could have equally been solved with (in the else block):
GTEST_FAIL() << "Unrecognized statistic name: '" << key << "'.";Nevertheless, I'm marking myself as satisified.
multibody/cenic/test/cenic_integrator_test.cc line 407 at r12 (raw file):
Previously, rpoyner-tri (Rick Poyner (rico)) wrote…
I'm ok wit leaving them, unless you think they are obviously brittle.
I don't seem them as particularly "brittle". I just thought they were redundant. My value was minimizing unnecessary CI cost. But I'm content to defer to author.
multibody/cenic/test/cenic_integrator_test.cc line 166 at r17 (raw file):
"integrator_smallest_adapted_step_size_taken"}; auto validate_empty_stats = [&](const std::vector<NamedStatistic> stats,
nit: typo
Suggestion:
auto validate_empty_stats = [&](const std::vector<NamedStatistic>& statsmultibody/cenic/test/cenic_integrator_test.cc line 215 at r17 (raw file):
EXPECT_NEAR(std::get<double>(value), 0.003819, 1e-6); } else { ASSERT_TRUE(false);
BTW There's a GTEST_FAIL() for this purpose.
e31ef62 to
b82a2d4
Compare
rpoyner-tri
left a comment
There was a problem hiding this comment.
@rpoyner-tri resolved 3 discussions.
Reviewable status:complete! all discussions resolved, LGTM from assignees jwnimmer-tri(platform),SeanCurtis-TRI(platform).
SeanCurtis-TRI
left a comment
There was a problem hiding this comment.
@SeanCurtis-TRI reviewed 1 file and all commit messages.
Reviewable status:complete! all discussions resolved, LGTM from assignees jwnimmer-tri(platform),SeanCurtis-TRI(platform).
This change is