Describe the suggested improvement
The shared tests in Retry.cs are flaky. I have observed many failures on AzureServiceBus, AmazonSQS, and also on Learning Transport, more often on Linux than on Windows, but I have observed the problems on both.
The tests were very hard to get to fail locally. I attempted to add the FixtureLifeCycle attribute to restrict the tests to LifeCycle.InstancePerTestCase and that did make it more likely for the tests to fail locally, but I could not reproduce the NullReferenceException shown below.
I suspect this might be a case of leftover messages in a shared queue processed in non-deterministic orders, and if the endpoints for each variant of the test could be changed to use unique endpoint names, that might help.
Additional Context
From https://github.com/Particular/NServiceBus.MessagingBridge/actions/runs/11506603214/attempts/1?pr=542
🔴 AcceptanceTests.SQS (.NETCoreApp,Version=v8.0)
| ✓ Passed | ✘ Failed | ↷ Skipped | ∑ Total | ⧗ Elapsed |
|---|
| 12 | 1 | — | 18 | 1m21s |
- Retry (1 failed)
- 🟥 Should_log_warn_when_best_effort_ReplyToAddress_fails
Bridge should log warnings when ReplyToAddress cannot be translated for failed message and retry.
Expected: 1
But was: 0
at Retry.Should_log_warn_when_best_effort_ReplyToAddress_fails() in /_/src/AcceptanceTests/Shared/Retry.cs:line 109
at NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
at NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaiter)
at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()
at NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)
1) at Retry.Should_log_warn_when_best_effort_ReplyToAddress_fails() in /_/src/AcceptanceTests/Shared/Retry.cs:line 109
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.ExecutionContextCallback(Object s)
🔴 AcceptanceTests.ASB (.NETCoreApp,Version=v8.0)
| ✓ Passed | ✘ Failed | ↷ Skipped | ∑ Total | ⧗ Elapsed |
|---|
| 11 | 2 | — | 13 | 41s |
- Retry (2 failed)
- 🟥 Should_work(False)
System.NullReferenceException : Object reference not set to an instance of an object.
at Retry.Should_work(Boolean translateReplyToAdressForFailedMessages) in /_/src/AcceptanceTests/Shared/Retry.cs:line 55
at NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
at NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaiter)
at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()
at NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)
- 🟥 Should_work(True)
Multiple failures or warnings in test:
1) Expected: True
But was: False
2) Expected: True
But was: False
at Retry.Should_work(Boolean translateReplyToAdressForFailedMessages) in /_/src/AcceptanceTests/Shared/Retry.cs:line 46
at NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
at NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaiter)
at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()
at NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)
1) at Retry.<>c__DisplayClass0_0.<Should_work>b__0() in /_/src/AcceptanceTests/Shared/Retry.cs:line 48
at NUnit.Framework.Assert.Multiple(TestDelegate testDelegate)
at Retry.Should_work(Boolean translateReplyToAdressForFailedMessages) in /_/src/AcceptanceTests/Shared/Retry.cs:line 46
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.ExecutionContextCallback(Object s)
2) at Retry.<>c__DisplayClass0_0.<Should_work>b__0() in /_/src/AcceptanceTests/Shared/Retry.cs:line 49
at NUnit.Framework.Assert.Multiple(TestDelegate testDelegate)
at Retry.Should_work(Boolean translateReplyToAdressForFailedMessages) in /_/src/AcceptanceTests/Shared/Retry.cs:line 46
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.ExecutionContextCallback(Object s)
From https://github.com/Particular/NServiceBus.MessagingBridge/actions/runs/11298392810 where it happened on Windows and both Should_work variants failed on both Windows and Linux
🔴 AcceptanceTests.ASB (.NETCoreApp,Version=v8.0)
| ✓ Passed | ✘ Failed | ↷ Skipped | ∑ Total | ⧗ Elapsed |
|---|
| 11 | 2 | — | 13 | 40s |
- Retry (2 failed)
- 🟥 Should_work(False)
System.NullReferenceException : Object reference not set to an instance of an object.
at Retry.Should_work(Boolean translateReplyToAdressForFailedMessages) in /_/src/AcceptanceTests/Shared/Retry.cs:line 55
at NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
at NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaiter)
at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()
at NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)
- 🟥 Should_work(True)
Multiple failures or warnings in test:
1) Expected: True
But was: False
2) Expected: True
But was: False
at Retry.Should_work(Boolean translateReplyToAdressForFailedMessages) in /_/src/AcceptanceTests/Shared/Retry.cs:line 46
at NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
at NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaiter)
at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)
at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()
at NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)
1) at Retry.<>c__DisplayClass0_0.<Should_work>b__0() in /_/src/AcceptanceTests/Shared/Retry.cs:line 48
at NUnit.Framework.Assert.Multiple(TestDelegate testDelegate)
at Retry.Should_work(Boolean translateReplyToAdressForFailedMessages) in /_/src/AcceptanceTests/Shared/Retry.cs:line 46
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.ExecutionContextCallback(Object s)
2) at Retry.<>c__DisplayClass0_0.<Should_work>b__0() in /_/src/AcceptanceTests/Shared/Retry.cs:line 49
at NUnit.Framework.Assert.Multiple(TestDelegate testDelegate)
at Retry.Should_work(Boolean translateReplyToAdressForFailedMessages) in /_/src/AcceptanceTests/Shared/Retry.cs:line 46
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.ExecutionContextCallback(Object s)
Describe the suggested improvement
The shared tests in Retry.cs are flaky. I have observed many failures on AzureServiceBus, AmazonSQS, and also on Learning Transport, more often on Linux than on Windows, but I have observed the problems on both.
The tests were very hard to get to fail locally. I attempted to add the
FixtureLifeCycleattribute to restrict the tests toLifeCycle.InstancePerTestCaseand that did make it more likely for the tests to fail locally, but I could not reproduce the NullReferenceException shown below.I suspect this might be a case of leftover messages in a shared queue processed in non-deterministic orders, and if the endpoints for each variant of the test could be changed to use unique endpoint names, that might help.
Additional Context
From https://github.com/Particular/NServiceBus.MessagingBridge/actions/runs/11506603214/attempts/1?pr=542
🔴 AcceptanceTests.SQS (.NETCoreApp,Version=v8.0)
🔴 AcceptanceTests.ASB (.NETCoreApp,Version=v8.0)
From https://github.com/Particular/NServiceBus.MessagingBridge/actions/runs/11298392810 where it happened on Windows and both
Should_workvariants failed on both Windows and Linux🔴 AcceptanceTests.ASB (.NETCoreApp,Version=v8.0)