Conversation
ca2cf06 to
e49c2f6
Compare
| var fileStream = File.Open(lockOptions.LockFile, FileMode.OpenOrCreate, FileAccess.ReadWrite, fileShareMode); | ||
| return new LockHandle(fileStream); | ||
| } | ||
| catch (IOException e) |
There was a problem hiding this comment.
The IOException is possibly a little bit broad here. We could potentially perform additional checks, but that could get fragile if we rely on the error message string.
| LockType.Exclusive => FileShare.None, | ||
| LockType.Shared => FileShare.ReadWrite, |
There was a problem hiding this comment.
I opted to use a custom enum to map to the appropriate FileShare value used in the underlying lock. This keeps the intended usage clearer.
| static readonly TimeSpan RetryInitialDelay = TimeSpan.FromMilliseconds(10); | ||
| static readonly TimeSpan RetryMaxDelay = TimeSpan.FromMilliseconds(500); |
There was a problem hiding this comment.
We don't need to use a polling strategy in Tentacle's ScriptIsolationMutex.
| static ResiliencePipeline<ILockHandle> BuildLockAcquisitionPipeline(LockOptions lockOptions) | ||
| { | ||
| var builder = new ResiliencePipelineBuilder<ILockHandle>() | ||
| .AddFallback( |
There was a problem hiding this comment.
During testing I found that if the retry configuration was just right (or wrong, depending on your point-of-view) it would occasionally throw the exception from acquiring the lock, but mostly the timeout exception. This is intended to normalise the exceptions.
I believe it does have the side effect currently of hiding other types of exceptions.
| { | ||
| try | ||
| { | ||
| using var _ = Isolation.Enforce(); |
There was a problem hiding this comment.
I had to add these in multiple spots (see the other occurrences below). There is a risk here if an additional entry point is added and this isn't included.
Maybe there is a better common entry point/wrapper we can use?
Overview
This change provides an implementation to enforce script isolation within Calamari.
Currently, Tentacle provides script isolation functionality, by maintaining a series of in-process locks.
For SSH, this is achieved by performing the locking on the Octopus Instance running the script. This introduces issues for multi-node clusters as the lock is limited to a single node. Our execution scaling strategies rely on being able to create additional nodes to process service bus messages, so this needs to be addressed to support SSH targets and workers.
This implementation uses a file locking approach to share the lock across multiple processes.
Result
When the appropriate environment variables are supplied, Calamari will acquire an appropriate lock before running the relevant command.
The relevant environment variables are:
CalamariScriptIsolationLevelwhich defines the script isolation level Calamari should enfore. This must be one of two values (case insensitive):FullIsolationto run the command independent of all other commands andNoIsolationto allow the command to run concurrently with otherNoIsolationcommands.CalamariScriptIsolationMutexNamedefines a mutex name to use. Only other commands running with the same mutex name will be isolated. This name should be valid to use in a file name.CalamariScritpIsolationMutexTimeoutdefines a timeout to use when waiting for the mutex. A timeout value should be greater that 10ms and less than 1 day. Providing a value outside of this range will remove the timeout and either reduce the number of retries or rety indefinitely.Additionally,
TentacleHomeis used as a location to store the files used for the lock. This directory resolves to something like/home/username/.octopus/machine-slug. This means that if the same machine is registered with a different slug it will use independent isolation locks.Current Caveats
Isolation.Enforce()andIsolation.EnforceAsync()methods are used in three separate places. Ideally we could do some consolidation and there might be a spot missing.Miscellaneous