You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pretty simple, just chuck things like compiled regexen (often found on their own, or within rich objects like boto2 clients) or thread locks (often found within database client connections, for example) into your config somewhere, then eagerly await some frustrating TypeErrors to bubble up from Python-core's copy.py.
At its heart this is due to how we clone both configs and contexts in a few spots in Invoke's task execution stuff in order to prevent hard-to-debug unintentional state bleed between tasks, subroutines, etc. Cloning uses copy.deepcopy heavily since there's no other good way to copy arbitrary nested objects with unknown contents.
So what could we do to prevent/minimize these issues?
I still believe arbitrary mutation will lead to nasty hard-to-track bugs...
We legitimately need at least some cloning for things like parameterized tasks (think fab -H h1,h2,h3)
Though IIRC it's feasible to expect we could limit the cloning in that case to be far more specific; I forget the details.
But it would certainly fix this issue.
Attempt to deepcopy, and fallback to straight up regular copying or no copying, if we encounter TypeError during the deepcopy.
This feels real bad because it'd be silently inconsistent (even if it logs, it'll still confuse the hell out of people)
Attempt to write our own, "better" deepcopy (probably by extending config.merge_dicts) that crawls the entire tree and does a "regular" copy of leaf values instead of a wholesale deepcopy, and refuses to copy anything it doesn't understand.
Feels like reinventing deepcopy - we'd need to do everything it does (including attempting to call __deepcopy__ and such) with the sole difference being an attempt at "granular" TypeError exception handling.
But it would let us basically say "anything that isn't throwing a TypeError on deepcopy is getting copied, the rest are up to you", which is at least better than "there was a TypeError somewhere so you're now not getting any copying whatsoever"
Just...not actually a whole lot better, as it's still inconsistent.
Allow users to opt-out of the deepcopying/cloning when they encounter this issue & are willing to take on the burden of worrying about state bleed between tasks
I worry it would mean those corner cases where we really need cloning would then always break if this was turned on; or conversely that if we made them exempt, they'd still be enough to trigger the core issue here
Allow users to selectively opt-out of deepcopying specific keys (i.e. a blacklist of sorts)
Sidesteps the all-or-nothing nature of the other solutions
EDIT: this does require the "reimplement deepcopy, kinda" approach from above; clone() is currently implemented as "deepcopy all the cached source configs as well as the core config", it's merge() that uses merge_dicts. So it's still a moderate amount of work.
Less elegant / looks like an API/functionality wart
But it's not like there is an elegant solution to this besides "just never deepcopy, have fun debugging why nested task invocations are causing changes in behavior higher up the stack"
Ye olde "educate users" approach, instruct them to find other ways of passing around non-deepcopyable objects, or to (ugh) monkeypatch the objects in question so they either deepcopy happily or skip deepcopying explicitly.
Not great, obviously
But much less work, and less extra code on our end to develop its own bugs
Scenario
Pretty simple, just chuck things like compiled regexen (often found on their own, or within rich objects like boto2 clients) or thread locks (often found within database client connections, for example) into your config somewhere, then eagerly await some frustrating
TypeErrors to bubble up from Python-core'scopy.py.E.g.:
or (slightly different traceback and object type, though the reasoning is nearly identical):
Solutions / discussion
At its heart this is due to how we
cloneboth configs and contexts in a few spots in Invoke's task execution stuff in order to prevent hard-to-debug unintentional state bleed between tasks, subroutines, etc. Cloning usescopy.deepcopyheavily since there's no other good way to copy arbitrary nested objects with unknown contents.So what could we do to prevent/minimize these issues?
fab -H h1,h2,h3)deepcopy, and fallback to straight up regular copying or no copying, if we encounterTypeErrorduring thedeepcopy.deepcopy(probably by extendingconfig.merge_dicts) that crawls the entire tree and does a "regular" copy of leaf values instead of a wholesaledeepcopy, and refuses to copy anything it doesn't understand.deepcopy- we'd need to do everything it does (including attempting to call__deepcopy__and such) with the sole difference being an attempt at "granular"TypeErrorexception handling.TypeErrorondeepcopyis getting copied, the rest are up to you", which is at least better than "there was aTypeErrorsomewhere so you're now not getting any copying whatsoever"clone()is currently implemented as "deepcopyall the cached source configs as well as the core config", it'smerge()that usesmerge_dicts. So it's still a moderate amount of work.