Refactor FST algorithms to be exclusively non-mutating instance methods #56
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
Refactor (once more) with the following major breaking changes:
algorithms
, which now only contains algorithms such asdijkstra
which take an FST and compute some value (and do not return a new FST).FST
class again. All methods are instance methods which return a new FST and do not mutate the original FST. See below for rationale.State
andTransition
implementation has been extracted to separate modules.Rationale
PyFoma historically had both mutating and non-mutating methods, which was formalized in the prior refactor. However, this can introduce confusion, particularly for methods like
fst1.compose(fst2)
where it is unintuitive thatfst1
will be mutated.Mutating methods are already discouraged and possibly being deprecated in comparable frameworks such as Pandas and PyTorch. They can lead to silent logic errors due to unintentional mutations, particularly in environments such as Jupyter notebooks where cells may be run several times.
Furthermore, this refactor removes the dynamically-created methods, simplifying the implementation and making it easier to use things like static type checkers.
The counterargument is that mutation can be desirable in some cases when minimizing memory usage is critical. In these cases, the following pattern is appropriate:
This does not entirely avoid creating copies, but it does avoid keeping these copies in memory. Furthermore, many of the mutating methods previously depended on
become
anyway, so this approach is equivalent in many cases.