diff --git a/Guides/Trim.md b/Guides/Trim.md new file mode 100644 index 00000000..cd5a7416 --- /dev/null +++ b/Guides/Trim.md @@ -0,0 +1,114 @@ +# Trim + +[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/Trim.swift) | + [Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/TrimTests.swift)] + +Returns a `SubSequence` formed by discarding all elements at the start and end of the collection +which satisfy the given predicate. + +This example uses `trimming(where:)` to get a substring without the white space at the beginning and end of the string. + +```swift +let myString = " hello, world " +print(myString.trimming(where: \.isWhitespace)) // "hello, world" + +let results = [2, 10, 11, 15, 20, 21, 100].trimming(where: { $0.isMultiple(of: 2) }) +print(results) // [11, 15, 20, 21] +``` + +## Detailed Design + +A new method is added to `BidirectionalCollection`: + +```swift +extension BidirectionalCollection { + + public func trimming(where predicate: (Element) throws -> Bool) rethrows -> SubSequence +} +``` + +This method requires `BidirectionalCollection` for an efficient implementation which visits as few elements as possible. + +A less-efficient implementation is _possible_ for any `Collection`, which would involve always traversing the +entire collection. This implementation is not provided, as it would mean developers of generic algorithms who forget +to add the `BidirectionalCollection` constraint will receive that inefficient implementation: + +```swift +func myAlgorithm(input: Input) where Input: Collection { + + let trimmedInput = input.trimming(where: { ... }) // Uses least-efficient implementation. +} + +func myAlgorithm2(input: Input) where Input: BidirectionalCollection { + + let trimmedInput = input.trimming(where: { ... }) // Uses most-efficient implementation. +} +``` + +Swift provides the `BidirectionalCollection` protocol for marking types which support reverse traversal, +and generic types and algorithms which want to make use of that should add it to their constraints. + +### Complexity + +Calling this method is O(_n_). + +### Naming + +The name `trim` has precedent in other programming languages. Another popular alternative might be `strip`. + +| Example usage | Languages | +|-|-| +| ''String''.Trim([''chars'']) | C#, VB.NET, Windows PowerShell | +| ''string''.strip(); | D | +| (.trim ''string'') | Clojure | +| ''sequence'' [ predicate? ] trim | Factor | +| (string-trim '(#\Space #\Tab #\Newline) ''string'') | Common Lisp | +| (string-trim ''string'') | Scheme | +| ''string''.trim() | Java, JavaScript (1.8.1+), Rust | +| Trim(''String'') | Pascal, QBasic, Visual Basic, Delphi | +| ''string''.strip() | Python | +| strings.Trim(''string'', ''chars'') | Go | +| LTRIM(RTRIM(''String'')) | Oracle SQL, T-SQL | +| string:strip(''string'' [,''option'', ''char'']) | Erlang | +| ''string''.strip or ''string''.lstrip or ''string''.rstrip | Ruby | +| trim(''string'') | PHP, Raku | +| [''string'' stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] | Objective-C/Cocoa | +| ''string'' withBlanksTrimmed ''string'' withoutSpaces ''string'' withoutSeparators | Smalltalk | +| string trim ''$string'' | Tcl | +| TRIM(''string'') or TRIM(ADJUSTL(''string'')) | Fortran | +| TRIM(''string'') | SQL | +| String.trim ''string'' | OCaml 4+ | + +Note: This is an abbreviated list from Wikipedia. [Full table](https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(string_functions)#trim) + +The standard library includes a variety of methods which perform similar operations: + +- Firstly, there are `dropFirst(Int)` and `dropLast(Int)`. These return slices but do not support user-defined predicates. + If the collection's `count` is less than the number of elements to drop, they return an empty slice. +- Secondly, there is `drop(while:)`, which also returns a slice and is equivalent to a 'left-trim' (trimming from the head but not the tail). + If the entire collection is dropped, this method returns an empty slice. +- Thirdly, there are `removeFirst(Int)` and `removeLast(Int)` which do not return slices and actually mutate the collection. + If the collection's `count` is less than the number of elements to remove, this method triggers a runtime error. +- Lastly, there are the `popFirst()` and `popLast()` methods, which work like `removeFirst()` and `removeLast()`, + except they do not trigger a runtime error for empty collections. + +The closest neighbours to this function would be the `drop` family of methods. Unfortunately, unlike `dropFirst(Int)`, +the name `drop(while:)` does not specify which end(s) of the collection it operates on. Moreover, one could easily +mistake code such as: + +```swift +let result = myString.drop(while: \.isWhitespace) +``` + +With a lazy filter that drops _all_ whitespace characters regardless of where they are in the string. +Besides that, the root `trim` leads to clearer, more conscise code, which is more aligned with other programming +languages: + +```swift +// Does `result` contain the input, trimmed of certain elements? +// Or does this code mutate `input` in-place and return the elements which were dropped? +let result = input.dropFromBothEnds(where: { ... }) + +// No such ambiguity here. +let result = input.trimming(where: { ... }) +``` diff --git a/README.md b/README.md index 6bc15bc8..9b45d576 100644 --- a/README.md +++ b/README.md @@ -32,6 +32,7 @@ Read more about the package, and the intent behind it, in the [announcement on s - [`chunked(by:)`, `chunked(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes. - [`indexed()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Indexed.md): Iterate over tuples of a collection's indices and elements. +- [`trimming(where:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Trim.md): Returns a slice by trimming elements from a collection's start and end. ## Adding Swift Algorithms as a Dependency diff --git a/Sources/Algorithms/Trim.swift b/Sources/Algorithms/Trim.swift new file mode 100644 index 00000000..eb16d5da --- /dev/null +++ b/Sources/Algorithms/Trim.swift @@ -0,0 +1,50 @@ +//===----------------------------------------------------------------------===// +// +// This source file is part of the Swift Algorithms open source project +// +// Copyright (c) 2020 Apple Inc. and the Swift project authors +// Licensed under Apache License v2.0 with Runtime Library Exception +// +// See https://swift.org/LICENSE.txt for license information +// +//===----------------------------------------------------------------------===// + +extension BidirectionalCollection { + + /// Returns a `SubSequence` formed by discarding all elements at the start and end of the collection + /// which satisfy the given predicate. + /// + /// This example uses `trimming(where:)` to get a substring without the white space at the + /// beginning and end of the string: + /// + /// ``` + /// let myString = " hello, world " + /// print(myString.trimming(where: \.isWhitespace)) // "hello, world" + /// ``` + /// + /// - parameters: + /// - predicate: A closure which determines if the element should be omitted from the + /// resulting slice. + /// + /// - complexity: `O(n)`, where `n` is the length of this collection. + /// + @inlinable + public func trimming( + where predicate: (Element) throws -> Bool + ) rethrows -> SubSequence { + + // Consume elements from the front. + let sliceStart = try firstIndex { try predicate($0) == false } ?? endIndex + // sliceEnd is the index _after_ the last index to match the predicate. + var sliceEnd = endIndex + while sliceStart != sliceEnd { + let idxBeforeSliceEnd = index(before: sliceEnd) + guard try predicate(self[idxBeforeSliceEnd]) else { + return self[sliceStart.. 10 } + XCTAssertEqual(results_noheadmatch, [1, 3, 5, 7, 9]) + } + + func testBothEndsMatch() { + // Both ends match, some string of >1 elements do not (return that string). + let results = [2, 10, 11, 15, 20, 21, 100].trimming { $0.isMultiple(of: 2) } + XCTAssertEqual(results, [11, 15, 20, 21]) + } + + func testEverythingMatches() { + // Everything matches (trim everything). + let results_allmatch = [1, 3, 5, 7, 9, 11, 13, 15].trimming { _ in true } + XCTAssertEqual(results_allmatch, []) + } + + func testEverythingButOneMatches() { + // Both ends match, one element does not (trim all except that element). + let results_one = [2, 10, 12, 15, 20, 100].trimming { $0.isMultiple(of: 2) } + XCTAssertEqual(results_one, [15]) + } +}