Swift: Sorting togetherness with Zip2Sequence and Stable Sort (Xcode 7.2; updated Swift 3, Xcode 8)


Let's face it, of all the types in Swift, Zip2Sequence receives very little press. This doesn't mean its role in the evolution of Swift has been forgotten at Apple. It just means that generally its role is  fulfilled using other approaches (or maybe being at the end of the alphabet us bloggers just haven't addressed its purpose yet).

The basics

As the name suggests, a Zip2Sequence is created by zipping together two sequences.
let firstArray = [1,2,3,4,5,6,7,8,9,10]
let secondArray = ["A","B","C","D","E","F","G","H","I","J"]
// first way of instantiating
let zipped1 = Zip2Sequence(firstArray,secondArray)
// second way of instantiating
let zipped2 = zip(firstArray, secondArray)
Note: The two sequences preserve their original types, the number of zipped together pairs is as long as the shortest of the zipped sequences, and if one sequence is longer than the other then the final values of the longer array are omitted.

What is it good for?

We can do many of the things that we expect from working with a sequence including employing higher-order functions like map, flatMap, reduce, split, and so on. We can also use a Zip2Sequence to instantiate an Array, providing us with an array of tuples.
Array(zipped2) // [(.0 1, .1 "A"), (.0 2, .1 "B"), (.0 3, .1 "C"), (.0 4, .1 "D"), (.0 5, .1 "E"), (.0 6, .1 "F"), (.0 7, .1 "G"), (.0 8, .1 "H"), (.0 9, .1 "I"), (.0 10, .1 "J")]

Getting sorted

What I'm most interested in in this post is to discuss the way in which Zip2Sequence can sort two arrays based on the reordering of one array. The kind of thing we see in spreadsheets where we have two or more columns and one of those determines the reordering of all the others so that rows stay together in the process of reordering. While I've started to look at this before (here and here), what we have with Zip2Sequence is the potential for a cleaner and arguably more functional approach that is more reliable.

Let's suppose first of all that we have a SortType so that we don't confuse our greater than and less than
enum SortType {
    case Ascending
    case Descending
}
We can then write an extension for arrays with elements of comparable type:
extension Array where Element: Comparable {
    func secondarySort<A>(arr2:[A], type:SortType = .Ascending) -> [A] {
        // instantiate a Zip2Sequence instance
        let zipped = zip(self, arr2)
        
        let sortedZip = zipped.sort({
            if type == .Ascending {return $0.0 < $1.0}
            else {return $0.0 > $1.0}
        })
        return sortedZip.map({$0.1})
    }
}
This can be implemented like so:
let testArray1 = [3,2,4,1,5]
let testArray2 = ["W","I","F","S","T"]

testArray1.secondarySort(testArray2) // ["S", "I", "W", "F", "T"]
In the sorting of a Zip2Sequence $0.0 is the current value in the first sequence and $1.0 is the look behind value in the first sequence, while .1 values are from the second sequence. So if the current item is ordered before the look behind item in an ascending sort then $0.0 will need to be less than the look behind [or previous] value $0.1.

Note: I've called this a secondary sort for want of another term. I've not yet come across an exact term for this type of sorting. (If you know of one please post below.)

More functional less mutable

Now we can imagine re-using the same array to order numerous other arrays, but is it sensible? After all the original array might be mutable. So how about we build an enum that makes things a bit more stable:
enum SortType {
    case Ascending
    case Descending
}
enum SortError:ErrorType {
    case ArrayLengthsNotEqual
}

extension RangeReplaceableCollectionType where Generator.Element: Comparable {
    func secondarySort<A>(arr2:[A], type:SortType = .Ascending, stableSort:Bool = true) -> [A] {
        // instantiate a Zip2Sequence instance
        let zipped = zip(self, arr2)
        
     
            let sortedZip = zipped.sort({
                if type == .Ascending {return $0.0 < $1.0}
                else {return $0.0 > $1.0}
            })
            return sortedZip.map({$0.1})
     

    }
}

enum SortBy {
    
    case IntegerArray([Int]), DoubleArray([Double]), FloatArray([Float]), CGFloatArray([CGFloat]), StringArray([String])
    
    init?<S: SequenceType>(arr: S) {
        if arr is [Int] {
            self = SortBy.IntegerArray(arr as! [Int])
        }
        else if arr is [Double] {
            self = SortBy.DoubleArray(arr as! [Double])
        }
        else if arr is [Float] {
            self = SortBy.FloatArray(arr as! [Float])
        }
        else if arr is [CGFloat] {
            self = SortBy.CGFloatArray(arr as! [CGFloat])
        }
        else if arr is [String] {
            self = SortBy.StringArray(arr as! [String])
        }
        else {
            return nil
        }
        
    }
    
    
    func sort<A>(arr:Array<A>, type:SortType = .Ascending, stableSort:Bool = true) throws -> Array<A> {
        switch self {
        case IntegerArray(let i):
            guard i.count == arr.count else {
                throw SortError.ArrayLengthsNotEqual
            }
            return i.secondarySort(arr, type: type, stableSort:stableSort)
            
        case DoubleArray(let d):
            guard d.count == arr.count else {
                throw SortError.ArrayLengthsNotEqual
            }
            return d.secondarySort(arr, type: type, stableSort:stableSort)
        case FloatArray(let f):
            guard f.count == arr.count else {
                throw SortError.ArrayLengthsNotEqual
            }
            return f.secondarySort(arr, type: type, stableSort:stableSort)
        case CGFloatArray(let cg):
            guard cg.count == arr.count else {
                throw SortError.ArrayLengthsNotEqual
            }
            return cg.secondarySort(arr, type: type, stableSort:stableSort)
        case StringArray(let str):
            guard str.count == arr.count else {
                throw SortError.ArrayLengthsNotEqual
            }
            return str.secondarySort(arr, type: type, stableSort:stableSort)}
    }
    
}

// data
let arr1 = [1,3,9,3,4,5,6,7,8]
let arr2 = ["A","D","B","X","C","D","E","F","G"]
let arr3 = [2.3,2.3,0.7,8.9,2.3,7.9,1.4,6.4,15.7]

// implementatation
// first create your sorter using an Array of type Int, Double, Float, CGFloat or String
if let sorter = SortBy(arr: arr3) {
    do {

        let sortedArray1 = try sorter.sort(arr1)
        let sortedArray2 = try sorter.sort(arr2)
        let sortedArray3 = try sorter.sort(arr3)
        
    }
    catch SortError.ArrayLengthsNotEqual {
        print("Array lengths are not equal")
    }
    catch {
        print("other error")
    }
}

Stability

The observant will notice that I've introduced a parameter called stableSort: and that it's a Bool that I'm not currently utilising. The reason for this is because as soon as I posted code similar to the above on Gist, I was awakened to the lack of stable sort in this solution. And to solve this, I decided as an initial solution to borrow recent code from AirspeedVelocity (read more about stable sorting there):
// stable sort taken from AirspeedVelocity blog
// http://airspeedvelocity.net/2016/01/10/writing-a-generic-stable-sort/
extension RangeReplaceableCollectionType
    where
    Index: RandomAccessIndexType,
    SubSequence.Generator.Element == Generator.Element,
Index.Distance == Index.Stride {
    public mutating func stableSortInPlace(
        isOrderedBefore: (Generator.Element,Generator.Element)->Bool
        ) {
            let N = self.count
            
            var aux: [Generator.Element] = []
            aux.reserveCapacity(numericCast(N))
            
            func merge(lo: Index, _ mid: Index, _ hi: Index) {
                
                aux.removeAll(keepCapacity: true)
                
                var i = lo, j = mid
                while i < mid && j < hi {
                    if isOrderedBefore(self[j],self[i]) {
                        aux.append(self[j])
                        j += 1
                    }
                    else {
                        aux.append(self[i])
                        i += 1
                    }
                }
                aux.appendContentsOf(self[i..<mid])
                aux.appendContentsOf(self[j..<hi])
                self.replaceRange(lo..<hi, with: aux)
            }
            
            var sz: Index.Distance = 1
            while sz < N {
                for lo in startIndex.stride(to: endIndex-sz, by: sz*2) {
                    merge(lo, lo+sz, lo.advancedBy(sz*2,limit: endIndex))
                }
                sz *= 2
            }
    }
}
The observant will also have noted that I shifted away from an Array extension to a RangeReplaceableCollectType to get everything working in the above version of my code. This was to enable me to include the stable sort code. And here is the extension rewritten to include stable sort:
extension RangeReplaceableCollectionType where Generator.Element: Comparable {
    func secondarySort<A>(arr2:[A], type:SortType = .Ascending, stableSort:Bool = true) -> [A] {
        // instantiate a Zip2Sequence instance
        let zipped = zip(self, arr2)
        
        if stableSort {
            var arr = Array(zipped)
            if type == .Ascending {arr.stableSortInPlace({$0.0 < $1.0})}
            else {arr.stableSortInPlace({$0.0 > $1.0})}
            return arr.map({$0.1})
        }
        else {
            let sortedZip = zipped.sort({
                if type == .Ascending {return $0.0 < $1.0}
                else {return $0.0 > $1.0}
            })
            return sortedZip.map({$0.1})
        }
        
    }
}
If you want to simply work with two arrays ordering and forget the enums then these two pieces of code are what you need.

Complete code

So finally we have everything working in on place for you to cut and paste into a playground (also on Gist):
import UIKit
import Foundation
// stable sort taken from AirspeedVelocity blog
// http://airspeedvelocity.net/2016/01/10/writing-a-generic-stable-sort/
extension RangeReplaceableCollectionType
    where
    Index: RandomAccessIndexType,
    SubSequence.Generator.Element == Generator.Element,
Index.Distance == Index.Stride {
    public mutating func stableSortInPlace(
        isOrderedBefore: (Generator.Element,Generator.Element)->Bool
        ) {
            let N = self.count
            
            var aux: [Generator.Element] = []
            aux.reserveCapacity(numericCast(N))
            
            func merge(lo: Index, _ mid: Index, _ hi: Index) {
                
                aux.removeAll(keepCapacity: true)
                
                var i = lo, j = mid
                while i < mid && j < hi {
                    if isOrderedBefore(self[j],self[i]) {
                        aux.append(self[j])
                        j += 1
                    }
                    else {
                        aux.append(self[i])
                        i += 1
                    }
                }
                aux.appendContentsOf(self[i..<mid])
                aux.appendContentsOf(self[j..<hi])
                self.replaceRange(lo..<hi, with: aux)
            }
            
            var sz: Index.Distance = 1
            while sz < N {
                for lo in startIndex.stride(to: endIndex-sz, by: sz*2) {
                    merge(lo, lo+sz, lo.advancedBy(sz*2,limit: endIndex))
                }
                sz *= 2
            }
    }
}
// my code starts here

enum SortType {
    case Ascending
    case Descending
}
enum SortError:ErrorType {
    case ArrayLengthsNotEqual
}

extension RangeReplaceableCollectionType where Generator.Element: Comparable {
    func secondarySort<A>(arr2:[A], type:SortType = .Ascending, stableSort:Bool = true) throws -> [A] {
        
        guard self.count as? Int == arr2.count else {
            throw SortError.ArrayLengthsNotEqual
        }
        // instantiate a Zip2Sequence instance
        let zipped = zip(self, arr2)
        
        if stableSort {
            var arr = Array(zipped)
            if type == .Ascending {arr.stableSortInPlace({$0.0 < $1.0})}
            else {arr.stableSortInPlace({$0.0 > $1.0})}
            return arr.map({$0.1})
        }
        else {
            let sortedZip = zipped.sort({
                if type == .Ascending {return $0.0 < $1.0}
                else {return $0.0 > $1.0}
            })
            return sortedZip.map({$0.1})
        }
        
    }
    
    
    func secondarySortInPlace<A>(inout arr2:[A], type:SortType = .Ascending, stableSort:Bool = true) throws {    
        
            arr2 = try self.secondarySort(arr2)
    }
}
enum SortBy {
    
    case IntegerArray([Int]), DoubleArray([Double]), FloatArray([Float]), CGFloatArray([CGFloat]), StringArray([String])
    
    init?<S: SequenceType>(arr: S) {
        if arr is [Int] {
            self = SortBy.IntegerArray(arr as! [Int])
        }
        else if arr is [Double] {
            self = SortBy.DoubleArray(arr as! [Double])
        }
        else if arr is [Float] {
            self = SortBy.FloatArray(arr as! [Float])
        }
        else if arr is [CGFloat] {
            self = SortBy.CGFloatArray(arr as! [CGFloat])
        }
        else if arr is [String] {
            self = SortBy.StringArray(arr as! [String])
        }
        else {
            return nil
        }
        
    }
    
    
    func sort<A>(arr:Array<A>, type:SortType = .Ascending, stableSort:Bool = true) throws -> Array<A> {
        switch self {
        case IntegerArray(let i):
            return try i.secondarySort(arr, type: type, stableSort:stableSort)
        case DoubleArray(let d):
            return try d.secondarySort(arr, type: type, stableSort:stableSort)
        case FloatArray(let f):
            return try f.secondarySort(arr, type: type, stableSort:stableSort)
        case CGFloatArray(let cg):
            return try cg.secondarySort(arr, type: type, stableSort:stableSort)
        case StringArray(let str):
            return try str.secondarySort(arr, type: type, stableSort:stableSort)}
    }
    
    func sortInPlace<A>(inout arr:Array<A>, type:SortType = .Ascending, stableSort:Bool = true) throws  {
        switch self {
        case IntegerArray(let i):
            try i.secondarySortInPlace(&arr)
        case DoubleArray(let d):
            try d.secondarySortInPlace(&arr, type: type, stableSort:stableSort)
        case FloatArray(let f):
            try f.secondarySortInPlace(&arr, type: type, stableSort:stableSort)
        case CGFloatArray(let cg):
            try cg.secondarySortInPlace(&arr, type: type, stableSort:stableSort)
        case StringArray(let str):
            try str.secondarySortInPlace(&arr, type: type, stableSort:stableSort)}
    }
    
}

Your implementation can then be as follows using sort:
// data
var arr1 = [1,3,9,3,4,5,6,7,8]
var arr2 = ["A","C","B","X","C","D","E","F","G"]
var arr3 = [2.3,5.2,0.7,8.9,10.6,7.9,1.4,6.4,15.7]

// implementatation
// first create your sorter,  the ordering array (used to create the SortBy instance) must contain Int, Float, Double, CGFloat or String
if let sorter = SortBy(arr: arr3) {
    do {
        arr1 = try sorter.sort(arr1)
        arr2 = try sorter.sort(arr2)
        arr3 = try sorter.sort(arr3)     
    }
    catch SortError.ArrayLengthsNotEqual {
        print("Array lengths are not equal")
    }
   catch {
        print("other error")
   }
}
Or like this with sortInPlace:
// data
var arr1 = [1,3,9,3,4,5,6,7,8]
var arr2 = ["A","C","B","X","C","D","E","F","G"]
var arr3 = [2.3,5.2,0.7,2.3,10.6,7.9,2.3,6.4,15.7]

// implementation
// first create your sorter,  the ordering array (used to create the SortBy instance) must contain Int, Float, Double, CGFloat or String
if let sorter = SortBy(arr: arr3) {
    do {
        try sorter.sortInPlace(&arr3)
        try sorter.sortInPlace(&arr2)
        try sorter.sortInPlace(&arr1)
    }
    catch SortError.ArrayLengthsNotEqual {
        print("Array lengths are not equal")
    }
    catch {
        print("other error")
    }
}
Thanks to Al Skipp for encouraging some code clean-up here.

Conclusion

Zip2Sequence has been a small part of something much larger but it has been an important pivot in working out a way of sorting multiple arrays based on the reordering of a single one.

Swift 3, Xcode 8

import UIKit

// stable sort taken from AirspeedVelocity blog
// http://airspeedvelocity.net/2016/01/10/writing-a-generic-stable-sort/

extension RangeReplaceableCollection
    where
    Index: Strideable,
    SubSequence.Iterator.Element == Iterator.Element,
IndexDistance == Index.Stride {


    public mutating func stableSortInPlace(
        isOrderedBefore: @escaping (Iterator.Element,Iterator.Element)->Bool
        ) {
        let N = self.count
        
        var aux: [Iterator.Element] = []
        aux.reserveCapacity(numericCast(N))
        
        func merge(lo: Index, mid: Index, hi: Index) {
            aux.removeAll(keepingCapacity: true)
            
            var i = lo, j = mid
            while i < mid && j < hi {
                if isOrderedBefore(self[j],self[i]) {
                    aux.append(self[j])
                    j = (j as! Int + 1) as! Self.Index
                }
                else {
                    aux.append(self[i])
                    i = (i as! Int + 1) as! Self.Index
                }
            }
            aux.append(contentsOf:self[i..<mid])
            aux.append(contentsOf:self[j..<hi])
            self.replaceSubrange(lo..<hi, with: aux)
        }
        
        var sz: IndexDistance = 1
        while sz < N {
            for lo in stride(from:startIndex, to: endIndex-sz, by: sz*2) {
                 // see https://swift.org/migration-guide/
                if let hiVal = self.index(lo, offsetBy:sz*2, limitedBy: endIndex) {
                merge(lo:lo, mid: lo+sz, hi:hiVal)
                }

            }
            sz *= 2
        }
    }
}
// my code starts here

enum SortType {
    case Ascending
    case Descending
}
enum SortError:Error {
    case ArrayLengthsNotEqual
}

extension Array where Element: Comparable {
    func secondarySort<A>(arr2:[A], type:SortType = .Ascending, stableSort:Bool = true) throws -> [A] {
        
        guard self.count  == arr2.count else {
            throw SortError.ArrayLengthsNotEqual
        }
        // instantiate a Zip2Sequence instance
        let zipped = zip(self, arr2)
        
        if stableSort {
            var arr = zipped.map{(a, b) in (a,b)}
            if type == .Ascending {arr.stableSortInPlace(isOrderedBefore: {$0.0 < $1.0})}
            else {arr.stableSortInPlace(isOrderedBefore: {$0.0 > $1.0})}
            return arr.map({$0.1})
        }
        else {
            let sortedZip = zipped.sorted(by: {
                if type == .Ascending {return $0.0 < $1.0}
                else {return $0.0 > $1.0}
            })
            return sortedZip.map({$0.1})
        }
        
    }
    
    
    func secondarySortInPlace<A>( arr2:inout [A], type:SortType = .Ascending, stableSort:Bool = true) throws {
        
        arr2 = try self.secondarySort(arr2: arr2)
    }
}
enum SortBy {
    
    case IntegerArray([Int]), DoubleArray([Double]), FloatArray([Float]), CGFloatArray([CGFloat]), StringArray([String])
    
    init?<S: Sequence>(arr: S) {
        if arr is [Int] {
            self = SortBy.IntegerArray(arr as! [Int])
        }
        else if arr is [Double] {
            self = SortBy.DoubleArray(arr as! [Double])
        }
        else if arr is [Float] {
            self = SortBy.FloatArray(arr as! [Float])
        }
        else if arr is [CGFloat] {
            self = SortBy.CGFloatArray(arr as! [CGFloat])
        }
        else if arr is [String] {
            self = SortBy.StringArray(arr as! [String])
        }
        else {
            return nil
        }
        
    }
    
    
    func sort<A>(arr:Array<A>, type:SortType = .Ascending, stableSort:Bool = true) throws -> Array<A> {
        switch self {
        case .IntegerArray(let i):
            return try i.secondarySort(arr2: arr, type: type, stableSort:stableSort)
        case .DoubleArray(let d):
            return try d.secondarySort(arr2: arr, type: type, stableSort:stableSort)
        case .FloatArray(let f):
            return try f.secondarySort(arr2: arr, type: type, stableSort:stableSort)
        case .CGFloatArray(let cg):
            return try cg.secondarySort(arr2: arr, type: type, stableSort:stableSort)
        case .StringArray(let str):
            return try str.secondarySort(arr2: arr, type: type, stableSort:stableSort)}
    }
    
    func sortInPlace<A>(arr:inout Array<A>, type:SortType = .Ascending, stableSort:Bool = true) throws  {
        switch self {
        case .IntegerArray(let i):
            try i.secondarySortInPlace(arr2: &arr)
        case .DoubleArray(let d):
            try d.secondarySortInPlace(arr2: &arr, type: type, stableSort:stableSort)
        case .FloatArray(let f):
            try f.secondarySortInPlace(arr2: &arr, type: type, stableSort:stableSort)
        case .CGFloatArray(let cg):
            try cg.secondarySortInPlace(arr2: &arr, type: type, stableSort:stableSort)
        case .StringArray(let str):
            try str.secondarySortInPlace(arr2: &arr, type: type, stableSort:stableSort)}
    }
    
}

// data
var arr1 = [1,3,9,3,4,5,6,7,8]
var arr2 = ["A","C","B","X","C","D","E","F","G"]
var arr3 = [2.3,5.2,0.7,8.9,10.6,7.9,1.4,6.4,15.7]

// implementatation
// first create your sorter,  the ordering array (used to create the SortBy instance) must contain Int, Float, Double, CGFloat or String
if let sorter = SortBy(arr: arr3) {
    do {
        arr1 = try sorter.sort(arr: arr1)
        arr2 = try sorter.sort(arr: arr2)
        arr3 = try sorter.sort(arr: arr3)
    }
    catch SortError.ArrayLengthsNotEqual {
        print("Array lengths are not equal")
    }
    catch {
        print("other error")
    }
}



Further Reading

'[swift-evolution] [Manifesto] Completing Generics' (Douglas Gregor, Apple)

'A Little Respect for AnySequence' (Rob Napier)

'Writing a Generic Stable Sort' (AirspeedVelocity)

'Sequence Performance' (Big O Note-Taking)

'I would like to "join" two sequences in to a sequence of tuples' (Stackoverflow)

Further thoughts

As a quick point: we can work on creating a Zip_Sequence type with three sequences and above as proposed by Douglas Gregor, and here is some of my own rough work towards that.
struct Zip3Sequence<A: RangeReplaceableCollectionType, B: RangeReplaceableCollectionType, C:RangeReplaceableCollectionType where A.Generator.Element: Comparable> {
    var zipped:[(C.Generator.Element, B.Generator.Element, A.Generator.Element)]
    
    init(let sequence1:C, var sequence2:B, var sequence3:A) {
        var array: Array<(C.Generator.Element, B.Generator.Element, A.Generator.Element)> = []
        for a in sequence1.enumerate() {
            if let s2 = sequence2.first,
                s3 = sequence3.first {
            let tup = (a.element,s2, s3)
                    array.append(tup)
                   sequence2.removeFirst()
                   sequence3.removeFirst()
            }
        }
        zipped = array
    }
   
}

let array1 = [3,2,1]
let array2 = ["A","C","B"]
let array3 = [1.3,2.4,5.6]
Zip3Sequence(sequence1: array1, sequence2: array2, sequence3: array3)

Swift 3 version

struct Zip3Sequence where A.Iterator.Element: Comparable {
    var zipped:[(C.Generator.Element, B.Generator.Element, A.Generator.Element)]
    
    init(sequence1:C, sequence2:B, sequence3:A) {
        let sequence1 = sequence1
        var sequence2 = sequence2
        var sequence3 = sequence3
        var array: Array<(C.Generator.Element, B.Generator.Element, A.Generator.Element)> = []
        for a in sequence1.enumerated() {
            if let s2 = sequence2.first,
                let s3 = sequence3.first {
                let tup = (a.element,s2, s3)
                array.append(tup)
                sequence2.removeFirst()
                sequence3.removeFirst()
            }
        }
        zipped = array
    }
    
}

let array1 = [3,2,1]
let array2 = ["A","C","B"]
let array3 = [1.3,2.4,5.6]
let zp = Zip3Sequence(sequence1: array1, sequence2: array2, sequence3: array3)
But this is something for another time, and I make no claims for the utility of this code.


Comments