Swift: A pure Swift method for returning ranges of a String instance (updated for Xcode 6.3.1, Swift 1.2, Xcode 7, Swift 2 and Xcode 8.0, Swift 3)


A selection of algorithms that work with Strings

Often familiarity makes us turn back to the Cocoa Framework, but Swift has a wealth of algorithms that we can use with String. These include:
distance(str.startIndex, str.endIndex) // string length
count(str) // string length

str[advance(str.startIndex, 4)] // get character at index 4
str[advance(str.startIndex, 4)...advance(str.startIndex, 8)] // get characters in range index 4 to 8

last(str) // retrieve last letter
first(str) // retrieve first letter
dropFirst(str) // remove first letter
dropLast(str) // remove last letter

filter(str, {!contains("aeiou", $0)}) // remove vowels

indices(str) // retrieve the Range value for string

isEmpty(str) // test whether there is anything in the string
minElement(indices(str)) // first index

str.substringToIndex(advance(minElement(indices(str)), 5)) // returns string up to the 5th character
str.substringFromIndex(advance(minElement(indices(str)), 5)) // returns string from the 5th character

min("antelope","ant") // returns the alphabetical first
max("antelope","ant") // returns the alphabetical last

prefix(str, 5) // returns first 5 characters
reverse(str) // return reverse array of Characters

suffix(str, 5) // returns last 5 characters
swap(&str, &a) // swaps two strings for the value of one another

Update: Swift 2 (string algorithms)

var str = "Hello Swift!"
var aStr = "Hello World!"

str.startIndex // first index
str.endIndex // end index

str.startIndex.distanceTo(str.endIndex) // string length
str.characters.count // string length

str[str.startIndex.advancedBy(4)] // get character at index 4
str[str.startIndex.advancedBy(4)...str.startIndex.advancedBy(8)] // get characters in range index 4 to 8

str.characters.last // retrieve last letter
str.characters.first // retrieve first letter

str.removeAtIndex(str.characters.indices.first!) // remove first letter
str.removeAtIndex(str.characters.indices.last!) // remove last letter

let first = str.characters.dropFirst()
String(first) // dropFirst

let last = str.characters.dropLast()
String(last) // dropLast

"aeiou".characters.contains("a") // contains character
str.characters.filter{!"aeiou".characters.contains($0)} // remove vowels

str.characters.indices // retrieve the Range value for string

str.isEmpty // test whether there is anything in the string

str.startIndex.advancedBy(5) // advance index
str.substringToIndex(str.startIndex.advancedBy(5)) // returns string up to the 5th character
str.substringFromIndex(str.startIndex.advancedBy(5)) // returns string from the 5th character

min("antelope","ant") // returns the alphabetical first
max("antelope","ant") // returns the alphabetical last

str.characters.prefix(5) // returns first 5 characters
str.characters.reverse() // return reverse array of Characters

str.characters.suffix(5) // returns last 5 characters
swap(&str, &aStr) // swaps two strings for the value of one another

Reversing a string

Things start to get interesting when you start to combine these algorithms, like so
var str = "hello"
var revStr = ""
for i in str {
    revStr.append(last(str)!)
    str = dropLast(str)
}
revStr // "olleh"

Update: Swift 2 (reversing a string)

var str = "Hello Swift!"
String(str.characters.reverse()) // "!tfiwS olleH"
in this process for reversing a string.

rangesOfString:

It is also possible to go further and remove the need for Cocoa Framework methods like rangeOfString. And here I'm doing something very similar by retrieving the ranges of a string
extension String {
    func rangesOfString(findStr:String) -> [Range<String.Index>] {
        var arr = [Range<String.Index>]()
        var startInd = self.startIndex
        // check first that the first character of search string exists
        if contains(self, first(findStr)!) {
            // if so set this as the place to start searching
            startInd = find(self,first(findStr)!)!
        }
        else {
            // if not return empty array
            return arr
        }
        var i = distance(self.startIndex, startInd)
        while i<=count(self)-count(findStr) {
            if self[advance(self.startIndex, i)..<advance(self.startIndex, i+count(findStr))] == findStr {
                arr.append(Range(start:advance(self.startIndex, i),end:advance(self.startIndex, i+count(findStr))))
                i = i+count(findStr)
            }
            else {
                i++
            }
        }
        return arr
    }
} // try further optimisation by jumping to next index of first search character after every find 


"a very good hello, hello".rangesOfString("hello”)
using a String extension written entirely in Swift with no added Cocoa.

Update: Swift 2 implementation (rangesOfString:)

Note that not only are instance methods now used as opposed to functions, but find() is replaced by indexOf().
extension String {
    func rangesOfString(findStr:String) -> [Range<String.Index>] {
        var arr = [Range<String.Index>]()
        var startInd = self.startIndex
        // check first that the first character of search string exists
        if self.characters.contains(findStr.characters.first!) {
            // if so set this as the place to start searching
            startInd = self.characters.indexOf(findStr.characters.first!)!
        }
        else {
            // if not return empty array
            return arr
        }
        var i = self.startIndex.distanceTo(startInd)
        while i<=self.characters.count-findStr.characters.count {
            if self[self.startIndex.advancedBy(i)..<self.startIndex.advancedBy(i+findStr.characters.count)] == findStr {
                arr.append(Range(start:self.startIndex.advancedBy(i),end:self.startIndex.advancedBy(i+findStr.characters.count)))
                i = i+findStr.characters.count
            }
            else {
                i++
            }
        }
        return arr
    }
} // try further optimisation by jumping to next index of first search character after every find


"a very good hello, hello".rangesOfString("hello")

Update: Swift 3 implementation (rangesOfString:)

Note that not only are instance methods now used as opposed to functions, but find() is replaced by indexOf().
extension String {
    func rangesOfString(findStr:String) -> [Range<String.Index>] {
        var arr = [Range<String.Index>]()
        var startInd = self.startIndex
        // check first that the first character of search string exists
        if self.characters.contains(findStr.characters.first!) {
            // if so set this as the place to start searching
            startInd = self.characters.index(of:findStr.characters.first!)!
        }
        else {
            // if not return empty array
            return arr
        }
        var i = self.distance(from:startIndex, to: startInd)
        while i<=self.characters.count-findStr.characters.count {
            if self[self.index(startIndex,offsetBy:i)..<self.index(startIndex, offsetBy: i+findStr.characters.count)] == findStr {
                arr.append(self.index(startIndex,offsetBy:i)..<self.index(startIndex, offsetBy:i+findStr.characters.count))
                i = i+findStr.characters.count
            }
            else {
                i += 1
            }
        }
        return arr
    }
} // try further optimisation by jumping to next index of first search character after every find



let a = "a very good hello, hello"
for sub in a.rangesOfString(findStr: "hello") {
        a.substring(with: sub)
}
Now let's suppose we wanted to identify ranges of strings so that we might add an attribute to an NSMutableAttributedString and that the string in question contained emoji. Immediately we have an issue because an NSString is UTF-16 and counts emoji as the number of UTF-16 characters within in them, whereas a Swift String counts emoji as single characters.

The simplest solution is to cast to an NSString before searching for substrings, so that NSRanges are returned based on UTF-16 measurement but if we wanted to work in Swift and return ranges that would be compatible when transformed into NSRange values, then we should do the following:
extension String {
    func rangesOfUTF16String(findStr:String) -> [Range<String.UTF16View.Index>] {
        var arr = [Range<String.UTF16View.Index>]()
        var startInd = self.utf16.startIndex
        // check first that the first character of search string exists
        if self.characters.contains(findStr.characters.first!) {
            // if so set this as the place to start searching
            startInd = self.utf16.index(of:findStr.utf16.first!)!
        }
        else {
            // if not return empty array
            return arr
        }
        var i = self.utf16.distance(from:self.utf16.startIndex, to: startInd)
    
        while i<=self.utf16.count-findStr.utf16.count {
            if String(self.utf16[self.utf16.index(self.utf16.startIndex,offsetBy:i)..<self.utf16.index(self.utf16.startIndex, offsetBy: i+findStr.utf16.count)]) == findStr {
                arr.append(self.utf16.index(self.utf16.startIndex,offsetBy:i)..<self.utf16.index(self.utf16.startIndex, offsetBy:i+findStr.utf16.count))
                i = i+findStr.utf16.count
            }
            else {
                i += 1
            }
        }
        return arr
    }
    
} // try further optimisation by jumping to next index of first search character after every find



let a = "a very good hello,😘 hello"
var attrString = NSMutableAttributedString(string: a)

for sub in a.rangesOfUTF16String(findStr: "hello") {
 
    
    let loc = attrString.string.utf16.distance(from: attrString.string.utf16.startIndex, to: sub.lowerBound)
    let len = attrString.string.utf16.distance(from: sub.lowerBound, to: sub.upperBound)
    let nsRange = NSRange(location: loc, length: len)
    
     attrString.addAttribute(NSForegroundColorAttributeName, value: UIColor.red, range: nsRange)
}
attrString
This leverages UTF16View to make sure our range values are correct for the purposes of transformation.

Comments

  1. Excellence! This gave me the clarity I needed to write some substring extensions. Great job.

    ReplyDelete
  2. rangesOfString() above assumes literal string equivalence. This ignores canonically equivalence: http://en.wikipedia.org/wiki/Unicode_equivalence

    ReplyDelete
    Replies
    1. Is it possible for you to provide an example of an input where this falls down so that I might test and explore revisions?

      Delete
  3. How does your rangeOfString extension compare in terms of performance to the native ObjC/Cocoa one?

    ReplyDelete
    Replies
    1. A great question. I shall revisit and test, then publish results.

      Delete
  4. Your code has a bug on this line:
    i++
    It should be:
    else {i++}

    Your code fails on this case
    "hellohello".rangesOfString("hello")

    ReplyDelete
    Replies
    1. You are absolutely right. Thanks for pointing this out. I've updated the code accordingly.

      Delete
  5. rangesOfString is a lifesaver, thanks!

    ReplyDelete
    Replies
    1. Thanks for your comment, I'm glad it's helped you.

      Delete
  6. First thx for making this post, always appreciated. Second I noticed some swift 2.0 inconsistencies. Im running the latest Xcode beta (7.1). I find that myString.characters.prefix(idx) works while prefix(myString.characters, 5) seems to be depracated.. Greg

    ReplyDelete
    Replies
    1. Thanks Greg, I've now updated all code to work with the release version of Xcode 7. Note: I haven't added any of the newly available String methods.

      Delete
  7. Thank you! Great help. I especially appreciate the updates for Swift 2 .

    ReplyDelete
  8. can you explain how could i use this
    rangeOfCharacters(from: options: range: )

    ReplyDelete
    Replies
    1. To quote the docs, it "Finds and returns the range in the String of the first character from a given character set found in a given range with given options." For example, if you pass this method a .decimalDigits character set then it will return the range of the first decimal digit. If you had a string like "Table 12", the method would return the range of the "1".

      Delete
    2. Here's a coded example:

      func removePrecedingZeroes(str:String) -> String? {
      var aSet = CharacterSet.decimalDigits
      aSet.remove("0")
      aSet.insert(".")
      if let range = aString.rangeOfCharacter(from: aSet)?.lowerBound {
      return aString.substring(from: range)
      }
      return nil
      }

      var aString = "000000000000123454"
      removePrecedingZeroes(str: aString) // "123454"

      Delete
  9. I'm getting a "No '..<' candidates produce the expected contextual result type 'Range'" in these lines of code -> arr.append(self.index(startIndex,offsetBy:i)..<self.index(startIndex, offsetBy:i+findStr.characters.count)) in Swift 3.

    Thanks in advance!

    ReplyDelete

Post a Comment