Swift: A pure Swift method for returning ranges of a String instance (updated for Xcode 6.3.1, Swift 1.2, Xcode 7, Swift 2 and Xcode 8.0, Swift 3)
A selection of algorithms that work with Strings
Often familiarity makes us turn back to the Cocoa Framework, but Swift has a wealth of algorithms that we can use with String. These include:distance(str.startIndex, str.endIndex) // string length count(str) // string length str[advance(str.startIndex, 4)] // get character at index 4 str[advance(str.startIndex, 4)...advance(str.startIndex, 8)] // get characters in range index 4 to 8 last(str) // retrieve last letter first(str) // retrieve first letter dropFirst(str) // remove first letter dropLast(str) // remove last letter filter(str, {!contains("aeiou", $0)}) // remove vowels indices(str) // retrieve the Range value for string isEmpty(str) // test whether there is anything in the string minElement(indices(str)) // first index str.substringToIndex(advance(minElement(indices(str)), 5)) // returns string up to the 5th character str.substringFromIndex(advance(minElement(indices(str)), 5)) // returns string from the 5th character min("antelope","ant") // returns the alphabetical first max("antelope","ant") // returns the alphabetical last prefix(str, 5) // returns first 5 characters reverse(str) // return reverse array of Characters suffix(str, 5) // returns last 5 characters swap(&str, &a) // swaps two strings for the value of one another
Update: Swift 2 (string algorithms)
var str = "Hello Swift!" var aStr = "Hello World!" str.startIndex // first index str.endIndex // end index str.startIndex.distanceTo(str.endIndex) // string length str.characters.count // string length str[str.startIndex.advancedBy(4)] // get character at index 4 str[str.startIndex.advancedBy(4)...str.startIndex.advancedBy(8)] // get characters in range index 4 to 8 str.characters.last // retrieve last letter str.characters.first // retrieve first letter str.removeAtIndex(str.characters.indices.first!) // remove first letter str.removeAtIndex(str.characters.indices.last!) // remove last letter let first = str.characters.dropFirst() String(first) // dropFirst let last = str.characters.dropLast() String(last) // dropLast "aeiou".characters.contains("a") // contains character str.characters.filter{!"aeiou".characters.contains($0)} // remove vowels str.characters.indices // retrieve the Range value for string str.isEmpty // test whether there is anything in the string str.startIndex.advancedBy(5) // advance index str.substringToIndex(str.startIndex.advancedBy(5)) // returns string up to the 5th character str.substringFromIndex(str.startIndex.advancedBy(5)) // returns string from the 5th character min("antelope","ant") // returns the alphabetical first max("antelope","ant") // returns the alphabetical last str.characters.prefix(5) // returns first 5 characters str.characters.reverse() // return reverse array of Characters str.characters.suffix(5) // returns last 5 characters swap(&str, &aStr) // swaps two strings for the value of one another
Reversing a string
Things start to get interesting when you start to combine these algorithms, like sovar str = "hello" var revStr = "" for i in str { revStr.append(last(str)!) str = dropLast(str) } revStr // "olleh"
Update: Swift 2 (reversing a string)
var str = "Hello Swift!" String(str.characters.reverse()) // "!tfiwS olleH"in this process for reversing a string.
rangesOfString:
It is also possible to go further and remove the need for Cocoa Framework methods like rangeOfString. And here I'm doing something very similar by retrieving the ranges of a stringextension String { func rangesOfString(findStr:String) -> [Range<String.Index>] { var arr = [Range<String.Index>]() var startInd = self.startIndex // check first that the first character of search string exists if contains(self, first(findStr)!) { // if so set this as the place to start searching startInd = find(self,first(findStr)!)! } else { // if not return empty array return arr } var i = distance(self.startIndex, startInd) while i<=count(self)-count(findStr) { if self[advance(self.startIndex, i)..<advance(self.startIndex, i+count(findStr))] == findStr { arr.append(Range(start:advance(self.startIndex, i),end:advance(self.startIndex, i+count(findStr)))) i = i+count(findStr) } else { i++ } } return arr } } // try further optimisation by jumping to next index of first search character after every find "a very good hello, hello".rangesOfString("hello”)using a String extension written entirely in Swift with no added Cocoa.
Update: Swift 2 implementation (rangesOfString:)
Note that not only are instance methods now used as opposed to functions, but find() is replaced by indexOf().extension String { func rangesOfString(findStr:String) -> [Range<String.Index>] { var arr = [Range<String.Index>]() var startInd = self.startIndex // check first that the first character of search string exists if self.characters.contains(findStr.characters.first!) { // if so set this as the place to start searching startInd = self.characters.indexOf(findStr.characters.first!)! } else { // if not return empty array return arr } var i = self.startIndex.distanceTo(startInd) while i<=self.characters.count-findStr.characters.count { if self[self.startIndex.advancedBy(i)..<self.startIndex.advancedBy(i+findStr.characters.count)] == findStr { arr.append(Range(start:self.startIndex.advancedBy(i),end:self.startIndex.advancedBy(i+findStr.characters.count))) i = i+findStr.characters.count } else { i++ } } return arr } } // try further optimisation by jumping to next index of first search character after every find "a very good hello, hello".rangesOfString("hello")
Update: Swift 3 implementation (rangesOfString:)
Note that not only are instance methods now used as opposed to functions, but find() is replaced by indexOf().extension String { func rangesOfString(findStr:String) -> [Range<String.Index>] { var arr = [Range<String.Index>]() var startInd = self.startIndex // check first that the first character of search string exists if self.characters.contains(findStr.characters.first!) { // if so set this as the place to start searching startInd = self.characters.index(of:findStr.characters.first!)! } else { // if not return empty array return arr } var i = self.distance(from:startIndex, to: startInd) while i<=self.characters.count-findStr.characters.count { if self[self.index(startIndex,offsetBy:i)..<self.index(startIndex, offsetBy: i+findStr.characters.count)] == findStr { arr.append(self.index(startIndex,offsetBy:i)..<self.index(startIndex, offsetBy:i+findStr.characters.count)) i = i+findStr.characters.count } else { i += 1 } } return arr } } // try further optimisation by jumping to next index of first search character after every find let a = "a very good hello, hello" for sub in a.rangesOfString(findStr: "hello") { a.substring(with: sub) }Now let's suppose we wanted to identify ranges of strings so that we might add an attribute to an NSMutableAttributedString and that the string in question contained emoji. Immediately we have an issue because an NSString is UTF-16 and counts emoji as the number of UTF-16 characters within in them, whereas a Swift String counts emoji as single characters.
The simplest solution is to cast to an NSString before searching for substrings, so that NSRanges are returned based on UTF-16 measurement but if we wanted to work in Swift and return ranges that would be compatible when transformed into NSRange values, then we should do the following:
extension String { func rangesOfUTF16String(findStr:String) -> [Range<String.UTF16View.Index>] { var arr = [Range<String.UTF16View.Index>]() var startInd = self.utf16.startIndex // check first that the first character of search string exists if self.characters.contains(findStr.characters.first!) { // if so set this as the place to start searching startInd = self.utf16.index(of:findStr.utf16.first!)! } else { // if not return empty array return arr } var i = self.utf16.distance(from:self.utf16.startIndex, to: startInd) while i<=self.utf16.count-findStr.utf16.count { if String(self.utf16[self.utf16.index(self.utf16.startIndex,offsetBy:i)..<self.utf16.index(self.utf16.startIndex, offsetBy: i+findStr.utf16.count)]) == findStr { arr.append(self.utf16.index(self.utf16.startIndex,offsetBy:i)..<self.utf16.index(self.utf16.startIndex, offsetBy:i+findStr.utf16.count)) i = i+findStr.utf16.count } else { i += 1 } } return arr } } // try further optimisation by jumping to next index of first search character after every find let a = "a very good hello,😘 hello" var attrString = NSMutableAttributedString(string: a) for sub in a.rangesOfUTF16String(findStr: "hello") { let loc = attrString.string.utf16.distance(from: attrString.string.utf16.startIndex, to: sub.lowerBound) let len = attrString.string.utf16.distance(from: sub.lowerBound, to: sub.upperBound) let nsRange = NSRange(location: loc, length: len) attrString.addAttribute(NSForegroundColorAttributeName, value: UIColor.red, range: nsRange) } attrStringThis leverages UTF16View to make sure our range values are correct for the purposes of transformation.
Excellence! This gave me the clarity I needed to write some substring extensions. Great job.
ReplyDeleterangesOfString() above assumes literal string equivalence. This ignores canonically equivalence: http://en.wikipedia.org/wiki/Unicode_equivalence
ReplyDeleteIs it possible for you to provide an example of an input where this falls down so that I might test and explore revisions?
DeleteHow does your rangeOfString extension compare in terms of performance to the native ObjC/Cocoa one?
ReplyDeleteA great question. I shall revisit and test, then publish results.
DeleteYour code has a bug on this line:
ReplyDeletei++
It should be:
else {i++}
Your code fails on this case
"hellohello".rangesOfString("hello")
You are absolutely right. Thanks for pointing this out. I've updated the code accordingly.
DeleterangesOfString is a lifesaver, thanks!
ReplyDeleteThanks for your comment, I'm glad it's helped you.
DeleteFirst thx for making this post, always appreciated. Second I noticed some swift 2.0 inconsistencies. Im running the latest Xcode beta (7.1). I find that myString.characters.prefix(idx) works while prefix(myString.characters, 5) seems to be depracated.. Greg
ReplyDeleteThanks Greg, I've now updated all code to work with the release version of Xcode 7. Note: I haven't added any of the newly available String methods.
DeleteThank you! Great help. I especially appreciate the updates for Swift 2 .
ReplyDeletecan you explain how could i use this
ReplyDeleterangeOfCharacters(from: options: range: )
To quote the docs, it "Finds and returns the range in the String of the first character from a given character set found in a given range with given options." For example, if you pass this method a .decimalDigits character set then it will return the range of the first decimal digit. If you had a string like "Table 12", the method would return the range of the "1".
DeleteHere's a coded example:
Deletefunc removePrecedingZeroes(str:String) -> String? {
var aSet = CharacterSet.decimalDigits
aSet.remove("0")
aSet.insert(".")
if let range = aString.rangeOfCharacter(from: aSet)?.lowerBound {
return aString.substring(from: range)
}
return nil
}
var aString = "000000000000123454"
removePrecedingZeroes(str: aString) // "123454"
I'm getting a "No '..<' candidates produce the expected contextual result type 'Range'" in these lines of code -> arr.append(self.index(startIndex,offsetBy:i)..<self.index(startIndex, offsetBy:i+findStr.characters.count)) in Swift 3.
ReplyDeleteThanks in advance!