Swift and EPUB: Font obfuscation (updated)


There are some basic ingredients required to perform EPUB font obfuscation:
  • first we need to be able to generate a SHA-1 digest from a unique identifier string,
  • second we need to transform the hexadecimal string (40 characters in length) into a UInt8 byte array of 20 bytes
  • third we need to apply the IDPF algorithm, which is provided in pseudo code

Generating a SHA-1 Digest

The first obstacle is the generation of a SHA-1 digest from a String, because while there is a framework called CommonCrypto, there are some hoops to jump through when using it with Swift. To avoid these hoops I turned to a JavaScript library called CryptoJS and ran it in a virtual machine thanks to the JavaScriptCore framework.

It sounds like more work but in fact there's not much more than a few lines of code to get this working:
import Foundation
import JavaScriptCore

public struct Crypto {
    public static func sha1(str:String) -> String? {

        if let url = NSBundle.mainBundle().URLForResource("sha1", withExtension: "js"),
                    js = String(contentsOfURL: url, encoding: NSUTF8StringEncoding, error: nil) {
            
            // First a context and JS virtual machine is created
            let context = JSContext(virtualMachine: JSVirtualMachine())
            
            // Next we send the context the script
            let val = context.evaluateScript(js)
            
            // generate JSValue
            let a:JSValue = context.evaluateScript("CryptoJS.SHA1('\(str)')")
            return a.toString()
        
    }
        else {
            return nil
        }
}
}
And the only thing to do in order to get this up and running in a Playground, aside from the add code, is to copy the CryptoJS sha1.js file into the Resources folder.

String to Byte Array: Updated

In order to convert the String to a byte array I originally borrowed some code from StackOverflow and combined this with a stepping through of the string two characters at a time. But it wasn't the most efficient approach, because I was converting to NSData and then back to bytes. So I stripped out that code and replaced it with this:
 
func sha1data(str:String) -> [UInt8] {
    var keydata = ""
    var dataArray = [UInt8]()
    if let crypto = Crypto.sha1(str) {
        keydata = crypto
        for i in stride(from: 0, to: count(keydata), by: 2) {
            var str = "0x\(first(keydata)!)"
            keydata.removeAtIndex(keydata.startIndex)
            str.append(first(keydata)!)
            keydata.removeAtIndex(keydata.startIndex)
            dataArray.append(UInt8(strtod(str,nil)))
        }
    }
    return dataArray
}
which was made possible by the observation, on StackOverflow once again, that strtod can convert a hexadecimal string to a Double. I'm not sure this is the cleanest of code I can arrive at, but it strips out a good number of lines of code that I previously had and also some unnecessary computations.

Now with the initial hurdle passed we can approach the IDPF's font obfuscation algorithm.

Pseudo code converted to Swift

The pseudo code linked to above is straightforward enough, although as with all pseudo code I've encountered there was a certain sense that it made assumptions that weren't always clear. But I confess here that I've not worked at a byte level before and have only the knowledge that I've gathered from writing the Bytes for Beginners series of posts to assist me, so no doubt a good deal of the fault here was mine.
 public func obfuscateFontIDPF(data:NSData, key:String) -> NSData {
    // convert string to data
    // now do obfuscation
    let source = data
    var destination = [UInt8]()
    let keyData =  sha1data(key)
    var arr = [UInt8](count: source.length, repeatedValue: 0)
    source.getBytes(&arr, length:source.length)
    arr.count
    var outer = 0
    while outer < 52 && arr.isEmpty == false {
        var inner = 0
        while inner < 20 && arr.isEmpty == false {
            let byte = arr.removeAtIndex(0)      //Assumes read advances file position
            let sourceByte = byte
            let keyByte = keyData[inner]
          //  println(keyByte)
            let obfuscatedByte = (sourceByte ^ keyByte)
            destination.append(obfuscatedByte)
            inner++
        }
        
        outer++
    }
    
    destination.extend(arr)
    let newData = NSData(bytes: &destination, length: count(destination)*sizeof(UInt8))
    arr.removeAll(keepCapacity: false)
    return newData
}
But despite these limitations, which in particular made me uncertain of whether using UInt8 arrays was the right thing to do throughout, I successfully wrote this code and managed to de-obfuscate fonts in an EPUB created by InDesign. This I take as a sign of success because there is no accidental way to recreate an obfuscated font. If it returns from its mangled state into something that clearly works then the work is decisively done.

Troublesome troubleshooting

An EPUB uses its dc:identifier (a unique identifier) to create the key for encryption but things are a bit foggy as to whether the prefix that InDesign uses, i.e. urn:uuid:, is part of the encryption key. This is because Adobe used to have their own non-IDPF approach. But I can confirm that the prefix is part of the key.

Pipes, FileHandles, Tasks and Streams

In this project I've simply loaded NSData into a byte array, ignoring the length of the file but I am aware that there exists the classes of NSStream, NSPipe, NSFileHandle and NSTask. (At least NSTask exists on OS X.) And that it is perhaps not the best thing to load the data all into memory at the same time.

These classes are therefore something that I will consider looking into and identifying how it is possible to read from one file and write the altered data to another in chunks rather than loading it all into memory at the same time.

Conclusion

If you'd like to play with this code, you'll find it as part of the Swiftography repo.


Comments