What does the demise of bitcode mean for the future of application security?
Andrew Whaley, senior technical director at Promon, explains the problem with Apple’s Xcode update.
TechRepublic published this article on July 28, the original article can be found here.
For app developers, Low-Level Virtual Machine bitcode has been a staple of Apple’s toolchain and the Android Native Development Kit for the past seven years. With the release of the Xcode 14 beta, soon to become the standard for iOS and MacOS development from this year, Apple has deprecated the option to build bitcode apps.
For the application security industry, who have largely designed and integrated their approach to code obfuscation around bitcode, this has vast ramifications. Unless security vendors adapt, in the not-too-distant future many apps may face a gaping hole in their security.
What is code obfuscation?
Code obfuscation is a powerful technique for protecting code and an essential part of application security products. The idea behind obfuscation is to modify an executable file so that it is no longer transparent to a hacker but still remains fully functional.
When done effectively, obfuscation makes reverse-engineering a program extremely difficult and is therefore used to protect sensitive intellectual property. For instance, obfuscation can be used to hide an algorithm that a company doesn’t want competitors to understand — most notably to protect security code.
In the field of app shielding, we use a number of tools to enforce a safe environment for apps to operate within. This includes things like hook detection, anti-debug and anti-tampering, all of which are ironically vulnerable to tampering or removal unless well hidden. Obfuscation is therefore used to protect these tools.
Obfuscation can be inserted at three different levels: The source based level, the native binary based level and by far the most dominant approach, the intermediate level. Between many compilers and the native code is an intermediate layer where optimizations are done.
Low-Level Virtual Machine is the best known example of this. LLVM is a set of compiler and toolchain technologies that can be used to develop a front-end for any programming language and a back-end for any instruction set architecture. LLVM is useful because it allows compilers such as Clang or Rustc to target different backends such as Linux on X86_64, armv7, iOS and Windows. If an obfuscator can operate at this level, it’s the easiest to build and maintain because it’s not tied to either the front-end compiler language or the back-end machine instruction set.
However, there is one downside: It is often tied to the toolchain. For apps on iOS and MacOS, those obfuscating at the intermediate level are subject to any changes or major overhauls to Apple’s integrated software development — such as Xcode 14.
What is bitcode?
Bitcode is a serialized version of LLVM’s Intermediate Representation.
A large reason for LLVM’s popular usage in app development, and therefore bitcode’s, is that it’s open source and available to everybody. This has led to many vendors creating obfuscators that operate on bitcode. The advantage for them is that they too can also target many back-ends and also typically several front-ends. The fact that the LLVM libraries also provide all the APIs necessary for manipulating the bitcode has further contributed to its dominance.
Apple has previously made use of bitcode within its toolchain because it had several CPU architectures to support this such as Intel, arm32 and arm64. Apple has even mandated in some cases that apps have to be submitted in bitcode format — not machine code. This has allowed Apple to do the final stage lowering to the machine code for the particular device to be installed on.
How is bitcode affected by future Xcode releases?
Apple has now reached a point where all of its new hardware uses arm64 and no longer requires the flexible back-ends provided by LLVM. Notably, at the WWDC 2022 keynote, there was mention of being able to better optimize purely for that architecture, which hints that the LLVM intermediate layer may be no longer used for that purpose in the future.
This has led to a major overhaul in the form of the Xcode 14 beta, where Apple has deprecated the option to build bitcode apps. Developers for iOS and MacOS can still use bitcode with a warning, but this will later be removed. Essentially, it’s now no longer as easy to produce bitcode apps.
Why does this matter, and who’s impacted?
Future Xcode releases may now prevent security vendors from using bitcode. Obfuscation vendors typically take two approaches to bitcode obfuscation that will be impacted differently.
The first approach is app obfuscation, where the obfuscator acts on the whole app in bitcode format, post-build, as an IPA or Xcarchive file. This is a great approach because it means that the obfuscator doesn’t need to be tightly integrated into the toolchain and obfuscations can work on the whole app rather than individual modules at a time.
The second is a toolchain-integrated approach where the obfuscator replaces or patches components in the Apple toolchain to ensure that it gets called during the build process. This can cause maintenance problems, but typically this is a lightweight integration by creating wrappers around the existing clang compiler.
The first approach is effectively now deprecated. Vendors using this are likely to continue their work (with warnings) for at least another year. However, this method will probably be prevented in Xcode 15 or 16.
The second approach could also be on shaky ground going forward, since we don’t know whether Apple will remove LLVM or prevent access to it in the compiler at some point — potentially even without warning. All vendors that currently use a LLVM-based obfuscator for iOS and MacOS app protection will be impacted by this change.
What does this mean for the future of application security?
Ultimately, LLVM will become less useful and possibly disappear altogether as Apple seeks to leverage its unified architecture for CPU, GPU and ML accelerators. Xcode 14 already contains toolchain components competing with LLVM for this. If LLVM disappears, then going forward, Apple’s platforms could become much harder to protect and therefore fewer vendors will have products available to do that.
It’s entirely possible this shake-up may well compromise the security of many of the apps on the App Store. Whether this happens or not will depend on the adaptability of security vendors. Those using a toolchain-integrated approach will be fine for the time being, but they run the risk that this approach could be closed off without warning in the future.
What is likely is that we will see an increase in the native binary based approach to obfuscation. The key difference being this approach to obfuscation is where the built machine code is directly manipulated. There aren’t many obfuscators that currently use this method as it’s particularly difficult to do and may need to support lots of binary formats and/or CPU instruction sets.
In any case, while the future of code obfuscation may be uncertain, one thing is for sure — app developers will need to take a proactive approach, watching security vendors and planning accordingly if they want to ensure their apps remain secure.
Andrew Whaley is the Senior Technical Director at Promon, a Norwegian app security company. With his vast experience in penetration testing, application hardening, code obfuscation, cryptography and blockchain, Andrew leads Promon’s R&D team in enhancing the company’s core product suite with new security capabilities.