Protocol checks
Example:
// 👇🏻 Protocol definition
protocol CustomLoggable {
var customLogString: String { get }
}
func log(value: Any) {
if let value = value as? CustomLoggable { // 👈🏻 protocol check
...
}
}Whenever possible, this check is optimized away at build time, in the compiler
However, we don’t always have enough information yet, so this check often needs to happen in the runtime, with the help of protocol check metadata
With this metadata, the runtime knows whether this particular object really does conform to the protocol
Part of the metadata is built at compile time, but a lot can only be built at launch time, particularly when using Swift generics
on apps that rely heavily in Swift, this could add up to half the launch time
New this year:
protocol checks are pre-computed as part of the dyld closure for the app executable and any dylib it uses at launch (a.k.a. this is done at app installation time)
enabled even for existing apps when running on iOS 16, tvOS 16, or watchOS 9.
(ObjC) Message send
with new compilers and linker in Xcode 14,message send calls are up to 8 bytes smaller, down from 12 bytes, on ARM64
binaries up to 2% smaller overall
enabled by Xcode 14, even when targeting an older OS release as deployment target
Defaults to balanced performance and size optimization
Opt into optimizing for size only using
-Wl,-objc_stubs_smalllinker flag
How it’s done: Selector stubs
Prior:
whenever we interact with objc models, we almost always end up needing an instruction to call
objc_msgSend, even when doing property accessesthis is because at compile time, we don’t know which method to call, therefore we ask objc runtime using
objc_msgSendto find the right method
NSCalendar *cal = [self makeCalendar]; // 👈🏻 compiler emits bl _objc_msgSend
NSDateComponents* dateComponents = [[NSDateComponents alloc] init]; // 2x 👈🏻 compiler emits bl _objc_msgSend
dateComponents.year = 2022; // 👈🏻 compiler emits bl _objc_msgSend
dateComponents.month = 6; // 👈🏻 compiler emits bl _objc_msgSend
dateComponents.day = 6; // 👈🏻 compiler emits bl _objc_msgSend
NSDate *theDate = [cal dateFromComponents:dateComponents]; // 👈🏻 compiler emits bl _objc_msgSend
return theDate;to tell the runtime which method to call, we have to pass a selector to these
objc_msgSendcalls. This needs two more instructions to prepare the selector, each of these instructions takes 4 bytes on ARM64.
Example:
// ObjC
NSDate *theDate = [cal dateFromComponents:dateComponents];
// 👇🏻 compiler emits
adrp x1 [selector "dateFromComponents"] // 👈🏻 4 bytes on ARM 64
ldr x1 [x1, selector "dateFromComponents"] // 👈🏻 4 bytes on ARM 64
bl _objc_msgSendin conclusion, for each
objc_msgSendcall, we’re using 12 bytes (4B for_objc_msgSend, 8B for the selector)
What’s new:
for any given selector, it’s always the same code, example from above:
adrp x1 [selector "dateFromComponents"] // 👈🏻 always the same instruction for dateFromComponents
ldr x1 [x1, selector "dateFromComponents"] // 👈🏻 always the same instruction for dateFromComponents
bl _objc_msgSendwe can share this same code and only emit it once per selector instead of at every message send
we do this via a helper function, and call that function instead
this function is called Selector Stub
Therefore we go from this:
// ObjC
NSDate *theDate = [cal dateFromComponents:dateComponents];
// 👇🏻 compiler emits
adrp x1 [selector "dateFromComponents"]
ldr x1 [x1, selector "dateFromComponents"]
bl _objc_msgSendto this:
// ObjC
NSDate *theDate = [cal dateFromComponents:dateComponents];
// 👇🏻 compiler emits
bl _objc_msgSend$dateFromComponents
// Where the _objc_msgSend$dateFromComponents Selector stub is defined ONCE per program
// Selector stub:
_objc_msgSend$dateFromComponents:
adrp x1, [selector "dateFromComponents"]
ldr x1, [x1, selector "dateFromComponents"]
b _objc_msgSendHowever _objc_msgSend still needs to jump to the actual message send code (the Call stub), therefore we’re doing two jumps in our machine code (one to go to the Selector stub, one to the Call stub)
Based on the optimization that we would like, we can merge the Selector and Call stub (making only one jump), or we can keep them separate. This is the difference between the default behavior (balanced performance and size optimization) and the optimized for size one (-Wl,-objc_stubs_small linker flag):
// Separate selector and symbol stubs
// Optimize for Size
// Enable using -Wl,-objc_stubs_small
bl _objc_msgSend$dateFromComponents // 👈🏻 jump 1
// Selector stub
_objc_msgSend$dateFromComponents:
adrp x1, [selector "dateFromComponents"]
ldr x1, [x1, selector "dateFromComponents"]
b _objc_msgSend // 👈🏻 jump 2
// Call stub
_objc_msgSend:
adrp ...
ldr ...
br ...vs.
// Combined selector and symbol stubs
// Balanced Size/Performance
// Enabled by default
bl _objc_msgSend$dateFromComponents // 👈🏻 jump 1
// Where the _objc_msgSend$dateFromComponents Selector stub is defined ONCE per program
// Selector stub:
_objc_msgSend$dateFromComponents:
adrp x1, [selector "dateFromComponents"]
ldr x1, [x1, selector "dateFromComponents"]
adrp ...
ldr ...
br ...Retain and release
retain/release calls are now up to 4 bytes smaller, down from 8 on ARM64
binaries up to 2% smaller overall
enabled by new compilers in Xcode 14
requires new runtime support, hence your deployment target must be iOS 16, tvOS 16, or watchOS 9
How it’s done
thanks to ARC (automatic reference counting), our code compiles into a lot of retain/release calls
whenever we make a copy of a pointer to an object, we need to increment its retain count to keep it live
we do that by calling into the runtime, using
objc_retainwhen our variables go out of scope, we then need to decrement the retain count using
objc_release
NSCalendar *cal = [self makeCalendar]; // 👈🏻 bl _objc_retain
NSDateComponents* dateComponents = [[NSDateComponents alloc] init]; // 👈🏻 bl _objc_retain
dateComponents.year = 2022;
dateComponents.month = 6;
dateComponents.day = 6;
NSDate *theDate = [cal dateFromComponents:dateComponents]; // 👈🏻 bl _objc_retain
return theDate;
// 👈🏻 bl _objc_release
// 👈🏻 bl _objc_release
// 👈🏻 bl _objc_releasein actuality, the compiler will do some optimizations that will avoid some of these calls
objc_retain/objc_releasefunctions are just plain C functionsthey take a single argument, the object to be released
with ARC, the compiler inserts calls to these C functions, passing the appropriate object pointers
Because of that, these calls have to respect the C calling convention, defined by our platform ABI (Application Binary Interface)
this results in extra
moveinstructions just for passing the pointer in the right register
Optimization:
By specializing retain/release with a custom calling convention, the system can opportunistically use the right variant depending on where the object pointer already is, meaning we don’t need the extra
moveinstructionsskipping these
moveinstructions saves the 4 bytes
Quicker Autorelease elision
enabled by ObjC runtime changes (automatically happens when running iOS 16, tvOS 16, watchOS 9, or macOS 13)
with additional compiler changes, the app binary is also smaller (requires deployment target must be iOS 16, tvOS 16)
What is Autorelease elision?
When we call a method that returns a value, we call
retainon the returned objecton the method’s body, we don’t call
releaseon the returned object, as that would release the object from memoryinstead we call
autorelease, so the method caller can retain it
// 👇🏻 retain call
myValue = [[someInstance aMethodReturningAValue] retain];
-(MyType *)aMethodReturningAValue {
...
return [newValue autorelease]; // 👈🏻 we don't release, we autorelease, so the caller (above)
// can retain it before the instance gets released.
}autoreleasetells the runtime that we’re returning an object that will immediately be retainedthanks to some improvements in both compiler and Objc runtime,
autoreleaseoverhead is now cheaper, and with clever use of pointers is faster as well
