Thanks to Matt's work on borrowed LOAD_FAST, we can now eliminate reference counting trivially in the JIT.
Reference counting is expensive, Matt found that eliminating 90% of refcounts in LOAD_FAST meant a 2-3% speedup in the interpreter. So the speedup for the JIT should be quite a bit too.
The other problem is that reference counts block register allocation/TOS caching. As they force spills to the stack more often.
This issue has the potential to speedup up JIT benchmarks by several percent.
Now that reference tracking in the JIT optimizer is in place, the first thing we need to do is to convert ops to make the decref an explicit uop.
For escaping decref ops (ie, ops that decref could run the GC), we would need to refactor them so their decrefs are eliminated via specialization of pops. For example: the following op (which is not an escaping op, but just purely for demonstration!):
macro(BINARY_OP_ADD_INT) =
_GUARD_TOS_INT + _GUARD_NOS_INT + unused/5 + _BINARY_OP_ADD_INT;
macro(BINARY_OP_ADD_INT) =
_GUARD_TOS_INT + _GUARD_NOS_INT + unused/5 + _BINARY_OP_ADD_INT + _POP_TOP_INT + _POP_TOP_INT;
Once that's done, we can think about further ops.
Feature or enhancement
Proposal:
Thanks to Matt's work on borrowed LOAD_FAST, we can now eliminate reference counting trivially in the JIT.
Reference counting is expensive, Matt found that eliminating 90% of refcounts in LOAD_FAST meant a 2-3% speedup in the interpreter. So the speedup for the JIT should be quite a bit too.
The other problem is that reference counts block register allocation/TOS caching. As they force spills to the stack more often.
This issue has the potential to speedup up JIT benchmarks by several percent.
How to contribute.
Now that reference tracking in the JIT optimizer is in place, the first thing we need to do is to convert ops to make the decref an explicit uop.
For escaping decref ops (ie, ops that decref could run the GC), we would need to refactor them so their decrefs are eliminated via specialization of pops. For example: the following op (which is not an escaping op, but just purely for demonstration!):
becomes
Previously _BINARY_OP_ADD_INT's stack effect looked like this:
(left, right -- res). The new version should look like(left, right -- res, left, right).So for all the decref specializations, we would just need a
_POP_Xof their variants! This means no explosion of uop and their decref variants. We just specialize_POP_Xto_POP_X_NO_DECREFin the JIT. Keeping things cleanThese are open for contributors to take:
Once that's done, we can think about further ops.
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
No response
Linked PRs
_CALL_TYPE_1#135818_CALL_TUPLE_1#135860_CALL_BUILTION_O#136056_CALL_STR_1#136070_CALL_ISINSTANCE#136077_CALL_LEN#136104_CALL_BUILTIN_O#142695_STORE_ATTR_SLOT#142729_STORE_ATTR_INSTANCE_VALUE#142759_STORE_ATTR_WITH_HINT#142767_LOAD_ATTR_INSTANCE_VALUE#142769BINARY_OP_SUBSCR_STR_INT#142844UNPACK_SEQUENCEfamily #142949_UNPACK_SEQUENCE_TWO_TUPLE#142952_LOAD_ATTR_WITH_HINT#143062_BINARY_OP_SUBSCR_TUPLE_INT#143094IS_OP#143171_COMPARE_OP_X#143186_LOAD_ATTR_SLOT#143320TO_BOOL_STR#143417_CONTAINS_{OP|OP_SET|OP_DICT}#143731_BINARY_OP_SUBSCR_LIST_SLICE#144659MATCH_CLASS#144821MAKE_FUNCTIONin the JIT #144963