That leaves two choice:
1. One can hand-out thread ids in advance (i.e. assign one to each possible callback) during a stage when one knows the code will have only one thread running
2. Use randomized mutual exclusion algorithms
Personally, I think option #1 is very hard to implement correctly, so option #2 is probably the most feasible. Basically, you randomly generate a thread id for yourself at any time (in fact, you can randomly generate a thread id every time you want to enter a mutex). Then, you can run a symmetric mutual exclusion protocol using atomic RW registers.
I found a paper that seems to give one such algorithm:
"Tight Bounds for Shared Memory Symmetric Mutual Exclusion Problems" by Eugene Styer and Gary L. Peterson.