Wingolog blog posts are always such a joy to read (for me at least). The combination of low-level fundamental stuff presented in a simple and accessible way that gets the core idea across is a rare quality.
An immediate question I have is whether one could extend compacting GCs to also "sort" or "cluster" the live objects for better cache access in some way (in a practical sense I mean, where the extra runtime overhead is worth it). Also what kind clustering would actually work in that case.
I'm guessing naively tracking how often each object is accessed in a certain window of time or memory accesses an object wouldn't work since the most accessed objects could be used at completely different times - some sort of "accessed together" algorithm makes more sense. But then you go into combinatorial questions, which has an awkward tendency to explode.
Maybe something really stupid would work:
1. use a single global integer
that increases by one with
each heap memory access,
2. assign the counter's value to
GC metadata of the accessed
object in question as a sort
of "time stamp"
3. during compacting, "sort"
all living objects by whatever
they have as their value
(this is probably really hard)
4. after compacting, reset counter
and time stamps
Worst case this would put objects right next to at least two objects that were accessed around the same time.
I presume that this question must have been researched to death, does anyone have any suggestions?
This approach makes sense if you have access to either all available memory or a fixed quantity of memory, half of which would go to the fromspace and the other half to the tospace. What I don't understand is: When should the heap be made bigger? What about smaller?
Making it bigger requires a stop the world, and moving some objects.
You'll never need to make it smaller, but this requires the same efforts.
But good to see someone finally explaining the simplest and fastest GC, because the objects are compacted, usually in the same cache line. MMTk is way underused
Wingolog blog posts are always such a joy to read (for me at least). The combination of low-level fundamental stuff presented in a simple and accessible way that gets the core idea across is a rare quality.
An immediate question I have is whether one could extend compacting GCs to also "sort" or "cluster" the live objects for better cache access in some way (in a practical sense I mean, where the extra runtime overhead is worth it). Also what kind clustering would actually work in that case.
I'm guessing naively tracking how often each object is accessed in a certain window of time or memory accesses an object wouldn't work since the most accessed objects could be used at completely different times - some sort of "accessed together" algorithm makes more sense. But then you go into combinatorial questions, which has an awkward tendency to explode.
Maybe something really stupid would work:
Worst case this would put objects right next to at least two objects that were accessed around the same time.I presume that this question must have been researched to death, does anyone have any suggestions?
If you like reading about GC implementation than keep reading wingo's blog. He's currently being funded by NLnet to finish the Whippet GC library.
This approach makes sense if you have access to either all available memory or a fixed quantity of memory, half of which would go to the fromspace and the other half to the tospace. What I don't understand is: When should the heap be made bigger? What about smaller?
Making it bigger requires a stop the world, and moving some objects.
You'll never need to make it smaller, but this requires the same efforts.
But good to see someone finally explaining the simplest and fastest GC, because the objects are compacted, usually in the same cache line. MMTk is way underused