Robert Cooper and Keith Marzullo. 1991. Consistent detection of global predicates. In Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging (PADD ‘91). ACM, New York, NY, USA, 167-174.
Adve, Sarita V., and Hans-J. Boehm. “Memory models: a case for rethinking parallel languages and hardware.” Communications of the ACM 53.8 (2010): 90-101.
Arora, Nimar S., Robert D. Blumofe, and C. Greg Plaxton. “Thread scheduling for multiprogrammed multiprocessors.” Theory of Computing Systems 34.2 (2001): 115-144.
Blelloch, Guy E., Phillip B. Gibbons, and Yossi Matias. “Provably efficient scheduling for languages with fine-grained parallelism.” Journal of the ACM (JACM) 46.2 (1999): 281-321.
Blumofe, Robert D., and Charles E. Leiserson. “Scheduling multithreaded computations by work stealing.” Journal of the ACM (JACM) 46.5 (1999): 720-748.
Gast, Nicolas & Khatiri, Mohammed & Trystram, Denis & Wagner, Frederic. (2018). A new analysis of Work Stealing with latency.
Herlihy, Maurice, and Zhiyu Liu. “Well-structured futures and cache locality.” ACM SIGPLAN Notices. Vol. 49. No. 8. ACM, 2014.
V. Kumar, K. Murthy, V. Sarkar and Y. Zheng, “Optimized Distributed Work-Stealing,” 2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3), Salt Lake City, UT, 2016, pp. 74-77.
Muller, Stefan K., and Umut A. Acar. “Latency-hiding work stealing: Scheduling interacting parallel computations with work stealing.” Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 2016.
Sangho Lee, Teresa Johnson, and Easwaran Raman. 2014. Feedback directed optimization of TCMalloc. In Proceedings of the workshop on Memory Systems Performance and Correctness (MSPC ‘14). ACM, New York, NY, USA, Article 3, 8 pages.
Ghemawat, Sanjay, and Paul Menage. “Tcmalloc: Thread-caching malloc.” (2009).
Elias, Diego, et al. “Experimental and theoretical analyses of memory allocation algorithms.” Proceedings of the 29th annual acm symposium on applied computing. ACM, 2014.
Manghwani, Rahul, and Tao He. “Scalable memory allocation.” (2011): 1-13.
Chernikov, Andrey N., et al. “Experience with memory allocators for parallel mesh generation on multicore architectures.” 10th International Conference on Numerical Grid Generation in Computational Field Simulations, Forth, Crete, Greece. 2007.
Karlin, Ian, and Mike Collette. “Strong scaling bottleneck identification and mitigation in Ares.” Nuclear Explosive Code Development Conference Proceedings (NECDC14). 2015.
Macêdo, Rivalino Matias Taís Borges Autran, and Lúcio Borges. “A Comparative Study on Memory Allocators in Multicore and Multithreaded Applications.”
Svilen Kanev, Sam Likun Xi, Gu-Yeon Wei, and David Brooks. 2017. Mallacc: Accelerating Memory Allocation. SIGOPS Oper. Syst. Rev. 51, 2 (April 2017), 33-45.
Rebecca Smith and Scott Rixner. 2017. A policy-based system for dynamic scaling of virtual machine memory reservations. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC ‘17). ACM, New York, NY, USA, 282-294.
Bradley C. Kuszmaul. 2015. SuperMalloc: a super fast multithreaded malloc for 64-bit machines. SIGPLAN Not. 50, 11 (June 2015), 41-55.
Bacon, David F., Perry Cheng, and V. T. Rajan. “A unified theory of garbage collection.” ACM SIGPLAN Notices 39.10 (2004): 50-68.
Raul R. Wilson. 1992. Uniprocessor Garbage Collection Techniques. In Proceedings of the International Workshop on Memory Management (IWMM ‘92), Yves Bekkers and Jaques Cohen (Eds.). Springer-Verlag, London, UK, UK, 1-42.
Pekka P. Pirinen. 1998. Barrier techniques for incremental tracing. In Proceedings of the 1st international symposium on Memory management (ISMM ‘98). ACM, New York, NY, USA, 20-25.
T. Yuasa. 1990. Real-time garbage collection on general-purpose machines. J. Syst. Softw. 11, 3 (March 1990), 181-198.
Edsger W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. 1978. On-the-fly garbage collection: an exercise in cooperation. Commun. ACM 21, 11 (November 1978), 966-975.