- get rid of the attempt to use the backend methods in the cell logger, too many deadlock issues - just use pytorch for now if available and remove the pointless abstraction (needs more cleanup)
- completely rewrite general RAM reporting as I discovered tracemalloc can't track non-python application, so general RAM reporting was completely invalid at times. Switch to the same approach as pynvml tracking, with a thread to monitor peak memory usage.
- simplify the logic of how gpu peak memory is calculated
- fix broken tests now that RAM is reported correctly
- improve logging format