pixeldata threads/repetitions for big TIFF pyramid workload
Posted: Mon Mar 02, 2015 7:18 pm
Hi,
We are bringing an OMERO instance into production use soon. It's on decent (virtualized) hardware with 128GB RAM available, 32 cores and fast 20TB storage for the repository. I'm a bit confused r.e. the best configuration settings to use for good pixeldata pyramid performance. Most of our imports will be large 50MPixel+ TIFF files, so there is a lot of pyramid creation happening.
At present I set omero.pixeldata.threads = 8 then during pyramid creation and I'm confused that I'm seeing CPU usage via 'top' of up to 1200% (more than 8 threads/processes), but the Pixeldata-0 log file only ever prints messages that seem to come from up to 5 threads:
We thought that given our expected workload (very large batches of TIFFs uploaded by multiple users concurrently) that omero.pixeldata.threads < cores and omero.pixeldata.repetitions > 1 would be useful, so it might split the processing between pending images from different users better - running multiple pixeldata pyramid batches when there's a backlog. I'm confused as to how repetitions truly works though. We don't seem to have any RAM worries here, given the 128GB available. The processes are always showing well under the heap size allocated.
I'd be grateful for any pointers on the best settings for this scenario.
Thanks,
Dave Trudgian
We are bringing an OMERO instance into production use soon. It's on decent (virtualized) hardware with 128GB RAM available, 32 cores and fast 20TB storage for the repository. I'm a bit confused r.e. the best configuration settings to use for good pixeldata pyramid performance. Most of our imports will be large 50MPixel+ TIFF files, so there is a lot of pyramid creation happening.
At present I set omero.pixeldata.threads = 8 then during pyramid creation and I'm confused that I'm seeing CPU usage via 'top' of up to 1200% (more than 8 threads/processes), but the Pixeldata-0 log file only ever prints messages that seem to come from up to 5 threads:
- Code: Select all
2015-03-02 13:08:29,174 INFO [ ome.io.nio.PixelsService] (2-thread-1) Pyramid creation for Pixels:793 1/819 (0%).
2015-03-02 13:08:52,518 INFO [ ome.io.nio.PixelsService] (2-thread-4) Pyramid creation for Pixels:803 82/819 (9%).
2015-03-02 13:08:54,320 INFO [ ome.io.nio.PixelsService] (2-thread-5) Pyramid creation for Pixels:815 82/819 (9%).
2015-03-02 13:08:55,224 INFO [ ome.io.nio.PixelsService] (2-thread-2) Pyramid creation for Pixels:827 82/819 (9%).
2015-03-02 13:08:55,978 INFO [ ome.io.nio.PixelsService] (2-thread-1) Pyramid creation for Pixels:793 82/819 (9%).
2015-03-02 13:09:15,767 INFO [ ome.io.nio.PixelsService] (2-thread-4) Pyramid creation for Pixels:803 163/819 (19%).
2015-03-02 13:09:17,576 INFO [ ome.io.nio.PixelsService] (2-thread-5) Pyramid creation for Pixels:815 163/819 (19%).
2015-03-02 13:09:18,780 INFO [ ome.io.nio.PixelsService] (2-thread-2) Pyramid creation for Pixels:827 163/819 (19%).
2015-03-02 13:09:20,772 INFO [ ome.io.nio.PixelsService] (2-thread-1) Pyramid creation for Pixels:793 163/819 (19%).
We thought that given our expected workload (very large batches of TIFFs uploaded by multiple users concurrently) that omero.pixeldata.threads < cores and omero.pixeldata.repetitions > 1 would be useful, so it might split the processing between pending images from different users better - running multiple pixeldata pyramid batches when there's a backlog. I'm confused as to how repetitions truly works though. We don't seem to have any RAM worries here, given the 128GB available. The processes are always showing well under the heap size allocated.
I'd be grateful for any pointers on the best settings for this scenario.
Thanks,
Dave Trudgian