I started testing the inplace import functionality using hard links. Worked great with just a few files but then I got ambitious and started the import of a set of 8 ScanR-acquired plates (with 700 "wells"/images each) with a shell script as follows:
~omero_user/OMERO.server/bin/omero -s localhost -u sudar import -- --transfer=ln -r 51 -n LI8V00101_A01_04132015_PC3_PBS LI8V00101_A01_04132015_PC3_PBS_001
~omero_user/OMERO.server/bin/omero -s localhost -u sudar import -- --transfer=ln -r 51 -n LI8V00101_A02_04132015_PC3_IGF LI8V00101_A02_04132015_PC3_IGF_001
~omero_user/OMERO.server/bin/omero -s localhost -u sudar import -- --transfer=ln -r 51 -n LI8V00101_A03_04132015_PC3_bFGF LI8V00101_A03_04132015_PC3_bFGF_001
~omero_user/OMERO.server/bin/omero -s localhost -u sudar import -- --transfer=ln -r 51 -n LI8V00101_A04_04132015_PC3_10FBS LI8V00101_A04_04132015_PC3_10FBS_001
~omero_user/OMERO.server/bin/omero -s localhost -u sudar import -- --transfer=ln -r 51 -n LI8V00101_B01_04132015_PC3_PBS LI8V00101_B01_04132015_PC3_PBS_001
~omero_user/OMERO.server/bin/omero -s localhost -u sudar import -- --transfer=ln -r 51 -n LI8V00101_B02_04132015_PC3_bFGF LI8V00101_B02_04132015_PC3_bFGF_001
~omero_user/OMERO.server/bin/omero -s localhost -u sudar import -- --transfer=ln -r 51 -n LI8V00101_B03_04132015_PC3_10FBS LI8V00101_B03_04132015_PC3_10FBS_001
~omero_user/OMERO.server/bin/omero -s localhost -u sudar import -- --transfer=ln -r 51 -n LI8V00101_B04_04132015_PC3_EGF LI8V00101_B04_04132015_PC3_EGF_001
First plate went fine and then halfway through the 2nd plate I got the error:
2015-04-24 19:40:51,830 166484 [ main] INFO .importer.transfers.HardlinkFileTransfer - Transferring /data/share/lincs_user/LI8V001/LI8V00101/LI8V00101_A02_04132015_PC3_IGF_001/data/--W00257--P00001--Z00000--T00000--Alexa 555.tif...
2015-04-24 19:40:51,862 166516 [ main] INFO ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_STARTED: /data/share/lincs_user/LI8V001/LI8V00101/LI8V00101_A02_04132015_PC3_IGF_001/data/--W00257--P00001--Z00000--T00000--Alexa 555.tif
2015-04-24 19:40:51,884 166538 [ main] ERROR ome.formats.importer.cli.ErrorHandler - FILE_EXCEPTION: /data/share/lincs_user/LI8V001/LI8V00101/LI8V00101_A02_04132015_PC3_IGF_001/data/--W00257--P00001--Z00000--T00000--Alexa 555.tif
omero.ResourceError: null
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.7.0_75]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) ~[na:1.7.0_75]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.7.0_75]
at java.lang.reflect.Constructor.newInstance(Constructor.java:526) ~[na:1.7.0_75]
at java.lang.Class.newInstance(Class.java:379) ~[na:1.7.0_75]
at IceInternal.BasicStream.createUserException(BasicStream.java:2615) ~[ice.jar:na]
at IceInternal.BasicStream.access$300(BasicStream.java:12) ~[ice.jar:na]
at IceInternal.BasicStream$EncapsDecoder10.throwException(BasicStream.java:3099) ~[ice.jar:na]
at IceInternal.BasicStream.throwException(BasicStream.java:2077) ~[ice.jar:na]
at IceInternal.Outgoing.throwUserException(Outgoing.java:538) ~[ice.jar:na]
at omero.api._RawFileStoreDelM.close(_RawFileStoreDelM.java:466) ~[blitz.jar:na]
at omero.api.RawFileStorePrxHelper.close(RawFileStorePrxHelper.java:1877) ~[blitz.jar:na]
at omero.api.RawFileStorePrxHelper.close(RawFileStorePrxHelper.java:1839) ~[blitz.jar:na]
at ome.formats.importer.transfers.AbstractExecFileTransfer.checkLocation(AbstractExecFileTransfer.java:118) ~[blitz.jar:na]
at ome.formats.importer.transfers.AbstractExecFileTransfer.transfer(AbstractExecFileTransfer.java:63) ~[blitz.jar:na]
at ome.formats.importer.ImportLibrary.uploadFile(ImportLibrary.java:430) [blitz.jar:na]
at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:503) [blitz.jar:na]
at ome.formats.importer.ImportLibrary.importCandidates(ImportLibrary.java:287) [blitz.jar:na]
at ome.formats.importer.cli.CommandLineImporter.start(CommandLineImporter.java:245) [blitz.jar:na]
at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:858) [blitz.jar:na]
2015-04-24 19:40:51,888 166542 [ main] ERROR ome.formats.importer.ImportLibrary - Error on import: Cannot find path. Deleted? /data/OMERO/ManagedRepository/sudar_2/2015-04/24/19-38-50.012/data/--W00257--P00001--Z00000--T00000--Alexa 555.tif
omero.ResourceError: null
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.7.0_75]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) ~[na:1.7.0_75]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.7.0_75]
at java.lang.reflect.Constructor.newInstance(Constructor.java:526) ~[na:1.7.0_75]
at java.lang.Class.newInstance(Class.java:379) ~[na:1.7.0_75]
at IceInternal.BasicStream.createUserException(BasicStream.java:2615) ~[ice.jar:na]
at IceInternal.BasicStream.access$300(BasicStream.java:12) ~[ice.jar:na]
at IceInternal.BasicStream$EncapsDecoder10.throwException(BasicStream.java:3099) ~[ice.jar:na]
at IceInternal.BasicStream.throwException(BasicStream.java:2077) ~[ice.jar:na]
at IceInternal.Outgoing.throwUserException(Outgoing.java:538) ~[ice.jar:na]
and it aborted this plate. But then it aborted the next plate and next and etc... on the first file it encountered in the plate with the same exact error message.
When looking at master.err, I saw at almost the same timestamp (but ~2 minutes later ????) the following messages that appear to coincide with the aborts:
Apr 24, 2015 7:42:17 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:42:17 PM java.util.prefs.FileSystemPreferences syncWorld
WARNING: Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
Apr 24, 2015 7:42:24 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:42:47 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:42:47 PM java.util.prefs.FileSystemPreferences syncWorld
WARNING: Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
Apr 24, 2015 7:45:17 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:45:17 PM java.util.prefs.FileSystemPreferences syncWorld
WARNING: Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
Apr 24, 2015 7:45:47 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:45:47 PM java.util.prefs.FileSystemPreferences syncWorld
WARNING: Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
Apr 24, 2015 7:45:49 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:48:17 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:48:17 PM java.util.prefs.FileSystemPreferences syncWorld
WARNING: Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
Apr 24, 2015 7:48:47 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:48:47 PM java.util.prefs.FileSystemPreferences syncWorld
WARNING: Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
Apr 24, 2015 7:54:17 PM java.util.prefs.FileSystemPreferences checkLockFile0ErrorCode
WARNING: Could not lock User prefs. Unix error code 24.
Apr 24, 2015 7:54:17 PM java.util.prefs.FileSystemPreferences syncWorld
WARNING: Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
This is a CentOS 6.6 server running OMERO 5.1 and the filesystem is on local disk under ZFS.
Any pointers what this is likely caused by?
Unix error code 24 seems to point to "Too many open files" so should I increase the ulimit (or is OMERO 5.1 leaking file descriptors)? My ulimit and file-max settings are:
[inplace_user@lincs LI8V00101]$ ulimit -Hn
4096
[inplace_user@lincs LI8V00101]$ ulimit -Sn
1024
[inplace_user@lincs LI8V00101]$ cat /proc/sys/fs/file-max
6544206
Or might there be an issue with ZFS not providing proper file locking?
When after dinner I came back and ran the same plate that was the first to fail again, it imported just fine. So it wasn't anything specific to that plate.
Thanks for all advice.
- Damir