We're Hiring!

Serious problem with OMERO's DropBox

General user discussion about using the OMERO platform to its fullest. Please ask new questions at https://forum.image.sc/tags/omero
Please note:
Historical discussions about OMERO. Please look for and ask new questions at https://forum.image.sc/tags/omero

There are workflow guides for various OMERO functions on our help site - http://help.openmicroscopy.org

You should find answers to any basic questions about using the clients there.

Serious problem with OMERO's DropBox

Postby Sethur » Wed Dec 17, 2014 1:54 pm

Hi,

after a user complained to me about her images being declared as corrupt after download from the OMERO server via OMERO.web, I found a serious issue regarding the everyday lab workflow involving the DropBox: Apparently, many users stick to the following routine:

1) Take some images
2) "Save As" (e.g.) to create a LIF file on the DropBox folder (mounted via CIFS)
3) Meanwhile, the LIF file gets uploaded to OMERO
4) Take some more images into the same container
5) "Save" the LIF file again
6) The OMERO server detects a change in the size of the LIF file and re-uploads it under the same name. The first X images are now doubled)
7) The user repeats steps 4) to 6) an arbitrary amount of times and ends up with a confusing looking set of images.

To resolve this problem, OMERO could detect if the image that has just been saved is only an update to a previous version (i.e. the first images are absolutely identical) and only add the differences. This might probably be hard to implement, especially since the original RAW image file would have to be discarded, too, with every update.

An alternative would be to educate our users that they should only save or copy the absolute LAST version of their experiment onto the DropBox folder for upload, but they will probably not easily understand why this is necessary and therefore continue making these mistakes.

There is also the issue why some of the "intermediate" LIF files one of our users downloaded for analysis in the LASAF software were reported as corrupt. I will look into this in more detail, I have the suspicion that the DropBox thread uploaded only parts of those files in some cases (i.e. it left out the last few bytes.)

Cheers,

Tristan
Sethur
 
Posts: 112
Joined: Thu Jan 16, 2014 11:34 pm

Re: Serious problem with OMERO's DropBox

Postby jmoore » Wed Dec 17, 2014 2:53 pm

Hi Tristan,

if you find how which sequence of events is leading to corruption, we'd certainly look into taking steps to prevent that from happening, but in general the workflow you're outlining is simply not supported. From Using DropBox,
[DropBox] is ... intended as a write-once system.


I'd highly suggest the "last-only" workflow that you describe. Anything else will lead to a (perhaps partial) duplication in OMERO.

For your debugging, one possibility of what's happened is that one import was kicked off and running when a modification occurred which could easily cause what you're seeing. But not knowing the timings involved, that's really just a wild guess.

The warning above goes on to say: "Modifying an image after it has been imported may result in that modified image also being imported depending on the operating system and how the image was modified." Depending on what you find out, we can certainly expand that to include corrupting on-going imports as well.

Cheers,
~Josh
User avatar
jmoore
Site Admin
 
Posts: 1591
Joined: Fri May 22, 2009 1:29 pm
Location: Germany

Re: Serious problem with OMERO's DropBox

Postby Sethur » Thu Dec 18, 2014 9:31 am

Hi Josh,

thanks for the quick reply. Although it would be technically possible to check for updates if the file name didn't change and a re-upload was triggered in a certain time-frame, it's hard to implement that for all different file formats and merge all the changes that could have happened in the raw data file as well as replacing the old file with the new. For now, I will advise our users to use the DropBox only after they finished a certain experiment. Regarding the ideas you had about the corruption problem: Yes, I think that is easily possible. In one case where corruption occurred, the corrupted file was saved only 15 seconds before the next update.

Cheers,

Tristan
Sethur
 
Posts: 112
Joined: Thu Jan 16, 2014 11:34 pm

Re: Serious problem with OMERO's DropBox

Postby Sethur » Fri Dec 19, 2014 9:19 am

PS: After digging a little into the reason of the corrupted data, I found that there is another dimension to this issue:

- Our Dropbox import is done via in-place hard-linking
- If a user deletes a file on the Dropbox, the link still resides in the ManagedRepository so this is no problem
- However: If a user changes a file in the DropBox, a re-upload, i.e. in this case, another hardlink to the same file is created.
- As a consequence, a user might have OMERO db entries that point to a raw data file which no longer corresponds to these entries. In the best case scenario, there are only more images in the file than originally during the time of the import. In the worst case scenario, the user has meanwhile deleted images which were previously in that raw data file (and maybe added new ones as well).

I'm still unclear, however, why an export of any of the hardlinks (to the same file) might lead to a corrupted LIF (it can be opened alight in ImageJ but not in LASAF). Is there anything the OMERO.server does to the raw data files before export?

Could you also give me some insights on how OMERO would behave if there is a raw data file associated with an import that has more images than necessary with the original images still present?

Cheers,

Tristan
Sethur
 
Posts: 112
Joined: Thu Jan 16, 2014 11:34 pm

Re: Serious problem with OMERO's DropBox

Postby cblackburn » Mon Jan 05, 2015 11:19 am

Hi Tristan,

Happy new year! (And many apologies for the delay due to the holidays.)

Thanks for the further information on your users' workflow.

Sethur wrote:I'm still unclear, however, why an export of any of the hardlinks (to the same file) might lead to a corrupted LIF (it can be opened alight in ImageJ but not in LASAF). Is there anything the OMERO.server does to the raw data files before export?


I'm not certain at the moment what the cause is here. Are you seeing this corruption in all exported LIFs based on that one raw data file in the simple case when a user is progressively adding images to a file? Or is this a more complex case of images being removed from the file?

I would have expected the final DropBox-triggered import to represent a coherent Fileset and so produce a valid exported LIF when *that* imported image is exported. Would it be possible for you to test a simple case, a LIF with just one update following its initial import?

Sethur wrote:Could you also give me some insights on how OMERO would behave if there is a raw data file associated with an import that has more images than necessary with the original images still present?


This would certainly be format-dependent. I'll take a look at the LIF format and get back to you on this as soon as possible.

Cheers,

Colin
cblackburn
 
Posts: 85
Joined: Mon May 25, 2009 9:03 pm

Re: Serious problem with OMERO's DropBox

Postby cblackburn » Tue Jan 06, 2015 10:28 am

Hi Tistan,

Following up on this,

cblackburn wrote:Hi Tristan,
Sethur wrote:Could you also give me some insights on how OMERO would behave if there is a raw data file associated with an import that has more images than necessary with the original images still present?


This would certainly be format-dependent. I'll take a look at the LIF format and get back to you on this as soon as possible.


After checking with colleagues, from the structure of the files we think it's safe to expect that images are added serially. This should mean that any already-imported images will open okay. However, the rendering setting may be incorrect if the originally imported file was missing planes from any of the images.

Cheers,

Colin
cblackburn
 
Posts: 85
Joined: Mon May 25, 2009 9:03 pm


Return to User Discussion

Who is online

Users browsing this forum: Google [Bot] and 1 guest