Page 1 of 1

Make Dropbox watch a user's subfolder

PostPosted: Mon Mar 03, 2014 2:32 pm
by Sethur
Hi,

I tried to turn on Dropbox with our setup where every user gets a SAMBA 3 PDC enabled account on every Microscope Windows PC and my aim was to watch a certain subdirectory on the user's shared home folder.

In order to watch a user specific directory, I setup the relevant parts of etc/grid/template.xml as following:

Code: Select all
<property name="omero.fs.importUsers"  value="default"/>
<property name="omero.fs.watchDir"  value="/srv/home"/>


This seems to watch /srv/home/foobar when user foobar is logged on. I would, however, prefer to watch only a single directory (and all subfolders) inside the user's home directory, e.g. /srv/home/foobar/Dropbox

This way, the users still have most of their home dir for normal use. Is there any way to achieve this? The OMERO documentation is kept relatively sparse about this topic.

Best,

Tristan

Re: Make Dropbox watch a user's subfolder

PostPosted: Tue Mar 04, 2014 3:56 pm
by cblackburn
Hi Tristan,

It is possible to set-up DropBox for multiple users. This is described in our 4.4.10 documentation here:
http://www.openmicroscopy.org/site/support/omero4/sysadmins/dropbox.html#advanced-use
with an example here:
http://www.openmicroscopy.org/site/support/omero4/sysadmins/dropbox.html#example

If you have upgraded to OMERO 5 there are equivalent pages in the documentation there. So the two key lines for your case with users 'foo' and 'bar' whould be something like:
Code: Select all
...
<property name="omero.fs.importUsers"     value="foo;bar"/>
<property name="omero.fs.watchDir"        value="/srv/home/foo/Dropbox;/srv/home/bar/Dropbox"/>
...

One thing this doesn't make clear, other than by the example, is that each configuration variable must contain a semi-colon separated list of the same length as the number of users.

This could clearly be improved by allowing a single default value to be used for each user, for those variables that do not need to be set per-user. Also, in your use-case, the directories for each user are similar to each other in such a way that a templated configuration would be more useful. I have created tickets for these two improvements, see 1 and 2 below. If you wish to be cc'ed on either of these tickets to see their progress then let me know.

If you have any other questions or suggestions please let us know.

Regards,

Colin
[1] http://trac.openmicroscopy.org.uk/ome/ticket/12038
[2] http://trac.openmicroscopy.org.uk/ome/ticket/12039

Re: Make Dropbox watch a user's subfolder

PostPosted: Thu Mar 06, 2014 12:00 pm
by Sethur
Hi Colin,

thanks for the reply. I should have stated that I already read the available documentation to the OMERO Dropbox facility, so I knew that there was a way to set it up for each individual user by using semicolon separated lists.

This is - in our case - not a good option, though, as you already pointed out. We frequently add new users to our LDAP directory, each of which would require a stop and reload of the server accompanied by the respective change of the configuration.

I would have expected that you could so something like:

<property name="omero.fs.importUsers" value="default"/>
<property name="omero.fs.watchDir" value="/srv/home/%user%/Dropbox"/>

but this is apparently not yet possible. So, yes, please include me in the ticket. I would also probably be able to implement this myself, if you point me in the right direction regarding the location in the source (maybe I could then make this available as a pull request).

Regards,

Tristan

Edit: Just saw that the second ticket uses exactly me suggestion from above, I think this would be a good solution.

Re: Make Dropbox watch a user's subfolder

PostPosted: Thu Mar 06, 2014 12:11 pm
by Sethur
PS: Another thing, which I think is not yet possible with Dropbox, but would be very helpful if implemented, is the option of having local files deleted once they have been successfully uploaded to the OMERO server. This greatly reduces data redundancy and let's the user know when he can expect his files to show up in OMERO. Before the file is deleted, it would be easy to implement a checksum check to prevent deletion of files that have been corrupted during upload.

If you think deleting files is a bad idea, you could still implement a way to tag files that have already been uploaded. There are several options how to do this, e.g. creating a text file with a list of already uploaded files and checksums, renaming the file to something containing _UPLOADED_TO_OMERO_ON_DATE_, creating a file with upload infos for every uploaded file, etc. When using this approach, sys admins can write scripts themselves that take care of uploaded files to reduce data redundancy and free up space.

Re: Make Dropbox watch a user's subfolder

PostPosted: Fri Mar 07, 2014 1:16 pm
by cblackburn
Hi Tristan,

Thanks for your replies.

I have cc'ed you on ticket 12039. However, I'm not sure this as simple a job as it first looked to me as it also needs to address new users in some reasonable way. The current configuration settings are handled in:

https://github.com/openmicroscopy/openmicroscopy/blob/develop/components/tools/OmeroFS/fsDropBox.py

One other possible solution worth pursuing might be using symlinked or mounted directories under the DropBox folder. However, the effectiveness and configuration of this would depend on the OS your server is deployed on as each file monitoring service is a little different in its capabilities. Which OS are you using? Note that we have not tested either of these solutions.

Another might be to use a (semi-) automated method of setting the DropBox config by pulling information from LDAP and pushing it to
Code: Select all
bin\omero config set
Though this would be simplified by ticket 12038 being resolved.

On this
PS: Another thing, which I think is not yet possible with Dropbox, but would be very helpful if implemented, is the option of having local files deleted once they have been successfully uploaded to the OMERO server. This greatly reduces data redundancy and let's the user know when he can expect his files to show up in OMERO. Before the file is deleted, it would be easy to implement a checksum check to prevent deletion of files that have been corrupted during upload.

We have been working on something very much like this for OMERO 5. You can see the ticket detailing the work here:
http://trac.openmicroscopy.org.uk/ome/ticket/11573

We would welcome any comments on this.

Cheers,

Colin

Re: Make Dropbox watch a user's subfolder

PostPosted: Wed Mar 26, 2014 1:12 pm
by Sethur
Hi Colin,

thanks for you detailed answer. Our OMERO machine is running on Linux (Ubuntu 13.10). Regarding your proposed solutions:

a) I'm not sure how to achieve the proposed symlink solution without having a user specific "importUsers" property (i.e. something other than "default"). Having a user specific dropbox config that lists all users and their corresponding watched directories would not require symlinking anymore.

b) Doesn't changing the omero config require a restart of the server? If not, than pushing the config to omero in regular intervals would be a solution, indeed, although not very elegant.

c) I had a look at the undocumented in-place import feature (thanks for the hint!), but this would lead to the problem of OMEROs reaction when a users deletes an original file (inside his home directory) at the end of it's usefulness. OMERO would probably not react by automatically purging the database entries for this file and we would end up with a corrupted database as soon as users (who have full write permissions, of course) start to accidentally or purposely delete files.

Cheers,

Tristan

Re: Make Dropbox watch a user's subfolder

PostPosted: Thu Mar 27, 2014 2:49 pm
by cblackburn
Hi Tristan,

a) I'm not sure how to achieve the proposed symlink solution without having a user specific "importUsers" property (i.e. something other than "default"). Having a user specific dropbox config that lists all users and their corresponding watched directories would not require symlinking anymore.

I was thinking something along the lines of:
Code: Select all
jrs-macbookpro-25031:~ cblackburn$ ls -l /Users/cblackburn/omero-dropbox
lrwxr-xr-x  1 cblackburn  staff  43 27 Mar 13:46 /Users/cblackburn/omero-dropbox -> /Users/cblackburn/var/omero51/DropBox/colin

i.e. each user having a symlink to the default DropBox location. Then the system only need watch the default locations. Of course this will depend on your system and on user administration.

b) Doesn't changing the omero config require a restart of the server? If not, than pushing the config to omero in regular intervals would be a solution, indeed, although not very elegant.


It isn't necessary to restart the OMERO server, you can restart the various Ice servers independently.
Code: Select all
bin/omero admin ice server stop DropBox

should be sufficient as the default configuration is for DropBox to automatically restart if it is stopped. This should then pick up any changed DropBox configuration.

c) I had a look at the undocumented in-place import feature (thanks for the hint!), but this would lead to the problem of OMEROs reaction when a users deletes an original file (inside his home directory) at the end of it's usefulness. OMERO would probably not react by automatically purging the database entries for this file and we would end up with a corrupted database as soon as users (who have full write permissions, of course) start to accidentally or purposely delete files.


The DropBox system was conceived and implemented before the current OMERO 5 import workflow and so to some extent we are only just starting to look at how we can take advantage of the new import methods that are becoming available.

A typical imagined workflow might be to use in-place import with hard links. This way the user deleting files from their own DropBox folder will not be deleting the imported file from the ManagedRepository. So a full delete would require the user to both delete their local file and also actively delete the imported image via OMERO, which would delete the ManagedRepository link.

However, you raise an interesting idea. DropBox can monitor file deletes, although it doesn't do this by default at present. This would allow for some sort of "in-place" delete where the file being deleted could trigger an OMERO delete. This does raise some serious data security issues and so we have added it to our scoping document as something to look at and consider for a future release.

Cheers,

Colin

Re: Make Dropbox watch a user's subfolder

PostPosted: Wed Apr 02, 2014 2:43 pm
by Sethur
Hi Colin,

thanks for the detailed info! I was considering the symlink solution, but this introduces security issues with Samba since you need to enable wide links and this together with unix extensions allows users to break out of the provided shares.

However, I realized that you can make Samba provide a second set of user specific shares (that target user specific locations), so I will just provide a second share specifically for OMERO.dropbox now.

Regarding the Dropbox delete problem:

Hard links would unfortunately require the files to reside on the same physical partition then the main Omero data folder. But even without having an in-place import, you could implement a delete option in several ways:

a) If the user deletes files, optionally ask him/her on the next OMERO.web or insight login if those files should be deleted in OMERO as well, then proceed with triggering an OMERO delete action or not.

b) Better: Just trigger the delete action without asking, but implement a versioned files system like the "real" Dropbox, so that users can easily undo mistakes. Administrators would then ideally have an option of controlling how long into the past deleted data is kept.

Option b would have the additional benefit of making backups much easier. One would just have to backup all changes to the data directory, the option of going back in time would already be included.

Cheers,

Tristan

Re: Make Dropbox watch a user's subfolder

PostPosted: Thu Apr 03, 2014 2:17 pm
by cblackburn
Hi Tristan,

Sethur wrote:thanks for the detailed info! I was considering the symlink solution, but this introduces security issues with Samba since you need to enable wide links and this together with unix extensions allows users to break out of the provided shares.

However, I realized that you can make Samba provide a second set of user specific shares (that target user specific locations), so I will just provide a second share specifically for OMERO.dropbox now.


It's great that you've found a solution that works for your site. As you can appreciate in-place imports comes with its own set of limitations depending on the network. We'd be interested to know how your solution works for you as you start to use it.

Regarding the Dropbox delete problem:

Hard links would unfortunately require the files to reside on the same physical partition then the main Omero data folder. But even without having an in-place import, you could implement a delete option in several ways:

a) If the user deletes files, optionally ask him/her on the next OMERO.web or insight login if those files should be deleted in OMERO as well, then proceed with triggering an OMERO delete action or not.

b) Better: Just trigger the delete action without asking, but implement a versioned files system like the "real" Dropbox, so that users can easily undo mistakes. Administrators would then ideally have an option of controlling how long into the past deleted data is kept.

Option b would have the additional benefit of making backups much easier. One would just have to backup all changes to the data directory, the option of going back in time would already be included.


Thanks for the suggestions. I'll certainly add your comments to our scoping documents as something to consider as we move forward with increasing the functionality of delete and integrating it with DropBox and in-place import.

Cheers,

Colin