Page 1 of 1

VM Implementation of OMERO

PostPosted: Thu Aug 07, 2014 2:47 pm
by brossetti
Hello All,

I am currently working to install OMERO at my home institution. I have been working with my IT department to develop an implementation of OMERO that works within our current infrastructure. I would like to share my plan in hopes of receiving feedback from the OMERO community concerning any potential issues or faults that I may have overlooked. Although I am not a sysadmin, I have been making a conscious effort to learn the skills needed for this project. With that in mind, please forgive any ambiguities in the details below.

Goal
Create an image storage server for data generated by 15-35 users. The system should support images generated by the following equipment: Zeiss LSM 780, Zeiss Observer.Z1 (Zen Blue), Zeiss Evo SEM, and JEOL TEM. The system should be geared towards two primary user groups -- one group generating time lapse z-stacks and the other group generating spectral images.

Proposed Solution
Run OMERO as a virtual machine (not the available virtual appliance) and use NAS for data storage and backup. One NAS will be used as the binary repository and an identical NAS will be used for data backup.

Hardware/Software
VM: VMware vSphere
OS: CentOS 6
NAS (Primary Storage): Synology RS2414RP+ with 12x 4TB WD-Red HDD in Raid 6 (48TB Total)
NAS (Backup): Synology RS2414RP+ with 12x 4TB WD-Red HDD in Raid 6 (48TB Total)

The VM can be provisioned with up to 132GB RAM and 12 core 2.9GHz processors; however, we will only be given a small subset of these resources.

Questions
1. What resources should I allocate to the VM? The OMERO.server system requirements page gives several contradictory recommendations. For example, the OMERO.server recommendation is for 8GB RAM, but the page later gives the following suggestions:

You are probably going to hit a hard ceiling between 4 and 6GB for JVM size … I would surely doubt a large deployment using more than a few GBs of RAM…


… 16, 24 or 32GB of RAM would be ideal for your OMERO server. If you have a separate database server more than 16GB of RAM may not be of much benefit to you at all.


The same applies for the CPU recommendation. The 25-50 user recommendation is for a quad core, but the page later states:
Summary: Depending on hardware layout 2 x 4, 2 x 6 system core count should be more than enough.


I may be misinterpreting this information; however, I suspect other users will encounter similar confusion.

2. When creating the volume for the binary repository, should I create one large 48TB volume, or should I separate it into smaller volumes. The example given on the Server Binary Repository page identifies the volume as

$ bin/omero config set omero.data.dir /mnt/really_big_disk/OMERO


The name “really_big_disk” makes me think that I should partition into one 48TB volume.

3. (Question 2 Continued) Is it possible to split the binary repository over several volumes? To be honest, I am still confused about what is held in the binary repository and how it is accessed by OMERO.server. There is a section stating:

Your repository is not:
• the “database”
• the directory where your OMERO.server binaries are
• the directory where your OMERO.client (OMERO.insight, OMERO.editor or OMERO.importer) binaries are
• your PostgreSQL data directory


Unfortunately, there is no section that describes in layman’s terms what the repository IS.

Thank you all in advance for your help as I work through this installation.

Cheers,
Blair

Re: VM Implementation of OMERO

PostPosted: Fri Aug 08, 2014 10:26 am
by PaulVanSchayck
Dear Blair,

Using a NAS as storage is possible. I've had trouble with using CIFS/SMB, using NFS gave good performance. What I've done is kept the /OMERO directory, and only symlinked the Files, ManagedRepository, Pixels and Thumbnails directory towards the NAS mount point. This for performance reasons, as the often accessed cache and lock files are then kept local. Thumbnails for me is also that small that I think I'll move it back to local, but I haven't done that yet. Postgress is running local.

Regarding using multiple mounts, splitting the ManagedRepository (those are original files) and Pixels (the processed pixel pyramids) is a good option. But I wouldn't do that unless necessary.

Regarding CPU power. Are you planning to do more than storing, I.E run image analysis or image processing scripts? Other than that, using at least 4 cores will keep the user interfaces responsive while pixels are processed. More cores will reduce waiting times. Using virtualization is a good option anyway.

Kind regards,

Paul

Re: VM Implementation of OMERO

PostPosted: Fri Aug 08, 2014 11:18 am
by mtbc
Blair, thank you very much for your constructive comments: we will improve the documentation in time for the next release. I've had a chat with others here about your questions: here's a first run, and others will chime in as need be.

1. 8GB is a good starting point, especially as your virtualization environment should allow you to easily increase that if necessary, but more is always good. In particular, if the database server is running in the same virtual machine, it can be good to increase available memory enough for some of the database to be cached in RAM; many production databases are less than 10GB in total. With our 5.0.3 release, there is plenty of support for managing memory: see https://www.openmicroscopy.org/site/sup ... mance.html and give it a try.

2. I don't know of any problem with using one large 48TB volume. At that size, it certainly sounds external to your machine, so do bear the information on https://www.openmicroscopy.org/site/sup ... ote-shares in mind. (I am glad that Paul has some real-world advice on remote mounts.)

The binary repository is where much of your attention needs to be focused: ensuring that the OMERO server processes have good, fast access to the repository. The idea of using a "really_big_disk" in the prefix brings us to,

3. Yes, you can split the binary repository over several volumes. If you do use any formats that require pyramids to be built, those are located in /OMERO/Pixels/*_pyramid.

The managed repository, by default /OMERO/ManagedRepository/ but configurable by omero.managed.dir, principally stores the image files uploaded at import time, also files such as the logs of the import processes. Within it, were you to brush close to the 48TB someday, you could mount another volume and change the omero.fs.repo.path configuration (the directories below /OMERO/ManagedRepository/) to create import paths within that new mount point, and new imports would go onto that disk.

For our largest server at Dundee, we instead use LVM to grow the storage underneath the server without any downtime.

Cheers,
Mark

Re: VM Implementation of OMERO

PostPosted: Fri Aug 08, 2014 5:49 pm
by brossetti
Paul and Mark,

Thank you both for your insightful comments and recommendations. It will take me a bit of time to parse through the details. I'm sure that I will be back with more questions soon :D

Thanks!
Blair

Re: VM Implementation of OMERO

PostPosted: Thu Aug 21, 2014 6:26 pm
by brossetti
Hello,

I've had a chance to look a bit deeper into some of the recommendations here. I am wondering if there is a way to use LVM with NFS mount points. LVM is an attractive abstraction that will prove useful when we outgrow our current NAS, but I have not found a way to make this scenario work. Perhaps there is an entirely different method that would still allow for increasing the volume size?

Thanks!
Blair

Re: VM Implementation of OMERO

PostPosted: Fri Aug 22, 2014 9:56 am
by kennethgillen
brossetti wrote:Hello,

Hi Blair.

brossetti wrote: Perhaps there is an entirely different method that would still allow for increasing the volume size?


I have to say, I've never mixed LVM or other technologies with NFS mounts. It may well be worth talking to your IT Department and Synology reseller to discuss the expansion options for increasing the storage from the NAS itself.

All the best,

Kenny

Re: VM Implementation of OMERO

PostPosted: Fri Aug 22, 2014 12:20 pm
by rleigh
LVM and NFS are essentially separate. LVM is used to manage storage, NFS to serve the storage to other hosts. So if you are using an NFS server, you can use LVM on the NFS server to manage the storage which it is serving, providing the operating system supporting it and you have set it up. But on the NFS client, you can't use LVM with the remote NFS mounts; it's for local storage only. You can certainly use it to manage the local storage of the NFS client machine though.

Re: VM Implementation of OMERO

PostPosted: Fri Aug 22, 2014 2:03 pm
by brossetti
Thanks Kenny and Roger!

I did a bit of poking around and everything I found agreed with your statements. There is an option to group mount points using something like Gluster or Ceph, but I bet there will be a bit of a performance hit in those situations. For now, I should have more storage than I need. There exists expansion options for our NAS, so that would likely be the first option should we bump up against our storage limit.

Thanks!
Blair

Re: VM Implementation of OMERO

PostPosted: Wed May 13, 2015 7:51 am
by bulwynkl
If you have the network NICs, setting up a network that is dedicated to disk traffic only might mitigate any contention issues, allow you to use jumbo packets and may improve disk access performance over NFS.

maybe...