Page 1 of 1

Scaling Omero

PostPosted: Mon Feb 18, 2013 10:17 pm
by swewa
Hi

We are planning to upgrade our omero setup in our university. We estimate users to be 300-400 and simultaneous users to be 20-100. We have quite a variety of files in lif, tiff, nd2, svs and so on. Average file size is 80-200MB and sometimes we have big file as large as 5GB in size.

Our current set up is as below
Omero 4.4.4 with ice 3.3 running on Redhat Virtual Machine (single node, 2x2.2Ghz vcpu, 8GB memory)
Postgres Sql 9.0.3 on a database server (separate box)
NFS mount drive for data

As of now, the server is not utilised yet, but we are expecting to grow in next few months.
We are seeking your expertise to scale our omero setup in this following area
1. recommended hardware spec (cpu, memory, disk space)
2. clustering solution
3. load balancing / failover
4. Will omero benefit from GPU if the server has?
5. Can we manage the load better if we cluster multiple omero servers (vm)?
6. documentation for clustering solution, load balancing
7. documentation for performance tuning of omero

If you require more information, please kindly let me know.
thanks in adv.

regards,

Re: Scaling Omero

PostPosted: Tue Feb 19, 2013 10:10 pm
by jmoore
swewa wrote:Hi


Hi,

1. recommended hardware spec (cpu, memory, disk space)


This is a very broad question. Can you give us a general outline? Have you read through https://www.openmicroscopy.org/site/support/omero4/sysadmins/system-requirements.html ?

2. clustering solution


The primary server cannot currently make much use of clustering. Did you have something particular in mind?

3. load balancing / failover


Using a secondary blitz server can definitely be useful for failover or a hot-swap (deploying a new version with no down time). The primary constraint is really the memory that the secondary JVM will use.

4. Will omero benefit from GPU if the server has?


No, I'm sorry, they won't. The current Java libraries we use make no use of the GPU.

5. Can we manage the load better if we cluster multiple omero servers (vm)?


There are two bottlenecks to using multiple OMERO instances:
  • the database, since each Blitz server will need to connect independently
  • the filesystem, since each Blitz server will need to access the same shared ${omero.data.dir} location

6. documentation for clustering solution, load balancing
7. documentation for performance tuning of omero


The extent of the documentation is linked to from this FAQ: namely Server/Clustering.

NB: Please be sure to see the warnings about NFS under OMERO.server binary repository as well as on the forums.

Cheers,
~Josh.

Re: Scaling Omero

PostPosted: Wed Feb 20, 2013 5:47 am
by swewa
Hi Josh

thanks for the information. I had a look at the hardware spec.

1. general outline is we expect simultaneous users to be 20-100. So I guess we can use the similar or better spec for the recommended spec listed here https://www.openmicroscopy.org/site/sup ... ments.html

I have tested with one svs file, I noticed that generating thumbnails took a while to complete. I saw in the logs about pyramid creation of pixels(rendering?). During that time, the cpu load was 70-90% at one cpu core. But I did not see such issue with other file format (yet). I will do more test and observe.

6. for clustering, how many servers we can put? Is it active-active setup or active-passive setup? Do we need hardware load balancer infront if we want active-active environment?
To configure clustering, I supposed
Code: Select all
bin/omero config set omero.cluster.redirector configRedirector
to be set at both severs

and

Code: Select all
bin/omero node backup start
to be run at backup node ?

and

Code: Select all
the rest of the configs
are same for all servers?

I guess I will have to do some benchmark test to get an idea of current bottlenecks are.

thanks and regards,

Re: Scaling Omero

PostPosted: Wed Feb 20, 2013 7:45 am
by jmoore
swewa wrote:Hi Josh


Hi.

I have tested with one svs file, I noticed that generating thumbnails took a while to complete. I saw in the logs about pyramid creation of pixels(rendering?). During that time, the cpu load was 70-90% at one cpu core. But I did not see such issue with other file format (yet). I will do more test and observe.


Yes, in OMERO 4.4, generating the pyramid file for SVS is extremely time consuming (JPEG2000 compression, etc). The next major release will include a feature (called "FS") which will make this step unnecessary.

6. for clustering, how many servers we can put? Is it active-active setup or active-passive setup? Do we need hardware load balancer infront if we want active-active environment?


Assuming the servers share omero.data.dir as I mentioned, they can be active-active, but any sessions which are running on one will be lost if that server goes down. (Think "sticky sessions")

To configure clustering, I supposed...


There is to my knowledge no one using a clustered configuration externally. So 1) it may be a bit bumpy getting started but 2) I'm looking forward to your feedback!

I guess I will have to do some benchmark test to get an idea of current bottlenecks are.


Agreed, and it's quite possibly that clustering the OMERO server itself will not be the most immediate win. Will most of the operations be READ? Will most of it be coming from the web? If so, we've found that balancing (and caching!) at the web layer is a good place to start.

But definitely, your mileage will vary.
~Josh