Open Microscopy Environment

Posted: **Mon May 13, 2019 7:54 am**

Dear forum readers,
I have a problem importing Perkin Elmer Operetta data using the command line in-place import using bulk import and yaml.
My yaml file looks like this

continue: "true"
transfer: "ln_s"
checksum_algorithm: "File-Size-64"
logprefix: "logs/"
output: "yaml"
path: "/OMERO/ManagedRepository/ipimp-54474.tsv"
columns:
- target
- path

My approach was to generate a filelist from the exported Operetta images folder, that contains the images as well as the metadata in the Index.idx.xml file.
My file list contains 3360 images and the Index.idx.xml file.
My screen consisted of 10 wells, 16 fields each, 7 z-slices, 3 colors. So the 3360 files are the correct number, and the imported files are correctly displayed inside the omero web gui.
When I start the import using:

Code: Select all: /home/omero/OMERO.server/bin/omero import --bulk /OMERO/ManagedRepository/bulki-54474.yml --skip upgrade

Omero starts to import my images correctly. I get dataset that contains the images as well as plate layouts of the imported images / plate.
What is strange is that the importer does not stop once the dataset is imported, but imports the files repeatedly. I manually canceled the import now after 4 days. I got 22080 files imported in my dataset folder as well as the plate 138 times (names run 1 to 138).
Here is an image from my omero web interface:
[img]

: Screen Shot 2019-05-13 at 09.46.06.png (62.64 KiB) Viewed 958 times

[/img]
Inside my imported dataset I have entries that are called "Index.idx.xml [Well x, Filed x] that contain then many duplicated images.
Am I doing something wrong and do I have just to point the importer to the Index.idx.xml file instead to both the metadata and the images? Can I check somehow how this import was generated?
Thanks for the help,
Alex

Posted: **Mon May 13, 2019 9:19 am**

Hi Alex,

this could be either an import issue or an issue with file format, respectively the reader.
To narrow it down a bit, could you post the content of the ipimp-54474.tsv file (or better upload it on http://qa.openmicroscopy.org.uk/qa/upload/ if it's a larger file)? Also the output of 'import -f' would be interesting. The '-f' option is a kind of dry-run option, it won't kick off the import but it'll list in detail what would happen (which files would be imported, into how many plates, etc.).
If you could capture and send us the output of

Code: Select all: omero import -f --bulk /OMERO/ManagedRepository/bulki-54474.yml

, that'd be great.

Kind Regards,
Dominik

Posted: **Mon May 13, 2019 10:22 am**

Hi Dominik,
thanks for the fast response. I uploaded both the ipimp.54474.tsv file as well as the console output for the command

Code: Select all: omero import -f --bulk /OMERO/ManagedRepository/bulki-54474.yml

Please not the console output is not complete, but the last entries are repeated again and again.
Thanks
Alex

Posted: **Mon May 13, 2019 4:36 pm**

Hi Dominik,
just an additional information. The last command is still not finished and if I check the file sets with

Code: Select all: bin/omero fs sets

I get the following list:

Code: Select all: # | Id | Prefix | Images | Files | Transfer ----+-----+----------------------------------+--------+-------+---------- 0 | 713 | AlexR_2/2019-05/13/09-02-40.066/ | 160 | 3046 | ln_s 1 | 712 | AlexR_2/2019-05/13/08-24-01.442/ | 160 | 3046 | ln_s 2 | 711 | AlexR_2/2019-05/13/07-45-20.546/ | 160 | 3046 | ln_s 3 | 710 | AlexR_2/2019-05/13/07-06-41.235/ | 160 | 3046 | ln_s 4 | 709 | AlexR_2/2019-05/13/06-28-04.421/ | 160 | 3046 | ln_s 5 | 708 | AlexR_2/2019-05/13/05-49-26.910/ | 160 | 3046 | ln_s 6 | 707 | AlexR_2/2019-05/13/05-10-42.893/ | 160 | 3046 | ln_s 7 | 706 | AlexR_2/2019-05/13/04-32-05.787/ | 160 | 3046 | ln_s 8 | 705 | AlexR_2/2019-05/13/03-52-49.194/ | 160 | 3046 | ln_s 9 | 704 | AlexR_2/2019-05/13/03-13-55.541/ | 160 | 3046 | ln_s 10 | 703 | AlexR_2/2019-05/13/02-35-08.014/ | 160 | 3046 | ln_s 11 | 702 | AlexR_2/2019-05/13/01-56-28.709/ | 160 | 3046 | ln_s 12 | 701 | AlexR_2/2019-05/13/01-17-38.558/ | 160 | 3046 | ln_s 13 | 700 | AlexR_2/2019-05/13/00-38-49.877/ | 160 | 3046 | ln_s 14 | 699 | AlexR_2/2019-05/12/23-59-58.226/ | 160 | 3046 | ln_s 15 | 698 | AlexR_2/2019-05/12/23-21-17.378/ | 160 | 3046 | ln_s 16 | 697 | AlexR_2/2019-05/12/22-42-37.295/ | 160 | 3046 | ln_s 17 | 696 | AlexR_2/2019-05/12/22-03-46.220/ | 160 | 3046 | ln_s 18 | 695 | AlexR_2/2019-05/12/21-25-04.230/ | 160 | 3046 | ln_s 19 | 694 | AlexR_2/2019-05/12/20-46-15.442/ | 160 | 3046 | ln_s 20 | 693 | AlexR_2/2019-05/12/20-07-28.382/ | 160 | 3046 | ln_s 21 | 692 | AlexR_2/2019-05/12/19-28-33.157/ | 160 | 3046 | ln_s 22 | 691 | AlexR_2/2019-05/12/18-49-43.550/ | 160 | 3046 | ln_s 23 | 690 | AlexR_2/2019-05/12/18-10-58.112/ | 160 | 3046 | ln_s 24 | 689 | AlexR_2/2019-05/12/17-32-16.646/ | 160 | 3046 | ln_s

I do not know if this is helpful.
I can also upload the corresponding log file, but it is 32 MB.
Best wishes
Alex

Posted: **Tue May 14, 2019 9:10 am**

Hi Alex,

I think the problem is that you list every image file separately in the tsv file. Each line is a import.
Only the first line is needed:

Code: Select all: SAMHD1-SC35 /mnt/CCHL-User/Alex/03-Microscopy/2018/2018-08-09-Operetta_SamHD1_SC35_AGS/plate01_SAMHD1_SC35__2018-08-09T13_50_53-Measurement1/Images/Index.idx.xml

You can point to the 'index.idx.xml' or actually just to the 'Images' directory itself. The importer will figure out automatically which image files to import. And as you're importing plates, you should remove the "Dataset:name:"

You could look at an example from IDR:
This https://github.com/IDR/idr0037-vigilant ... er/screenA (idr0037-screenA-bulk.yml and idr0037-screenA-plates.tsv) is how http://idr.openmicroscopy.org/webclient ... creen-2051 was imported. Also see the option "exclude: "clientpath"" in the bulk.yml. Although it's commented out in this example, you could use it to prevent accidentely importing image files multiple times. With that option you'd get a warning if you try to import a single image file, if it has already been imported previously as part of a plate.

Kind Regards,
Dominik

Posted: **Wed May 15, 2019 8:48 am**

Dear Dominik,
thanks for the suggestion. This worked well. What I realised from my previous import is that the import of the full directory resulted in a large number of huge images. If I check with

Code: Select all: bin/omero fs images

# | Image | Name | FS | # Files | Size
----+-------+----------------------------------+-----+---------+--------
0 | 22259 | Index.idx.xml [Well 8, Field 16] | 713 | 3046 | 5.4 GB
1 | 22258 | Index.idx.xml [Well 8, Field 15] | 713 | 3046 | 5.4 GB
2 | 22257 | Index.idx.xml [Well 8, Field 14] | 713 | 3046 | 5.4 GB
3 | 22256 | Index.idx.xml [Well 8, Field 13] | 713 | 3046 | 5.4 GB
4 | 22255 | Index.idx.xml [Well 8, Field 12] | 713 | 3046 | 5.4 GB
5 | 22254 | Index.idx.xml [Well 8, Field 11] | 713 | 3046 | 5.4 GB
6 | 22253 | Index.idx.xml [Well 8, Field 10] | 713 | 3046 | 5.4 GB
7 | 22252 | Index.idx.xml [Well 8, Field 9] | 713 | 3046 | 5.4 GB
8 | 22251 | Index.idx.xml [Well 8, Field 8] | 713 | 3046 | 5.4 GB
9 | 22250 | Index.idx.xml [Well 8, Field 7] | 713 | 3046 | 5.4 GB
10 | 22249 | Index.idx.xml [Well 8, Field 6] | 713 | 3046 | 5.4 GB

I see the imported images from the index.idx.xml as 5.4 GB size. Although I expect the images (7 z-planes, 3 colors per Field with a single frame having 1.5 MB) to be only approx. 32 MB.
Any idea how this was generated?
By the way how can I delete the files from my omero database?
Thanks
Alex

Posted: **Wed May 15, 2019 9:56 am**

Easiest way would be to use the 'delete' command. For the various options see:

Code: Select all: ./omero delete --help

With respect to the image sizes: Have to check myself what the 'fs' commands actually reports there, might not be what you'd expect (or might as well be a bug).

Regards,
Dominik

Posted: **Wed May 15, 2019 11:23 am**

I've been playing around a bit with the 'fs' command and I'm not sure if the 'fs image' command is the right tool to use if you want to check the import. What might be useful to check the size of an imported plate, if you have the ID (copied from the web client for example):

Code: Select all: ./omero fs usage --report --human-readable Plate:2052

To check which files have been imported for a particular plate:

Code: Select all: ./omero fs ls 8751

You need the Fileset ID for the previous command, which you can get by running

Code: Select all: ./omero obj get Image:47739

on of the images in the plate.

Kind Regards,
Dominik

Open Microscopy Environment

Bulk import of Perkin Elmer Operetta data

Bulk import of Perkin Elmer Operetta data

Re: Bulk import of Perkin Elmer Operetta data

Re: Bulk import of Perkin Elmer Operetta data

Re: Bulk import of Perkin Elmer Operetta data

Re: Bulk import of Perkin Elmer Operetta data

Re: Bulk import of Perkin Elmer Operetta data

Re: Bulk import of Perkin Elmer Operetta data

Re: Bulk import of Perkin Elmer Operetta data