Page 1 of 1

Is folder import into a screen a thing?

PostPosted: Tue Apr 02, 2019 7:57 am
by StephanJanosch
Hi All,

I feel in charge for some research data management activities here at MPI-CBG.de. So I got approached with 1 TB of screening images and thought of trying Omero with the goal of using IDR as a repo for that screen.

I startet really simple:

Code: Select all
TestScreen/
├── Plate_1
│   ├── AssayPlate_Aurora_3AUR_C03_T0001F001L01A01Z01C01.tif
│   ├── AssayPlate_Aurora_3AUR_C03_T0001F001L01A02Z01C02.tif
│   ├── AssayPlate_Aurora_3AUR_C03_T0001F002L01A01Z01C01.tif
│   └── AssayPlate_Aurora_3AUR_C03_T0001F002L01A02Z01C02.tif
└── Plate_2
    ├── AssayPlate_Aurora_3AUR_C03_T0001F003L01A01Z01C01.tif
    ├── AssayPlate_Aurora_3AUR_C03_T0001F003L01A02Z01C02.tif
    ├── AssayPlate_Aurora_3AUR_C03_T0001F004L01A01Z01C01.tif
    └── AssayPlate_Aurora_3AUR_C03_T0001F004L01A02Z01C02.tif


And I failed uploading this tiny example structure into a screen via Insight.

I documented the fail in this screencast: https://cloud.mpi-cbg.de/index.php/s/EARtsLKSduvLjnG

The images end up as orphaned images. They don't go into the selected screen!

Did I understand anything wrong, in regards to screens?
Do I need to use the CLI in order to get images in plates into Omero?
How about well positions?

Sorry these naive questions, but the learning curve feels quite steep. I will watch some tutorial videos now: https://www.youtube.com/channel/UCyySB9 ... auQ/videos ;-)

Thanks,
Stephan

Re: Is folder import into a screen a thing?

PostPosted: Tue Apr 02, 2019 8:02 am
by StephanJanosch
Just some additional info:

Screens are often done in plates. 96-well plates or 384-well plates. Each well has a coordinate (e.g. B4) and can be imaged multiple times.

So you might end up with 12 images, if you image a well with 2 channels on 6 positions. This makes 1152 images per plate. :idea:

Re: Is folder import into a screen a thing?

PostPosted: Tue Apr 02, 2019 2:44 pm
by sbesson
Hi Stephan

I feel in charge for some research data management activities here at MPI-CBG.de. So I got approached with 1 TB of screening images and thought of trying Omero with the goal of using IDR as a repo for that screen.


This is certainly matching what IDR is doing routinely.

The images end up as orphaned images. They don't go into the selected screen!


What this implies is that the dataset was not detected as High-Content Screening data. Instead each TIFF was detected and imported as a separate image and ended up in the Orphaned images.

Did I understand anything wrong, in regards to screens?


The primary requirement is to have a filesets imported as HCS is that the underlying imaging data must be expressed in a HCS format i.e. including the Screen/Plate/Well layout metadata in a form that can be read as such by Bio-Formats. Bio-Formats support many HCS proprietary file formats generated from typical high throughput instruments - see here for the table summarizing the expected structure.

It might be interesting to know more about the instrument which generated the original data you are using as a test.

Do I need to use the CLI in order to get images in plates into Omero?


The CLI should not be a requirement for importing HCS data. OMERO.insight has the ability to import such datasets. In the case of High-Content Screening, using the CLI might be advantageous for a few reasons:

  • it is possible to test whether a dataset will be detected as High-Content Screening using the following command
    Code: Select all
    bin/omero import -f /path/to/folder
    . This will report the group of filesets detected for import as well as whether each of these filesets are of HCS/SPW type.
  • HCS datasets are typically distributed over a large number of files (several tens to hundreds of thousands). Using the in-place import functionality combined with the parallel upload option introduced in OMERO 5.4.8 can result in huge import gains.

How about well positions?


If the dataset is detected as HCS by Bio-Formats, the library should also be able to parse well metadata like positions from the file format when possible. Post import, this metadata will be available in the database.

Screens are often done in plates. 96-well plates or 384-well plates. Each well has a coordinate (e.g. B4) and can be imaged multiple times.

So you might end up with 12 images, if you image a well with 2 channels on 6 positions. This makes 1152 images per plate.


This order of magnitude matches the current high range of HCS datasets we have received in IDR i.e. 384-well plates with up to 30 multidimensional well samples- see here for an example.

Best,
Sebastien

Re: Is folder import into a screen a thing?

PostPosted: Wed Apr 03, 2019 1:06 pm
by StephanJanosch
Thanks for your answer Sebastian!

sbesson wrote:The primary requirement is to have a filesets imported as HCS is that the underlying imaging data must be expressed in a HCS format i.e. including the Screen/Plate/Well layout metadata in a form that can be read as such by Bio-Formats. Bio-Formats support many HCS proprietary file formats generated from typical high throughput instruments - see here for the table summarizing the expected structure.


I was aware, that BioFormats was somehow closely related to Omero, but it was new to me, that the integration of these two is that tight.

sbesson wrote:It might be interesting to know more about the instrument which generated the original data you are using as a test.


I will learn about this tomorrow.

sbesson wrote:If the dataset is detected as HCS by Bio-Formats, the library should also be able to parse well metadata like positions from the file format when possible. Post import, this metadata will be available in the database.


I will play around with https://docs.openmicroscopy.org/bio-formats/5.8.2/users/comlinetools/mkfake.html to see what Bio-Formats expects.

Unfortunately our light microscopy facility has almost no experience with BioFormats itself, because In-house we use Fiji a lot on individual file base.

I will report back as soon, as I know more.

Thanks,
Stephan

Re: Is folder import into a screen a thing?

PostPosted: Thu Apr 04, 2019 11:03 am
by sbesson
Hi Stephan

I was aware, that BioFormats was somehow closely related to Omero, but it was new to me, that the integration of these two is that tight.


In terms of relationship, Bio-Formats is the library used by OMERO to read the metadata and pixel data from all original data expressed in proprietary file formats.
Although both libraries are released independently of each other, they are both maintained by the OME consortium and the development of new features makes heavy use of this integration.

I will play around with https://docs.openmicroscopy.org/bio-for ... kfake.html to see what Bio-Formats expects.


The internal fake format is certainly a good way to test HCS import workflow with mock data. We also have some representative datasets from a few typical HCS formats available publicly.

Unfortunately our light microscopy facility has almost no experience with BioFormats itself, because In-house we use Fiji a lot on individual file base.


This might not be incompatible since Fiji is an ImageJ distribution with a predefined set of plugins for life sciences including notably Bio-Formats. In other terms, if you are opening proprietary file formats via Fiji, it is very likely you have been using Bio-Formats under the hood.

Best,
Sebastien

Re: Is folder import into a screen a thing?

PostPosted: Thu Apr 11, 2019 9:23 am
by StephanJanosch
Hi, I am back with more information.

1) Microscope.

The images were taken with a Yokogawa Cell Voyager 7000
https://www.yokogawa.com/solutions/prod ... s/cv7000s/
Sadly this microscope is not in the lists.

It produces some xml based metadata files, but I had no time yet to check these out.

May I can get a complete smaller dataset for you guys for "the collection". ;)

Here some structure without tiffs.
Code: Select all
├── 006RB151017A-ARSE=Test_6thRun_20151020_173512
│   └── 006RB151014A-ARSE
│       ├── 1038407001_Aurora_3AURDontUse_00019299B.200.EB.ULB.wpp
│       ├── 141212_RNABP_Screen.mes
│       ├── 3AUR-Aurora.wpi
│       ├── ImageCorrectionResult.icr
│       ├── MeasurementData.mlf
│       ├── MeasurementDetail.mrf
│       ├── Thumbs.db
│       ├── back_crosstalk_parameter.xml
│       └── back_geometry_parameter.xml
├── 006RB151017A-ATP=Test_6thRun_20151020_211201
│   └── 006RB151014A-ATP
│       ├── 1038407001_Aurora_3AURDontUse_00019299B.200.EB.ULB.wpp
│       ├── 141212_RNABP_Screen.mes
│       ├── 3AUR-Aurora.wpi
│       ├── ImageCorrectionResult.icr
│       ├── MeasurementData.mlf
│       ├── MeasurementDetail.mrf
│       ├── back_crosstalk_parameter.xml
│       └── back_geometry_parameter.xml
├── 006RB151017A-HEAT=Test_6thRun_20151020_195943
│   └── 006RB151014A-HEAT
│       ├── 1038407001_Aurora_3AURDontUse_00019299B.200.EB.ULB.wpp
│       ├── 141212_RNABP_Screen.mes
│       ├── 3AUR-Aurora.wpi
│       ├── ImageCorrectionResult.icr
│       ├── MeasurementData.mlf
│       ├── MeasurementDetail.mrf
│       ├── back_crosstalk_parameter.xml
│       └── back_geometry_parameter.xml



2) mkfake - no image channels

I used mkfake to get a understanding how a HCS file structure should look. But sadly there no way of specifying the channel number. We have separate tiff files per channel. Now I have a knowledge gap how to tell Bioformats that our file names follow a pattern.

I don't want to spread .pattern files everywhere (https://docs.openmicroscopy.org/bio-for ... ht=pattern)

3) metadata files

I also found: https://idr.openmicroscopy.org/about/screens.html

I guess we need to learn about these.

Is there maybe a tiny HCS example dataset around, where only 2 plates with 2-3 wells and 2 fields and only gene information per well and maybe cell count per well are shown?

I really could need a teaching example, including not more 20 files and having a bare minimum metadata file set (study file, library file with just gene IDs and result file with just cell count).

Still learning,
Stephan

Re: Is folder import into a screen a thing?

PostPosted: Thu Apr 11, 2019 10:15 am
by jmoore
Hi Stephan,

StephanJanosch wrote:The images were taken with a Yokogawa Cell Voyager 7000
https://www.yokogawa.com/solutions/prod ... s/cv7000s/
Sadly this microscope is not in the lists.


It's available in the 6.0 version of Bio-Formats: https://docs.openmicroscopy.org/bio-formats/6.0.1/formats/cv7000.html, which will be included in the upcoming OMERO 5.5. The hope is certainly that it will "just work".


May I can get a complete smaller dataset for you guys for "the collection". ;)


Always welcome!


2) mkfake - no image channels

I used mkfake to get a understanding how a HCS file structure should look. But sadly there no way of specifying the channel number. We have separate tiff files per channel. Now I have a knowledge gap how to tell Bioformats that our file names follow a pattern.


Good point. And I tried adding a .ini file as a workaround but that didn't work as well. I'll need to discuss with others the best way to make this happen.


3) metadata files

Is there maybe a tiny HCS example dataset around, where only 2 plates with 2-3 wells and 2 fields and only gene information per well and maybe cell count per well are shown?


That's definitely possible. But since I'm coming a bit late to this discussion, you're looking for an OME-TIFF example with such a layout in which every TIFF is a separate channel?

Trying briefly, I'd do something like this:

Code: Select all
/tmp/fake $ bfconvert -option ometiff.companion test.companion.ome "simple-plate&plates=2&plateAcqs=2&plateRows=2&plateCols=2&fields=2&sizeC=2.fake" test_c%c_s%s.ome.tiff
simple-plate&plates=2&plateAcqs=2&plateRows=2&plateCols=2&fields=2&sizeC=2.fake
FakeReader initializing simple-plate&plates=2&plateAcqs=2&plateRows=2&plateCols=2&fields=2&sizeC=2.fake
[Simulated data] -> test_c%c_s%s.ome.tiff [OME-TIFF]
   Series 0: converted 2/2 planes (100%)
   Series 1: converted 2/2 planes (100%)
   Series 2: converted 2/2 planes (100%)
   Series 3: converted 2/2 planes (100%)
...
[done]
5.121s elapsed (2.203125+57.453125ms per plane, 752ms overhead)

/tmp/fake $ls -ltra *.tiff | head
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s0.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s0.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s1.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s1.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s2.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s2.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s3.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s3.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s4.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s4.ome.tiff


But I imagine we can point you to something more realistic as well.
~Josh

Re: Is folder import into a screen a thing?

PostPosted: Thu Apr 11, 2019 2:04 pm
by StephanJanosch
jmoore wrote:Hi Stephan,

It's available in the 6.0 version of Bio-Formats: https://docs.openmicroscopy.org/bio-formats/6.0.1/formats/cv7000.html, which will be included in the upcoming OMERO 5.5. The hope is certainly that it will "just work".

:shock: I need to check that out! Thanks for pointing me into the right direction. Somehow I was reading in version 5.8.

jmoore wrote:Good point. And I tried adding a .ini file as a workaround but that didn't work as well. I'll need to discuss with others the best way to make this happen.

Thanks!
.ini-file? Something new for me to look up.

jmoore wrote:That's definitely possible. But since I'm coming a bit late to this discussion, you're looking for an OME-TIFF example with such a layout in which every TIFF is a separate channel?

Trying briefly, I'd do something like this:

Code: Select all
/tmp/fake $ bfconvert -option ometiff.companion test.companion.ome "simple-plate&plates=2&plateAcqs=2&plateRows=2&plateCols=2&fields=2&sizeC=2.fake" test_c%c_s%s.ome.tiff
simple-plate&plates=2&plateAcqs=2&plateRows=2&plateCols=2&fields=2&sizeC=2.fake
FakeReader initializing simple-plate&plates=2&plateAcqs=2&plateRows=2&plateCols=2&fields=2&sizeC=2.fake
[Simulated data] -> test_c%c_s%s.ome.tiff [OME-TIFF]
   Series 0: converted 2/2 planes (100%)
   Series 1: converted 2/2 planes (100%)
   Series 2: converted 2/2 planes (100%)
   Series 3: converted 2/2 planes (100%)
...
[done]
5.121s elapsed (2.203125+57.453125ms per plane, 752ms overhead)

/tmp/fake $ls -ltra *.tiff | head
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s0.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s0.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s1.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s1.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s2.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s2.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s3.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s3.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c0_s4.ome.tiff
-rw-r--r--  1 jamoore  wheel  267350 Apr 11 12:13 test_c1_s4.ome.tiff


But I imagine we can point you to something more realistic as well.
~Josh


This was not exactly what I was looking for (that command line 8-) :oops: ). I was looking for an generic HCS screen example including these metadata text files which fits on one screen. Something like this:

Code: Select all
.
├── fakescreen.fake
│   ├── Plate000
│   │   ├── Run000
│   │   │   ├── WellA000
│   │   │   │   ├── Field000.fake
│   │   │   │   ├── Field001.fake
│   │   │   │   └── Field002.fake
│   │   │   └── WellA001
│   │   │       ├── Field000.fake
│   │   │       ├── Field001.fake
│   │   │       └── Field002.fake
│   │   └── Run001
│   │       ├── WellA000
│   │       │   ├── Field000.fake
│   │       │   ├── Field001.fake
│   │       │   └── Field002.fake
│   │       └── WellA001
│   │           ├── Field000.fake
│   │           ├── Field001.fake
│   │           └── Field002.fake
│   ├── Plate001
│   │   ├── Run000
│   │   │   ├── WellA000
│   │   │   │   ├── Field000.fake
│   │   │   │   ├── Field001.fake
│   │   │   │   └── Field002.fake
│   │   │   └── WellA001
│   │   │       ├── Field000.fake
│   │   │       ├── Field001.fake
│   │   │       └── Field002.fake
│   │   └── Run001
│   │       ├── WellA000
│   │       │   ├── Field000.fake
│   │       │   ├── Field001.fake
│   │       │   └── Field002.fake
│   │       └── WellA001
│   │           ├── Field000.fake
│   │           ├── Field001.fake
│   │           └── Field002.fake
│   ├── idr0000-screenA-library.txt
│   └── idr0000-screenA-processed.txt
└── idr0000-study_HCS.txt

I mixed and matched mkfake and the metadata template here. No clue if it would work like that.

Where it get's really clear, how the content of these .txt files relate to some single channel tiffs per field. Now I find the right analogy: The minimal hello world of fully equipped HCS screen which bio-formats can understand. And what a human can oversee as well.

Thanks for joining in Josh!

Re: Is folder import into a screen a thing?

PostPosted: Fri Apr 12, 2019 9:45 am
by jmoore
.ini-file? Something new for me to look up.


See https://docs.openmicroscopy.org/bio-formats/6.0.1/developers/generating-test-images.html#generating-test-images

"...may be accompanied by an INI-style companion file. A companion file must use the same basename as the fake file and be suffixed with .ini"

~J.