EmbryoMiner: A new framework for interactive knowledge discovery in large-scale cell tracking data of developing embryos

State-of-the-art light-sheet and confocal microscopes allow recording of entire embryos in 3D and over time (3D+t) for many hours. Fluorescently labeled structures can be segmented and tracked automatically in these terabyte-scale 3D+t images, resulting in thousands of cell migration trajectories that provide detailed insights to large-scale tissue reorganization at the cellular level. Here we present EmbryoMiner, a new interactive open-source framework suitable for in-depth analyses and comparisons of entire embryos, including an extensive set of trajectory features. Starting at the whole-embryo level, the framework can be used to iteratively focus on a region of interest within the embryo, to investigate and test specific trajectory-based hypotheses and to extract quantitative features from the isolated trajectories. Thus, the new framework provides a valuable new way to quantitatively compare corresponding anatomical regions in different embryos that were manually selected based on biological prior knowledge. As a proof of concept, we analyzed 3D+t light-sheet microscopy images of zebrafish embryos, showcasing potential user applications that can be performed using the new framework.

1. Download and install the EmbryoMiner toolbox from https: //sourceforge.net/projects/scixminer/files/Extension%20packages/ by extracting the embryominer.zip archive to the application specials folder of your SciXMiner installation. This extension package contains all required components of the interactive knowledge discovery framework EmbryoMiner. 5. The *.batch file automatically loads the demo projects as well as the visualization windows. Batch files can be opened in a standard text editing software, in case you're interested in understanding or changing the code.
6. Further information is provided in the readme files associated with each of the examples.
The application examples comprise an overview of the visualization possibilities, interactive selection capabilities, application of data mining methods to cell tracking data, track filtering as well as the import of tracking data generated by other tools.
The software was tested and compiled on Microsoft Windows 10 using MATLAB 2017b. Releases for the other major operating systems are planned for the next release. Notes: To get an impression of how to interact with SciXMiner and to get used to its workflow, make sure to have a look at the supplementary items S1 − S5 Video that resemble the application examples provided with the software.
A detailed description of all functions of the graphical user interface of EmbryoMiner are provided in the file scixminer tracking help.pdf that is part of the tracking toolbox.
We tested our framework with entire zebrafish embryos and found that the framework remains highly responsive up to at least 25, 000 objects per frame. Thus, all examples presented in this paper could be performed on a usual desktop computer. We note that the responsiveness for larger data sets that include 3D volume rendering may decrease or may require a more powerful workstation with sufficient memory. As most of the analyses are constrained to a specific region of interest, however, performing a spatiotemporal filtering of the data sets prior to the actual analysis can be used to reduce the track amount to a feasible range.
The seed detection and segmentation presented in the main text was implemented in the open-source software tool XPIWIT [1,2], a platform-independent application that was implemented in C++ on the basis of the Insight Toolkit [3].
XPIWIT is applicable to large-scale 3D images, features a graphical user interface and XML pipelines for processing the data (download and installation instructions available from https: // bitbucket. org/ jstegmaier/ xpiwit/ downloads ). Extracted segments or seed points can be imported and tracked with SciXMiner using the menu item EmbryoMiner → Import / Export / Convert → Import (XPIWIT) CSV Files.
The menu item EmbryoMiner → Import / Export / Convert → Import (XPIWIT) CSV Files can also be used as a generic importer for CSV files. Each time point is required to have a separate CSV file with a single detection per row.
The first row contains the specifiers for each of the columns (separated by semicolon (";"), dot (".") to specify floating point values). The import script assumes that columns 3, 4, 5 contain the x, y, z locations of each object and are named as "xpos", "ypos", "zpos". Empty rows should have the same number of columns and entries should be filled with the "NaN" specifier. If required, provided locations will be automatically tracked using a nearest neighbor tracking algorithm and converted into a SciXMiner project (MATLAB *.mat files). The imported project will be saved a level above the folder that contained the CSV files.
SciXMiner projects can be exported to CSV (one CSV file per time point) via the menu command EmbryoMiner → Import / Export / Convert → Export Project as CSV Files.
To use the 3D volume viewer feature of the framework, image data sets have to be converted to the *.mha format first. We provide an importer script that automatically converts *.tif images to the appropriate format in the menu item EmbryoMiner → Import / Export / Convert → Convert 3D Tiffs to MHA for Volume Rendering. The script can also be used to generate maximum intensity projections along all major axes, if enabled in the settings dialog. Once a project is loaded and the VTK Visualization was started properly using EmbryoMiner → VTK Visualization → Start/Restart, 3D visualizations and 2D maximum intensity projections can be added via EmbryoMiner → VTK Visualization → Add ... Projection and by selecting all 3D volume files or the desired maximum intensity projection file. We tested the new 3D viewer with the largest available data sets of the Cell Tracking Challenge (1272 x 603 x 125 px, 8 bit, 50 time points) and the BioEmergences data sets (512 x 512 x 104 px, 8 bit, 360 time points) and scrolling in time was still possible and highly responsive even on a standard desktop workstation (Intel Core i7-6700 CPU @ 3.4GHz, 64GB memory, NVidia Quadro K620 GPU). For even larger images, however, we note that a workstation with a sufficiently powerful GPU might be required. Alternatively, strategies like down-sampling or cropping may be used to maintain the interactivity on less powerful computers.
The software features importers for the cell tracking algorithms TGMM [4], BioEmergences [5], TrackMate [6] and algorithms that produce output in the Cell Tracking Challenge format [7,8]. All third-party importers are available in the SciXMiner menu item EmbryoMiner → Import / Export / Convert → .... The procedure is self-explanatory and the window titles of the file open dialogs guide you through the import process.
If you have any problems or questions related to the trajectory visualization framework, please do not hesitate to write us an email (benjamin.schott@kit.edu or johannes.stegmaier@partner.kit.edu), ideally including a detailed description on how to reproduce the respective error.