A High-Content Assay Enables the Automated Screening and Identification of Small Molecules with Specific ALDH1A1-Inhibitory Activity

Aldehyde dehydrogenase enzymes (ALDHs) have a broad spectrum of biological activities through the oxidation of both endogenous and exogenous aldehydes. Increased expression of ALDH1A1 has been identified in a wide-range of human cancer stem cells and is associated with cancer relapse and poor prognosis, raising the potential of ALDH1A1 as a therapeutic target. To facilitate quantitative high-throughput screening (qHTS) campaigns for the discovery, characterization and structure-activity-relationship (SAR) studies of small molecule ALDH1A1 inhibitors with cellular activity, we show herein the miniaturization to 1536-well format and automation of a high-content cell-based ALDEFLUOR assay. We demonstrate the utility of this assay by generating dose-response curves on a comprehensive set of prior art inhibitors as well as hundreds of ALDH1A1 inhibitors synthesized in house. Finally, we established a screening paradigm using a pair of cell lines with low and high ALDH1A1 expression, respectively, to uncover novel cell-active ALDH1A1-specific inhibitors from a collection of over 1,000 small molecules.


Introduction
The superfamily of human Aldehyde dehydrogenase (ALDH) enzymes comprises 19 putative functional isozymes that catalyze the NAD(P)+-dependent oxidation of an aldehyde to its corresponding carboxylic acid [1,2]. ALDHs have a surprisingly broad spectrum of biological activities through the metabolism of both endogenous and exogenous aldehydes. For instance, they are involved in the biosynthesis and metabolism of the developmental regulator retinoic acid and the neurotransmitters GABA and dopamine, as well as in cellular homeostasis via the elimination of reactive aldehydes that arise as by-products of oxidative stress [3][4][5]. From a therapeutic point of view, ALDH activity is important in alcohol metabolism through aldehyde detoxification and to cancer drug resistance through the metabolism of chemotherapeutics a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 such as cyclophosphamides [3,[6][7][8]. We focused our efforts on ALDH1A1, which in conjunction with two other cytosolic isozymes, ALDH1A2 and ALDH1A3, comprise the ALDH1A subfamily. Unbalanced ALDH1A1 activity has been linked to insulin resistance, obesity and inflammation [9][10][11][12]. Additionally, increased expression and activity of ALDH1A1 has been identified in a wide-range of human cancer stem cells and are associated with cancer relapse and poor prognosis [13,14]. Given the significant physiological and pathological roles of ALDH1A1, there has been an interest in the development of small molecule inhibitors, not only as chemical tools to better understand the biological role of this protein but also for potential clinical applications [15,16].
To date, most of the high-throughput technologies supporting the identification of small molecule modulators of ALDH1A1 activity constitute in vitro biochemical assays which, although robust and sensitive, do not study the enzyme in its native cellular state nor provide information of inhibitor's cell permeability and toxicity. The evident need for complementary cellular approaches was recently addressed by Ming et al., where the authors adapted the commercially available, low-throughput flow cytometry-based ALDEFLUOR assay into a medium-throughput (96-well) imaging-based assay to assess ALDH1A1 inhibitors in hepatocarcinoma cell lines [17]. While a valuable starting point, this assay format is still not suitable to assess the cellular activity of compound libraries of >100 molecules in dose-response typically required to support systematic and thorough medicinal chemistry efforts.
Here, we have optimized, fully automated, and miniaturized a 1,536-well high-content ALDEFLUOR assay suitable to support quantitative high-throughput screening (qHTS) for the discovery, characterization and profiling of ALDH1A1 small molecule inhibitors. We show robust and reproducible assay performance in 5 μL volume and demonstrate the utility of this assay by generating 11-and 16-point dose response curves on a comprehensive set of prior art inhibitors (Validation Set), as well as an in-house library of over 300 proprietary ALDH1A1 inhibitor analogs, in cell lines displaying different ALDH expression levels. Finally, we implemented a dual cell-based phenotypic screening paradigm to directly uncover novel and selective ALDH1A1 inhibitors with cellular activity from large compound collections, a process that bypasses the initial biochemical screen and subsequent counterscreens for target specificity.

Results
Miniaturization and optimization of a 1,536-well imaging-based ALDEFLUOR assay The ALDEFLUOR assay is used to identify and isolate living cells on the basis of ALDH activity. This assay takes advantage of the conversion of the fluorescent ALDH substrate BODIPY-aminoacetaldehyde (BAAA), which freely diffuses in and out of cells, into the negatively charged product BODIPY-aminoacetate (BAA), which is specifically retained inside cells thereby enhancing their fluorescence [18]. Although primarily dependent on ALDH1A1, the ALDEFLUOR assay reportedly detects activity from other subfamily members, namely ALDH1A2 and ALDH1A3, as well as mitochondrial ALDH2 [13,19,20]. 4-N,N-diethylaminobenzaldehyde (DEAB), a compound that inhibits multiple ALDH family members but displays the highest potency against 1A1, is frequently used as control for the assay [21]. To support our screening campaign for the identification of novel ALDH1A1 inhibitors for oncology research as well as other applications, we first sought to miniaturize and automate the previously established 96-well imaging-based ALDEFLUOR assay by Ming, et al. [17], to a 1,536-well format that could be implemented on an online robotic screening system to assess the cellular activity of thousands of ALDH1A1 inhibitors in a dose-response manner (qHTS).
High expression of ALDH1A1 has been reported in several tumor and cancer cell lines [13,15]. To establish our assay, we selected a pair of cell lines from three different human cancer types, each pair consisting of a cell line with high and low reported ALDH1A1 levels [22][23][24]. Specifically, we tested glioma (LN-18 and LN-229), pancreatic (MIA PaCa-2 and PANC-1) and colon cancer cell lines (HT-29 and SW480). We confirmed by Western blotting that both MIA PaCa-2 and HT-29 express high levels of ALDH1A1, while their counterparts PANC-1 and SW480, present undetectable ALDH1A1 protein levels ( Fig 1A). However, we were not able to detect significant ALDH1A1 expression in either glioma cell line, even though earlier reports indicated that LN-18 had high ALDH1A1 protein levels [22]. Our Western blot analysis also shows that both glioma cell lines express low levels of ALDH1A2; ALDH1A3 is mainly expressed in LN-229; and ALDH3A1 protein was not detected in any of the cell lysates ( Fig  1A). ALDH2 expression was detected in all cell lines, with the exception of PANC-1. However, we should bring into question the specificity of the ALDH2 antibody used in this study. Our antibody cross-reactivity analysis shown in S1 Fig indicates that this particular ALDH2 antibody can recognize multiple ALDH isozymes. Limited cross-reactivity was seen for the remaining antibodies.
To miniaturize the assay into a 1536-well format amenable to qHTS, we chose MIA PaCa-2 and HT-29 because of their high levels of ALDH1A1 (Fig 1A). Utilizing the 8-step protocol described by Ming et al. as a starting point [17], we developed a "semi-automated" protocol described in S1 Table. Briefly, we plated 1,000 cells in a volume of 5 μL/well into black-optical quality clear bottom 1,536-well plates, and allowed cells to attach overnight. Culture media was subsequently removed by inverting and centrifuging plates using a plate adaptor as previously described [25] and replaced with 5 μL of a solution of either 500 or 100 nM BAAA substrate in ALDEFLUOR buffer. To identify cells based on nuclear staining, we also included 0.5 nM Hoechst 33342 in the above buffer. Twenty-three nL of DEAB or DMSO vehicle (final assay concentration of DMSO 0.5%) was transferred via Wako Pintool. After incubating the plate for 30 minutes at 37˚C, 5% CO 2 , 85% RH, the remaining substrate was removed by centrifugation as above and replaced with 3 μL of ALDEFLUOR buffer. Fluorescence was subsequently imaged using the IN Cell Analyzer 2200 high-content widefield fluorescence based imager and the resulting images were quantified using the IN Cell Investigator v1.6.2 analysis software (see Experimental Procedures for analysis details). These initial experiments in 1,536-well format revealed that ALDH1A1 activity can be detected in cell lines expressing high levels of ALDH1A1 such as MIA PaCa-2 and HT-29, but not low-expressing lines such as LN-18 and PANC-1 cells (Fig 1B and S2A Fig). In addition, these experiments indicated that fluorescence intensity correlated with the substrate concentration used in the assay, since cells treated with 500 nM substrate display higher intensity in the FITC channel compared to those treated with 300, 100 and 50 nM substrate (S2B Fig).
To optimize the 1,536-well assay, we systematically tested assay signal-to-background (S:B) and robustness (Z' factor) for three main parameters: BAAA substrate concentration (50-500 nM), number of MIA PaCa-2 cells (500-2,000 cells/well) and DEAB inhibitor incubation time (30-120 minutes). The observed DEAB's IC 50 increased with increasing cell numbers and also with longer reaction times. In contrast, no clear correlation was observed between DEAB's IC50 and substrate concentration. Overall, the IC 50 obtained for DEAB ranged from 0.6 to 1.5 μM with an MSR [26] of 2.2 (Fig 2A-2C). The best S:B and Z' values were observed with higher number of cells/well and/or higher BAAA substrate concentrations (Fig 3). We found that inhibitor incubation time did not significantly influence assay performance, although incubation times of 30-60 minutes seem to be optimal for the cell line tested (Fig 3), and that it was crucial to remove background fluorescence due to remaining BAAA substrate in the supernatant before imaging (step 6 of semi-automated protocol in S1 Table). Although the IC 50 for DEAB was similar with and without step 6, we found that removing background fluorescence due to the remaining BAAA substrate before imaging improves assay signal window and robustness, even when low substrate concentrations are used (S3 Fig).

Validation set
To validate our optimized assay, we assembled a set of 20 previously reported ALDH inhibitors, with varying degrees of target specificity (Table 1 and S4 Fig) [15,16,27]. For instance, we included inhibitors like Bay-11-7085, Disulfiram and Aldi-2 because of their activity towards multiple ALDH isoforms, and inhibitors like NCT-501 and A37 with reported selectivity towards ALDH1A1. We also mined the patent literature to find three previously unpublished ALDH1A1 inhibitors, UM 673A, UM 673B, and Compound 5. In addition, we included two ALDH1A1 inhibitors (PubChem CID 725345 and 2929292) previously identified from our screening efforts (PubChem AID 1030). Furthermore, we incorporated Daidzin and CVT-10216, which preferentially inhibit ALDH2, and CB7 which specifically targets ALDH3A1, to ensure coverage of other ALDH isoforms. We initially set out to confirm the reported potency and specificity of this validated set of inhibitors using an in vitro fluorescence-based enzymatic assay [28]. Briefly, the assay measures the dehydrogenase activity of recombinant human ALDH1A1, ALDH1A2, ALDH1A3, ALDH2 or ALDH3A1 using NAD(P) + and either propionaldehyde or benzaldehyde as substrates. An orthogonal resorufin-based assay format was also run with isolated ALDH1A1 to rule out potential false positives due to intrinsic compound fluorescence [29]. Both assay formats yielded comparable IC 50 for each inhibitor (R 2 = 0.86; S5A Fig).
We found that these inhibitors displayed varying degrees of potency and specificity, which generally agreed with their previously reported activity (Table 1). However, for UM 673A, Pargyline, Daidzin, and CVT-10216, the IC 50 values obtained in our biochemical assay were 10-to 80-fold higher than their reported IC 50 . In addition, we observed an IC 50 15-to 60-fold lower than published for Aldi-2. We then tested the validation set in the 1,536-well ALDEFLUOR imaging-based assay using both MIA PaCa-2 and HT-29 cells. Eight inhibitors in this validation set (Kyneurine, Citral, Daidzin, DEAB, Disulfiram, Gossypol, Molinate and Pargyline) were recently tested by Marcato and colleagues for their ability to reduce ALDEFLUOR fluorescence in breast cancer cells that express ALDH1A3 [43]. The authors found that Citral and DEAB significantly reduced ALDEFLUOR fluorescence (determined by flow cytometry) at concentrations >10 μM. However, it is important to note that most of the inhibitors in the validation set have either not been previously tested or have no reported IC 50 value using the ALDEFLUOR cellular assay and thus we have no prior data set(s) to compare to. The assay was run according to the protocol described in S1 Table, with each compound tested as a 16-point dilution series, with concentrations ranging from 45.8 μM to 1.4 nM. Compounds displaying high quality concentration response curves and >50% efficacy, were considered active [44]. Under these conditions, 10 compounds exhibited inhibitory activity in both cell lines, with potencies varying from~0.8 to~28 μM. The most potent compounds were Compound 5, DEAB, and NCT-501. Importantly, compound potency was similar between cell lines (R 2 = 0.85, S5B Fig; Table 1). When comparing inhibitor potency between biochemical and cell based-assays we found that compound IC 50 is higher when inhibitors are tested in the cellular environment as opposed to in the isolated enzyme (Table 1). Only two compounds, Pargyline and Gossypol, were active in HT-29 but inactive in MIA PaCa-2 cells.
Of note is the compound Citral. In the work by Ming et al., the authors utilized Citral as one of the control compounds to validate and optimize different assay parameters such as number of cells/well and substrate removal steps [17]. Under these different conditions, the reported IC 50 is in the range of~7.6 to 60.1 μM [17]. In our 1,536-well assay conditions, the IC 50 for Citral is~27 μM in both MIA PaCa-2 and HT-29 cells, which agrees with their 96-well assay.
The remaining 8 compounds were inactive in both cell lines. Not surprisingly, compounds like Kynurenine and Molinate, which have low potency (>10 μM) against the purified enzyme also tested negative in our cell-based assays. Other compounds, like CID 2929292, with potent inhibitory activity against isolated ALDH1A1, tested negative in the cellular assay. These results could indicate poor compound cell permeability. Among compounds inactive in the cell-based assay, we also find ALDH3A1-specific inhibitors. Not only do the cell lines tested here not express detectable levels of ALDH3A1 protein (Fig 1A) but the ALDEFLUOR assay reportedly does not detect ALDH3A1 activity [45]. Correspondingly, CB7, a specific and potent inhibitor for ALDH3A1 developed by the Hurley group, tested negative in this cellular assay. Finally, we find that ALDH2-specific inhibitors Daidzin and CVT-10216, also tested negative in the cellular assay. However, as mentioned above, the potency of these two compounds in the in vitro assay was in the single and double-digit μM range, respectively, and did not match reported potencies (Table 1). In addition, the observed expression of ALDH2 in both MIA PaCa-2 and HT-29 cells is questionable due to lack of antibody specificity (S1 Fig). Overall, the results from the biochemical and cellular assays show that Compound 5 and NCT-501 are selective and potent against isolated ALDH1A1 as well as ALDH1A1-expressing cells. Our results identified Disulfiram, Bay 11-7085, Aldi-2, CID 725345, and DEAB as pan-ALDH inhibitors with potent biochemical activity and exhibiting cellular activity in MIA-PaCa2 and HT-29 cells.

Assay automation
In order to support large-scale quantitative screening, we modified the assay protocol to fit our automated robotic platform, containing robotic arms for transporting 1,536-well plates along with dispensers, washers, incubators and detectors. The main difference between the two protocols is the automated removal of media and substrate via aspiration (S1 Table) as opposed to inverting and centrifuging the plate applied in the non-automated screening [25]. Using MIA PaCa-2 cells and following the automated protocol-1 described in S1 Table, the assay yielded a S:B of 3.25±0.64 and Z' of 0.24±0.03. To determine if the assay statistics could be improved, we added an extra wash step before imaging to further decrease background signal (protocol described in automated protocol-2 inS1 Table). Although this extra step did not alter assay signal window (S:B of 3.5) it did improve assay robustness (Z' of 0.53±0.05). We then screened the validation set using the above fully automated protocol-2 with two washes and compared inhibitor activity with that obtained using the semi-automated format described in S1 Table. The IC 50 for control DEAB was similar in both the semi-and fully automated formats (Fig 4A; IC 50 range of 0.14 to 0.70 μM). Almost all compounds that showed no inhibitory activity in the semi-automated assay, also showed no activity in the fully automated format. The only discrepancy was Pargyline, which exhibited no activity in the semi-automated protocol but reduced BAA signal with an IC 50 of~5 μM in the automated protocol. Potencies for the remaining 11 compounds with BAA-reducing activity correlated well (R 2 = 0.78, p<0.0002) between protocol formats (Fig 4B).
To demonstrate the power of this automated cellular assay, we tested a collection of 379 ALDH1A1 inhibitors developed as part of an in-house campaign (manuscript in preparation). This collection contains inhibitors with potencies ranging from 13 nM to 14 μM activity in ALDH1A1 biochemical assays. We tested each compound as an 11-point, 1:3 dilution series (final concentration range 0.775 nM to 45.8 μM) against both MIA PaCa-2 and HT-29 cells. A total of 288 compounds demonstrated inhibitory activity in both cell lines and with IC 50 values ranging from 0.02 to 30 μM (high quality concentration response curves and >50% efficacy). Inhibitor IC 50 values correlated well in both cell lines with an R 2 = 0.57 and p<0.0001 ( Fig  4C). When comparing IC 50 values between ALDH1A1 enzymatic and cell-based assays, similar to the validation set, compound potency is right-shifted in the cell-based assay (Fig 4D and 4E, R 2 values of 0.31 and 0.33, respectively). The median potency difference between biochemical and MIA PaCa-2 and HT-29 assays were 38-and 17-fold, respectively. Of the remaining 91 compounds, 24 reduced BAA intensity in only one of the cell lines, and 67 compounds did not reduce BAA intensity in either of the cell lines tested. Importantly, these inactive compounds showed potency ranges of 0.07 to 34.6 μM in biochemical settings, indicating that our assay identifies analogs with good biochemical activity but poor cell permeability and/or intracellular target engagement. The 1,536-well high-content ALDEFLUOR assay detects ALDH1A1specific inhibitors We next sought to test if the fully automated high-content assay could be used to detect ALD-H1A1-specific inhibitors. To this end, we compared the activity of control DEAB and the ALDH1A1-specific inhibitors NCT-501 and Compound 5 in cells with high (MIAPaCa2 and HT-29) versus low (LN-229) ALDH1A1 protein levels. Importantly, LN-229 displays detectable levels of BAA fluorescence, likely due to high expression levels of ALDH1A3 and possibly ALDH2 (Fig 1A and S6A Fig).

A dual cell-based screen identifies ALDH1A1-specific inhibitors
The above findings prompted us to speculate that a dual cell-based screening paradigm using a pair of cell lines with high and low ALDH1A1 expression could be implemented as a phenotypic imaging-based screen to identify novel ALDH1A1 inhibitors from large compound collections. We reasoned that compounds that reduce BAA intensity exclusively in MIA PaCa-2 cells but are inactive in LN-229 cells, are ALDH1 inhibitors specific for the 1A1 isozyme. This paradigm would directly identify cell-permeable inhibitors, in a process that bypasses the initial biochemical screen and subsequent counterscreens for target specificity.
To validate our hypothesis, we screened a collection of bioactive molecules containing 1,279 unique compounds in dose response (7 concentration points ranging from 2.  Table (no extra wash step included). A total of 6 compounds reduced BAA activity in MIA PaCa-2 but not in LN-229. We were able to source 5 of the 6 compounds for retesting, at an 11-point dose response with a concentration range of 0.77 nM to 45.8 μM. Among these, we confirmed the MIA PaCa-2-specific activity of the thromboxane receptor antagonist L-670596 (PubChem CID 129360) with an IC 50 of~10 μM (Fig 6A and Table 2). The herbicide Propachlor (PubChem CID 4931), showed a >50-fold difference in potency between MIA PaCa2 and LN-229 cells (IC 50 of 0.3 vs. 16 μM, respectively; Fig 6B and Table 2). The remaining three compounds retested were either inactive or showed no potency difference between cell lines. In vitro enzymatic assays validated L-670596 as a specific ALDH1A1 inhibitor. Similarly, Proprachlor is a much more potent inhibitor of ALDH1A1 vs. the other isozymes tested ( Table 2). Neither Proprachlor nor L-670596 have previously been shown to possess anti-aldehyde hydrogenase activity.

Discussion
Here we developed and demonstrated a fully automated 1,536-well high-content ALDEFLUOR assay that can be scaled to a robotic screening platform to evaluate large compound libraries and in high-density dose-response (i.e. 7 or 11-points). The imaging time required per plate is 20 min. By fully automating the assay, where the only user intervention is supplying plated cells, one can run up to 72 plates (or~10,000 compounds at 11-point dose response) per day when only one plate reader is used. The high-throughput offered by our assay is especially advantageous when using low-abundant cells, such as patient-derived cells. With similar number of cells needed per plate format (9.6 x 10 5 vs. 1.5x10 6 for 96-and 1,536-well format, respectively), one can screen 17 times more compounds at 11-point dose range in the 1,536-well format compared to the earlier 96-well format assay [17]. In addition, our miniaturized format is cost-effective in terms of compound volumes and ALDEFLUOR reagent usage, since it allows one to screen 7 times more compounds with the same amount of reagent used in the 96-well format.
We implemented our high-content assay to quantitatively benchmark the potential of prior art compounds to reduce ALDH1A1 activity in a cellular context. To our knowledge, this is the first study in which a comprehensive set of ALDH inhibitors is tested in parallel in both biochemical and cell-based assays. Moreover, our assay provides the throughput necessary to support medicinal chemistry efforts during lead optimization processes. For some of the prior art compounds, we observed IC 50 values that were higher or lower than previously published results, which could be attributed to differences in assay format, buffer components, reaction progress, substrate concentration, and/or compound quality (i.e. purity).
In drug discovery programs, a frequent practice consists of first assaying compounds in a biochemical assay and then advancing selected molecules to biochemical counter-assays (to assess target specificity) as well as to a cellular format to identify cell-permeable structures. The dual cell-based screening paradigm presented here allows the direct identification of ALDH1A1-specific (cell type-specific) inhibitors with cellular permeability and potency. This set-up offers an attractive alternative approach to the discovery of ALDH1A1 inhibitors as it enables one to expedite the identification of cellular active lead compounds from large collections.
The assay presented herein represents an important advance for expediting ALDH1A1 drug discovery campaigns, complementing in vitro biochemical assays, providing the throughput necessary to support SAR efforts and also offering the alternative to be used as a primary phenotypic screen to discover ALDH1A1 inhibitors from large collections.

1,536-well enzymatic assays
Human ALDH1A1 and ALDH3A1 were expressed and purified as described elsewhere [46,47]. Human ALDH2 was purchased from Abcam (ab87415). Human ALDH1A2 and ALDH1A3 were purchased from MyBioSource (MBS1005929; San Diego, CA) and Thermo-Fisher (11636H07E50), respectively. The inhibitory activity of compounds against ALDHs was measured according to protocols described previously [28] To account for artifacts due to intrinsic compound fluorescence in the above spectrum, compounds were also tested in an orthogonal resorufin assay format with optics of excitation 525 nm/emission 598 nm as described before [29]. The same parameters as the NADH assay were used for the resorufin assay, with the addition of (final) 133 μg/mL diaphorase and 37 μM resazurin to the assay solution. Same controls were used in both formats.

1,536-well high-content imaging ALDEFLUOR assay
The ALDEFLUOR kit was purchased from STEMCELL Technologies (Vancouver, Canada; #01700). Cells (5 μL) were dispensed into black, optical quality (cyclic olefin copolymer) clear bottom, medium binding TC treated 1,536-well plates (Aurora Microplates, Whitefish MT) using a Multidrop Combi dispenser (ThermoFisher) and incubated overnight (37˚C, 5% CO 2 , 85% RH). Media was subsequently removed by centrifuging plates upside down using a plate adaptor to collect media. A solution of BAAA substrate (STEMCELL Technologies) and Hoechst 33342 (ThermoFisher, final concentrations of 500 nM and 0.5 nM, respectively) in ALDEFLUOR buffer (STEMCELL Technologies #01700) was dispensed onto cells using a Multidrop Combi followed by immediate transfer (23 nL) of compound or control solutions using a Wako Pin-tool (final percentage of DMSO in the cell plates was 0.5%). Unless otherwise noted, all compounds were assayed as 16-point dilutions spanning a final concentration range of 1.4 nM to 47.8 μM. The neutral and positive assay controls were DMSO and DEAB (4.6 μM), respectively. Cells were incubated for 30 minutes or the indicated amount of time at 37˚C, 5% CO 2 , 85% RH to allow the conversion of BAAA into BAA. Supernatant was subsequently removed by centrifugation as described above, then ALDEFLUOR buffer (3 μl) was dispensed by Multidrop Combi before imaging on an IN Cell 2200 (GE Healthcare). The above assay was modified for an online robotic screening system. After cell plating and overnight incubation, 4 μL of media were removed using a 64-tip metal aspirator head on a Wako aspirator station, leaving 1 μL remaining in the well, followed by a 4 μL dispense of BAAA and Hoechst 33342 in ALDEFLUOR buffer, for a final concentration of 500 nM and 0.5 nM, respectively. Immediately following the dispense, 23 nL of compound or control solutions were transferred using a Wako Pin-tool. Cells were incubated for 30 minutes at 37˚C, 5% CO 2 , 85% RH, followed by a 4 μL media removal using the Wako aspirator, and a subsequent 3 μL addition of ALDEFLUOR buffer. Plates were then immediately read on the IN Cell 2200 as described below.

Image acquisition and analysis
For images captured on the IN Cell 2200 widefield automated microscope, a 10x 0.45 NA Plan Apo objective lens was used to capture the entire well of the 1,536-well plate using standard DAPI (390/18x, 432/48m) and FITC (475/28x, 525/48m) filter sets at 50 msec and 100 msec exposures, respectively. Images were subsequently analyzed using IN Cell Investigator v1.6.2 analysis software's canned Multi-Target Analysis algorithm (GE Healthcare). Hoechst stained nuclei were identified using top hat segmentation with a minimum area of 75 μm 2 and sensitivity of 93. BAA-retaining cells captured via FITC channel, were identified using multiscale top hat segmentation with a minimum area of 100 μm 2 and a sensitivity setting of 16. Several data measures were collected and the most robust measure for ALDH activity was found to be the integrated intensity (Intensity x Area) of the FITC channel. Data were plotted using Graph-Pad Prism software (GraphPad, San Diego, CA), with sigmoidal dose-response (variable slope) fitting.

qHTS data analysis and statistics
Data from each assay were normalized plate-wise to corresponding intra-plate controls (neutral control DMSO and positive control DEAB 4.6 μM or otherwise noted) as described previously [48]. The same controls were also used for the calculation of the Z' factor for each assay. The Z' factor, a measure of assay quality control, was determined as previously described [49]. Percent activity was derived using in-house software [50]. Dose-response curves were classified as described previously [44]. All concentration-response curves were fitted as before and IC 50 were calculated with the GraphPad Prism software.

Acknowledgments
This work was supported by the intramural research program of the National Center for Advancing Translational Sciences. We thank Carleen Klumpp-Thomas for assistance with assay automation and the compound management group (Paul Shinn, Danielle Bougie, Crystal McKnight, Misha Itkin, and Zina Itkin) for sourcing, quality control, formatting, and plating all compounds.

Author Contributions
Conceptualization: AY AS NJM.