Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

PS Poly: A chain tracing algorithm to determine persistence length and categorize complex polymers by shape

  • Elizabeth A. Conley,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Software, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri United States of America

  • Creighton M. Lisowski,

    Roles Formal analysis, Methodology, Software, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri United States of America

  • Katherine G. Schaefer,

    Roles Formal analysis, Investigation, Methodology, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri United States of America

  • Harrison C. Davison,

    Roles Investigation, Methodology, Software, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri United States of America

  • Julie E. Baguio,

    Roles Methodology, Software

    Affiliation Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri United States of America

  • Ioan Kosztin,

    Roles Conceptualization, Investigation, Methodology, Software, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri United States of America

  • Gavin M. King

    Roles Conceptualization, Methodology, Writing – original draft, Writing – review & editing

    kinggm@missouri.edu

    Affiliations Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri United States of America, Department of Biochemistry, University of Missouri-Columbia, Columbia, Missouri United States of America, Materials Science and Engineering Institute, University of Missouri-Columbia, Columbia, Missouri United States of America

Abstract

The fundamental molecules of life are polymers. Prominent examples include nucleic acids and proteins, both of which exhibit a large array of mechanical properties and three-dimensional shapes. The bending rigidity of individual polymers is quantified by the persistence length. The shape of a polymer, dictated by the topology of the polymer backbone, a line trace through the center of the polymer along the contour path, is also an important characteristic. Common biomolecular architectures include linear, cyclic (ring-like), and branched structures; combinations of these can also exist, as in complex polymer networks. Determination of persistence length and shape are largely informative to polymer function and stability in biological environments. Here we demonstrate Persistence length Shape Polymer (PS Poly), a near-fully automated algorithm designed to obtain key physical attributes from single molecule images obtained in physiologically relevant fluid conditions via atomic force microscopy. The algorithm, which involves image reduction via skeletonization followed by end point and branch point detection, is capable of rapidly analyzing thousands of polymers with subpixel precision. Algorithm outputs were verified by analysis of deoxyribonucleic acid, a very well characterized macromolecule. The method was further demonstrated by application to candidalysin, a recently discovered and complex virulence factor from Candida albicans. Candidalysin forms polymers of highly variable shape and contour length and represents the first peptide toxin identified in a human fungal pathogen. PS Poly is a robust and general algorithm. It can be used to extract fundamental information about polymer backbone stiffness, architecture, and more generally, polymerization mechanisms.

1. Introduction

Knowledge of cellular function and dysfunction (disease) has advanced through developing a detailed understanding of many semi-flexible polymeric molecules. A prime example is the recently discovered peptide toxin candidalysin (CL), which is the virulence factor secreted by the fungus C. albicans [1]. CL forms loops in solution which can then embed into membranes and form pores that damage host cells [2]. When establishing the molecular basis of a disease, such as invasive candidiasis which emanates from C. albicans and has a high mortality rate [3], characterizing polymer mechanical properties and topology (shape) provides significant insight. For example, bending rigidity, quantified by the persistence length, sheds light on the expected diameter of loops formed by CL. This geometric information can be used to predict what size molecules might be able to pass through the CL pores in host cell membranes during C. albicans infection. Additionally, when studying the kinetics of polymer loop formation (cyclization) or branching, which usually involves secondary polymerization interfaces [4,5], it is informative to separate and quantify polymers by architecture so that distinct reactions can be isolated. Such analyses can be used to construct kinetic models of looping and branching, to determine under what conditions polymer cyclization occurs, and to explore how conversion of linear polymers to looped polymers can be controlled.

Atomic Force Microscopy (AFM) is a powerful single molecule imaging technique employed in micro/nanoscale biophysical investigations and has been used to shed light on polymer persistence length, lP [69]. Analysis typically requires a stack of AFM image data containing many individual polymers. To perform lP and other calculations such as radius of gyration, it is necessary to extract the coordinates along the chain contour, or “backbone”, for each polymer in the analysis. Once these coordinates are obtained, a polymer physics model such as the worm-like chain (WLC) can be used to deduce the persistence length through calculating mean-square end-to-end distances or correlations of backbone tangent vectors [10].

Existing software for polymer detection and characterization in AFM image data suffer from limitations [1121]. For example, popular tools including Easyworm [11], Skan [21], and AutoSmarTrace [19], do not simultaneously provide robust feature extraction, skeletonization, shape, and mechanical property calculations such as persistence length. While EasyWorm is suitable for analysis of simple, unbranched polymer chains, it does not determine polymer architecture, requires significant manual input, and is only available to the MATLAB community; thus limiting its accessibility and adaptability for open-source workflows. AutoSmartTrace, also MATLAB-dependent, requires manual intervention for thresholding and feature identification, reducing reproducibility across diverse polymer morphologies. While Skan provides Python-based skeletonization and network analysis, its focus on general image processing necessitates extensive customization for polymer-specific applications, such as persistence length quantification. These limitations underscore the need for an open-source, flexible, modular, and easily extensible solution for the automated detection and analysis of polymer structures recorded through AFM imaging.

PS Poly, introduced here, attempts to fill these gaps and is well suited for studies of complex polymerization processes such as those underlying the host cell attack mechanism of Candida albicans [2,22,23]. PS Poly is designed for polymer backbone isolation with sub-pixel precision, automated persistence length and radius of gyration calculation, and architecture categorization (e.g., linear, looped, branched, branched-looped). A workflow of the algorithm is shown (Fig 1). The program is open-source with code written in both Python and Igor Pro 7 (WaveMetrics, Inc.) [24] and is near-fully automatic, requiring only basic information from the user such as the pixel resolution (nm/pixel) of the source images. PS Poly eliminates manual input post-thresholding, and outputs quantitative metrics such as total polymerized length and branch point coordinates. Persistence length results were verified by comparison to established values and were robust to moderate levels of added noise. The use of a Python framework may enhance accessibility, automation, open-source development, and modularity. Thus, the algorithm represents a step toward a standardized tool for AFM-based polymer analysis.

thumbnail
Fig 1. Program Overview.

Images are processed and reduced to polymer backbone coordinates, then individual features are separated based upon architecture. Usually, only measurements of linear polymers are considered when calculating persistence length, lp.

https://doi.org/10.1371/journal.pone.0341464.g001

2. Methods

In this section we first describe the algorithm as it was originally written in Igor Pro 7, subsequently referred to as PS-Poly. We then turn attention to the Python implementation, PsPolypy. Though written in different languages (the C-like Igor Pro scripting language and Python) and slightly different in methodology, both versions of the algorithm have effectively the same functionality. The methods used for calculating physical attributes such as polymer persistence length and uncertainty are also discussed in this section. Finally, we describe the techniques used for acquiring AFM images of CL in near-native conditions.

2.1. Igor Implementation: PS-Poly

Here we describe the Igor Pro 7 implementation, PS-Poly. Briefly, in this algorithm the images are skeletonized, a convolutional filter is used to identify endpoints, and a pathfinding algorithm is used to determine the coordinates of all linear particles used for persistence length analysis. To separate features by shape, filters were developed to identify branch points and distinguish branches from cyclic or looped polymers. These steps are described below. We assume that image preprocessing such as background subtraction has already been applied to the raw AFM image data prior to PS-Poly analysis.

2.1.1. Particle segmentation & polymer backbone isolation.

The program begins by loading full-field images. Isolating the polymer backbone is the next step, illustrated in Fig 2. This is achieved automatically, by employing Otsu’s method [25], or manually, with user defined threshold corresponding to pixel intensity, which is proportional to topographical height, z, of the polymer in the AFM image. In either case, a binary “mask” image is created where all values above and below the threshold are set to 1 and 0, respectively. Then, if upscaling is desired, a copy of this mask is made with a higher pixel density by creating a new image with a specified scaling factor. The size of the expanded mask has dimensions scaled by the specified scaling factor, and each pixel in the original mask is taken up by a block of x scaling pixels in the expanded mask. This allows the result to be obtained with a subpixel level of accuracy which can be valuable for characterizing short polymers. A “skeleton” of the mask is created through a surface thinning algorithm which eliminates layers of the image until only single pixel linewidth traces remain (Fig 2D) [26].

thumbnail
Fig 2. Polymer backbone isolation procedure.

The first steps to perform PS-Poly calculations on an AFM image of candidalysin are shown. Scale bars (yellow) are 100 nm. (A) Raw AFM image with topographic height, z, shown in greyscale. (B) Mask created from the AFM data. (C) New mask that has been expanded to a higher pixel density. (D) Skeleton created from the expanded mask. (E) Each molecule is registered as a unique object (pink circles).

https://doi.org/10.1371/journal.pone.0341464.g002

2.1.2. Acquiring polymer coordinates.

To obtain separate lists of coordinates for each molecule, we begin by looping through each pixel on a duplicate of the thinned image. Once a 1-valued pixel is found, that coordinate is stored. Then a flood-fill algorithm fills-in all 1-valued pixels which are continuous with that coordinate. This process continues in a loop until the duplicated image is entirely 0-valued, and the resulting list of coordinates correspond to exactly one “seed” pixel per molecule. Then, depth-first search (DFS) is applied in a square around the seed pixel [27]. DFS is used to identify connected pixels and provides detailed information about each connected component. The DFS algorithm explores all possible paths stemming from one input coordinate until a path is found to another input coordinate. The implementation of DFS in this program returns 1 if a path is found between the two coordinates, and 0 if there is no possible path. The search radius is incremented with each loop iteration, and the loop breaks once all locations continuous with the seed pixel are found. We found that applying DFS, as opposed to checking for continuity with all of the one-valued pixels in the image, reduces computational time significantly.

2.1.3. Sorting polymers by architecture.

Polymers are sorted based on their architecture: Linear, Looped, Branched, and Branched-Looped. Examples of these four primary polymer classes are shown in AFM images of CL (Fig 3). This sorting first requires polymer termination point identification, achieved through convolutional filtering.

thumbnail
Fig 3. Primary polymer types.

Examples of the different particle types identified through PS-Poly. Raw AFM image data of CL is shown next to processed skeleton images for Linear, Looped, Branched, and Branched-Looped polymers. The scale bar spans 50 nm and applies to all images.

https://doi.org/10.1371/journal.pone.0341464.g003

2.1.4. Polymer end point determination.

The algorithm loops through the image, cropping 9 × 9 pixel areas surrounding each central pixel test point. Fig 4A demonstrates the pixel grids that are created for every 1-valued pixel in the image. There are 16 possible endpoint configurations. This comes about because there are 8 possible neighboring pixels that share either a common edge or vertex with the central pixel. For one neighbor there are 8 possible combinations. For 2 neighbors, there are also 8 possible endpoint combinations. The pixel grids corresponding to the 16 possible endpoint configurations are shown in Fig 4B. Each pixel grid is compared with each of the 16 endpoint grids and if any one of them is an exact match, then that coordinate is considered an endpoint.

thumbnail
Fig 4. Convolutional filtering for end point determination.

(A) shows 9 x 9 pixel grids that are created from points i, ii, and iii on a skeletonized polymer. (B) shows the sixteen different shapes that were used in the convolutional filter that finds endpoint coordinates. Shapes are clustered via number of neighbors.

https://doi.org/10.1371/journal.pone.0341464.g004

2.1.5. Branch point identification.

Following endpoint detection, branch points are identified through a filter that works by creating an array corresponding to neighboring pixel values that circle about a central pixel of interest (Fig 5A). Each path comprises a clockwise 10-pixel-long sweep. A unique path is counted for each 1 value that is abutted by 0 values (Fig 5A, blue triangles). We note that the last two pixels in the path are repeats of the first two, allowing evaluation of the starting point of the array. The array is examined for unique paths. If three or more unique paths are found, then the pixel of interest is determined to be a branch point. The algorithm is also prevented from overcounting branch points, as not all points with three neighbors are true branch points (Fig 5B).

thumbnail
Fig 5. Branch point determination.

(A) Each pixel in the image is centered in a 3 x 3 grid and a string of 1’s and 0’s begins at any point neighboring the centrally located reference pixel. In this example three unique paths (blue triangles) are found, identifying the reference pixel as a branch point. (B) Technique for prevention of branch point overcounting. Test point D has 3 neighboring 1-valued pixels, each of which corresponds to a unique path. Test point C also has 3 neighboring 1-valued pixels, but it does not exhibit three unique paths and is thus not a branch point. The non-unique path is marked by the orange triangle.

https://doi.org/10.1371/journal.pone.0341464.g005

2.1.6. Architecture calling, overlapped identification, and total polymer length determination.

If a polymer has exactly two endpoints and no branch points, it is considered to be linear. If there are no endpoints and no branch points, then it is considered to be a loop. Further, polymers that branch and loop are separated from those which branch but do not loop. We further differentiated true branch points from segments where polymers drape over themselves as they adsorb to the imaging surface. These overlapped polymers were identified via co-localization of branch points with topographically high points along the polymer backbone (Fig 6).

thumbnail
Fig 6. Process for determining overlapped polymers.

(A) Cartoon showing an example of how an overlapped polymer may form via twisting upon surface adsorption. (B) AFM image of CL polymer. The height is shown in greyscale, the lateral scale bar (yellow) is 20 nm. (C) Skeleton created from the data in panel B shows a potential branch point junction (arrow) comprising the intersection of 4 branches. (D) New skeleton that incorporates height information. (E) Skeleton created at a threshold of 1.5 times the average height of the polymer distal from the junction identifies the polymer as overlapped.

https://doi.org/10.1371/journal.pone.0341464.g006

The total polymerized length is found by applying a pathfinding algorithm which sums all pixels in each feature. For overlapped particles, the length is computed by adding the length of the original skeleton with the skeleton made at a threshold of 1.5 times the average height of all pixels on the skeleton with incorporated height information. If over 80% of the polymer is above 1.5 times the average height of the polymer backbone, then it is sorted separately as a noise particle. Such features could be aggregates or other artifacts. Polymers with high points that are not overlapped are still sorted by their architecture and stored in a separate folder.

2.2. Python Implementation: PsPolypy

2.2.1. Preprocessing, Image Loading, & Upscaling.

We assume that any image preprocessing including background subtraction has already been applied. PsPolypy uses as input a list of full-field images (denoted as ) that contain polymer particles. The images may have different pixel resolution () but the same fixed real-space resolution (), measured in nanometers per pixel (). Each image is first converted to normalized grayscale, with pixel intensities . Optionally, users can upscale the pixel resolution of the images using -order interpolation, as implemented in scikit-image [28,29]. For a user-defined magnification factor , the pixel and the real-space resolution of the upscaled image becomes () and (), respectively. This optional step allows for finer image details to be analyzed.

2.2.2. Particle Segmentation.

Each image in the set undergoes particle segmentation through a multi-step process. Initially, Otsu thresholding is applied to create a binary mask . This mask then undergoes connected-component labeling, segmenting it into distinct regions, each corresponding to a unique polymer particle, with a well-defined bounding box. To ensure complete particle representation, any region whose bounding box touches the edge of the full-field image is discarded, as these particles may be partially cut off. The original image and binary mask are then cropped according to the bounding boxes of all remaining regions. Finally, a list of particle objects is created, with each object containing the cropped image of an individual particle and its corresponding cropped binary mask.

2.2.3. Skeletonization.

The skeleton of each polymer particle (Fig 7, green) is obtained by applying the skeletonize method from scikit-image to the corresponding binary mask. The resulting skeleton is then analyzed using Skan [21], which automatically determines the topology (e.g., linear, branched, looped or cyclic) and geometric features (e.g., end-to-end distance, contour length) of the particle. We denote the set of all paths as , and the -th path as . This skeletal analysis captures the main features of the particle’s structure and morphology.

thumbnail
Fig 7. Skeletonization and interpolation of different polymer topologies.

Examples of linear, looped, and branched polymers are shown along with the skeleton (green) and interpolation (red).

https://doi.org/10.1371/journal.pone.0341464.g007

2.2.4. Classification of polymer architecture.

Polymers are classified into one of six categories based on their skeleton’s structure: Linear, Branched, Looped, Branched-Looped, Overlapped, or Unknown. If the skeleton contains a single path with distinct endpoints, the particle is classified as Linear (see Fig 7). If the skeleton has a single path where the start and end points coincide, the particle is classified as Looped. For skeletons with multiple paths, each crossing is checked whether it is a branch junction or an overlap, defined by having a height 1.5 larger than the average polymer backbone height distal to the crossing. Polymers with at least one overlap are classified as Overlapped (as shown in Fig 6). For the rest of the skeletons, if no combination of paths forms a cycle (as determined using the NetworkX graph representation of the skeleton) [30], the particle is classified as Branched. Conversely, if multiple paths are present and at least one cycle exists, the particle is classified as Branched-Looped. If none of the above criteria are met, the particle is categorized as Unknown. After classification, users may select only the types of particles they wish for further analysis.

2.2.5. Interpolate Skeletons.

Each path (longer than 3 pixels) within each particle (digitized skeleton) undergoes cubic B-spline interpolation by employing the SciPy’s splev function [31]. The coordinates along the interpolated skeleton path are given by and , where is the distance along the contour. The interpolated paths are sampled at user-defined intervals along the contour. The interpolated skeleton provides a more precise representation of the particle compared to the original (digitized) skeleton (Fig 7, compare red curve with green pixels). For each sampling point along the interpolated path, both the position of the point and the tangent unit vector to the path are recorded.

2.2.6. Mean end-to-end distance.

The mean end-to-end distance of the particles, , as a function of contour length is determined as follows. First, for the -th path (), a symmetric distance matrix is constructed by calculating the Euclidian distance between all pairs of points and , along the contour. Thus, the distances with the same lag correspond to the -th superdiagonal of . Finally, for is calculated as the mean of the squares of all -th superdiagonal elements of for all paths . The uncertainty of is estimated by calculating the standard error of the mean (SEM).

2.2.7. Mean orientation correlation function.

The orientation (or tangent-tangent) correlation of polymer particles is defined by , where is the unit tangent vector to the interpolated path at point . The mean orientation correlation , as a function of path length , is calculated by first constructing the correlation matrix , where is the path index. For , is calculated as the mean of all -th superdiagonal elements of for all paths . Similarly to , the uncertainty of is estimated through the corresponding SEM.

2.3. Persistence Length () Calculations

In both Python and Igor Pro implementations, the persistence length, , of the polymer particles was estimated by fitting either or to their expressions from the worm-like-chain (WLC) model which is commonly used to describe semi-flexible polymers [10]. For polymers equilibrated onto 2D surfaces, as is typical in AFM image data, the expressions are [9,11,32,33]

and

In PsPolypy, these nonlinear fits are performed using the feature rich lmfit python library [34].

2.4. Radius of Gyration () Calculations

The radius of gyration was calculated as a metric for characterizing polymer compactness. For a set of polymer coordinates ri = [xi, yi], the radius of gyration is defined as

where rcm is the polymer center of mass. In our case, the center of mass is equivalent to the geometric center as we assume all polymer subunits have equal mass. Rg was computed from the coordinates of all interpolated skeleton paths for each particle; for particles with multiple branches, coordinates from each branch were included. In the WLC model (in 2D), the dependence of on and is given by the square root of

This expression offers another avenue for estimating from our measurements and thus testing the applicability of the WLC model [35].

2.4. Model Fitting and Uncertainty Analysis

In each data set {}, where represents the length along the paths, the mean of either or , and the corresponding standard error of the mean (SEM), we calculated the mean and the uncertainty as a function of or . Data was fitted to the one parameter, , WLC model (discussed above) using the lmfit python package. The fitting process employed weighted least squares minimization, with weights calculated as . The best-fit parameter was determined by minimizing the chi-square statistic, with its uncertainty derived from the covariance matrix of the fit, scaled by the reduced chi-squared, . The 95% confidence interval (CI) for was calculated as , representing the range within which the true parameter value is likely to lie with 95% probability. To visualize the model’s predictive capability, 95% prediction bands were computed as , where is the WLC model of either or and . Here, represents the model uncertainty propagated from , is the standard deviation of the residuals, and accounts for the measurement uncertainty. The resulting plots display the original data points with error bars , the best-fit curve , and the 95% prediction bands, illustrating both the uncertainty in the model and the expected range for new observations.

2.5. Atomic force microscopy imaging of CL

CL was purchased from Peptide 2.0 (Chantilly, VA) in powder, then hydrated to 100 μM in MilliQ water. The stock solution was stored in 10 μL aliquots at −80 °C. Imaging was performed as previously described [2]. Briefly, a CL aliquot was thawed and diluted to 330 nM in the imaging buffer (10 mM Hepes, 150 mM NaCl, pH 7.3). Ninety microliters was added to freshly cleaved mica disks and incubated for 10 minutes at room temperature (~25 °C). The samples were washed by exchanging 90 μL of imaging buffer five times to remove any particles in solution or loosely bound particles. Samples were imaged in the imaging buffer using biolever mini tips (Olympus, k ~ 0.1 N/m, fo ~ 30 kHz in fluid) in tapping mode (Cypher, Asylum Research, Santa Barbara, CA). Throughout imaging, the tip sample force magnitude was kept to ≤100 pN, a regime in which minimal protein distortion is expected. Prior to algorithm implementation, images were flattened using commercial AFM software (Asylum Research).

3. Results and discussion

Double stranded DNA represents a convenient benchmark for polymer chain tracing algorithms as its persistence length has been well characterized. In an analysis of four AFM images containing 206 linear strands of DNA with data from Hennan et al [8], PS-Poly persistence length results for DNA were found to be 48 ± 3 nm. This is within the margin of error for the widely accepted value for double stranded DNA persistence length of around 50 nm (Table 1). The persistence length for the polymer Candidalysin was determined in an analogous manner and found to be 12.1 ± 0.3 nm using seven images containing 670 linear polymers. Using the Easyworm software [11], the results for Candidalysin were found to be 12 ± 3 nm [2], which is in good agreement with our algorithm.

thumbnail
Table 1. Comparison of PS-Poly to previous work. Results were obtained via the end-to-end distance method for persistence length.

https://doi.org/10.1371/journal.pone.0341464.t001

To complement persistence length analysis, polymer architecture was also determined. In the process of categorizing polymer feature shapes, the coordinates of all endpoints, branch points, and three-dimensional overlaps are stored in the output as well as the total length of each feature, total polymerized length for each image, and total polymerized length for all images. An AFM image of CL and the resulting PS-Poly architecture outputs are shown (Fig 8).

thumbnail
Fig 8. The output of PS-Poly for shape categorization.

(A) Input AFM image of Candidalysin. The scale bar is 200 nm and the greyscale spans 12 nm. (B) Output for shape categorization is shown in tabular format. Two types of artifacts are identified, high points, defined as any point on a particle that is above 1.5 times the average height of the polymer backbone in the image and noise particles, defined as any particle in which 80% or more of the pixels are above 1.5 times the average height of the polymers in the image.

https://doi.org/10.1371/journal.pone.0341464.g008

After establishing the overall agreement between our algorithm output and previous work, we analyzed the potential errors and robustness of the persistence length calculations. CL was employed as this polymer represents a general and complex test case. AFM image data revealed contour lengths ranging from individual CL subunits that appear as punctate features of dimension roughly equivalent to the AFM tip radius (~5 nm) to well over 100 nm. However, CL polymers that are both Linear and long were rare. This is because the longer a CL polymer becomes the more likely it is to become Looped or Branched or both (i.e., Branched-Looped) [22]. To handle these variations, in our CL lp analysis we restricted the fitting window to contour lengths between 10 and 30 nm. The justification for excluding data with (a) nm, and (b) nm, is that the WLC model best describes semi-flexible polymers with comparable or larger than , which in our case is larger than 10 nm; and (b) at higher contour lengths, the poor sampling of long CL polymers in the images makes the data analysis less robust. We additionally note that almost all CL polymers exhibited a constant maximum height of ~3.5 nm above the substrate surface, indicative of a stable flat-laying orientation equilibrated in the plane of the substrate. Only linear particles were used when fitting to the WLC model. When considering only one end-to-end distance per particle, PsPolypy returned nm (Fig 9). On the other hand, when considering multiple segments per particle, as described in the Methods section (which is the default), we obtained a more precise result, nm (Fig 10B). In both cases, the fits were good quality, as indicated by and the 95% prediction bands.

thumbnail
Fig 9. Limiting the contour length fitting window provides robust persistence length calculations for CL.

Plot of mean-square end-to-end distance versus contour length for CL polymers. The persistence length was found to be 12.0 ± 0.9 nm. A 95% confidence interval (CI, teal shaded region) and 95% prediction band (pink shaded region) are shown.

https://doi.org/10.1371/journal.pone.0341464.g009

thumbnail
Fig 10. Persistence length calculations appear stable in the face of noise.

(A) Image sequence showing the addition of noise. The second-row images are detailed views of the region indicated in upper images (red rectangles). The added noise levels for each plot are indicated and range between 0 and 0.03 (std). Calculations of persistence length using different methods: (B) end-to-end distance and (C) tangent-tangent correlation. As before, only data within 10–30 nm contour windows were included in the fits. The expected range for new observations are indicated by the 95% prediction bands.

https://doi.org/10.1371/journal.pone.0341464.g010

We next challenged the algorithm by repeating persistence length calculations with noise added to the raw image data. Fig 10A shows an AFM image of CL with an increasing amount of white gaussian noise added, quantified by the standard deviation (std). Analysis of the polymer persistence length from this data is shown using both the end-to-end distance (R2) method (Fig 10B) and the tangent-tangent correlation (TTC) method (Fig 10C). Despite this added noise, the calculated value remains close to the nominal value of 12 nm (14 nm) using the R2 (TTC) method. Note that the slightly larger value returned by the TTC method is most likely due to its sensitivity to the precise form of the interpolated skeletons. This conclusion is consistent with the relatively large 95% prediction band in Fig 10C. It appears, however, that the algorithm is robust to a moderate amount of random pixel noise as is typically encountered in experimental settings.

A summary plot displaying outputs is provided in Fig 11. Results from the R2 and TTC methods are shown in Fig 11A and B, respectively. The tables contain additional information. We observe that as the image noise increases, the number of features detected as Linear decreases. This not only results in lower statistical weight of the calculation but also in reduced quality of the fits. We also note that the lp calculations with  nm approach the fundamental limit of the method. The uninterpolated contour length in pixels () of a polymer must be pixels to meaningfully contribute to an AFM-based calculation. When path interpolation is applied, the contour must contain pixels, where is the B-spline interpolation order.

thumbnail
Fig 11. Persistence length output in the presence of noise.

Persistence length results, as a function of added image noise, obtained by using the (A) R2 or (B) TTC methods. Rows in the tables are defined: noise std = noise level added to each image; All = total number of polymer features detected; Linear = number of features identified as Linear; 10 < L < 30 = number of Linear features exhibiting contour lengths within the 10–30 nm fitting window; = value of persistence length; 95% CI = confidence interval; statistical goodness of fit parameter.

https://doi.org/10.1371/journal.pone.0341464.g011

3.1. Radius of gyration analysis

To complement the polymer bending stiffness analysis, we also examined the dependence of the radius of gyration, Rg, on the contour length, L, for candidalysin polymers. The results for linear and branched polymers are shown in Fig 12. In both cases, the data are in overall agreement with the WLC model predictions based on the persistence lengths, lp. This agreement provides further support for the applicability of the 2D WLC model to describe CL polymers adsorbed onto the substrate surface for AFM investigation and also speaks to the quality of the solvent (aqueous buffer solution).

thumbnail
Fig 12. Radius of Gyration Analysis.

The dependence of the radius of gyration () on contour length (L) for linear (top) and branched (bottom) candidalysin polymers. The WLC model predictions based on the persistence lengths () obtained from fitting the corresponding end-to-end distance data are overlaid.

https://doi.org/10.1371/journal.pone.0341464.g012

3.2. Computational performance evaluation

The runtime of PS-Poly in Igor Pro 7 (64-bit) was compared to PsPolypy by processing a dataset of nine AFM images containing 779 particles using similar hardware. PS-Poly processed the dataset in 892 seconds on a CPU with an average frequency of approximately 4.2 GHz, while PsPolypy processed the dataset in 3.2 seconds on a CPU with an average frequency of approximately 4.4 GHz. While direct speedup comparisons are complicated by the variation of CPU utilization, these results suggest that the Python implementation achieves a significant reduction in runtime, on the order of two orders of magnitude. In addition to its superior calculation speed, Python’s open source and object-oriented design provides users with greater flexibility to fine-tune the algorithm to specific needs.

4. Conclusions and future outlook

PS Poly is an algorithm designed to calculate persistence length, radius of gyration, and architecture from single molecule image data of complex polymers. When implemented as a python package the algorithm is modular and object oriented, making it straightforward to maintain and to extend its features and scope. While it can batch process stacks of AFM images, PsPolypy can also be easily used for customized workflows because most of the attributes and methods are directly available to the user. PS Poly currently provides two WLC model-based methods for calculating persistence length (R2 and TTC) from the skeletonized representation of the polymer particles; additional methods can be added in the future, as needed. Furthermore, its automated shape detection can be used to provide insight into radius of gyration differences and polymerization mechanisms, as we have recently shown [22]. The benefits of automating this process include reduced human bias as well as time saved by the user and the related improvement in statistical weight of the data set. The current implementations of PS Poly use Otsu’s method for image thresholding which works effectively for high signal-to-noise images such as those typically acquired in AFM. However, other segmentation methods such as watershed algorithms and machine/deep learning features can also be added to future versions. For best results, it is important to reduce noise in the input images before running the program. As we showed, the algorithm is robust to moderate levels of random pixel noise, but it will break down if higher noise levels are encountered. While PS Poly was developed to analyze AFM image data, it has the potential to generalize to other imaging modalities.

References

  1. 1. Moyes DL, Wilson D, Richardson JP, Mogavero S, Tang SX, Wernecke J, et al. Candidalysin is a fungal peptide toxin critical for mucosal infection. Nature. 2016;532(7597):64–8. pmid:27027296
  2. 2. Russell CM, Schaefer KG, Dixson A, Gray ALH, Pyron RJ, Alves DS, et al. The Candida albicans virulence factor candidalysin polymerizes in solution to form membrane pores and damage epithelial cells. Elife. 2022;11:e75490. pmid:36173096
  3. 3. Mayer FL, Wilson D, Hube B. Candida albicans pathogenicity mechanisms. Virulence. 2013;4(2):119–28. pmid:23302789
  4. 4. Haque FM, Grayson SM. The synthesis, properties and potential applications of cyclic polymers. Nat Chem. 2020;12(5):433–44. pmid:32251372
  5. 5. Voit BI, Lederer A. Hyperbranched and highly branched polymer architectures--synthetic strategies and major characterization aspects. Chem Rev. 2009;109(11):5924–73. pmid:19785454
  6. 6. Müller DJ, Dumitru AC, Lo Giudice C, Gaub HE, Hinterdorfer P, Hummer G, et al. Atomic Force Microscopy-Based Force Spectroscopy and Multiparametric Imaging of Biomolecular and Cellular Systems. Chem Rev. 2021;121(19):11701–25. pmid:33166471
  7. 7. Weaver DR, King GM. Atomic Force Microscopy Reveals Complexity Underlying General Secretory System Activity. Int J Mol Sci. 2022;24(1):55. pmid:36613499
  8. 8. Heenan PR, Perkins TT. Imaging DNA Equilibrated onto Mica in Liquid Using Biochemically Relevant Deposition Conditions. ACS Nano. 2019;13(4):4220–9. pmid:30938988
  9. 9. Rivetti C, Guthold M, Bustamante C. Scanning force microscopy of DNA deposited onto mica: equilibration versus kinetic trapping studied by statistical polymer chain analysis. J Mol Biol. 1996;264(5):919–32. pmid:9000621
  10. 10. Rubinstein M, Colby R. Polymer Physics. New York: Oxford University Press Inc. 2003.
  11. 11. Lamour G, Kirkegaard JB, Li H, Knowles TP, Gsponer J. Easyworm: an open-source software tool to determine the mechanical properties of worm-like chains. Source Code Biol Med. 2014;9:16. pmid:25093038
  12. 12. Beton JG, Moorehead R, Helfmann L, Gray R, Hoogenboom BW, Joseph AP, et al. TopoStats - A program for automated tracing of biomolecules from AFM images. Methods. 2021;193:68–79. pmid:33548405
  13. 13. Usov I, Mezzenga R. FiberApp: An Open-Source Software for Tracking and Analyzing Polymers, Filaments, Biomacromolecules, and Fibrous Objects. Macromolecules. 2015;48(5):1269–80.
  14. 14. Konrad SF, Vanderlinden W, Frederickx W, Brouns T, Menze BH, De Feyter S, et al. High-throughput AFM analysis reveals unwrapping pathways of H3 and CENP-A nucleosomes. Nanoscale. 2021;13(10):5435–47. pmid:33683227
  15. 15. Wiggins PA, van der Heijden T, Moreno-Herrero F, Spakowitz A, Phillips R, Widom J, et al. High flexibility of DNA on short length scales probed by atomic force microscopy. Nat Nanotechnol. 2006;1(2):137–41. pmid:18654166
  16. 16. Faas FGA, Rieger B, van Vliet LJ, Cherny DI. DNA deformations near charged surfaces: electron and atomic force microscopy views. Biophys J. 2009;97(4):1148–57. pmid:19686663
  17. 17. Brangwynne CP, Koenderink GH, Barry E, Dogic Z, MacKintosh FC, Weitz DA. Bending dynamics of fluctuating biopolymers probed by automated high-resolution filament tracking. Biophys J. 2007;93(1):346–59. pmid:17416612
  18. 18. Zhang H, Söderholm N, Sandblad L, Wiklund K, Andersson M. DSeg: A Dynamic Image Segmentation Program to Extract Backbone Patterns for Filamentous Bacteria and Hyphae Structures. Microsc Microanal. 2019;25(3):711–9. pmid:30894244
  19. 19. Schneider M, Al-Shaer A, Forde NR. AutoSmarTrace: Automated chain tracing and flexibility analysis of biological filaments. Biophys J. 2021;120(13):2599–608. pmid:34022242
  20. 20. Graham JS, McCullough BR, Kang H, Elam WA, Cao W, De La Cruz EM. Multi-platform compatible software for analysis of polymer bending mechanics. PLoS One. 2014;9(4):e94766. pmid:24740323
  21. 21. Nunez-Iglesias J, Blanch AJ, Looker O, Dixon MW, Tilley L. A new Python library to analyse skeleton images confirms malaria parasite remodelling of the red blood cell membrane skeleton. PeerJ. 2018;6:e4312. pmid:29472997
  22. 22. Schaefer KG, Russell CM, Pyron RJ, Conley EA, Barrera FN, King GM. Polymerization mechanism of the Candida albicans virulence factor candidalysin. J Biol Chem. 2024;300(6):107370. pmid:38750794
  23. 23. Lin J, Miao J, Schaefer KG, Russell CM, Pyron RJ, Zhang F, et al. Sulfated glycosaminoglycans are host epithelial cell targets of the Candida albicans toxin candidalysin. Nat Microbiol. 2024;9(10):2553–69. pmid:39285260
  24. 24. KingGavinM. https://github.com/KingGavinM/
  25. 25. Houssein EH, Mohamed GM, Ibrahim IA, Wazery YM. An efficient multilevel image thresholding method based on improved heap-based optimizer. Sci Rep. 2023;13(1):9094. pmid:37277531
  26. 26. Zhang TY, Suen CY. A fast parallel algorithm for thinning digital patterns. Commun ACM. 1984;27(3):236–9.
  27. 27. Even S, Even G. Graph Algorithms. Cambridge University Press. 2011.
  28. 28. van der Walt S, Schönberger JL, Nunez-Iglesias J, Boulogne F, Warner JD, Yager N, et al. scikit-image: image processing in Python. PeerJ. 2014;2:e453. pmid:25024921
  29. 29. Van Rossum G, Drake FL. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace. 2009.
  30. 30. Hagberg AA, Schult DA, Swart PJ. In: Proceedings of the 7th Python in Science Conference, Pasadena, CA USA, 2008. 11–5.
  31. 31. Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17(3):261–72. pmid:32015543
  32. 32. Gutjahr P, Lipowsky R, Kierfeld J. Persistence length of semiflexible polymers and bending rigidity renormalization. Europhys Lett. 2006;76(6):994–1000.
  33. 33. Schöbl S, Sturm S, Janke W, Kroy K. Persistence-length renormalization of polymers in a crowded environment of hard disks. Phys Rev Lett. 2014;113(23):238302. pmid:25526167
  34. 34. Newville M, Stensitzki T, Allen DB, Ingargiola AL. LMFIT: Non-Linear Least-Square Minimization and Curve-Fitting for Python. 2014.
  35. 35. Baschnagel J, Meyer H, Wittmer J, Kulić I, Mohrbach H, Ziebert F, et al. Semiflexible Chains at Surfaces: Worm-Like Chains and beyond. Polymers (Basel). 2016;8(8):286. pmid:30974563