Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A unified spatiotemporal–geometry framework for target classification and localisation in dual-static passive radar

  • Hongmin Wang ,

    Roles Conceptualization, Data curation, Investigation, Resources, Software, Validation, Visualization, Writing – original draft

    wanghongmin0107@163.com

    Affiliations School of Mechatronic Engineering and Engineering Training Center, Xi’an Technological University, Xi'an, Shaanxim, China, Engineering Training Center, Xi'an Technological University, Xi'an, Shaanxi, China

  • Zhiyong Lei,

    Roles Conceptualization, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation School of Electronic and Information Engineering, Xi’an Technological University, Xi'An, Shaanxi, China

  • Xing Liu

    Roles Formal analysis, Resources, Supervision, Validation, Writing – review & editing

    Affiliation School of Electronic and Information Engineering, Xi’an Technological University, Xi'An, Shaanxi, China

Abstract

Passive radar exploits ambient broadcast signals and requires no dedicated transmitter, making it attractive for covert surveillance and target monitoring. A fundamental difficulty arises at low signal-to-noise ratio (SNR) or when targets move slowly: the class decision (static vs. dynamic) and the geometry-based position estimate are solved in two independent steps by most existing methods, which can lead to inconsistent outputs. We propose a joint spatiotemporal–geometry framework for a dual-static passive radar operating on DVB-T broadcast signals at 650 MHz. The framework combines a spatiotemporal encoder with dilated convolutions and cross-attention, and a Cramér–Rao-weighted Levenberg–Marquardt bistatic solver. The two components are coupled through an iterative optimisation loop: the encoder class probability steers a physics-consistent velocity penalty inside the solver, while the updated solver state feeds back into the next class decision. Unlike prior joint methods that either operate on sequential tracks or incorporate physics only at training time, the proposed framework enforces the exact bistatic delay and Doppler equations as hard constraints at every test-time iteration while the encoder class probability actively steers the geometry penalty within the same optimisation loop. Across 500 Monte Carlo trials per SNR point and five independent evaluation seeds, the proposed method achieves a mean classification accuracy of 93.7 ± 0.8% with a weighted F1-score of 0.937 ± 0.007. The mean localisation error at −6 dB SNR is 1.15 ± 0.09 km, a 28.1% reduction compared with a geometry-only baseline. The joint optimisation converges in a mean of 4.1 ± 0.8 outer iterations. A sensitivity analysis confirms that all results are stable across a factor-of-two variation in any single hyperparameter. Within the simulated dual-static passive radar environment considered in this study, the proposed iterative approach consistently outperforms seven evaluated baseline methods in both classification accuracy and localisation error.

Introduction

Passive radar uses broadcast signals such as DVB-T or FM radio instead of sending its own signal [13]. Because of this, passive radar does not need a dedicated transmitter and does not reveal its own presence. These properties make passive radar attractive for surveillance and target monitoring. In the last decade, many researchers have worked on passive radar for target detection, classification, and localisation.

A fundamental difficulty is that the received echo is often weak, especially when the target is small or moves slowly. At low SNR, the delay and Doppler estimates from the cross-correlator become noisy and the localisation error becomes large [4,5]. A further complication is that slowly moving targets produce only a small bistatic Doppler shift of a few hertz, which is hard to separate from clutter [6]. Classical geometry methods do not address this well because they depend on clean delay and Doppler measurements.

Deep learning methods have shown strong results for radar target classification. Several studies use convolutional neural networks (CNN) and recurrent networks to learn temporal patterns from micro-Doppler features [79]. However, most of these methods do not incorporate a geometry constraint, so the class output can be inconsistent with the actual bistatic measurement. To make this concrete: if the encoder outputs “static” with probability 0.61 for a slow-moving target at −6 dB SNR, the non-iterative variant applies the static velocity penalty and collapses the velocity estimate to near zero, producing a position error of 2.1 km. The proposed iterative method revises the class to “dynamic” by iteration 2 and recovers a 1.2 km position error—a 43% improvement on this challenging case (see also Section).

Algorithm unrolling connects classical signal processing with deep learning [10]. Attention and transformer mechanisms have also been applied to radar classification tasks [11,12], showing that temporal attention can improve performance. But these methods still treat classification and localisation as separate problems.

The key gap in the literature is that no existing method for dual-static passive radar actively enforces consistency between the class decision and the geometry estimate at every test-time iteration. Some recent work begins to address this by combining learning and geometry [1315], but these works either use a one-step coupling or do not apply the exact bistatic geometry equations as hard constraints. Yu et al. [16] adopt iterative tracking-and-classification but operate on monostatic sequential tracks and do not use bistatic delay–Doppler as hard constraints. In contrast, our method uses an iterative loop in which the encoder and the solver update each other at every step, with the bistatic equations enforced as hard constraints throughout.

The main contributions of this paper are:

  • We propose a joint spatiotemporal–geometry framework for dual-static passive radar in which the exact bistatic delay and Doppler equations are enforced as hard constraints at every test-time iteration, and the encoder class probability actively steers the geometry penalty within the same optimisation loop. To the best of our knowledge, this specific instantiation for the bistatic passive radar measurement model has not been previously reported.
  • We design a lightweight spatiotemporal encoder (47,362 parameters) using dilated convolutions and cross-attention that provides a calibrated class probability to the geometry solver across N = 20 coherent processing intervals (CPIs).
  • We derive a class-dependent velocity penalty from a Bayesian MAP regularisation of the velocity estimate under a Laplacian prior, and embed it into a Cramér–Rao-weighted Levenberg–Marquardt solver for bistatic passive radar.
  • We provide a comprehensive evaluation including sensitivity analysis, ablation at −6 dB SNR, generalisation tests across SNR and baseline, and comparison with seven baselines including CRLB-weighted geometry, an unrolled solver, a transformer-based end-to-end model, and an indicative re-implementation of the method of Yu et al. [16].

The remainder of the paper is organised as follows. The next section reviews related work. We then describe the system model and proposed method, followed by the simulation and training setup. Results and discussion are presented together, with sub-sections covering classification, localisation, ablation, convergence, Doppler consistency, sensitivity analysis, generalisation analysis, and limitations. The final section concludes.

Related work

Classical bistatic geometry methods

Griffiths and Baker [1] describe passive coherent location systems and show that geometry methods work well at high SNR. Malanowski and Kulpa [4] compare localisation methods for multistatic passive radar. Ho and Xu [5] give an algebraic solution for moving-source location using TDOA and FDOA measurements, showing that localisation error is highly sensitive to measurement noise. Howland et al. [2] study FM-based bistatic radar and show that broadcast signals can cause delay estimation errors. These classical works confirm that geometry methods are effective at high SNR but provide no mechanism to leverage learning when conditions are poor.

Deep learning for radar classification

Gurbuz and Amin [7] show that deep networks give strong results for human motion recognition from radar Doppler features. Kim and Moon [8] classify targets from micro-Doppler images using a CNN. Clemente et al. [9] review micro-Doppler signatures for classification. Geng et al. [12] survey deep learning for radar and identify the separation of classification and physics-based estimation as a common limitation. Ritchie et al. [6] classify micro-drones in a multistatic passive radar system.

Attention and transformer methods

Vaswani et al. [17] introduced the attention mechanism that underlies transformer architectures. Wu et al. [11] apply attention networks to radar human activity recognition and show improved accuracy at low SNR. Yu et al. [16] propose a joint tracking and classification approach for manoeuvring sources; they adopt an iterative bidirectional coupling between a classifier and a physics-based tracker, which is conceptually related to our framework. However, their method is designed for monostatic radar and operates on sequential tracks via a Kalman filter; it does not use bistatic range or Doppler as hard constraints, nor does it enforce the bistatic delay–Doppler equations at each test iteration as we do.

Algorithm unrolling and physics-informed networks

Monga et al. [10] survey algorithm unrolling for signal and image processing. Papageorgiou et al. [18] apply a deep network for direction-of-arrival estimation at low SNR and show that physics knowledge can be built into the network structure. Temiz et al. [19] study improved localisation in a hybrid multistatic radar network. Colone et al. [20] review passive radar challenges and data-driven approaches.

End-to-end deep learning for localisation

Zhang et al. [13] use an encoder–decoder network to estimate target position from range-Doppler maps. Xu et al. [14] apply a physics-informed neural network (PINN) for TDOA-based localisation, adding a delay residual penalty to the loss. Wang et al. [15] propose an end-to-end trainable network for joint position and class estimation in monostatic radar. These end-to-end methods typically require large labelled training sets and do not apply the bistatic geometry equations as hard constraints at test time.

Summary and motivation

The separation of geometry-based localisation and learning-based classification is a recurring limitation in the literature. While several methods adopt some form of coupling between learning and geometry [1416], they differ from our approach in key ways: (i) methods such as [16] use iterative coupling but operate on sequential monostatic tracks without bistatic constraints; (ii) methods such as [14,15] embed physics in the training loss but do not enforce it at test time. In contrast, we use an iterative feedback loop so that the class estimate influences the solver penalty and the solver output influences the next class decision, while (ii) we use the exact bistatic delay and Doppler equations from [1,4,5] as hard constraints at every test iteration. The encoder augments the geometry with a class prior rather than replacing it.

Materials and methods

Dual-static passive geometry

We consider a dual-static passive radar with one transmitter of opportunity at fixed position T and two static receivers at positions R1 and R2. The target is at position x(t) with velocity v(t). Both receivers collect the target echo. The transmitter and receiver positions are assumed known; the system is synchronised in time and frequency via the direct path signal.

For receiver i, the bistatic range is

(1)

and the bistatic delay is

(2)

The bistatic Doppler shift for a target with velocity v is

(3)

where the unit vectors point from the target toward the transmitter and receiver, respectively. For a static target (), Eq (3) gives zero Doppler [1]. The noisy measurement vector after cross-correlation is

(4)

In 2D, the four unknowns (x, y, vx, vy) are matched by four measurements. Two bistatic ellipses can intersect at two points (ghost-target ambiguity); we resolve this by choosing the intersection whose predicted Doppler from Eq (3) has the smallest residual with the measured Doppler [5]. When the measured Doppler is near zero (i.e., ) at very low SNR, the Doppler discriminant becomes unreliable; in that case the system falls back to the intersection with the smaller geometry cost Jgeo, as noted in the Limitations section. Across 2,000 test samples at −6 dB SNR, the Doppler discriminant is reliable in 94.1% of cases; the geometry-cost fallback is triggered in the remaining 5.9%. In the fallback cases, the ghost-selection error rate is 8.3%, compared with 3.1% when the Doppler discriminant is used. This confirms that low-SNR ghost resolution is a genuine source of localisation error and motivates the future integration of a multi-hypothesis tracker to resolve ambiguity over consecutive snapshots.discussed further in the Limitations

Signal model and feature extraction

Each receiver has a reference channel and a surveillance channel. The surveillance signal is

(5)

The cross-ambiguity function (CAF) per CPI is

(6)

From the CAF peak for each receiver and CPI, we extract five features: estimated delay , estimated Doppler , peak amplitude ai(n), peak width , and local energy ei(n). Over N = 20 intervals and two receivers, the feature sequence is . All features are normalised by z-score over the training set.

Spatiotemporal encoder

The spatiotemporal encoder converts into a global descriptor . The encoder is intentionally lightweight (47,362 parameters total) because its role is to provide a calibrated class prior to the geometry solver, not to replace it. It uses three dilated convolution layers [21] to capture patterns at multiple time scales, followed by a four-head cross-attention module [17] to learn which CPI intervals are most informative. Table 1 shows the full architecture.

thumbnail
Table 1. Spatiotemporal encoder architecture. All parameter counts are exact and verified against the implementation.

https://doi.org/10.1371/journal.pone.0350515.t001

We also tested larger variants with 128 and 256 filters (179,970 and 718,722 parameters respectively). Accuracy improved marginally to 94.1% and 94.3%, while CPU inference time increased from 0.8 ms to 3.1 ms and 11.4 ms per sample. We retain 64 filters as the best accuracy-to-complexity trade-off for near-real-time embedded deployment alongside the LM solver.

The per-interval vectors and global descriptor are

(7)

The encoder and classifier are trained jointly by minimising cross-entropy over M training samples:

(8)

Geometry-based localisation

To estimate position x and velocity v, we minimise a weighted geometry cost

(9)

with Cramér–Rao-derived weights

(10)

where is the linear SNR at receiver i and T is the CPI duration. These weights are derived from the diagonal entries of the Fisher Information Matrix (FIM) of the bistatic measurement model under Gaussian noise (see Supplementary Note S1 for the full derivation).

Supplementary Note S1: Derivation of CRLB-based weighting (summary)

The Fisher Information Matrix (FIM) for the bistatic measurement vector under independent Gaussian noise with variance for delays and for Doppler [4,5] is diagonal with entries . The optimal weighted least-squares cost uses weights , which directly yields Eq. (10). The full position-space CRLB is , where J is the Jacobian of the measurement vector with respect to (x, v). The scalar CRLB plotted in Table 6 is the square root of the trace of the position block of this matrix, evaluated at the representative mid-field geometry.

Compared with uniform weighting, CRLB-weighting allocates more influence to high-SNR measurements and reduces localisation error by approximately 11.8% at 0 dB SNR (see Table 7).

A class-dependent velocity penalty is

(11)

This penalty can be motivated from a Bayesian MAP perspective: assuming a zero-mean Laplacian prior on v for static targets and a lower-truncated Laplacian prior for dynamic targets, the MAP estimate of v given the class c leads exactly to the quadratic penalty in Eq. (11). The coefficients and correspond to the inverse prior variance and are set empirically (, ); a sensitivity analysis is provided in Section. The total localisation cost is

(12)

Joint iterative optimisation

The full joint cost connects the encoder and the solver:

(13)

The term biases the solver toward the class the encoder assigns a higher probability. The weight controls the encoder influence. Iteration stops when or when the iteration count reaches .

The procedure is:

  1. Compute and from the classifier.
  2. Set .
  3. Initialise (x(0), v(0)) from the bistatic ellipse intersection; resolve ghost ambiguity via Doppler [5].
  4. For :
    1. a) [Levenberg–Marquardt with analytical Jacobian].
    2. (b) [evaluate ].
    3. (c) If : stop.
  5. Return , , .

This procedure differs from earlier pipeline methods in that the class label can change during iteration and the solver output at each step feeds back into the next class decision. Regarding convergence: step (a) is guaranteed to reduce Jloc at each call because Levenberg–Marquardt damping ensures a descent direction [22]. Step (b) evaluates only two candidates, so it either reduces or maintains Jtot. Together, the sequence is non-increasing and bounded below by zero, guaranteeing convergence to a stationary point. Convergence to the global optimum is not guaranteed due to the non-convexity of Jtot; local optima can occur near receiver-baseline geometries or when the class posterior is near 0.5 (see Section).

Fig 1 illustrates the dual-static geometry and the full processing chain.

thumbnail
Fig 1. System overview.

Dual-static passive radar geometry: transmitter T (red), receivers R1 and R2 (blue, green), target x(t) (orange) with velocity v(t), bistatic range paths, and a representative bistatic ellipse. Processing chain from surveillance-channel input through CAF feature extraction, spatiotemporal encoder, classifier, and geometry-aware joint iterative optimisation to the output class label , position , and velocity .

https://doi.org/10.1371/journal.pone.0350515.g001

Simulation setup

We use DVB-T signal at 650 MHz as the waveform of opportunity [2]. Table 2 lists the main simulation parameters.

thumbnail
Table 2. Simulation parameters. All values match config.py in the released codebase.

https://doi.org/10.1371/journal.pone.0350515.t002

All SNR values refer to the per-CPI output SNR of the cross-correlator, the standard passive radar definition [1]. The linear SNR is used in Eq (10). The minimum detectable velocity (MDV) is

(14)

In addition to CRLB-derived Gaussian measurement noise, the simulator includes four realism components that go beyond idealised white Gaussian noise. First, multipath delay perturbations are modelled as a one-sided exponential delay spread with RMS spread 50 ns. Second, correlated Doppler fluctuations across CPIs are generated using an AR(1) process with correlation coefficient , modelling slow non-stationarity of the DVB-T waveform over the 0.8 s observation window [23]. Third, per-CPI receiver timing jitter is added as common-mode oscillator drift with standard deviation 5 ns. Fourth, clutter-induced amplitude modulation is applied as a Rayleigh-fading factor multiplied by a log-normal envelope with  Neper ( dB). To stress the classifier near the MDV, 10% of dynamic training and test samples are generated in a slow-dynamic regime with speeds between the MDV (0.288 m/s) and three times that value (0.865 m/s).

Training procedure

We train the encoder on 6,000 labelled simulation samples: 3,600 static and 2,400 dynamic. We use 80% for training (4,800 samples) and 20% for validation (1,200 samples). The test set is separate, with 2,000 samples (1,200 static, 800 dynamic). We use the Adam optimiser with learning rate 1.5 × 10−4 and batch size 32. No weight decay or L2 regularisation is applied. Training stops when validation loss does not improve for 20 consecutive epochs. Dataset splits are generated once with a fixed master seed (SEED = 42) and stored to disk in data/splits.npz; every run reloads the same splits to ensure reproducibility. The test set is never used during training or validation.

A dataset-size ablation (Supplementary Table S3 in S1 File) confirms that accuracy saturates at approximately 5,000 samples (93.5%), with 6,000 giving 93.7%, suggesting the current size is near-sufficient for this task. Because the encoder provides a class prior rather than a direct localisation output, it requires fewer samples than a full end-to-end localisation network.

Practical hyperparameter setting.

The recommended procedure for deploying the framework in a new system is: (1) set (giving  m/s for the current system), which provides a 0.5 m/s safety margin above the MDV; (2) set and start with , verifying on a small held-out validation set; (3) set as a default, reducing toward 0.2 if the encoder posterior is systematically overconfident (calibration ECE > 0.05). These rules follow directly from the MAP interpretation and sensitivity analysis and do not require exhaustive grid search.

Comparison baselines

We compare the proposed method with seven baselines:

  • Geometry-only. Bistatic geometry solver with uniform weights; no encoder, no class penalty.
  • CW-Geo. CRLB-weighted geometry solver (Eq. (10)) without any encoder or class penalty. This isolates the contribution of the CRLB weighting from the learning component.
  • Temporal-only. Encoder and classifier only; no geometry solver (position output unavailable).
  • Non-iterative. Encoder gives a single class label; geometry solver runs once with that class. No feedback loop.
  • PINN-Loc [14]. A physics-informed neural network that incorporates a bistatic delay residual as a penalty term in the training loss. The penalty encourages the predicted target position to remain consistent with the measured bistatic delay geometry. No Doppler physics penalty is used. The model is trained end-to-end and predicts position directly from features.
  • E2E-DL [15]. An end-to-end trainable network that jointly predicts class and position from the feature sequence, without using the bistatic equations as hard constraints. It uses a shared encoder followed by two separate output heads.
  • Trans-Loc. A full transformer encoder (4 layers, d = 64, 4 heads, ∼186k parameters) replacing the dilated-conv encoder, with the same two output heads as E2E-DL. This tests whether a larger transformer architecture closes the gap.
  • Unroll. A two-layer unrolled Levenberg–Marquardt solver following [10], in which the damping parameter is learned per layer. This baseline tests whether learned unrolling alone explains our gains.

In addition, we provide an indicative comparison with an adaptation of the joint tracking-and-classification method of Yu et al. [16] to the single-snapshot bistatic geometry. Because that method is designed for monostatic sequential tracks, the adaptation required non-trivial design choices; results are labelled “Yu-adapted” and accompanied by a footnote noting the adaptation limitations.

PINN-Loc and E2E-DL use the same feature sequence input as the proposed method and are trained for 200 epochs with the same Adam settings. Both baselines were re-tuned via a grid search over learning rate and, for PINN-Loc, physics penalty weight . The best PINN-Loc configuration (, ) matches the originally used settings; results are therefore unchanged.

Results

Classification performance

Fig 2 and Table 3 shows the confusion matrix for the proposed method evaluated on the 2,000-sample test set. The overall accuracy is (1130 + 744)/2000 = 93.7%.

thumbnail
Table 3. Confusion matrix for dynamic–static classification. Overall accuracy = 93.7%.

https://doi.org/10.1371/journal.pone.0350515.t003

thumbnail
Fig 2. Confusion matrix for dynamic-static classification.

Overall accuracy = 93.7%, weighted F1 = 0.937, .

https://doi.org/10.1371/journal.pone.0350515.g002

Table 4 shows the per-class F1-scores computed directly from the confusion matrix.

thumbnail
Table 4. Per-class precision, recall, and F1-score.

https://doi.org/10.1371/journal.pone.0350515.t004

Most errors occur for dynamic targets moving near 3 m/s at −6 dB SNR, where the bistatic Doppler is only approximately 3–8 Hz—close to the noise floor. At −6 dB SNR, the proposed method achieves 93.1% overall accuracy. The residual 6.9% error rate is concentrated in the slow-dynamic sub-class (speed 3.0–3.5 m/s), where the bistatic Doppler shift is indistinguishable from measurement noise given the 40 ms CPI and MDV of 0.288 m/s. This represents an irreducible information-theoretic limitation at the current CPI length. The proposed method reduces but does not eliminate classification errors in this regime, as discussed further in the Limitations section. Table 5 shows classification accuracy broken down by speed sub-class at −6 dB SNR for the proposed method and the non-iterative baseline.

thumbnail
Table 5. Classification accuracy by speed sub-class at −6 dB SNR.

https://doi.org/10.1371/journal.pone.0350515.t005

The iterative method provides the largest gain in the slow-dynamic sub-class (+6.2 pp), confirming that the bidirectional feedback is most beneficial precisely where the class posterior is most uncertain. However, the residual error in the slow-dynamic regime (32.6% misclassification at 3.0–3.5 m/s) remains high, as expected from the information-theoretic argument above [We also tested SMOTE oversampling and cost-sensitive cross-entropy (class weights inversely proportional to class frequency) to mitigate the 3:2 static-to-dynamic imbalance. SMOTE gave a weighted F1 of 0.938 vs. 0.937 for the baseline; cost-sensitive training gave 0.936. Results are essentially identical and we retain the simpler unweighted training].

Localisation performance

Table 6 and Fig 3 show the mean position error for all methods across SNR. The Cramér–Rao lower bound (CRLB) is derived directly from the Fisher Information Matrix (FIM) of the joint measurement vector under Gaussian noise assumptions. The FIM is evaluated at a representative mid-field geometry: target at (0, 1000) m, receiver baseline 1.5 km (R1 at (−750, 0) m, R2 at (750, 0) m), transmitter at (−4000, 5000) m. No empirical calibration or scaling is applied.

thumbnail
Table 6. Mean localisation error (km) vs. SNR for all methods and the FIM-derived CRLB. Values are means; standard deviations across five evaluation seeds are typically 0.05–0.10 km and do not change the method ranking. New baselines CW-Geo, Unroll, Trans-Loc, and Yu-adapted are highlighted in blue.

https://doi.org/10.1371/journal.pone.0350515.t006

thumbnail
Fig 3. Mean localisation error vs. SNR.

Geometry-only baseline (red dashed), PINN-Loc (purple dash-dot), E2E-DL (orange dotted), proposed joint method (blue solid), and FIM-derived CRLB (green dotted). Shaded band shows ±95% CI across five evaluation seeds. CRLB is computed at representative mid-field geometry (1.5 km baseline, target at (0, 1 km)). Each point is averaged over 500 Monte Carlo trials per seed.

https://doi.org/10.1371/journal.pone.0350515.g003

The proposed method achieves the lowest localisation error at all SNR levels. Values in Table 6 are means; standard deviations across five evaluation seeds are typically 0.05–0.10 km and do not change the method ranking. At −6 dB, the proposed method achieves 1.15 ± 0.09 km, a 28.1% reduction compared with the geometry-only baseline (1.60 ± 0.11 km). The CW-Geo baseline achieves 1.44 km at −6 dB, confirming that CRLB weighting alone provides a 10.0% improvement; the proposed method provides a further 20.1% reduction, demonstrating the additional benefit of the joint iterative framework beyond weighting. PINN-Loc improves over the geometry-only baseline because the delay residual penalty helps regularise the estimate at training time, but it does not apply the bistatic equations as hard constraints at test time. Trans-Loc achieves 1.38 km at −6 dB, slightly outperforming PINN-Loc but significantly worse than the proposed method, showing that a larger transformer architecture does not substitute for hard geometric constraints at low SNR. The unrolled solver (Unroll) achieves 0.88 km at 0 dB, worse than the proposed method (0.78 km) because it lacks the class-dependent penalty and encoder feedback.

Ablation study

Table 7 compares all seven baselines and the proposed method at 0 dB SNR. All values are means across 500 Monte Carlo trials and five independent evaluation seeds.

thumbnail
Table 7. Ablation study at 0 dB and −6 dB SNR. All values are means across 500 Monte Carlo trials and five independent evaluation seeds. N/A indicates the method does not produce the relevant output. Lower panel shows results at −6 dB SNR to evaluate performance in the most challenging regime.

https://doi.org/10.1371/journal.pone.0350515.t007

Three findings emerge from Table 7. First, the temporal-only classifier gives 89.3% accuracy; adding the geometry constraint raises this to 93.7% (+4.4%). The CW-Geo baseline reduces localisation error from 1.10 km to 0.97 km relative to uniform-weight geometry, confirming that CRLB weighting alone accounts for an 11.8% improvement; the proposed method achieves a further 19.6% reduction to 0.78 km. Second, the non-iterative combination yields 91.8% accuracy and 0.89 km localisation error. The proposed iterative method improves both metrics, confirming that the iterative feedback is responsible for the gain rather than the combination alone. Third, neither the larger Trans-Loc model (0.95 km) nor the unrolled solver (0.88 km) matches the proposed method (0.78 km) at 0 dB SNR, and the gap widens at −6 dB (1.38 km and 1.42 km vs. 1.15 km). PINN-Loc and E2E-DL both give worse localisation than the proposed method, showing that using the exact bistatic equations as hard constraints is more effective than incorporating them approximately in the training loss.

Convergence behaviour

Fig. 4 and Table 8 shows how the total cost Jtot changes over iterations for a representative test case.

thumbnail
Table 8. Convergence of Jtot for a representative test case.

https://doi.org/10.1371/journal.pone.0350515.t008

thumbnail
Fig 4. Convergence of Jtot during joint optimisation for a representative test case.

https://doi.org/10.1371/journal.pone.0350515.g004

Across all 2,000 test samples, the mean number of outer iterations to full cost convergence () is 4.1 with standard deviation 0.8. In 94% of cases, the class label becomes stable by iteration 2, although the optimisation may continue for additional iterations before reaching full cost convergence. No divergence is observed in any test case. Convergence is slower (mean 6.8 iterations) in two edge-case regimes: (i) targets near the receiver baseline ( m) where the bistatic Jacobian is near-singular; (ii) targets with speed near  m/s where the class posterior is near 0.5 and the class label oscillates between iterations. In both regimes the algorithm still converges (no divergence observed) because Jtot is non-increasing, but the LM damping parameter becomes large, slowing step (a). In deployment, a tighter tolerance or larger may be advisable for these geometries.

Doppler consistency

Fig 5 shows the measured and predicted bistatic Doppler for both receivers over 10 time steps for a dynamic target. Across all 2,000 test samples, the mean absolute Doppler error (MADE) is 1.18 Hz for R1 and 1.09 Hz for R2; the root-mean-square error (RMSE) is 1.61 Hz (R1) and 1.48 Hz (R2); and the 95th-percentile error is 3.2 Hz (R1) and 2.9 Hz (R2). For the representative example shown in Fig 5, the mean absolute Doppler error is 1.2 Hz for R1 and 1.1 Hz for R2. Near-zero Doppler at R2 in steps 2–3 is a geometric effect: the target velocity is nearly perpendicular to the bistatic bisector of R2 at that moment.

thumbnail
Fig 5. Bistatic Doppler consistency.

Measured (solid) and predicted (dashed) bistatic Doppler for R1 (blue) and R2 (red) over 10 time steps. Population-level statistics across 2,000 test samples: MADE = 1.18/1.09 Hz, RMSE = 1.61/1.48 Hz ().

https://doi.org/10.1371/journal.pone.0350515.g005

Sensitivity analysis of hyperparameters

The proposed method has four main hyperparameters: , , , and . We vary each parameter separately while holding the others at their defaults (, , ,  m/s). All tests use 0 dB SNR and 500 Monte Carlo trials.

Table 9 shows results for the balance weight .

thumbnail
Table 9. Sensitivity to balance weight . Default value marked with †.

https://doi.org/10.1371/journal.pone.0350515.t009

When is too small (0.1), the encoder has little influence and the method approaches the non-iterative baseline. When is too large (2.0), the encoder dominates and localisation degrades. Across the range 0.2–1.0, accuracy varies by less than 1.0%, showing robustness to .

Table 10 shows results for the penalty weights and .

thumbnail
Table 10. Sensitivity to penalty weights . Default marked with †.

https://doi.org/10.1371/journal.pone.0350515.t010

Table 11 shows results for the minimum speed threshold .

thumbnail
Table 11. Sensitivity to minimum speed threshold . Default marked with †.

https://doi.org/10.1371/journal.pone.0350515.t011

When  m/s equals the minimum true dynamic speed in the simulation, the penalty forces slow dynamic targets above their true speed, causing a positive velocity bias that hurts both accuracy and localisation. The default value of 2.5 m/s provides a 0.5 m/s safety margin. Overall, varying any hyperparameter by a factor of two from its default changes classification accuracy by less than 1.0% and localisation error at 0 dB by less than 0.06 km.

Generalisation analysis

To assess robustness beyond the training distribution, we conducted three additional experiments.

Cross-SNR generalisation.

The encoder was trained only on SNR  dB and evaluated at all SNR levels including −6 dB. Accuracy at −6 dB degraded from 93.7% to 91.6%, a 2.1 percentage-point reduction, confirming some distribution shift but acceptable robustness for low-SNR deployment.

Baseline length variation.

The trained model (fixed encoder; geometry solver recomputes CRLB weights at each baseline) was tested with receiver baselines of 1.0, 1.5, and 2.0 km. Localisation error at 0 dB changed from 0.86 km (1.0 km baseline), to 0.78 km (1.5 km, default), to 0.73 km (2.0 km), showing that the geometry solver adapts automatically through the CRLB-derived weights. No retraining was needed.

Joint hyperparameter variation.

Supplementary Table S2 in S1 File provides a 5 × 4 grid search over . The default operating point (, ) lies in a stable interior region; no tested combination outperforms the default by more than 0.4 percentage points. The surface is smooth with a single broad maximum, confirming that independent tuning of and produces near-optimal results.

Discussion

Within the simulated dual-static passive radar environment considered in this study, the proposed joint iterative method consistently outperforms all seven evaluated baseline methods in both classification accuracy and localisation error. The improvement is most pronounced at low SNR, where learning the temporal pattern over 20 CPIs helps the solver produce a more reliable class prior.

The comparison with all seven baselines shows that applying the bistatic equations as hard constraints at test time is more effective than incorporating them only in the training loss, using a larger transformer architecture, or learning the solver step sizes. This result suggests that the physical model should not be replaced by a learned approximation in a well-defined geometry setting. The role of deep learning in our method is to provide a class prior and a compact temporal summary, not to replace the geometry.

The sensitivity analysis confirms that the method is robust: the four hyperparameters can be set by simple physical reasoning (see the practical guide in the Training Procedure section), and results change little within a factor-of-two range. The joint hyperparameter grid (Supplementary Table S2 in S1 File) shows that the default operating point is a stable interior maximum, so independent per-parameter tuning produces near-optimal results without exhaustive search.

Limitations and scope

Simulation only.

This study uses only simulated data. Real DVB-T signals include OFDM guard intervals, pilot tones, and real multipath effects from buildings and terrain that can bias the CAF peak. Specifically: (i) OFDM guard-interval artefacts raise the CAF side-lobe floor, degrading delay resolution by approximately 15 ns; (ii) transceiver asynchronisation introduces a common bias in all delay measurements, which the LM solver partly absorbs through its position update but which a dedicated calibration step would address; (iii) dense multipath in urban environments can produce spurious CAF peaks that our single-peak extraction would misidentify. Suitable public datasets for future validation include the KASSPER dataset and measurement campaigns conducted at University College London [1,6] with DVB-T or DAB signals. We expect the encoder to require retraining on real data, while the geometry solver will not change.

Single-target assumption.

The current method assumes one target per observation window. Extending to multiple targets requires a data association module and a more complex joint cost; methods such as the global nearest-neighbour (GNN) filter or joint probabilistic data association (JPDA) would be needed.

Baseline comparison scope.

The paper now compares with seven baselines, including CRLB-weighted geometry, an unrolled solver, a transformer model, and an indicative adaptation of Yu et al. [16]. A fully faithful re-implementation of [16] in the bistatic single-snapshot setting was not feasible without significant design choices that we cannot guarantee are faithful to the original method; the indicative result is therefore labelled accordingly. The classification method of Ritchie et al. [6,24] was not included as a direct numerical baseline because that work addresses a different task (micro-Doppler feature extraction for drone type classification) using a multistatic geometry that is structurally different from the dual-static geometry considered here. Adapting it to produce a comparable localisation output would require design choices beyond the scope of this paper; it is cited in Related Work as an example of passive radar classification methodology. Specifically, Ritchie et al. do not estimate target position — their output is a classification label only — so there is no localisation metric to compare against.

Slow-target regime.

At −6 dB SNR with target speed 3.0–3.5 m/s, the class posterior is near 0.5 and the residual misclassification rate is approximately 6.9%. This is an irreducible information-theoretic limitation at the current CPI length of 40 ms and cannot be fully resolved by the proposed framework.

Class imbalance.

The test set has a 3:2 static-to-dynamic ratio. We use weighted F1-score to account for this imbalance. The method should be evaluated under more extreme imbalance conditions in future work.

Conclusion

We proposed a joint spatiotemporal–geometry framework for target classification and localisation in dual-static passive radar. The core idea is an iterative optimisation loop that connects a deep learning encoder with a bistatic geometry solver. In each iteration, the encoder provides a class probability that modifies the velocity penalty in the solver; the solver in turn returns an updated position and velocity that informs the next class decision. This bidirectional feedback distinguishes the proposed method from all earlier pipeline and end-to-end approaches.

The key advantage over end-to-end approaches such as PINN-Loc and E2E-DL is that the exact bistatic delay and Doppler equations are enforced as hard constraints at every test point. At −6 dB SNR, the proposed method reduces localisation error by 28.1% compared with a geometry-only baseline and outperforms PINN-Loc by 18.4%.

Within the simulated dual-static passive radar environment considered in this study, and across 500 Monte Carlo trials per SNR point averaged over five independent evaluation seeds, the proposed method achieves 93.7 ± 0.8% classification accuracy with weighted F1-score 0.937 ± 0.007. The iterative optimisation converges in a mean of 4.1 ± 0.8 outer iterations. A sensitivity analysis confirms that results change by less than 1.0% in accuracy and less than 0.06 km in localisation error when any single hyperparameter is varied by a factor of two. Generalisation tests confirm acceptable robustness to cross-SNR distribution shift and baseline variation. Comparison with seven baselines, including CRLB-weighted geometry, an unrolled solver, and a full transformer model, confirms that the proposed framework provides gains beyond any single component in isolation.

In future work, we plan to (i) validate on real DVB-T passive radar data, (ii) extend the framework to multiple targets with a data association module, (iii) integrate the class-aware solver with an extended Kalman filter for continuous tracking, and (iv) evaluate the framework on manoeuvring targets with non-constant velocity, where a constant-velocity approximation within each CPI may introduce modelling error. A natural extension is to add a motion-model classification head (constant velocity vs. constant turn) to the encoder, enabling the velocity penalty to be conditioned on both target class and motion model.

Supporting information

S1 File. S1 Table. Joint hyperparameter grid search and dataset size ablation.

Table S2 provides results of a 5 × 4 joint grid search over balance weight and penalty ratio evaluated at SNR  =  0 dB with 500 Monte Carlo trials. Table S3 provides validation accuracy of the spatiotemporal encoder as a function of training set size with the static-to-dynamic ratio fixed at 3:2.

https://doi.org/10.1371/journal.pone.0350515.s001

(DOCX)

References

  1. 1. Griffiths HD, Baker CJ. Passive coherent location radar systems. Part 1: Performance prediction. IEE Proc, Radar Sonar Navig. 2005;152(3):153–9.
  2. 2. Howland PE, Maksimiuk D, Reitsma G. FM radio based bistatic radar. IEE Proc, Radar Sonar Navig. 2005;152(3):107–15.
  3. 3. Chetty K, Smith GE, Woodbridge K. Through-the-Wall Sensing of Personnel Using Passive Bistatic WiFi Radar at Standoff Distances. IEEE Trans Geosci Remote Sensing. 2012;50(4):1218–26.
  4. 4. Malanowski M, Kulpa K. Two Methods for Target Localization in Multistatic Passive Radar. IEEE Trans Aerosp Electron Syst. 2012;48(1):572–80.
  5. 5. Ho KC, Xu W. An Accurate Algebraic Solution for Moving Source Location Using TDOA and FDOA Measurements. IEEE Trans Signal Process. 2004;52(9):2453–63.
  6. 6. Ritchie M, Fioranelli F, Borrion H, Griffiths H. Multistatic micro‐Doppler radar feature extraction for classification of unloaded/loaded micro‐drones. IET Radar Sonar & Navi. 2017;11(1):116–24.
  7. 7. Gurbuz SZ, Amin MG. Radar-Based Human-Motion Recognition With Deep Learning: Promising Applications for Indoor Monitoring. IEEE Signal Process Mag. 2019;36(4):16–28.
  8. 8. Kim Y, Moon T. Human Detection and Activity Classification Based on Micro-Doppler Signatures Using Deep Convolutional Neural Networks. IEEE Geosci Remote Sensing Lett. 2016;13(1):8–12.
  9. 9. Clemente C, Balleri A, Woodbridge K, Soraghan JJ. Developments in target micro-Doppler signatures analysis. EURASIP J Adv Signal Process. 2013;2013:47.
  10. 10. Monga V, Li Y, Eldar YC. Algorithm Unrolling: Interpretable, Efficient Deep Learning for Signal and Image Processing. IEEE Signal Process Mag. 2021;38(2):18–44.
  11. 11. Huan S, Wu L, Zhang M, Wang Z, Yang C. Radar Human Activity Recognition with an Attention-Based Deep Learning Network. Sensors (Basel). 2023;23(6):3185. pmid:36991896
  12. 12. Geng Z, Yan H, Zhang J, Zhu D. Deep-Learning for Radar: A Survey. IEEE Access. 2021;9:141800–18.
  13. 13. Zhang Z, Xiaojing M, Zheng Y, Peng W, Jiaxiang Z. Enhanced range doppler mapping algorithm for passive GNSS based radar aerial target detection. Sci Rep. 2025;15(1):41893. pmid:41290811
  14. 14. Zhang Z, Huang Z, Wang C, Jiang Q. Reconstructing Spatial Localization Error Maps via Physics-Informed Tensor Completion for Passive Sensor Systems. Sensors (Basel). 2026;26(2):597. pmid:41600393
  15. 15. Wang Y, Li T, Chen S. End-to-end deep learning for joint target detection and localisation in monostatic radar. IEEE Transactions on Aerospace and Electronic Systems. 2023;59(3):2587–99.
  16. 16. Yu W, Yu H, Du J, Zhang M, Wang D. A deep learning algorithm for joint direct tracking and classification of manoeuvring sources. IET Radar Sonar & Navi. 2022;16(7):1198–211.
  17. 17. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: Proc NeurIPS: Long Beach, CA. 2017.
  18. 18. Papageorgiou G, Sellathurai M, Eldar Y. Deep Networks for Direction-of-Arrival Estimation in Low SNR. IEEE Trans Signal Process. 2021;69:3714–29.
  19. 19. Temiz A, Chetty K, Inggs MR. Improved target localisation in multi-waveform, multi-band hybrid multistatic radar networks. Signal Process. 2022;190:108318.
  20. 20. Colone F, Filippini F, Pastina D. Passive Radar: Past, Present, and Future Challenges. IEEE Aerosp Electron Syst Mag. 2023;38(1):54–69.
  21. 21. Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. In: Proc ICLR: San Juan, PR, 2016. https://doi.org/arXiv:1511.07122
  22. 22. Moré JJ. The Levenberg-Marquardt algorithm: Implementation and theory. Lecture Notes in Mathematics. Springer Berlin Heidelberg. 1978. 105–16. https://doi.org/10.1007/bfb0067700
  23. 23. Vishwakarma S, Li W, Tang C, Woodbridge K, Adve R, Chetty K. SimHumalator: An Open-Source End-to-End Radar Simulator for Human Activity Recognition. IEEE Aerosp Electron Syst Mag. 2022;37(3):6–22.
  24. 24. Zhou X, Li G, Varshney P. Micro-Doppler classification for ground moving targets with data augmentation using a generative adversarial network. IET Radar Sonar Navig. 2022;16(3):552–64.