Peer Review History

Original SubmissionAugust 5, 2022
Decision Letter - Sathishkumar V E, Editor

PONE-D-22-21523SGAT: Shuffle and Graph Attention based Siamese Networks for Visual TrackingPLOS ONE

Dear Dr. Wang,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Oct 29 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Sathishkumar V E

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this paper, the authors proposed a tacking algorithm for visual tracking using shuffle and graph matching attention mechanism. Correlations between the spatial and channel-wise information to highlight the target region is explored. The results are compared with benchmark datasets. The paper is well written. The interpretation and description of the experimental results are also explained clearly. However, this manuscript have some weak points, it should be further improved before consider for publication. Some of my observations are

1. Abstract is very general. It has to be elaborated with characteristics of the results obtained.

2. No need to give specification for cl and reg in figure 1. It has to be explained through literature.

3. In table 1 also expansion for GM, SA are not needed. Expansions need to be given during first refereed place.

4. In table 1, what Success represents? What about accuracy?

5. Figure 3, more explanation is needed like the reason for SGAT algorithm’s best performance compared to all other methods

6. Figure 4 is not clear

7. Why the authors choose the threshold values 0.5 and 0.75. Justification is required

8. The need for evaluation based on AO is required

9. What is the need for representing success rate as SR? Uniformity in representing evaluation measures are not followed.

10. Table 3 need to be elaborated. It is mentioned thatUAV123 is used for evaluation. Why other state-of-the-art datasets are not used for evaluation?

11. In some places, success is represented as AUC. Need to maintain uniformity.

12. Justification is required for fig 9.

13. Recent references need to be included.

Reviewer #2: In this paper, the authors propose a shuffle attention based Siamese tracker. The idea makes sense and the paper is easy to follow. Extensive experimental results demonstrate that the proposed method achieves good performance. However, there are several problems and questions of the paper should be solved.

(1) In the abstract, the authors should introduce the core idea of the proposed method and point out the advantages of the method.

(2) The motivation is not clearly in the introduction. What's the problem of this paper solved?

(3) Why the shuffle attention is better than the original channel and spatial attentions in tracking ?

(4) There are several Siamese trackers and attentions should be discussed to enrich the related work, such as Learning dual-margin model for visual tracking, Learning Deep Multi-Level Similarity for Thermal Infrared Object Tracking, Hierarchical spatial-aware siamese network for thermal infrared object tracking, and Learning dual-level deep representation for thermal infrared tracking.

(5) What's the GM in Fig.1. There are missing the introduction about this.

(6) How to divide the group in the shuffle attention module? The authors should explain the reason and conduct an ablation study.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1

Paper No.: No.: PONE-D-22-21523

Title: SGAT: Shuffle and Graph Attention based Siamese Networks for Visual

Tracking

Authors: Jun Wang, Limin Zhang, Wenshuang Zhang, Yuanyun Wang*, Chengzhi

Deng

Dear Editor:

We would like to thank the reviewers and you for your great efforts in helping us to

improve the quality of the paper. After carefully considering the reviewers’ comments

and suggestions, we have significantly revised the paper with more details and

descriptions.

A detailed summary of the revisions and some specific comments/responses are given

in the following.

In short, we feel that we have addressed all crucial concerns of the reviewers.

However,if you have any questions or further requirements, please do not hesitate to

contact us.

Best regards,

Yuanyun Wang

October, 3, 2022

A Response and Summary of the Revisions: No.: PONE-D-22-21523

Authors’ Responses:

*****************

Associate Editor

*****************

(1)Please ensure that your manuscript meets PLOS ONE's style requirements,

including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main

_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_a

uthors_affiliations.pdf

RESPONSE: Thanks. We have carefully read the guidelines and style templates.

Then, we have modified the manuscript throughout the paper according to the

templates.

(2) Please note that PLOS ONE has specific guidelines on code sharing for

submissions in which author-generated code underpins the findings in the manuscript.

In these cases, all author-generated code must be made available without restrictions

upon publication of the work. Please review our guidelines at

https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code a

nd ensure that your code is shared in a way that follows best practice and facilitates

reproducibility and reuse.

RESPONSE: Thanks for pointing this out. We will provide the appropriate code in

three months.

*****************

Reviewer #1

*****************

(1) Abstract is very general. It has to be elaborated with characteristics of the results

obtained.

RESPONSE: Thanks for the comments. We have re-organized the abstract in the

revised manuscript with more details according to the results obtained.

...In this paper, we propose a novel tracking algorithm for feature extraction of target

templates and search region images through convolutional neural networks and

shuffle attention, and computes the similarity between the template and a search

region through a graph attention matching. ...

…Extensive experiments demonstrate that the proposed tracking algorithm achieves

excellent tracking results on multiple challenging benchmarks. Compared with other

state-of-the-art methods, the proposed tracking algorithm achieves excellent tracking

performance. …

(2) No need to give specification for cl and reg in figure 1. It has to be explained

through literature.

RESPONSE: Thanks for the valuable suggestion. We removed the branches cls and

reg in the revised manuscript as shown in Fig 1. We have explained both the

classification and regression branches in Section 2, i.e., related works. In addition, this

prediction head is the usual way of mainstream trackers, including SiamRPN[12] and

SiamCAR[16], etc.

SiamRPN has the classification (cl) and regression (reg) branches. The classification

branch distinguishes the target from the surrounding background. And the regression

branch refines the target location.

(3) In table 1 also expansion for GM, SA are not needed. Expansions need to be given

during first refereed place.

RESPONSE: Thanks for the comments. We give the expansions of GM and SA

during first refereed place.

In Table 1, we demonstrate the effectiveness of GM by ablation experiments. For

more details, GM and SA are explained in Page 4 at the first refereed place, please see

Section 3 and Section 4.2. In addition, we fixed the GM constant and validated the

effectiveness of the SA module.

…2) the shuffle attention mechanism model (SA Unit), which reconstructs the basis

features to focus on the target region and suppress the background interference

through the spatial and channel-wise transformation; 3) graph attention matching

(GM), which computes the similarity between the target template and a search region,

and joints classification and regression branches to locate the target position in the

current frame. …

(4) In table 1, what Success represents? What about accuracy?

RESPONSE: Thanks for pointing this out. In the evaluation metric, we explain the

relevant definitions of success and accuracy. Among them, precision and accuracy

denote the same concept. For more details, please see the Section 4.1.

…The precision is evaluated by the center location error (CLE) between the predicted

location and the ground truth location. The precision plots are drawn in according to

the frame percentages of CLE under the specified thresholds. Besides, the success rate

is defined as the intersection over union (IoU) between the predicted bounding boxes

and the ground truth. Meanwhile, when the IoU exceeds a certain threshold, it is

considered to track the target accurately, and the success plot is drawn by the frame

percentage. …

(5) Figure 3, more explanation is needed like the reason for SGAT algorithm’s best

performance compared to all other methods.

RESPONSE: Thanks for the suggestions. We have added some sentences for analysis

of superior performance in Fig 3, in the revised manuscript. For more details, please

see the Section 4.3.

…since similarity learning based on graph matching effectively exploits the structured

information, the SGAT algorithm achieves the best results in the success rate and

different attributes. …

(6) Figure 4 is not clear.

RESPONSE: Thanks for pointing this out. We have drawn the figure again. And we

have added some corresponding descriptions for Figure 4. For more details, please see

the Fig 4 and Section 4.3.

…Here, the x-axis represents the 10th power of the tracking speed and the y-axis

represents the success rate. For example, when x is taken as 2, the tracking speed is

200 frames per second. …

(7) Why the authors choose the threshold values 0.5 and 0.75. Justification is

required.

RESPONSE: Thanks. In the paper associated with the GOT-10k dataset, the authors

explicitly state the use of thresholds of 0.5 and 0.75, and subsequent mainstream

trackers use this threshold for fair comparisons. The provision that all target trackers

use the same training and testing sets provided by the dataset ensures a fair

comparison of all trackers. GOT-10K training and testing sets are non-overlapping.

After uploading the tracking results to the GOT-10K official website, the website

automatically analyzes the tracking results. The assessment metrics provided include

mean overlap rate (AO) and success rate (SR). SR0.5 indicates the rate of successful

tracking frames with an overlap of more than 0.5, while SR0.75 indicates the rate of

successful tracking frames with an overlap of more than 0.75. For more details, please

see Section 4.3 in Page 9.

(8) The need for evaluation based on AO is required.

RESPONSE: Thanks for pointing this out.

We performed AO-based evaluation of the trackers on the GOT-10k dataset in Table

2 in Section 4.3. Please see the corresponding explain and analysis in Section 4.3

AO represents the average overlap between all estimated bounding boxes and

ground-truth boxes. By using AO as an evaluation index, we can further evaluate the

tracking performance of our tracker. Meantime, we used success rate for evaluation

on OTB2015, LaSOT and UAV123 datasets, which is IoU-based method.

(9) What is the need for representing success rate as SR? Uniformity in representing

evaluation measures are not followed.

RESPONSE: Thanks for the suggestions. We have revised all SR to success rate

throughout the revised manuscript.

(10) Table 3 need to be elaborated. It is mentioned that UAV123 is used for

evaluation. Why other state-of-the-art datasets are not used for evaluation?

RESPONSE: Thanks for pointing this out.

In UAV test set, the main challenge factors are occlusion and small targets, and most

images have low resolution attributes. To the best of our knowledge, the

state-of-the-art trackers are usually compared on the UAV dataset. Additionally, we

also evaluate our tracker in OTB-100, GOT-10k and LaSOT. Extensive experimental

results demonstrate that the proposed tracker has excellent performance on multiple

benchmarks including OTB-100, GOT-10k, UAV123 and LaSOT, and outperforms

many SOTA trackers. For more details, please see the Section 4.3.

(11) In some places, success is represented as AUC. Need to maintain uniformity.

RESPONSE: Thanks for suggestions. We revised the manuscript to maintain a

uniform as success rate.

(12) Justification is required for fig 9.

RESPONSE: Thanks for suggestions.

Figure 9 shows the inability of our tracker to perform accurate localization in some

complex scenarios, and we discuss it in more detail in Limitations.

... As shown in Fig 9, in complex tracking environment, trackers may occur tracking

drift and tracking failure. In some extreme scenarios, the SGAT cannot complete the

target tracking task well. For example, after the 106th frame in the sequence soccer

and the 143rd frame in the sequence bird1, when there are many similar target

interferences and long-term occlusion in the scene, the SGAT will lose the target and

lead to tracking failure, which is also an urgent problem faced by the existing trackers.

Next, we will focus on the following two aspects: 1) modeling the target using

spatial-temporal context information to ensure that the target can be located when

occlusion occurs; 2) Adding a learnable memory unit to alleviate the problem that the

target often disappears in long-term tracking. …

(13) Recent references need to be included.

RESPONSE: Thanks for comments. We added and discussed some key references in

the revised manuscript, for example, Ref. [25], [26], [55], [56].

…In recent years, trackers based on Siamese network attract incremental attention for

their leading performance [25-28]. …

…Fan et al. [55] propose a dual-margin model for accuracy and robust visual tracking,

which formulated the target state prediction problem as a dual-margin model

including an intra-object margin and an inter-object margin. Li et al. [56] propose a

thermal infrared tracker based on a hierarchical spatially-aware twin network that

regards the infrared tracking problem as a similarity verification task. …

[25] Hui, Le, et al. 3D Siamese Transformer Network for Single Object Tracking on

Point Clouds. arXiv preprint arXiv:2207.11995. 2022.

[26] Tang F, Ling Q. Ranking-Based Siamese Visual Tracking. IEEE Conference

on Computer Vision and Pattern Recognition. 2022: 8741-8750.

[55] Fan N, Li X, Zhou Z, Liu Q, He Z. Learning dual-margin model for visual

tracking. Neural Networks. 2021: 344-354.

[56] Li X, Liu Q, Fan N, He Z, Wang H. Hierarchical Spatial-aware Siamese Network

for Thermal Infrared Object Tracking. Knowledge-Based Systems. 2019: 71-81.

*****************

Reviewer #2

*****************

(1) In the abstract, the authors should introduce the core idea of the proposed method

and point out the advantages of the method.

RESPONSE: Thanks for the comments.

We added more details in the abstract, including the core ideas of the paper and the

advantages of the proposed approach.

...In this paper, we propose a novel tracking algorithm for feature extraction of target

templates and search region images. Based on convolutional neural networks and

shuffle attention, the tracking algorithm computes the similarity between the template

and a search region through a graph attention matching. The proposed tracking

algorithm exploits the correlations between the spatial and channel-wise information

to highlight the target region. Moreover, the graph matching can greatly alleviate the

influences of appearance variations such as partial occlusions. …

(2) The motivation is not clearly in the introduction. What's the problem of this paper

solved?

RESPONSE: Thanks for pointing this out. We further explain the motivation of

design our tracking algorithm and describe the problem solved.

a. Most Siamese-based trackers use the features of the last convolution layer or

cascaded multi-layers as the target representations of the template and the search

region, which do not effectively use the structured and part-level information. To

address this problem, we propose to combine the advantages of CNN and shuffle

attention for feature representation of target templates and search region images.

b. Both the cross-correlation and depth cross-correlation take the template features as

a whole for linear matching on the search regions, so that the adjacent sliding

windows produce a similar response. To solve this problem, we introduced graph

matching approach for similarity learning to mine more structured information.

Our tracking algorithm effectively uses the structured and part-level information and

exploits structured and part-level information, which greatly alleviate the influences

of appearance variations such as fast motion and partial occlusions.

For more details, please see the Section 1.

(3) Why the shuffle attention is better than the original channel and spatial attentions

in tracking?

RESPONSE: Thanks.

The original spatial and channel attention does not take full advantage of the

correlational attention between space and channel, making it less efficient, e.g.,

CBAM. The shuffle attention by dividing into different blocks along the channel is a

lighter and more efficient way of integrating spatial and channel attention.

(4) There are several Siamese trackers and attentions should be discussed to enrich the

related work, such as Learning dual-margin model for visual tracking, Learning Deep

Multi-Level Similarity for Thermal Infrared Object Tracking, Hierarchical

spatial-aware siamese network for thermal infrared object tracking, and learning

dual-level deep representation for thermal infrared tracking.

RESPONSE: Thanks. We have updated the literature and further enriched the related

work as suggested.

…In recent years, trackers based on Siamese network attract incremental attention for

their leading performance [25-28]. …

…Fan et al. [55] propose a dual-margin model for accuracy and robust visual tracking,

which formulated the target state prediction problem as a dual-margin model

including an intra-object margin and an inter-object margin. Li et al. [56] propose a

thermal infrared tracker based on a hierarchical spatially-aware twin network that

regards the infrared tracking problem as a similarity verification task. …

[27] Fan N, Li X, Zhou Z, et al. Learning dual-margin model for visual tracking.

Neural Networks. 2021: 344-354.

[28] Liu Q, Li X, He Z, et al. Learning deep multi-level similarity for thermal infrared

object tracking. IEEE Transactions on Multimedia. 2020: 2114-2126.

[55] Fan N, Li X, Zhou Z, Liu Q, He Z. Learning dual-margin model for visual

tracking. Neural Networks. 2021: 344-354.

[56] Li X, Liu Q, Fan N, He Z, Wang H. Hierarchical Spatial-aware Siamese Network

for Thermal Infrared Object Tracking. Knowledge-Based Systems. 2019: 71-81.

(5) What's the GM in Fig 1. There are missing the introduction about this.

RESPONSE: Thanks for pointing this out. We describe the details of the GM in

section 3.3.

…we learn a graph attention matching (GM) based similarity measuring instead of

cross-correlation. By decomposing the target template and search region features into

multiple grids, and then computing the similarity of different template and search

region grids, which greatly alleviate the challenging of pose variations of target. …

... we assume 1 × 1 × C grid of the feature map as a node. For node i on the template

and node j in the search region, the correlation scores are. …

(6) How to divide the group in the shuffle attention module? The authors should

explain the reason and conduct an ablation study.

RESPONSE: Thanks for the comments. We have added some sentences to descript

how to divide the group in the shuffle attention module.

…As shown in Figure 2, the designed deep model effectively exploit the correlations

between the spatial and channel-wise to highlight the target region without extra

overhead. …

…Channel transformation focuses on 'what' is important in an input image. The

typical channel attention is SE block, which can effectively capture the correlation

between channels. However, SE blocks usually increase the number of parameters of

the model, which is not accord with the principle of lightweight design in tracking

tasks. To generate channel weights efficiently, the spatial dimension of an input

feature map is usually compressed, and adopt average-pooling to integrate spatial

information. Based on prior information, we adopt a novel channel transformation

method that resizes the channel-wise block through global average pooling. The

channel-wise block is obtained as follows. …

…As a supplement to channel-wise transformation, spatial transformation aims to

locate 'where' is an important region. To effectively carry out spatial transformation,

the max-pooling and average-pooling are usually used to deal with input feature along

channel dimension. In this paper, the specific implementation steps are as follows:

firstly, group normalization (GN) is used to preprocess the spatial features. Then,

linear transformation and activation function are combined to enhance the ability of

feature representation and suppress the interference of background region. The

transformed spatial features are as follows. …

…In shuffle attention module, we divide the basis features into multiple sub-features

along the channel dimensions. The shuffle unit reconstructs each sub-feature by

spatial and channel-wise transformations. Finally, the sub-features are combined by

using the dependence along channel dimensions. …For more details, please see the

Section 3.2 and 4.2.

Attachments
Attachment
Submitted filename: AuthorsReply_Revised.pdf
Decision Letter - Sathishkumar V E, Editor

PONE-D-22-21523R1SGAT: Shuffle and Graph Attention based Siamese Networks for Visual TrackingPLOS ONE

Dear Dr. Wang,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

 Please include a reference to your own previously published work, https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0273690 and clarify the motivations for this manuscript, in light of this previous work.

Please submit your revised manuscript by Nov 25 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Hanna Landenmark

Staff Editor, PLOS ONE

on behalf of 

Sathishkumar V E

Academic Editor, PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments (if provided):

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Authors addressed all the comments specified by the reviewer. So, the paper can be accepted at this stage

Reviewer #2: The response solves my doubts.

There are two same references [27] and [55] and there is missing a related reference 'Learning dual-level deep representation for thermal infrared tracking'.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 2

Paper No.: No.: PONE-D-22-21523R1

Title: SGAT: Shuffle and Graph Attention based Siamese Networks for Visual

Tracking

Authors: Jun Wang, Limin Zhang, Wenshuang Zhang, Yuanyun Wang, Chengzhi

Deng

Dear Editor:

We would like to thank the reviewers and you for your great efforts in helping us to

improve the quality of the paper. After carefully considering the reviewers’ comments

and suggestions, we have revised the paper with some details and references.

A detailed summary of the revisions and some specific comments/responses are given

in the following.

In short, we feel that we have addressed all crucial concerns of the reviewers.

However,if you have any questions or further requirements, please do not hesitate to

contact us.

Best regards,

Yuanyun Wang

October,13, 2022

Response and Summary of the Revisions: No.: PONE-D-22-21523R1

*****************

Associate Editor

*****************

(1) Please include a reference to your own previously published work,

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0273690 and clarify

the motivations for this manuscript, in light of this previous work.

RESPONSE: Thanks. We included the precious work [22] in Reference.

[22] Yuanyun W, Wenshuang Z, Limin Z, Jun W. Siamese network with a depthwise

over-parameterized convolutional layer for visual tracking. PLOS ONE. 2022;1-21.

And, we added some sentences to clarity the motivation in this manuscript and the

difference between this and the previous work.

Different from the previous work [22], we design a novel feature extraction network

based on GoogleNet to exploit correlations of the spatial and channel-wise

information. Additionally, in order to alleviate the influences of appearance variations,

we use a different similarity computing to obtain more accurate score maps. Inspired

by above-mentioned works, in this paper, we propose a novel tracking algorithm

based on shuffle attention mechanism and graph matching in Siamese network. The

shuffle attention mechanism in the backbone network reconstructs the basic features

extracted from CNN, and makes the feature representation focusing on the regions of

interest through spatial and channel-wise transformations. Different from the

cross-correlation based similarity learning, the part-to-part graph attention matching

further improves the tracking robustness in complex scenes, such as occlusion.

*****************

Reviewer #1

*****************

(1) Authors addressed all the comments specified by the reviewer. So, the paper can

be accepted at this stage.

RESPONSE: Thanks for your comment.

*****************

Reviewer #2

*****************

(1) There are two same references [27] and [55] and there is missing a related

reference 'Learning dual-level deep representation for thermal infrared tracking'.

RESPONSE: Thanks for pointing this out. We removed the same reference, and have

included the key reference [3] in Introduction, Page 1. Some details are as follows:

[3] Liu Q, Yuan D, Fan N, et al. Learning dual-level deep representation for thermal

infrared tracking. IEEE Transactions on Multimedia. 2022;1-8.

Visual tracking [1-3] is a fundamental research topic in computer vision. It aims to

estimate target states in subsequent frames by given the initial state in the first frame.

It is widely used in various applications, such as video surveillance [4],

human-computer interaction [5], augmented reality [6], and so on. Recently,

Convolutional Neural Network (CNN) is successfully used in visual tracking. Deep

trackers [7,8] achieve robust tracking performance and real-time tracking speed.

However, due to complicated appearance variations, visual tracking is still a

challenging task.

Attachments
Attachment
Submitted filename: AuthorsReply_Revised.pdf
Decision Letter - Sathishkumar V E, Editor

SGAT: Shuffle and Graph Attention based Siamese Networks for Visual Tracking

PONE-D-22-21523R2

Dear Dr. Wang,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Sathishkumar V E

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: All the comments have been addressed properly by the authors. The paper can be accepted at this stage.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

**********

Formally Accepted
Acceptance Letter - Sathishkumar V E, Editor

PONE-D-22-21523R2

SGAT: Shuffle and Graph Attention based Siamese Networks for Visual Tracking

Dear Dr. Wang:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Sathishkumar V E

Academic Editor

PLOS ONE

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .