Peer Review History

Original SubmissionOctober 7, 2019
Decision Letter - Jordi Paniagua, Editor

PONE-D-19-28022

Migrant mobility flows characterized with digital data

PLOS ONE

Dear Mr. MAZZOLI,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Both reviewers consider that your manuscript has merits to be considered for publication. However, they also observe some minor issues that should be addressed. Please respond to all reviewers' comments.

We would appreciate receiving your revised manuscript by Jan 25 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Jordi Paniagua

Academic Editor

PLOS ONE

Journal Requirements:

1.

When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.  Please remove your figures from within your manuscript file, leaving only the individual TIFF/EPS image files, uploaded separately.  These will be automatically included in the reviewers’ PDF.

3.

We suggest you thoroughly copyedit your manuscript for language usage, spelling, and grammar. If you do not know anyone who can help you do this, you may wish to consider employing a professional scientific editing service.  

Whilst you may use any professional scientific editing service of your choice, PLOS has partnered with both American Journal Experts (AJE) and Editage to provide discounted services to PLOS authors. Both organizations have experience helping authors meet PLOS guidelines and can provide language editing, translation, manuscript formatting, and figure formatting to ensure your manuscript meets our submission guidelines. To take advantage of our partnership with AJE, visit the AJE website (http://learn.aje.com/plos/) for a 15% discount off AJE services. To take advantage of our partnership with Editage, visit the Editage website (www.editage.com) and enter referral code PLOSEDIT for a 15% discount off Editage services.  If the PLOS editorial team finds any language issues in text that either AJE or Editage has edited, the service provider will re-edit the text for free.

Upon resubmission, please provide the following:

  • The name of the colleague or the details of the professional service that edited your manuscript
  • A copy of your manuscript showing your changes by either highlighting them or using track changes (uploaded as a *supporting information* file)
  • A clean copy of the edited manuscript (uploaded as the new *manuscript* file)

4. Please clarify whether there was any ethical oversight over the study, and whether the authors had access to any identifying information.

5.

We note that Figures 2 and 5 in your submission contain [map/satellite] images which may be copyrighted. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

1.    You may seek permission from the original copyright holder of Figure(s) [#] to publish the content specifically under the CC BY 4.0 license. 

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

2.    If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

The following resources for replacing copyrighted map figures may be helpful:

USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/

The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sseop/clickmap/

Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html

NASA Earth Observatory (public domain): http://earthobservatory.nasa.gov/

Landsat: http://landsat.visibleearth.nasa.gov/

USGS EROS (Earth Resources Observatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/#

Natural Earth (public domain): http://www.naturalearthdata.com/

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This paper analyzes a relevant social issue, which is migration under humanitarian crisis. It is a valuable contribution from the computational social science perspective, and shows that the usage of data from social media can help to real-time track migration events. The paper is clearly written and the methodology is well described, so I think that it can be a valuable contribution for PLOS ONE.

However, I have a couple of minor issues and doubts which I suggest that the authors address:

1) In Fig. 1 and Fig. 5, I think that the correlation values in term of the R^2 do not provide a clear highlight when the distribution of values seems to obbey a power law. This R^2 is probably mainly dominated by the first 2 or 3 countries. I suggest to remove this indications, or either justify their usage, or either compute them in a logarithmic scale.

2) It would be interesting to point out that this methodology does not fit well for some countries in the South Cone which are further from Venezuela, like Chile (~290k "official" migrants vs ~800k? estimated) and Argentina (~130k "official" vs ~600k? estimated). This might be related to higher Twitter penetration in those countries, which promotes its usage by migrants; or to the correlation between distance travelled and socio-economic status, or to other factors that make the upscaling work incorrectly for those countries. I think that this issue deserves to be briefly discussed.

3) I did not understand the following phrase in the Discussion: "We could have used a stricter criterion and request two or more tweets abroad but this does not affect the average flows (the upscaling factors absorb it), although it notably enhances the statistical fluctuations".

As far as I understand, the scaling factor is computed as the ratio between Venezuela population and the amount of residents (TUV's). The criterion for detecting a migration situation does not affect any of the previous quantities. In this sense, if we apply a stricter "migration criterion" then the upscaled migration amount will be affected as well.

Reviewer #2: The authors propose a novel method for assessing and studying the phenomenon of migration using twitter geo-located data. The authors apply the method to the Venezuelan Migration crisis showing they are able to estimate the amounts of migrants in certain years. The estimates are compatible with those found by international organizations. Moreover, they provide a way to study in detail the geographic distribution of routes of migration.

I find the idea of using Twitter data for migrations quite appealing, despite the limitations this kind of data might have (even though the authors provide a discussion of such limitations in the conclusions). Researchers interested in migration patterns do not have always access to private data from mobile phone companies, and surveys made by international organizations might not have the desired level of detail for certain studies.

Hence, I would recommend the article for publication with minor revisions I am certain the authors will be able to address easily:

- in page 9 the authors propose a way to estimate the fluxes crossing the border of the Venezuelan country. However, I was not able to understand precisely how this is done (maybe due to my limited comprehension ability). Is it estimated by counting the number of vectors crossing the line? Are the authors able of following the trajectory of an individual and hence assess whether he crosses the border? Please rephrase it better in the text.

- The authors at page 9 makes distinction between migration patterns on land and by airplane. Of course the second ones belong to less disadvantaged individuals but still the flux might be relevant for migration studies. Do the authors think it would be possible to identify this kind of migrations as well?

- In the discussion section the authors state that the work has proven to be helpful for humanitarian agencies. Do they mean it has already been applied by these agencies for some of their studies? In this case, I would add a reference if available. Otherwise, I would state that the method "could be useful" or "have potential use for" these agencies.

- In general the work is interesting due to the fact the data used is publicly available. I would discuss a little bit about possible comparisons with private data in order to further validate the method.

After having addressed this minor comments, in my opinion the work will be ready for publication.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1

Answers to reviewers' comments:

First of all, we would like to thank the reviewers for the positive and constructive comments.

Reviewer #1:

This paper analyzes a relevant social issue, which is migration under humanitarian crisis. It is a valuable contribution from the computational social science perspective, and shows that the usage of data from social media can help to real-time track migration events. The paper is clearly written and the methodology is well described, so I think that it can be a valuable contribution for PLOS ONE.

However, I have a couple of minor issues and doubts which I suggest that the authors address:

1) In Fig. 1 and Fig. 5, I think that the correlation values in term of the R^2 do not provide a clear highlight when the distribution of values seems to obey a power law. This R^2 is probably mainly dominated by the first 2 or 3 countries. I suggest to remove this indications, or either justify their usage, or either compute them in a logarithmic scale.

The log-log scale in the representation is only a matter of convenience since the data points have several orders of magnitude. Both figures are showing an identity relation, as we are comparing expected versus estimated flows. The important question is whether the points fall over the diagonal and, consequently, the analysis on the “goodness” of fit must be done linearly to have sense. We cannot expect any scaling law out of linearity because that would imply that the model is not reproducing well the numbers observed in official data and that a systematic bias has been introduced. We added in the captions of Fig. 1 and 5 a note to clarify this point.

2) It would be interesting to point out that this methodology does not fit well for some countries in the South Cone which are further from Venezuela, like Chile (~290k "official" migrants vs ~800k? estimated) and Argentina (~130k "official" vs ~600k? estimated). This might be related to higher Twitter penetration in those countries, which promotes its usage by migrants; or to the correlation between distance travelled and socio-economic status, or to other factors that make the upscaling work incorrectly for those countries. I think that

this issue deserves to be briefly discussed.

We thank the reviewer for noticing this distortion. This is indeed an important observation to make. We think that the factors acting here may be two: on one side, as the reviewer states, the upscaling factor here is probably affected by the fact that the penetration rate in these countries is different. From a social perspective, migrants moving to other countries may conform to the local culture in the way the social platforms are used. However, this change of habits take time and, in some cases, it can be even generations. On the other hand, the difference could be this high because of a miss-representation of the Venezuelan migrants from the official statistics. Note what happened in Brazil, where the official numbers from the UN organizations are quite below the ones from the Federal Police and ours (Table 2). This is why it is so important to get information from extra sources. As we see from Figure 2, the routes of migration from Venezuela extend up to Santiago and Buenos Aires. We added a paragraph mentioning this issue in the Discussion Section.

3) I did not understand the following phrase in the Discussion: "We could have used a stricter criterion and request two or more tweets abroad but this does not affect the average flows (the upscaling factors absorb it), although it notably enhances the statistical fluctuations". As far as I understand, the scaling factor is computed as the ratio between Venezuela population and the amount of residents (TUV's). The criterion for detecting a migration situation does not affect any of the previous quantities. In this sense, if we apply a stricter "migration criterion" then the upscaled migration amount will be affected as well.

As the reviewer observed it is true that “the scaling factor is computed as the ratio between Venezuela population and the amount of residents (TUV's)”. The sentence "We could have used a stricter criterion and request two or more tweets abroad but this does not affect the average flows (the upscaling factors absorb it), although it notably enhances the statistical fluctuations" is intended to say that if we want to use a stricter criterion to classify people abroad as migrants, we should be consistent on the residents classification as well. In this sense, if we want to check for two consecutive tweets abroad in a specific country, in order not to lose the consistency, we should ask for residents to tweet at least twice in that year to consider them as active. On the other hand, by doing this, one notably reduces the sampling of the data and narrows the statistics, hence we discarded this option. In order to make the text clearer, we added this reflection to the above sentence.

Reviewer #2:

The authors propose a novel method for assessing and studying the phenomenon of migration using twitter geo-located data. The authors apply the method to the Venezuelan Migration crisis showing they are able to estimate the amounts of migrants in certain years. The estimates are compatible with those found by international organizations. Moreover, they provide a way to study in detail the geographic distribution of routes of migration. I find the idea of using Twitter data for migrations quite appealing, despite the limitations this kind of data might have (even though the authors provide a discussion of such limitations in the conclusions). Researchers interested in migration patterns do not have always access to private data from mobile phone companies, and surveys made by international organizations might not have the desired level of detail for certain studies. Hence, I would recommend the article for publication with minor revisions I am certain the authors will be able to address easily:

- in page 9 the authors propose a way to estimate the fluxes crossing the border of the Venezuelan country. However, I was not able to understand precisely how this is done (maybe due to my limited comprehension ability). Is it estimated by counting the number of vectors crossing the line? Are the authors able of following the trajectory of an individual and hence assess whether he crosses the border? Please rephrase it better in the text.

We thank the reviewer for pointing out this issue on the understanding of what are the numbers estimated in our methods. In the previous part of the manuscript, the method used to define a previously classified Venezuelan resident as a migrant was to detect at least one tweet from them in a second country as a proof of border crossing (Fig.1 and Table 2). We added this explanation in Validation of external flows subsection, lines 246 and following. On the other hand, from line 291, we introduce a different method, which requires a stricter sampling. We now want to assess whether migrants moving on the ground crossed a specific line along their trajectory. The results of this new measure are depicted in Figure 3 and Table 3. In order to be sure that they crossed the dashed line, we have to take at least one tweet on one side and one tweet on the other side of the dashed line. The vectorial depiction is a way to characterize the general direction of movement but it is not used to count the number of crossings. We added a clarification regarding this measure in lines 293 and following.

- The authors at page 9 makes distinction between migration patterns on land and by airplane. Of course the second ones belong to less disadvantaged individuals but still the flux might be relevant for migration studies. Do the authors think it would be possible to identify this kind of migrations as well?

There is possibility to detect air trips by having tweets of the same user in two faraway places and with a time interval compatible with a flight speed (between 300 and 900 km/h). We found a few cases in our data but they are not enough to do proper statistics. One can always use more relaxed criteria, like assuming that tweets happening between distant locations are footprints of air displacements regardless of the time between them. However, this can lead to false positives, like people who traveled on the ground by car/bus and never tweeted along the route.

- In the discussion section the authors state that the work has proven to be helpful for humanitarian agencies. Do they mean it has already been applied by these agencies for some of their studies? In this case, I would add a reference if available. Otherwise, I would state that the method "could be useful" or "have potential use for" these agencies.

It must be noticed that part of the authors belongs to UNICEF, specifically they are based in Brasilia and New York. Some of them are operatives and our results and data were discussed during the decision-making process regarding the Venezuelan crisis in Brazil and other nearby countries. This statement was included by them during the writing process and the rest of authors has no reason to consider it as false.

In particular, the insights from this research helped UNICEF to keep a broader vision of the scale of the migration problem beyond the border with Venezuela, which was the case before. Based on this, the UNICEF team moved into looking at (i) how to integrate the humanitarian response into our regular program of cooperation, especially our Municipal Seal of Approval; and (ii) expanding the reach of an AI-inspired project on xenophobia beyond the State of Roraima, close to the border.

- In general the work is interesting due to the fact the data used is publicly available. I would discuss a little bit about possible comparisons with private data in order to further validate the method.

We have added a paragraph in the Discussion section commenting on the possible comparisons that one could make with other data sources like private data.

After having addressed this minor comments, in my opinion the work will be ready for publication.

Attachments
Attachment
Submitted filename: answers_reviewers.pdf
Decision Letter - Jordi Paniagua, Editor

Migrant mobility flows characterized with digital data

PONE-D-19-28022R1

Dear Dr. MAZZOLI,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Jordi Paniagua

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: All my previous comments have been sufficiently addressed. Therefore, I recommend this article for publication.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Formally Accepted
Acceptance Letter - Jordi Paniagua, Editor

PONE-D-19-28022R1

Migrant mobility flows characterized with digital data

Dear Dr. Mazzoli:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Jordi Paniagua

Academic Editor

PLOS ONE

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .