Firms’ challenges and social responsibilities during Covid-19: A Twitter analysis

This paper offers insights on the major issues and challenges firms face in the Covid-19 pandemic and their concerns for Corporate Social Responsibility (CSR) themes. To do so, we investigate large Italian firms’ discussions on Twitter in the first nine months of the pandemic. Specifically, we ask: How is firms’ Twitter discussion developing during the Covid-19 pandemic? Which CSR dimensions and topics do firms discuss? To what extent do they resonate with the public? We downloaded Twitter posts by the accounts of large Italian firms, and we built the bipartite network of accounts and hashtags. Using an entropy-based null model as a benchmark, we projected the information contained in the network into the accounts layers, identifying a network of accounts. We find that the network is composed of 13 communities and accounts at the core of the network focus on environmental sustainability, digital innovation, and safety. Firms’ ownership type does not seem to influence the conversation. While the relevance of CSR hashtags and stakeholder engagement is relatively small, peculiarities arise in some communities. Overall, our paper highlights the contribution of online social networks and complex networks methods for management and strategy research, showing the role of online social media in understanding firms’ issues, challenges, and responsibilities, with common narratives naturally emerging from data.


Introduction
The Covid-19 pandemic had severe effects on firms and forced them to rethink the way they do their businesses, coping with what has been called "the new normal" [1].
Management research concerning the effects of Covid-19 on firms' performances mainly focused on specific sectors or stock market prices, also considering the impact of CSR activities. A broad area concerned the hospitality and tourism industry [2][3][4], food and beverage businesses [5], banking [6], and the manufacturing sectors [7]. Some papers focused on the consequences of firms' practices on their performances (for example [8], studied the effects of remote working on small businesses). Other streams of literature explored the effects of Covid-19 on firms' performances considering the stock market [9,10] and the impact of CSR activities [11][12][13][14]. Some papers focus on niches, considering the performance of the stock market in specific sectors, as the travel and leisure industry [15] or the hospitality sector [16]. current challenges and issues firms are facing. This can be useful for policy makers and managers to orient their strategies and decision making [41] and to understand how firms' and users' perceptions of CSR themes vary, reflecting different beliefs of what responsibilities firms have towards society [39].

Firms, social media and the Covid-19 pandemic
In the last 20 years, Internet changed the ways and the speed with which firms communicate and relate to their stakeholders. At first, the social dimension of website communication was focused on blogs, which allowed firms to communicate their "personality" and incorporate a relationship strategy based on connectivity and dialogue with users. Blogs started dialogic online relationships between firms and their public. When they first appeared around 15 years ago, social media were aimed at social networking with friends, family and colleagues [42]. They opened new paths and ways for firms to communicate with their stakeholders, shifting from a hierarchical one-to-many communication to a many-to-many collaborative type of communication [33]. Thus, social media became platforms for firms to share advertising, marketing and public relationship strategies [42].
Among social media, Twitter is popular for business communication purposes [28], and is the one with the highest adoption rates among large firms [30]. It is increasingly used in academic research, with most contributions arising from the computer and information science, and communications, while developing also in business and economics [43]. While Twitter research is mostly focused on the marketing [44] and financial fields [45], it is increasingly being used in the management and strategy fields, for example to monitor firms' CSR dissemination and stakeholder engagement [46], the emerging trends in technology [26] or to understand the issues and challenges firms are facing on specific themes [23].
This literature mostly studied firms' communication on Twitter with traditional methodologies. On the data collection side, manual collection methods [25] prevail, while on the data analysis side, manual labelling [27], and linear regressions [28][29][30] are the majority. However, some studies are experimenting with new approaches based on network methods, with a prevailing focus to understand CSR communication and stakeholder engagement on social media [24,33].
There is a prolific new research line which focuses on the impacts of Covid-19 on firms and their reactions (among others: [18,22,[47][48][49]) and there is an extended research about the Covid-19 as perceived on Twitter (see, for instance, [50][51][52][53][54][55][56]), but, surprisingly, researchers just started to explore firms' conversation on Covid-19 on Twitter. To the best of our knowledge, only one paper [23] has studied firms' discussion during Covid-19, focusing on the pandemic's effects on supply chains. However, the pandemic impacted multiple business dimensions, including firms' financial resources and business models [18], with first reactions ranging from increasing servitization, digitalization [17,20,22], and cooperation [57]. Managers and entrepreneurs had to rethink the way they do their businesses, learning to face a totally unexpected crisis [58] and to cope with a "new normal" [1]. To the best of our knowledge, there is no paper currently addressing how firms' conversation on social media is evolving during Covid-19, thus providing insights on the current issues and challenges firms are facing [24].
Thus, we pose the following explorative research question:

RQ 1: How is firms' Twitter discussion developing during the Covid-19 pandemic?
For completeness, in Appendix A we briefly review the Covid-19 situation during the data collection.

Firms, CSR dissemination and stakeholder engagement
Although the concept of CSR has no universal meaning and has been widely defined [59], it generally refers to the relationships and responsibilities a business has towards society, interpreted as the communities of stakeholders a firm interacts with [40,59,60], going beyond what is required by law [61]. What CSR concerns in practical terms depends on the historical and cultural context a firm is settled in, and it can also reflect the issues a firm is facing in a particular period [60], which it have been changing and increasing fast after the start of globalization [40].
While the advent of the Internet pushed firms to disseminate their social reporting online [60], social media opened a new channel for firms to disseminate their actions and interact with their stakeholders [28].
Two main theories explain why firms are present on online social networks: legitimacy theory and stakeholder theory. On the one hand, legitimacy theory [62,63] argues that firms behave consistently with their society's expectations and values. These are not fixed, and vary in space and time. Although legitimacy theory has been linked to CSR by many authors, it is not necessarily bounded to CSR or stakeholders' expectations. Following this view, firms use online social networks to legitimize their presence within their society [27]. On the other hand, following stakeholder theory [64], firms should follow the expectations of their stakeholders and develop relationships with them, thus creating long-term value. Following this perspective, firms use online social networks to interact with their stakeholders, sharing their strategies and results [65].
Most research finds that firms' social media posts about CSR topics are a minority. Considering together firms' Facebook posts, Tweets and Youtube videos in a 3-months period in 2014 [27], show that CSR posts are only 7% of all posts, thus suggesting that firms do not use Twittter to disseminate their CSR activities. Focusing on the top 50 companies from the Fortune list of 2010 [66], find similar results, highlighting that firms mostly post about topics nonrelated to CSR (among others, posts include marketing purposes, promotion of products and services), with only a small percentage of posts regarding CSR. This is also consistent with [67], who finds that in general business accounts 14.5% of tweets are about CSR topics, while they rise in CSR-dedicated Twitter pages, with more than 70% tweets on average about CSR issues. However, there are some conflicting results. For example, taking Spanish banks after the euro zone crisis, CSR appears as a material topic discussed on Twitter [28].
Few papers use network (visualization) methods, thus showing which communities around CSR topics arise. For example, selecting firms with a Twitter account dedicated to CSR and using network and machine learning methods [33], finds that firms and users create several independent communities around CSR themes, instead of a unique and connected network. Combining structural topic modeling and network methods [68], explore how the Twitter discussion around the hashtag #CSR evolves, resulting in 20 different topics (the most prevalent one being company strategy, followed by community charity, CSR teams, business ethics, . . .), most of them related to one or more topics, and seven without any significant correlation with others. The paper concludes that Twitter is used as a mean to share about multiple CSR dimensions, some of them are related and they change over time. Following a similar approach, starting from the hashtag #sustainability [24], adopt social networks methods, finding top themes associated with #sustainability (namely, Innovation, Environment, Climate Change, Corporate Social Responsibility, Technology, and Energy) and 6 communities (Environmental Sustainability, Sustainability Awareness, Renewable Energy and Climate Change, Innovative Technology, Green Architecture, and Food Sustainability).
Overall, existing research seem to agree that CSR communication on Twitter is a minority and results are quite fragmented. However, there are mixed findings and data-driven methods have been rarely used. Also following [67] call for a more in-depth study on the topics firms address on Twitter when communicating on CSR, we pose our second explorative research question:

RQ 2: Which CSR dimensions and topics do firms discuss?
While the aforementioned results relate to one-way communication on CSR on Twitter, a few papers also question the extent to which users interact with firms on Twitter, as a measure of stakeholder engagement.
Being a separate but related concept to CSR, stakeholder engagement can be interpreted under many different theoretical perspectives and has been variously defined. Following [69], "stakeholder engagement is understood as practices the organisation undertakes to involve stakeholders in a positive manner in organisational activities" (p. 315). Social media are increasingly used as a tool to measure stakeholder engagement, with the assumption that social media users are part of the firm's stakeholders [70]. Although some preliminary evidence shows that Twitter is less used as a mean to engage stakeholders compared to Facebook [71], it is still considered an ideal tool for firms to engage with their stakeholders, with measures like retweeting activity that can be interpreted as an implicit endorsement of the content of the message [31]. Papers exploring firms' use of Twitter for engaging their stakeholders use different approaches and findings are quite mixed. In general, it seems that social media are not used in their full capacity to interact and engage with stakeholders on CSR topics. This is what both [66] and [27] found, the former highlighting that firms interact very little on CSR topics, the latter finding that firms' posts concerning stakeholder engagement are only the 0.22% of total messages. This is somewhat consistent with [67], who finds that non-CSR tweets have a higher interactivity than CSR ones. However [31], find opposite results, with CSR messages gaining a greater audience reaction than non-CSR messages, in particular when certain CSR topics (e.g. environment or education) are discussed and when the post uses a hashtag related to the topic. However, popularity was measured as a binary variable (retweeted message vs not retweeted message), not taking into account the extent to which a message is retweeted. Similarly, focusing on specific CSR dimensions, a few papers study how stakeholder engagement varies depending on the type of CSR communication. For example, using the very broad categorization of core CSR (information directly linked to the firm's core business) versus supplementary CSR (information about actions detached from the firm's core business) [28], find that different categories of stakeholders react differently to firm's CSR tweets depending on the content that is shared. However, results do not allow to further distinguish which CSR dimensions resonate most.
As the literature shows very mixed results in stakeholder engagement on CSR topics on Twitter, we pose the following research question:

Data set
We selected Italy for our study as it is one of the countries first and more severely hit by the pandemic [18]. Although large firms are a minority in European businesses [72], they represent a suitable sample for our research aim for a number of reasons. First, large firms have a greater impact on society than small and medium firms [73,74], they generate more news, are more visible to the general public and are more likely to attract feedbacks from their consumers or clients [75]. Second, large firms have more resources (e.g. human, financial), which makes them more likely to invest in social and environmental reporting and dissemination [65]. Third, they usually have more stakeholders demanding information than small and medium sized firms [73]. For these reasons, large firms are a suitable context to explore firms' challenges and issues with online social media, as well as their dissemination of CSR activities and stakeholder engagement.
We relied on AIDA (https://aida.bvdinfo.com/), the Italian section of Bureau Van Dijk (https://www.bvdinfo.com/), to select firms. AIDA is a database containing financial and commercial historical data from approximately 540,000 firms operating in Italy. It relies on official data retrieved from the Italian Registry of Companies and the Italian Chambers of Commerce. AIDA contains data at the firm level, which include information about the firm's characteristics (e.g. location, industry, the year it was founded), the ownership and governance structure (e.g. the name of each shareholder and board member, the respective ownership share) and financial data (balance sheet, profit and loss accounts data, and ratios).
We selected all Italian active firms with 250 or more employees, resulting in 3.870 firms. We downloaded data about firms' characteristics (e.g. their names, location, ownership type, ATECO code and website) and financial data (e.g. total assets and revenues). ATECO is the classification of economic activities used by the Italian Institute of Statistics (ISTAT). It is the national translation of NACE, which is the classification of economic activities in the European Union. It stands for "Statistical classification of economic activities in the European Community" and is derived from the French "Nomenclature statistique des activités économiques dans la Communauté européenne". The latest classification is NACE Rev. 2, which was implemented from 2007. As at the time of download (April 2020) not every firm's financial data was updated with information on 2019, the financial data we downloaded refer to 2018. We manually searched the firms' websites as indicated in AIDA database, checking in each website if a Twitter account was available. For each website, we searched the home page and "contacts" sections. We found that 936 companies out of 3870 (24%) displayed a Twitter account in their website. A few of them indicated multiple Twitter accounts on their websites.

Data collection and first cleaning.
To collect all messages written by the selected accounts, we made use of the GET statuses/user_timeline Twitter API. Out of the original 936 accounts, 6 accounts were not active in the period examined, while 6 were not found (probably they were deactivated). The final data set was composed by 1.6 M of tweets.
The data set was cleaned further: we focused on firms writing tweets in the period from the 1st of March, 2020, to 17th of November, 2020, which includes the beginning of the Covid-19 pandemic. In order to focus on the proper set, we selected firms that wrote a message before and after the 1st of March, 2020, to explore their activity over a longer period. Literally, the GET statuses/user_timeline API permits to download the last circa 3200 messages written by the account: if the account is extremely active, these 3200 will cover just an extremely limited period and thus cannot be considered for the definition of a narrative over the entire period under analysis. For more details, visit Twitter developer webpage. It is the case, for instance, of the accounts of publishing houses owning newspapers: the Twitter accounts were used as the accounts of the newspapers, thus for communicating the latest piece of news. Other examples include public transport accounts (messages referred to traffic alerts), internet and mobile service companies (those accounts were either used to advertise the latest promotions or for the customer service) or football teams (commenting the results of the match). Selecting accounts which tweeted before and after the 1st of March 2020 also permits to get rid of those accounts that were active in the past, but did not contribute to the discussion during the pandemic. The resulting firms in our data set are a total of 417 different active accounts and 917864 different messages. Even though our research is mostly based on publicly available online data (i.e. Twitter messages), we believe it is ethically appropriate not to disclose the firms' names, usernames and quotes [76].

Hashtag cleaning.
For each firm in the data set, we extracted the hashtag used, in order to study similarities in the communication strategies. Only 401 accounts out of 417 (i.e. 96%) used at least one hashtag in the period under study.
We find frequent mistakes in typing the hashtags: to avoid considering as different two hashtag referring to the same subject, we implemented the Edit distance, as implemented by the py_stringmatching python module [77] (more details in the hashtag data cleaning can be found in the Appendix B). After this cleaning, 11475 different hashtags resulted in the data set.

CSR hashtags.
To recognise hashtags related to CSR issues, the widely known classification of sustainability proposed by [78] was used. It distinguishes between three dimensions: (i) the environmental dimension (i.e. attention of the company towards environmental issues), (ii) the social dimension (i.e. the relationship between business and society-consumers, employees and stakeholders in general), (iii) the economic dimension (i.e. socio-economic or financial aspects, including describing CSR in terms of business operations). In order to select relevant keywords, we adapted and integrated previous approaches. As [79] distinguish the environmental and social dimension, using keywords from the GRI (Global Reporting Initiative) 2010 standard, we updated this list with the latest GRI 2019 standards (Italian translation), also including the economic dimension. The GRI is is an independent, international organization aimed at developing voluntary reporting guidelines for CSR. The GRI Standards are among the world's most widely used standards for CSR reporting. For more information, please visit: https://www.globalreporting.org/. To create a comprehensive list of CSR keywords, we also integrated keywords from existing research [27,67] and added them to their relevant CSR area. To do this, the first and fourth authors discussed the category for each item, and placed it in the category they both agreed on.

The bipartite semantic network and its validated projection
We represented the system as a bipartite network, i.e. a network in which the nodes are divided into two disjointed classes (called layers) and edges are not allowed between nodes of the same group. One layer represents the different accounts, while the second layer represents the various hashtags. A hashtag and an account are connected if, in the period analysed, the given account used the selected hashtag at least once.
We use the strategy described in [38] to project the information contained in the original system on the layer of the accounts and detect similarly communicating firms. In a nutshell, this approach consists of 3 main steps: first, we consider the network of the co-occurrences of links, as they are observed in the real system. In our case, since we are interested in projecting the information contained in our system onto the company layer, for each couple of users, the co-occurrences are the number of hashtags used by both accounts. Second, we define a proper null-model as a benchmark. In the present application, we use a maximum entropy nullmodel, i.e. a model maximally random, but for the information contained in some constraints [36,37]. In particular, we can discount the information contained in the degree sequence, as in the Bipartite configuration Model (BiCM, [35]; more details on the null-model can be found in the Appendix C.1). In this way, in our analysis we are considering the attitude of the various accounts in using hashtags (that is, the degree of the various accounts) and the number of users using the given hashtag (i.e. the degree of the different hashtags). Finally, we compare the observed co-occurrences with the expectations of the null-model: if those are statistically significant, i.e. much higher than the ones predicted by the model and such that the disagreement cannot be explained by the constraint of the null-model, they are then validated. We then put an edge in the projection connecting the two accounts whose co-occurrences are statistically significant. The result of the projection is a monopartite undirected network of accounts in which a connection indicates non trivial similarities in terms of the usage of hashtags. The entire approach is described in more detail in the Appendix C.2.
For the implementation of the BiCM, we used the recently released python module NEMtropy, presented in [80].

Descriptive statistics
Before considering the accounts that have been validated by the projection techniques described in the section above, we first present some features of the data set, that includes financial data from AIDA and online data from Twitter.
Over the entire set of firms, the correlations among financial and online data remain limited: in Fig 1 we show the related Spearman correlation matrix (due to the non Gaussian distribution of the various quantities, Pearson correlation is not justified). The correlations between quantities related to the companies' activity on online social networks and those regarding their financial performance are in general weak and rarely significant. Actually, even among Twitter quantities, the correlations are quite weak, with a few exceptions: the number of likes per message and the number of retweets per message (0.64), the number of followers and the number of messages (0.57), and the number of followers and the number of likes per message (0.57). On the left panel the correlation matrix for the entire data set: all correlations between online and financial quantities are weak. This changes once we focus on specific ATECO codes, due to their peculiarities: on the right panel the same correlation matrix for sector 62 (Computer programming, consultancy and related activitie). A relatively strong correlation is present between the number of followers and the total assets, for instance. This behaviour may be due to the importance of the communications in online social networks for this sector, which increases with a firm's resources. https://doi.org/10.1371/journal.pone.0254748.g001

PLOS ONE
Firms' challenges and social responsibilities during  Interestingly enough, once we focus on a single ATECO code, as in the right panel of Fig 1 for code 62 (Computer programming, consultancy and related activities), the situation appears slightly different. There are significant and relatively strong correlations between, for instance, the number of followers and the total assets (0.60), and the total number of messages and the total assets (0.55). This difference is probably due to the specific sector: being related to information technologies, the appearance of these companies in online social networks is an important part of their marketing strategies and depends on a firm's size (as measured in the total assets).
Focusing on Twitter data, the retweeting activity is always the most frequent one, for instance representing nearly the 80% of the actions on the platform during the Covid-19 Italian debate [81]. Interestingly enough, probably due to the different role of firms' Twitter accounts, the retweeting activity in the present data set is extremely limited: just the 22.3% of the messages collected are retweets, while the others are original messages. A similar behaviour was detected for verified users [81]: the authors interpreted these findings maintaining that verified users are the main drivers for the development of a debate. Here, analogously, firms participate to the discussion by introducing new arguments, even if the accounts are not necessarily verified: the frequency of verified accounts in our data set is quite limited, close to 17%.
The most frequently used hashtag in the dataset is "covid", which appears 4106 times. "Coronavirus" is the third most used hashtag, appearing 2120 times. The other most frequently used hashtags are mostly related to public utilities themes and digitalization.
We remind that we just selected firms' accounts, not keywords, to download the messages we analysed. Nevertheless, almost all accounts used hashtags related to pandemic, due to the obvious impact that it had on every activity in the period.

Validated network of Twitter accounts
The result of the procedure described in Sec. 3.2 is a monopartite network of Twitter accounts composed by 80 firms and 135 links, in which links indicate a non trivial similarity in the usage of hashtags (see Fig 2). Interestingly enough, while being heavy users of messages and hashtags and having a great number of friends, the validated accounts are not among the most popular ones, see Fig 3. Indeed, the most popular users, i.e. those with more than a million followers, are luxury brands; those accounts wrote many messages, but they limit the usage of hashtags to their merchandise, probably in order to focus on the exclusiveness of their products. In this sense, let us remark that the validated nodes are those that contribute to the formation of common narratives, shared among various accounts; these extremely popular users, instead, are marking their originality and do not intervene substantially to shape common discussions.
Even in the case of the firms in the validated network, there are limited correlations between online and financial data, see the left panel of Fig 4: the only relatively strong (and significant) one is between the number of followers and the total assets (0.51). Nevertheless, when we focus on a specific ATECO code, correlations can be much stronger and mix the online data from Twitter with the financial ones: in the right panel of Fig 4 we observe quite a strong correlation between the number of messages and the total assets (0.73) or between the number of likes per message and the total assets (0.75). A similar behaviour was observed in the analysis of the entire data set, see Fig 1; the correlations on the 62 ATECO code, were, nevertheless, a little weaker than the ones observed here.
By running the Louvain algorithm [82], we identified 13 communities of accounts, identified by nodes of different colors in Fig 2. These groups show a certain homogeneity in terms of companies' sectors (for more details, see the tables in Figs 5 and 6). In particular, the greatest communities in the core of the network mostly include computer programming, consultancy and related activities (in dodger blue), electricity, gas, steam and air conditioning supply (in purple), employment (in sky blue), activities of head offices and management consultancy (in violet) and human health (in yellow). Instead, the ownership type (in term of the Global Ultimate Owner, or GUO) does not seem to influence the conversation. As defined by Bureau Van Dijk, the GUO is the individual or entity at the top of the corporate ownership structure, thus indicating the highest parent company: the GUO is represented in Fig 2 as different shapes of the nodes and no significant pattern can be detected. This is surprising, as some kinds of firms communicate in a different way, especially about CSR and environmental issues. For example, family firms are found to communicate on CSR differently from non-family firms [73] and publicly owned firms disclose less on environmental topics (contrary to the expectations [65,83]). Although ownership does not seem to influence the conversation, it is worth noting that

PLOS ONE
Firms' challenges and social responsibilities during Covid-19 the dialogue on environmental sustainability is mostly carried out by firms active in the public utilities sector.

4.2.1
The usage of hashtags in the validated network: Firms' discussion during the pandemic. Remarkably, among the 13 communities detected, 10 of them have a Covid-related theme in their main hashtags (e.g. "coronavirus", "covid", "iorestoacasa", the latter one meaning "stayhome"), meaning that firms' discussion is strongly related to the pandemic. The core of the network is made of 5 interconnected communities. Based on their most frequently used hashtags, we named them "Digital Transformation" (dodger blue), "Remote Working" (sky blue), "Digitalization" (violet), "Environmental Sustainability" (purple) and "Safety" (yellow), which emerged as the main themes that firms discussed at the beginning of the Covid-19 pandemic. The names assigned to all communities can be found in the table of Fig 7. In the "Environmental Sustainability" community (in purple), the most important hashtags are "sostenibilità" (meaning "sustainability"), "covid", "innovazione" ("innovation"), "greendeal", "energia" ("energy"), thus showing that the environmental themes are relevant and linked to the innovation ones, see the table in Fig 8. This community is mostly formed by firms managing national infrastructures, and public utilities. Three communities capture the digital innovation debate: "Digital Transformation", "Remote Working" and "Digitalization" Interestingly enough, the validated nodes are not the most popular, i.e. those with the highest number of followers (top right panel). In order to check the most popular accounts, we focused on accounts with more than 10 6 followers. In fact, the number of their messages is extremely limited, while their use of hashtags is extremely focused on their activities. This may be related to their strategies, to remark the exclusiveness of their products. More details can be found in the main text. https://doi.org/10.1371/journal.pone.0254748.g003
These results show that the digital innovation debate is central in firms' discussion on Twitter, with themes related to digitalization and the introduction of new tools as remote working, webinars, cloud services. This is consistent with the first results of firms' reactions to the Covid-19 crisis [17]: the needs to social distancing and to shelter at home pushed towards an acceleration of the digital transformation, which was ongoing before the pandemic, thus changing firms' business models and strategies. On one hand, this appeared in an increase in the demand for technology products (e.g. laptops) and services (e.g. cloud computers, digital  Fig 1), but for the one between the number of followers and the total assets (0.51). Again, the situation changes once we focus on a specific ATECO code and it is even more striking than for the entire data set (see left panel of Fig 1). In fact, the same correlation matrix for sector 62 (Computer programming, consultancy and related activitie) shows a strong correlation, for instance, between the total number of messages and the total assets (0.73) and between the number of Likes per message and the total assets (0.75). In fact, the validated accounts are those that follow a common strategy in the usage of hashtags and the importance of their appearance online increases relatively to the dimension of the company. https://doi.org/10.1371/journal.pone.0254748.g004

PLOS ONE
Firms' challenges and social responsibilities during Covid-19 services). On the other hand, many jobs went remote. While digital transformation processes were already ongoing [84], the pandemic highlighted the need to leverage technology to overcome the challenges and environmental uncertainty, and accelerated a trend towards different life styles, increasing the importance of technology in our economy and society [20]. Smaller communities are described in details in the Appendix D.

CSR dissemination and stakeholder engagement.
In this second part of the analysis, we study CSR dissemination and stakeholder engagement in the communities previously identified.
Before starting, let us make a few remarks. First, the number of CSR hashtags is extremely limited, being of 30 different words, compared to a set of 6036 different hashtag used by the validated accounts, resulting respectively as the 0.17% (environmental dimension), 0.28% (social dimension) and 0.05% (economic dimension). Despite these small numbers, it is crucial to notice that, when we also consider the repetitions (i.e. the number of times an account used

PLOS ONE
Firms' challenges and social responsibilities during Covid-19 the various hashtags), the fractions are quite different, that is respectively the 1.17%, 0.42% and 0.07%. In this sense, even if the set of keywords we are considering is quite limited, firms' accounts are particularly inclined to use it: if the number of repetitions per hashtag were constant, we would not observe a change in its percentage when considering their total presence. In this sense, also considering the repetitions, the environmental dimension seems particularly popular among the validated accounts.
CSR usage in the validated network. At the community level, 10 communities out of 13 use CSR hashtags (see the table of Fig 9), with most accounts in each community using CSR hashtags. Environmental and social themes are prevalent, with their relevance differing depending on the community. In the "Environmental Sustainability" community (in purple), all accounts (14) use hashtags related to CSR. This community is mainly composed of public utilities companies, and highlight that the sustainability debate is prevalent in these kinds of firms, and it is joint with the innovation debate. In this community, all the 14 accounts use hashtags related to the environmental dimension. Hashtags related to the social dimension are also tweeted by most accounts (8 out of 14), while hashtags related to the economic dimension are a minority (tweeted by 2 out of 14 accounts).

PLOS ONE
Firms' challenges and social responsibilities during Covid -19 In the "Digital Transformation" community (in dodger blue), 9 accounts out of 16 use hashtags related to CSR. Among them, 6 discuss environmental themes, 8 social ones and 2 economic ones. This shows that digital transformation themes are more connected to social aspects than environmental ones-among the most frequently used hashtags, words like "formazione" ("training") and "istruzione" ("education") appear. Considered together with the prevalent hashtags of the community ("smartworking", "cloud", "covid", . . .), it seems that firms are discussing the learning process associated with the new technologies, which is consistent with the period of observation.
In the "Remote working" community (in sky blue), 10 out of 11 accounts use CSR hashtags. Again, the social dimension is prevalent (9 accounts use social hashtags, in contrast to 6 accounts that use environmental hashtags and 2 that use economic hashtags). The hashtags related to the social dimension include training themes ("formazione", "istruzione", meaning "training" and "education", respectively), but are more varied than the previous communitythey concern work-related themes, as "corruzione"("corruption"), "discriminazione" ("discrimination") and "diversity". This is partly intuitive, as this community is mainly composed by recruitment agencies, consultancies, and telecommunication companies.
In the "Digitalization" community (in violet), 7 out of 9 accounts use CSR hashtags. Consistently with the previous communities, accounts that use hashtags related to the social dimension are more (7) than the ones posting about the environmental dimension (5), while the economic dimension is still the minor one (2). Again, the "Safety" community (in yellow) has 2 accounts tweeting about the social dimension and 1 about the environmental dimension, and no accounts tweeting about the economic dimension.
These results show that CSR dissemination is not evenly distributed among communities and firms. The prevalent CSR dimension firms disseminate on Twitter is contingent to the community they belong to, which is somehow related to the firms' sectors-showing that

PLOS ONE
Firms' challenges and social responsibilities during Covid-19 different types of firms emphasize different dimensions of CSR. The communities focused on digital innovation ("Digital Transformation", "Remote Working", "Digitalization") and safety are more concerned on the social dimension of CSR. In these communities, the environmental dimension is present, but it is less relevant compared to the social dimension, which is somehow surprising, as the environmental dimension is usually the one managers put more attention on [39]. Last, the economic dimension of CSR is overlooked in all communities. We do

PLOS ONE
Firms' challenges and social responsibilities during  not know if the higher relevance of the social dimension of CSR in the digital innovation and safety communities is an effect of the pandemic or if this also happens in non-crisis times. One option is that the pandemic pushed firms to see the social dimension as the most relevant one, and it became prevalent compared to the environmental one. This would confirm CSR as a concept evolving depending on the present circumstances [60]. In any case, it is worth noting that, differently from the main literature [39], in some communities the social dimension is more relevant than the environmental one. Further research could investigate this trend, checking if these communities mostly communicate on social CSR themes over a wider time span, also considering non-crisis times.
The analysis of the smallest communities concerns for CSR themes can be found in the Appendix E.
However, when considering the dissemination of the CSR dimensions focusing on hashtags as a unit of analysis, the relevance of CSR changes. The table in Fig 10 shows the occurrence of CSR hashtags among the hashtags in the validated network. Overall, CSR hashtags are less then 2% of all hashtags, with hashtags related to the environmental dimension being the majority (1.17%) compared to the social dimension (0.42%) and the economic one (0.07%). On one hand, this is consistent with previous literature, which argues that the CSR dimension is overlooked in firms' social media posts [27,66,67]. However, a methodological note is needed. As the hashtags we included in the CSR dimensions are only 30 in total, representing the 0.5% of the total hashtags in the (validated) network, it is easily predictable that they will represent a minority. Actually, their occurrence in the validated network is higher then expected (1.67%), thus highlighting that firms use the hashtags related to CSR more frequently then the others. Further research should widen the hashtags considered as related to the CSR dimensions-as for now, we believe that the hashtags we used as representative for the CSR dimension do not entirely capture the phenomenon. In any case, some communities show that specific CSR dimensions are more prevalent than others. It is the case of the "Environmental Sustainability" community (in purple), where hashtags related to the environmental

PLOS ONE
Firms' challenges and social responsibilities during  dimension are the 3.51% of the total. Firms in the "Remote Working" community (in sky blue), instead, use more hashtags related to the social dimension, which account for the 1.12% of the total. Again, the "Digitalization" community shows a higher prevalence of the environmental dimension (1.62%).
Stakeholder engagement on CSR dimensions. We measure stakeholder (user) engagement as the number of retweets and likes per hashtag. We base on the idea that likes and retweets are positive countersignals that reflect to what extent the firm's message resonates with online stakeholders. In particular, as retweets enlarge the number of users who visualize the message, they are considered stronger and favorable implicit endorsements [31].
The table in Fig 11 shows stakeholder engagement on all posts and CSR dimensions. Results show that posts with hashtags related to the three CSR dimensions are retweeted and liked less then the rest of the posts, thus confirming that social media are not fully exploited to interact and engage on CSR themes [27,66,67].
In the entire data set, on average, each message containing a hashtag concerning CSR is retweeted 2.99 times, against an average of 5.39 retweets for all the hashtags. However, a closer look at the communities and CSR dimensions highlights some peculiarities. While the number of retweets for the environmental dimension is more or less stable in the communities, the "Mobility" community (in white) has a higher number of retweets in the social dimension (4.86 on average). With a few exceptions, the social dimension has on average a higher number of retweets compared to the other CSR dimensions and to the dataset.
In terms of the number of likes per hashtag, in the entire dataset each message containing a CSR hashtag obtains on average 2.59 likes, against an average of 14.83 likes for all the hashtags. When we look at the various communities, again the number of likes in the environmental dimension is quite stable, while social hashtags show a few different trends, with the "Mobility" (in pale yellow) and "Environmental Sustainability" (in purple) communities having a higher number of likes (5.03 and 3.13 respectively) and the "Restart" one (in pink) having a lower number of likes than the average (1.31). Thus, it seems that stakeholder engagement shows

PLOS ONE
Firms' challenges and social responsibilities during Covid-19 similar trends with both measures (retweets and likes), while the extent to which users interact is unevenly distributed among dimensions and communities.
Overall, these findings show that firms' dissemination on CSR dimensions and stakeholder engagement are not homogeneous and vary depending on the communities. Thus, studies enquiring on firms' CSR concerns and stakeholder engagement should consider this when formulating their research questions and methods. These findings also show that network methods, allowing firms' discussions to emerge from data without any a priori hypothesis are an effective way to show firms' and users' different attitudes towards the CSR dimensions.

Conclusion
Our paper presents large Italian firms' discussion on Twitter during the first 9 months of the Covid-19 pandemic. Specifically, our explorative research questions followed three lines: 1) firms' general Twitter discussion during the Covid-19 pandemic; 2) the CSR dimensions being discussed; 3) stakeholder engagement on CSR themes. First, we show that the discussion is formed of 13 communities of firms, with Covid-19 themes appearing in 10 of them. The core of the network, which reflects firms' major challenges during the pandemic, is composed of five communities (i.e. "Digital Transformation", "Remote Working", "Digitalization", "Environmental Sustainability" and "Safety"). Firms' ownership type does not seem to affect the debate. Second, we show that 10 communities out of 13 use CSR hashtags. While the environmental and the social dimensions are the prevalent ones, the economic dimension is generally overlooked. Moreover, the social dimension is more relevant than the environmental one in the three digital innovation communities and the safety one. Third, stakeholder engagement on CSR themes is limited, and unevenly distributed among communities and themes.
This work has several implications. Today, an increasing number of firms has a social media account. They can be used for a number of reasons, which include sharing information about products, decisions, strategies [85], and for accountability on the firms' economic, environmental, and social activities. During the pandemic, the quantity of data being shared online increased, with online social networks strengthening their role as a mean of communication [86]. First, we show that online social media are a tool to capture firms' issues, challenges and concerns, highlighting groups of firms that are facing similar challenges in a timely manner. Doing so, we advance management research, showing that online social media can be used to have an understanding of firms' strategic dimension, without a-priori lenses, complementing traditional methodologies (e.g. surveys). Thus, we show the potential of social media data for management and strategy research, developing a new research dimension which overcomes the traditional uses of online social media data in research (which mainly explored the relationship of firms' social media with stock market volatility [87], product sales [88], financial performances [89], or CSR concerns [24,32,33] and stakeholder engagement [71,90,91]). Second, we conceptualize firms' discussion using complex network analysis. We apply a quantitative method that captures firms' communities of discussion by creating the bipartite network of accounts and hashtags, then projecting the information on the layer of accounts. Doing so, we show that common narratives are naturally emerging from data, with firms forming communities of discussion on Twitter. Thus, we integrate a new methodology (i.e. complex network analysis) in management research, widening the methods and focus of current research, that mostly uses manual labelling [27], and linear regressions [28][29][30] and centers on specific themes [23,24] or CSR issues [24,32,33]). Our method, being data-driven and with no a-priori hypotheses, allows themes and communities to freely emerge from data. Doing so, our results support legitimacy theory [62,63], as firms' CSR messages and stakeholders' interactions are a minority. However, we challenge the traditional view which bases legitimacy on CSR values: we argue that firms are on social networks to follow their society's values, but not necessary to follow expectations related to CSR. Results also contribute to the literature on firms and social media, showing that digital innovation, environmental sustainability and safety are the main challenges firms are facing during Covid-19; to the CSR literature [27,65], highlighting that the relevance of the CSR dimensions varies depending on the community; and to the stakeholder engagement literature [71,90,91], confirming that Twitter is overlooked as a tool to interact on CSR issues, with peculiarities arising in some communities. On the practical side, our research contributes to provide an analytics tool based on firms' Twitter data to support managers, entrepreneurs and policy makers when designing their strategies and decision making. Thus, we propose the use of social media as a support tool for strategymaking, adding to the current uses in businesses, which include product design, relations management and marketing [92]. Finally, applied to the CSR field, a network-based analysis of Twitter represents an alternative way to understand the managerial perceptions of CSR themes, which are the responsibilities managers believe a business should pursue towards society [39].
Some further questions remain to be addressed. First, our analyses focus on large firms in only one country, thus not considering small and medium businesses, which form the backbone of Italian and European economies, and other national contexts. An extension of this research will also include SMEs, to allow a wider understanding of firm's discussion on Twitter. In fact, due to the selection of large firms, firms' sectors are not evenly distributed, neither is firms' geographical location. A larger set of firms' account will overcome this limitation.
Second, the examined list of CSR hashtags is quite limited and, even if those hashtags are quite popular, we run the risk of capturing just a portion of the entire CSR debate; in further research we will expand this list, increasing our coverage of all the variations of the CSR discussion.
Starting from the mid of April, due to a little slowing down in the number of contagions (Al Jazeera, 20th April, 2020.), the government allowed some re-openings, while on the 4th of May a progressively weakening of the lockdown measures, called "Phase 2", started (Repubblica.it, 22nd April 2020).
With the beginning of the summer, the lockdown limitations were further reduced: citizens were allowed to move from region to region for summer vacations and various not-strictly essential activities as cinemas, theatres and even discotheques re-opened: the diffusion of the virus appeared under control (Gazzetta Ufficiale, Decreto del Presidente del Consiglio dei Ministri, 11 giugno 2020).
Nevertheless, during the autumn, confirming several epidemic forecasts, a second wave of Covid-19 contagions appeared: on the 13rd of October a government decree imposed a new lockdown, limiting again the activities of restaurants, cinemas, and theatres (Gazzetta Ufficiale, Decreto-legge 7 ottobre 2020, n. 125). Some regions imposed curfews to dissuade citizens to gather for the nightlife, due to a local increase of the contagions (Wikipedia.it).
On the 26th of October, restaurants and similar free-time activities were closed (Repubblica.it, 25th October 2020). Since the 3rd of November, the government has developed a system of colours to mark each region, depending on the spread of the virus. For example, in white regions the spread of the virus is under control, thus the limitations to various activities are quite weak (Ansa.it, 4th November 2020). In yellow, orange and red regions restrictions progressively increase, based on the local epidemic scenario. Travelling between different regions is prohibited, while within orange and red regions it is extremely limited. During the winter vacations, even moving inside the same municipality was extremely restricted (Gazzetta Ufficiale, Decreto-legge 2 dicembre 2020, n. 158).

B Hashtag cleaning
In order to overcome typos and to recognise properly the same subjects expressed by different hashtags, we implemented the edit distance, as said in the main text. Before implementing the edit distance, we first detect words containing numbers, which may represent dates or anniversaries: in this sense, we delete numbers and compare the non-numeric characters of the hashtags, in order to focus on the relevant information. The rationale is that we are interested in focusing on the fact that for instance, a firm is celebrating its anniversary to stress its robustness, but not on how many years they are celebrating. We do not consider in the comparison possible acronyms, i.e. words that were not recognise to be Italian or English, i.e. the two main languages for the company communication in Italy: it was done in order to avoid considering the random match of different acronyms.
Hashtags with a relative edit distance smaller than 0.20 were considered equal. The value of 0.20 was chosen after manually testing a sample of 200 couple of hashtags with edit distance lower than 0.20: the 90.9% of identifications were correct. We tried other thresholds, but the performances were worse. The identified hashtags were subsequently merged, after a manual check.

C Entropy-based null-models for bipartite networks
In the present section of the appendix we present more details about the the entropy-based null-model used for the analysis of the bipartite network between the layer of firm accounts and the one of (cleaned) hashtags, i.e. the Bipartite Configuration Model (BiCM [35]) mentioned in Section 3.2; a general review on the wider argument of entropy-based null-models for the analysis of real networks can be found in [36]. After the introduction of the BiCM in subsection C.1, we will show in subsection C.2 how the null-model can be used as a benchmark to validate the projection on one of the two layers, as proposed in [38]. Let us finally remark that the exact implementation of the null-model was performed via the python module NEMtropy, presented in [80].
C.1 Bipartite configuration model. Let us start from a real bipartite network G � Bi and call the two layers > and ?; we refer to nodes of the layer > and ? using respectively with Latin and Greek indices. A bipartite network can be represented via its biadjacency matrix, i.e. a rectangular |>|×|?|-matrix M whose the generic entry m iα is 1 if i 2 > and α 2 ? are connected and zero otherwise. The degree of nodes i and α can be written in terms of the biadjacency matrix as k i = ∑ α2? m iα and k α = ∑ i2> m iα , respectively.
The main idea is to build a network null-model that has some topological properties (in the case of the BiCM, the degree sequence) equal, on average, to the ones observed in the real system and is completely random for the remaining ones, as the statistical ensembles in Statistical Mechanics [36]. Thus, let us call G Bi the set (the ensemble) of all bipartite graphs with the same number of nodes, respectively in > and ?, as in the real network G � Bi and let us define the Shannon entropy for the ensemble [93]: where P(G Bi ) is the probability of the representative G Bi 2 G Bi . In order to consider the nullmodel as maximally random, but for the constraints, we have to maximise the entropy in Eq 1 constraining the average value of the degree sequence. We can obtain this result by using Lagrangian multipliers, i.e. by finding the maximum of where h. . .i are averages over the ensemble, z, η i and θ α are the Lagrangian multipliers and � indicates values measured over the real network. The maximisation of the entropy respect to the probability returns the functional forms of the probability in terms of the Lagrangian multipliers z, η i and θ α . Interestingly enough, when the constraints are linear in the adjacency matrix (as in the case of the degree sequence), the probability for the entire graph factorises in terms of probability per links, i.e.
where p iα is the probability of observing a link in m iα . The Lagrangian multiplier z is simply the normalisation of the probability per graph P(G Bi ). Instead, to finally obtain the value of η i and θ α , we have to impose that the average degree sequence is equal to the one observed in the real system, i.e. ( Interestingly enough, it has been shown that the system of Eq 2 is analogous to the maximisation of the likelihood of the observed system [94,95].
C.2 Validated projection. We can use the BiCM to study the similarity of the connections of the nodes on one of the layers and detect the significant similarities. Consider two nodes i, j 2 >: such similarity can be captured by the co-occurrences of links towards the opposite layer, or V-motifs [35]: The method presented in [38] proposes to compare, for each couple of nodes i, j 2 >, the observed V-motifs with the theoretical distribution obtained using the BiCM probabilities. The distribution of V-motifs is a Poisson binomial one, i.e. the generalization of a binomial distribution in which each event has a different probability; in the present application, due to the sparsity of our network, it can be safely approximate the Poisson binomial distribution with a Poisson one [96].
Once we have the theoretical distribution of the V-motifs we can state the statistical significance of their observation: for each couple i, j 2 >, we can associate a p-value related to the observed V � ij . We finally need a multiple test hypothesis to state the statistical significance of the p-values: the False Discovery Rate (FDR) procedure [97] is considered as one of the most effective, due to its control on the False Positive rate. All rejected hypotheses by the FDR indicate that the relative V-motifs cannot be explained by the theoretical distribution, that is, they contain more information than those of the constraints. Following this rationale, we can define a validated projection network as a monopartite network for nodes on layer > where a link between i and j is present if the related V � ij is statistically significant.

E CSR usage in the smaller communities
The remaining communities-which are smaller-all show some interest in CSR themes. The economic dimension of CSR is absent, except in the "Mobility" community, where two accounts use the hashtag "responsabilità" ("responsibility"). In all the minor communities the environmental dimension is bigger than the social one, with "sostenibilità" ("sustainability") as the recurring hashtag. As far as specific hashtags are concerned, these are varied depending on the community (see the table of Fig 13). "Sostenibilità" ("sustainability") is the more common

PLOS ONE
Firms' challenges and social responsibilities during Covid-19 hashtag in the "environmental" area, which appears in all the 10 communities. The two biggest communities ("Digital Transformation" and "Environmental Sustainability") include "stakeholder" as their most frequently tweeted hashtags. This means that firms are specifically addressing their stakeholders [64]. "Csr" and "responsabilità" ("responsibility") are the only hashtags used in the "economic" area, with the latter one appearing in only three communities. As the concept of responsibility is connected to the idea of accountability, the low presence of this hashtag shows that Twitter is not used as a mean to report firm's economic and financial results. This is consistent with previous research, which maintains that firms use Twitter mostly for marketing purposes and to promote their products and service, rather than for the sharing news about CSR [66].