Modeling Periodic Impulsive Effects on Online TV Series Diffusion

Background Online broadcasting substantially affects the production, distribution, and profit of TV series. In addition, online word-of-mouth significantly affects the diffusion of TV series. Because on-demand streaming rates are the most important factor that influences the earnings of online video suppliers, streaming statistics and forecasting trends are valuable. In this paper, we investigate the effects of periodic impulsive stimulation and pre-launch promotion on on-demand streaming dynamics. We consider imbalanced audience feverish distribution using an impulsive susceptible-infected-removed(SIR)-like model. In addition, we perform a correlation analysis of online buzz volume based on Baidu Index data. Methods We propose a PI-SIR model to evolve audience dynamics and translate them into on-demand streaming fluctuations, which can be observed and comprehended by online video suppliers. Six South Korean TV series datasets are used to test the model. We develop a coarse-to-fine two-step fitting scheme to estimate the model parameters, first by fitting inter-period accumulation and then by fitting inner-period feverish distribution. Results We find that audience members display similar viewing habits. That is, they seek new episodes every update day but fade away. This outcome means that impulsive intensity plays a crucial role in on-demand streaming diffusion. In addition, the initial audience size and online buzz are significant factors. On-demand streaming fluctuation is highly correlated with online buzz fluctuation. Conclusion To stimulate audience attention and interpersonal diffusion, it is worthwhile to invest in promotion near update days. Strong pre-launch promotion is also a good marketing tool to improve overall performance. It is not advisable for online video providers to promote several popular TV series on the same update day. Inter-period accumulation is a feasible forecasting tool to predict the future trend of the on-demand streaming amount. The buzz in public social communities also represents a highly correlated analysis tool to evaluate the advertising value of TV series.


Methods
We propose a PI-SIR model to evolve audience dynamics and translate them into ondemand streaming fluctuations, which can be observed and comprehended by online video suppliers. Six South Korean TV series datasets are used to test the model. We develop a coarse-to-fine two-step fitting scheme to estimate the model parameters, first by fitting inter-period accumulation and then by fitting inner-period feverish distribution.

Results
We find that audience members display similar viewing habits. That is, they seek new episodes every update day but fade away. This outcome means that impulsive intensity plays a crucial role in on-demand streaming diffusion. In addition, the initial audience size and online buzz are significant factors. On-demand streaming fluctuation is highly correlated with online buzz fluctuation.

Conclusion
To stimulate audience attention and interpersonal diffusion, it is worthwhile to invest in promotion near update days. Strong pre-launch promotion is also a good marketing tool to improve overall performance. It is not advisable for online video providers to promote several popular TV series on the same update day. Inter-period accumulation is a feasible Introduction TV series that are consist of a limited number of episodes are a popular type of entertainment welcomed by all ages. TV series were traditionally broadcast on television until House of Cards, a popular American political drama TV series first broadcast in 2013, was released in full through the Netfilex streaming service [1]. In China, several major online video-streaming service suppliers buy the broadcast rights of TV series and sell inserted commercial advertisements, which audiences must view. To optimize their advertisement schedules, it is important for video-on-demand (VOD) suppliers to record on-demand streaming statistics and forecast corresponding trends.
As social network services (SNSs) have become widespread over the past few years, audiences have become accustomed to discussing and sharing the plots of series and posting valence (i.e., a positive or a negative sentiment) in online social communities. With the help of behavioral e-footprints of audiences, VOD suppliers have the opportunities to learn the collective behaviors of human dynamics by modern computational social science approaches [2,3]. Several studies suggest that the buzz or word-of-mouth volume in social networks can be used to forecast a movie's box-office performance [4,5].
According to the Bass model, which is a traditional, concise innovation diffusion model proposed by Bass in 1969 [6] to forecast the first adoption of durable products, there are two types of agent: innovators and imitators. Each type has an effect on the adoption diffusion process. Theoretically, the Bass model consists of a simple non-linear differential equation of the adoption function F(t) with respect to time t: @FðtÞ @t ¼ ½p þ qFðtÞ½1 À FðtÞ, where p is the coefficient of innovation from external influences and q is the coefficient of imitation from internal influences. This equation has a counterpart in epidemic-spread theory, whereby q corresponds to the spreading rate in the standard susceptible-infected (SI) equation of the infection function I (t): @IðtÞ @t ¼ qIðtÞ½1 À IðtÞ and if we assume susceptible agents will be infected independently and randomly from outside. Thus, the Bass model can be regarded as an imported SI model.
A large body of literature has adopted biological SI-family models to study peer-wise cascade diffusion phenomena in the multidisciplinary branches of several fields, including physics, sociology, economics, and information science [7]. In particular, the theoretical result for zerotended critical thresholds for restraining epidemic outbreaks in scale-free complex networks [8] has inspired researchers to investigate the collective features of human dynamics in social networks, such as rumor diffusion [9][10][11], information sharing [12,13], and word-of-mouth contagion [14].
Most of these studies assume that the epidemic-like diffusion occurs in a closed system. That is, during the initial stage, only few infected individuals become disseminative origins who trigger diffusion, and the system takes no external influences into account [15]. However, Myer et al. find that only 71% of Twitter's tweet volume is attributed to network diffusion. The remaining 29% is due to external events and factors external to the network [16]. Stated simply, the literature can distinguish agents who are randomly infected by outside origins from interpersonally infected ones [17,18].
In comparison, TV series exert another type of imported influence on diffusion process. TV series update periods are fixed, which can be viewed as a periodic impulsive stimulation of the entire audiences. These circumstances resemble the impulsive vaccination of the susceptibleinfected-removed(SIR)-like epidemic model in biology [19][20][21]. However, impulsive vaccination suppresses epidemic diffusion, whereas TV series updates stimulate diffusion.
On the other hand, audiences have no chance to view the episodes they missed in the conventional television broadcasting mode, while they can review any times they want in the online steaming mode. There are so many video seeds in the online platform leading to choice overload for audiences. Li et al. introduce the concept of view scope to model the user information-processing capability under information overload in the Facebook- [22,23] and Twitterlike social networks [24]. Su et al. investigate the incomplete reading behavior of microblog users encountering massive messages by improving the traditional epidemic model with a reading rate [25]. Wang et al. model the nonredundant information transmission behavior in social networks [26]. However, our study finds that the periodic impulsive stimulation would influence the online streaming behavior of TV series audience significantly because of the lockin effect. Therefore, we develop a periodic impulsive-susceptible-infected-recovered (PI-SIR) model to analyze the periodic impulsive effects on online TV series diffusion and test this model using three popular South Korean TV series.

Analysis
The basic model Referring to the basic SIR epidemic diffusion deterministic model proposed by McKendrick and Kermack [27], we denote S(t) as the number of agents who have not yet been attracted by the TV series at time t, I(t) as the number of agents who have adopted the TV series and might share information with or spread valence to their friends in social networks or with physical contacts and R(t) as the number of agents who have lost interest in the TV series and give up watching.
Zhou et al. demonstrate that the social network has a heterogeneous topological structure with scale-free distribution of node degrees [28], which means that the minority of the population has larger connectivity and influence than the remaining majority. Wang et al. find the heterogeneous network structure has a significant influence on the epidemic threshold and final size [29]. However, in modern society, people live in a complicated communication network in which they can keep in touch with one another online or offline.
Since it is difficult to reshape the influence relationship among audience, to simplify, we assume that mixed communication networks have a homogeneous network structure for the duration of a limited TV-series broadcast as Bass model does. That is, for a social network with the average degree hki, the average spreading rate (ASR), denoted as λ, is defined combining with hki with the mean-field approximation [8,30]. We also denote β as the average removed rate (ARR). In addition, based on the Bass model [6], we consider that certain audiences are persuaded by mass media advertising and promotion to start watching with an average probability α-external influence rate (EIR). Therefore, the differential equations that correspond to the general imported SIR model are as follows: @SðtÞ @t ¼ À aSðtÞ À lIðtÞSðtÞ; @IðtÞ @t ¼ aSðtÞ þ lIðtÞSðtÞ À bIðtÞ; @RðtÞ @t ¼ bIðtÞ: where S(t) + I(t) + R(t) = 1. We let S(0) = 1, I(0) = R(0) = 0 be the initial condition.
According to the theory of impulsive differential equations [31], an impulsive-version of the SIR model with a fixed time interval τ is as follows: where parameter μ is the coefficient of impulse intensity (CII). Substituting S(t) + I(t) + R(t) = 1 into Eq 1, after rearrangement, we obtain: Obviously, for a popular TV series, if removed rate β % 0, then )R(t) % 0. Therefore, I(t) follows an S-shaped curve, and I(1) ! 1, S(1) ! 0. In contrast, for an unpopular TV series, the effect of β cannot be ignored. Therefore, the endemic status is I(1) ! 0, S(1) ! 0, and R(1) ! 1. In fact, because no TV series has infinite duration, most I(t) curves are J-shaped during the initial limited broadcast.
The online on-demand streaming statistics are easier to trace and record than the quantity variation of audiences. If we suppose that the newly updated m episodes at time τ k would be viewed by I(t) agents evenly within the following period τ, the on-demand streaming amount (ODSA) V(t) at time t has a shape similar to I(t) and is linear to I(t): As shown in Fig 1(a), the solid line is the general SIR diffusion curve derived from Eqs 5 and 6, whereas the piecewise broken curve is the general SIR diffusion considering impulsive stimulation at time τ k of Eq 4. Because the impulsive effect results in a step increment of I(t), which changes the initial conditions of the next period, the subsequent segment of the impulsive SIR curve will borrow the corresponding segment of the general SIR curve.
However, the actual TV series online ODSA curve is something like Fig 1(b). This zigzagging curve differs from the stepwise impulsive diffusion curve in Fig 1(a). Possible explanations include the following: 1. most audiences, including faithful fans and the newly recruited, are eager for the new episodes and finish watching by every update time, which results in the local sharp peaks (marked by small round circles); 2. a small fraction of the audiences cannot keep pace with new episodes and delays viewing during the next few days, which results in sharp declines after every update time (marked by small triangles); 3. promotion and buzz climb again before the next update time as viewers attempt to guess the plots and attract new viewers to join the audience, which results in the slowly climbing segment after the valley.

PI-SIR model
Therefore, we propose a PI-SIR model for online TV series diffusion. Fig 2 illustrates one single-period diffusion process from τ k to τ k+1 .
Because the audience that stopped watching cannot be tracked during non-update time periods, we count the removed population only at every update time (Eq 9). Therefore, R(τ k ) is a constant during (τ k , τ k+1 ), and SðtÞ þ IðtÞ In addition, we assume that every new audience views the previous episodes, whereas the deposited audience only views the currently updated episodes.
In sum, there are two groups of newly infected agents who view all the m(k+1) previous episodes in the following period: 1. At update time τ k , mSðt À k ÞIðt À k Þ susceptible agents transition to infected ones; 2. During τ k to τ k+1 , R t À 0 ½aSðt k þ iÞ þ lIðt k þ iÞSðt k þ iÞdi susceptible agents transition to infected ones by using a continuous time integral.
Otherwise, at update time τ k , deposited audience Iðt À k Þ À bIðt À k Þ remains, which only views the newly updated m episodes during the following period. Therefore, we obtain the interperiod accumulation (IPA) of the on-demand streaming amount-V(τ k , τ) from τ k to τ k+1 : According to Fig 1(b), we define a concave feverish function-f(i) (Fig 2), which is satisfied by R t À 0 f ðiÞ ¼ 1 to fit the actual zigzag curve. Finally, the ODSA function V(t) is: Different from biological epidemics, the initial audience is always not equal to 0 because of the commercial pre-launch promotion. That is, According to the coarse-to-fine strategy, we firstly inspect the inter-period variation of V(τ k , τ) with respect to k. Using variable transformation i = t − τ k and substituting Eq 7 into the integral item of Eq 10, yields, This result is substituted into Eq 10, Referring to Eq 8, we have This result is Substituted into Eq 13 to yield, One single-period diffusion process through τ k to τ k+1 . At update time τ k , the impulsive intensity of mSðt À k ÞIðt À k Þ indicates the newly recruited audience, while removed population bIðt À k Þ is the audience stopped watching, which is only counted at every update time. During non-update periods, external influence αS(t) and interpersonal spreading λS(t)I(t) occur in every time unit. A concave feverish function is used to fit the imbalance of the on-demand streaming distribution.

Data collection
In China, Youku.com (www.youku.com), Tudou.com (tv.tudou.com), and iQIYI.com (www. iqiyi.com) are the three leading online video-on-demand suppliers and account for a more than 90% market share. Youku and Tudou merged on Aug. 23 rd , 2012. These two companies publish and update their online streaming statistics data. For the past few years, South Korean TV series have entered the Chinese market and realized a rapid increase in influence. We selected six TV series to test our PI-SIR model: The Masters Sun (Masters), The Heirs (Heirs), My Love From the Stars (Love), Inspiring Age (Age), Kaputori, Modern Farmer (Farmer). All of the series are broadcast online on Chinese mainland. Therefore, we gather everyday ODSA from the Youku and Tudou websites (http://index.youku.com and http://top.iqiyi.com, respectively). The basic information for these data sets is listed in Table 1.
In addition, perform a correlation test of online buzz and on-demand streaming behaviors, we use Baidu Index (BI) (http://index.baidu.com/) to collect online buzz longitudinal data. BI is provided by the predominant Chinese search engine company Baidu.com and enables users to search for the search volume and trends of certain hot keywords and phrases.

Fitting
Based on the datasets, we define an inter-period error function of IPA to estimate the PI-SIR model parameters, where D is the number of duration, U is the number of updates and equal to the floor integer value of D t , V(τ k , τ) is the actual value and V 0 (τ k , τ) is the fitting value. Based on a report on China's online video streaming market, the total number of potential audience members was about N = 4.0e + 8 in 2013. A two-step fitting scheme is used to seek out the estimated parameters. In the first step, we harvest the estimated values ofâ,l,b, andm by, ½â;l;b;m ¼ arg min Based on the three samples, the remaining constants are established as τ = 7, m = 2. We believe the initial audience to be I À ð0Þ ¼ Vð0Þ m . The estimated parameters ofâ,l,b, andm are listed in Table 2. Then, before the second step, we use cosine similarity to investigate whether the inner-period quasi-parabolic curve segments have similar characteristics, where As listed in Table 3, the six sims are almost larger than 0.9, and the V s s are very close to 0 (except Farmer's), which implies that the ODSA of every period might exhibit the same characteristics from a perspective of collective behavior. Therefore, we can only use one set of estimated parameters off ðÁÞs to fit the feverish function in the second step as follows: Vðt k þ iÞ Vðt k ; tÞ ; where V(τ k + i) and V(τ k , τ) are the actual values and V 0 (τ k + i) and V 0 (τ k , τ) are the fitting values. The results forf ðÁÞs are listed in Table 2. Figs 3-8 show the two-step fitting results for the six datasets. Finally, the Pearson correlation coefficient is used to investigate the correlation between word-of-mouth and on-demand streaming, CorrðVðtÞ; BðtÞÞ ¼ hVðtÞ À V ; BðtÞ À Bi kVðtÞ À V k Á kBðtÞ À Bk where B(t) is the BI temporal series that corresponds to the broadcasting periods and V and B are the mean values of V(t) and B(t), respectively. The results of Corr are listed in Table 3.

Findings and Discussion
Our comprehensive analysis of the six datasets yields five findings as follows.

Audiences have similar viewing habits.
Comparingf ð1Þ -f ð7Þ in Table 2, we find that the feverish distributions within an update period of six different types of TV series are  Periodic Impulsive Diffusion of TV Series periodically fluctuated, which means that the audience attention imbalance affected by externally impulsive stimulation.
2. The on-demand streaming fluctuation is highly correlated with the online buzz fluctuation. According to an empirical criterion of PCC, the first 5 Corrs in Table 3 are larger than 0.5, which indicates high correlation. Visual confirmation in Figs 9-13 supports this finding. The explanation might be that today's audience is accustomed to search for information, to post messages, to discuss plots, and to exchange ideas in online social communities when they are watching TV series. However, the Corr of Farmer is close to 0, which indicates no correlation. By investigating the two curves in Fig 14, we find that the mismatched and high weighted initial peaks of BI and OSDA curves might bring Corr down, although ODSA waves after BI slightly.
3. Impulsive intensity plays a crucial role. Comparing the EIRs, ARRs, ASRs, and CIIs of Love and Heirs in Table 2, we find that the CII of Love is approximately 118 times larger than that of Heirs, while the EIR, ARR, and ASR of the two series are similar. Although the initial ODSA and initial BI of Love listed in Table 4 are less than those of Heirs, the ODSA and BI of Love climb much faster than those of Heirs. The ODSA and BI of Love start to exceed  Heirs from the 5 th week, which can be deduced from the IPA curves shown in Fig 15, the BI accumulation curves shown in Fig 16, the ODSA fluctuation curves shown in Fig 17, and the BI fluctuation curves shown in Fig 18. Finally, according to Table 4, the total ODSA of Love is approximately 3 times larger than that of Heirs, and the BI of Love is approximately 2 times larger than that of Heirs.
4. Audience activity degree indicates the trend of streaming rate. We can divide the six series into two groups. The first group includes Masters, Heirs, and Love. The second group includes Age, Kaputori, and Farmer. The BI accumulation values of the first group are tens or hundreds of times larger than those of the second group shown in Fig 16. This indicates audience activity degree for the first group is greatly larger than that of the second group. Therefore, comparing the EIR, ARRs, ASRs, and CIIs in Table 2, we find the removed rates of the second group are several times larger than those of the first group. High ARRs depress the rising of ODSA curves, even drive the trend down (see the IPA curve in Fig 15).
5. The initial value is also important. Comparing the EIRs, ARRs, ASRs, and CIIs of Heirs and Masters in Table 2, we find that the ASR of Masters is approximately 6.7 times larger than that of Heirs, while the other three parameters are equal. This outcome explains why two ODSA curves exhibit a similar shape (Fig 17). However, the ODSA curve of Masters climbs slightly more than that of Heirs at the tail because of the larger ASR value. As listed in Table 4, the total ODSA of Heirs is approximately 4 times larger than that of Masters and the total BI of Heirs is approximately 4.5 times larger than that of Masters. A significant reason for this gap is the initial value. Also with respect to Table 4, the initial ODSA of Heirs is approximately 5.6 times larger than that of Masters, while the initial BI of Heirs is approximately 6.9 times larger than that of Masters. In addition, the initial values of Age, Kaputori, and Farmer are greatly less than those of the other three series. Therefore, their performance of total ODSA and BI are also worse.

Conclusion
According to our PI-SIR model, the online on-demand streaming amount of TV series fluctuates with respect to the periodic impulsive stimulation and climbs as word-of-mouth diffuses. Because the audience can optionally review previous episodes whenever and wherever possible in the context of online streaming, a feature that traditional TV broadcasts cannot provide, the word-of-mouth effect, accumulates to amplify the audience rating. Our analysis results for six different types of South Korean TV series reveal that impulsive intensity on update days and pre-launch promotion have stronger impacts on the total on-demand streaming amount than other parameters. The implication of these results for management is that it is worthwhile to invest in promotion near update days to stimulate audience attention and interpersonal diffusion. In addition, strong pre-launch promotion seems to be a good marketing tool to improve overall performance. Our research also reveals that it is not advisable for online video providers to promote several popular TV series on the same update day because of the imbalanced distribution of audience intention. The technical implication of our research is that inter-period accumulation is a feasible forecasting tool to predict the future trend of the on-demand streaming amount. In addition, the buzz in public social communities is a highly correlated analysis tool to evaluate the advertising value of TV series.

Limitations
Our model seems not good at fitting emergency situations because differential equations try to smooth nonlinear trends. Today, as big data can be found everywhere, online video on-demand streaming providers would like to develop technologies to trace user habits using cookies or other behavioral footprints. In this research, we could obtain more detailed information except through the open records of on-demand streaming amounts. In future research, big data and more intelligent technologies can be expected to fill this gap.