Data quality assessment and associated factors in the health management information system among health centers of Southern Ethiopia

Background A well designed Health management information system is necessary for improving health service effectiveness and efficiency. It also helps to produce quality information and conduct evidence based monitoring, adjusting policy implementation and resource use. However, evidences show that data quality is poor and is not utilized for program decisions in Ethiopia especially at lower levels of the health care and it remains as a major challenge. Method Facility based cross sectional study design was employed. A total of 18 health centers and 302 health professionals were selected by simple random sampling using lottery method from each selected health center. Data was collected by health professionals who were experienced and had training on HMIS tasks after the tools were pretested. Data quality was assessed using accuracy, completeness and timeliness dimensions. Seven indicators from national priority area were selected to assess data accuracy and monthly reports were used to assess completeness and timeliness. Statistical software SPSS version 20 for descriptive statistics and binary logistic regression was used for quantitative data analysis to identify candidate variable. Result A total of 291 respondents were participated in the study with response rate of 96%. Overall average data quality was 82.5%. Accuracy, completeness and timeliness dimensions were 76%, 83.3 and 88.4 respectively which was lower than the national target. About 52.2% respondents were trained on HMIS, 62.5% had supervisory visits as per standard and only 55.3% got written feedback. Only 11% of facilities assigned health information technicians. Level of confidence [AOR = 1.75, 95% CI (0.99, 3.11)], filling registration or tally completely [AOR = 3.4, 95% CI (1.3, 8.7)], data quality check, supervision AOR = 1.7 95% CI (0.92, 2.63) and training [AOR = 1.89 95% CI (1.03, 3.45)] were significantly associated with data quality. Conclusion This study found that the overall data quality was lower than the national target. Over reporting of all indicators were observed in all facilities. It needs major improvement on supervision quality, training status to increase confidence of individuals to do HMIS activities.


Background
A well designed Health management information system is necessary for improving health service effectiveness and efficiency. It also helps to produce quality information and conduct evidence based monitoring, adjusting policy implementation and resource use. However, evidences show that data quality is poor and is not utilized for program decisions in Ethiopia especially at lower levels of the health care and it remains as a major challenge.

Method
Facility based cross sectional study design was employed. A total of 18 health centers and 302 health professionals were selected by simple random sampling using lottery method from each selected health center. Data was collected by health professionals who were experienced and had training on HMIS tasks after the tools were pretested. Data quality was assessed using accuracy, completeness and timeliness dimensions. Seven indicators from national priority area were selected to assess data accuracy and monthly reports were used to assess completeness and timeliness. Statistical software SPSS version 20 for descriptive statistics and binary logistic regression was used for quantitative data analysis to identify candidate variable.

Result
A total of 291 respondents were participated in the study with response rate of 96%.
Overall average data quality was 82.5%. Accuracy, completeness and timeliness dimensions were 76%, 83.3 and 88.4 respectively which was lower than the national target. About 52.2% respondents were trained on HMIS, 62.5% had supervisory visits as per standard and only 55.3% got written feedback. Only 11% of facilities assigned health information technicians. Level

Conclusion
This study found that the overall data quality was lower than the national target. Over reporting of all indicators were observed in all facilities. It needs major improvement on supervision quality, training status to increase confidence of individuals to do HMIS activities.

Bahailu Balcha, MSc
Amene Abebe Response to Reviewers: Response to reviewers First of all, we would like to acknowledge the academic editor for giving us adequate time to revise and address all the concerns of the reviewers and journal requirements. Following, we the authors of this manuscript have been working extensively since we have been notified with the academic editor and expert reviewer's report of the manuscript giving a due attention for all the concerns raised by the academic editor and expert reviewers to be well addressed. Thank you so much! A.Point by point response letter to academic editor 1.We have checked again our manuscript for fulfillment of PLOSONEs style requirements including the naming of files and it has been written accordingly. Thank you!! 2.Regarding the language usage, grammar and spelling errors, we have re-written each and every statement throughout the manuscript correcting all the grammatical, spelling and punctuation errors consulting English language professionals. Thank you! 3.According to the academic editor's recommendation to include the participating health facilities in the methods or as the supplementary file, the participating health institutions are included as supplementary file. Thank you! 4.As per the request of the academic editor we have included all the survey questionnaire (in the original language and the English version) as supporting information. Thank you. Thank you in advance!!! 5.Regarding the data availability all the data used are included in the manuscript. Thank you! 6.According to the academic editor's recommendation regarding publicity of the funding source, we have removed the statement from the acknowledgment section in the current version Thank you in advance!!! 7.Regarding inclusion of the author Berhan Tessaw in the online submission, we have included (via edit submission) online and included one additional author (Amene Abebe) who was missed in the first submission. Thank you! 8.The comment regarding referring the tables 1-5 in the text of the manuscript, has been well accepted and we have corrected all the tables in the manuscript text to refer the tables sequentially as they appear in the manuscript text. Thank you! 9.Regarding to check whether retracted articles are still cited in the manuscript, we have checked the references and no retracted reference is used. Thank you! 10. According to the requirement for the ORCID ID, the corresponding author and some of co-authors such as (Amene Abebe) already had ORCID iD. Thank you! 11.The comment about overlapping texts with previous publication is correct. However, nothing was deliberate because the publications which I were a co-author were published in the year 2020 and the current study has been done in the year 2018. Though, we have rephrased the overlapping texts in the current version of the manuscript. Thank you! B.Point by point response letter to reviewer one 1.Concerning the reviewer comment about revision of the document, form and the statistical part. We have re-written the document and corrected all errors in the write up and statistical output reporting 2.The reviewer comment about reduction of the introduction section of the manuscript is well accepted. We have reduced the introduction section to only one page and a paragraph, and restructured the content in to what has been known in the existing literature, what is lacking and what has been aimed by the study. Thank you! C.Point by point response letter to reviewer two First of all we would like to express our gratefulness for the reviewer for appropriately recognizing the topic as one of the important area of research and for being interested on the topic. Following our acknowledgement, the reviewers concerns are point by point addressed in the following ways: 1.Based on the recommendation of the reviewer to rephrase the title, we have rephrased the title as per the suggestion. Thank you! 2.The comment about writing in full the abbreviations at least once before using the abbreviation is well accepted. We have written all the abbreviation in full at least once before we use the abbreviations in the manuscript file. Thank you! 3.The reviewer comment about erasing unnecessary space between words and punctuation is right. We have erased all the unnecessary spaces and also checked punctuation errors throughout the document and corrected. Thank you! 4.The reviewer comments regarding the structure of the introduction section of the previous version of the manuscript has been accepted. In the current version we restructured the contents of the introduction in to three sections "introduction" "significance of the study" and "aim of the study" and we have removed all the repeated concepts which unnecessarily lengthened the introduction section of the manuscript. Thank you so much! 5.The reviewer comment about the use of comma in English is not used in maths is right. However, we search throughout the manuscript document and couldn't get such error. Thank you! 6.The reviewer comment about insertion of references to support ideas is accepted.
We have cited the references as needed throughout the manuscript document. Thank you! 7.The reviewer comment regarding the repetition of words like "the zone" and "the region" unnecessarily have been removed and corrected. Thank you! 8.According to the reviewers comment to correct the way of writing of the dates in the methods section is accepted. We have corrected the writing in the current version. Thank you! 9.The reviewer comment regarding rephrasing or inclusion of the map of the study area is right. We have rephrased and restructured the description of the study setting, study design and study period and made more clear and precise. Thank you! 10.The sentence in the methods section which written (The study utilized Facility based …..health workers involved in RHIS activities) has been reformulated. Thank you! 11.The reviewer recommendation to restructure and combine the repeated points in the methods section is well accepted. In the recent version we have removed the repeated points and also combined whose idea is the same. Introduction Health management information system (HMIS) is one of the six building blocks of health system that integrate data collection, processing, reporting, and use of the information. Globally, the restructuring of health information systems has been an important trend since its declaration in Alma-Ata conference of primary health care as an essential health care strategy in 1978 [1,2,3]. Developing countries also launched reforms to improve and expand health information systems as a component of health system reform [4]. The HMIS is a major source of information for monitoring and adjusting policy implementation and resource use in Ethiopia [5,6]. Health Sector Transformation Plan (HSTP) of Ethiopia considers information revolution as one of the four transformation agendas which involves advancement on the methods starting from data collection to the use of information for decision [7,8].
Data that are accurate, complete and delivered on time to users is an important aspect in healthcare planning, management and decision making but quality of data is frequently assessed as a component of the effectiveness or performance of the HIS; however data quality assessment is hidden within these scopes. This may lead to ignorance of data management and thereby the unawareness of data quality problem [9]. In Ethiopia, data quality and reliability issues are not well guiding program decisions in all aspects. Poor data quality at the lower administrative level or peripheral levels of woreda and health facilities, which are the source for majority of data used for decision making in the health sector remains a challenge as reported in 2016 annual reports of health sector transformation plan [10].
According to the assessment conducted on HMIS data quality and information use showed content completeness, reporting timeliness and accuracy were 39%, 73% and 76% respectively.
Existing evidence shows in Ethiopia including SNNPR (South Nation Nationality People Region) low level of data quality was reported as a gap which was below the national standard.
Data accuracy level for health centers was 36.22% which was much lower than the national target. This is due to many factors like lack of training, lack of decision based on supervision, lack of feedback, data quality assurances are done less frequently, limited skills of the health professionals [6,7,11,12].
Even though, as reported on the 2016 annual HSTP performance report of SNNPR, improvements have been seen in HMIS performance in the region, there is still a challenge in data quality especially on indicators related with HIV/AIDS, TB (Tuberculosis) and ANC (Antenatal care ). [13].

Study setting, study design and study period
This study was conducted in Hadiya Zone which is found in the Southern, Nations, Nationalities and Peoples` Regional State of Ethiopia. Hadaya zone comprises of 10 districts, 2 town

For Accuracy dimensions
Samples of 18 Health centers were selected to assess data quality. Based on the national HMIS information use and data quality manual, seven to nine data elements from each health center is satisfactory to assess data accuracy [16]. Data elements were selected randomly from top priority indicators at national level. Therefore, seven data elements from the 18 selected health centers were verified. 2 month documents were reviewed to see consistence of selected data elements of by random selection of the months September and November. The accuracy of data elements was determined by Accuracy Ratio (recounted data from the source document or registrations over reported data to the next level) for the respective data element. Lower than 0.90 or 90% accuracy ratio indicates over-reporting and higher than 1.10 or 110% accuracy ratio indicates under-reporting. Seven data elements, Antenatal care fourth visit, institutional deliveries, Pentavalent third doses, PMTCT coverage, Tuberculosis cure rate, confirmed malaria cases, and Contraceptive accepters rate were selected.

For Completeness and Timeliness
Content completeness was assessed by proportion of filled data elements of reporting formats pertaining to selected months. A tolerance level of 90% was used in grading health centers, which meant that each health center expected to complete at least 90% of data elements on report formats. All data elements of two months HMIS reports were reviewed to assess content completeness of reports. Timeliness also assessed by proportion of facilities with number of reports delivered up to deadline come for the selected two months. A tolerance of 90% was used in grading health centers.

Sample size and sampling procedure
Sample size was calculated using single population proportion formula based on the following assumption ,75% of peoples capable of performing HIS tasks in Eastern Ethiopia [9], desired degree of precision was 5% , 95% of confidence interval. These results the sample size of 288 and using a contingency of 5 % for non-respondents the final sample size will be 302.
WHO recommended for assessment of health facilities by considering the available funds and human resources, selecting 10%-50% facilities to have representative sample. Among the total 61 health centers in the zone 30% of health centers were selected based on the suggestion [17]. A total of 18 health centers were selected by simple random sampling. The calculated sample size for respondents were proportionally allocated to each health center, then health professionals were also selected randomly using lottery method from each selected health center. Health centers that are functional for more than one year were included whereas Health workers who had less than six month experience were excluded.

Data collection instrument and procedures
Data collection tools were adapted from the PRISM (Performance of Routine Information System Management) assessment tools version 3.1 and HMIS user's guideline. The tool is prepared to fit with local context and it mainly contains questions to assess accuracy, completeness and timeliness of HMIS data. Self-administered structured questionnaire containing back ground information of the respondents, organizational , behavioural and technical determinants of data quality in health centers was used [16,18] . The tool was pretested prior to actual data collection period on 5% of the sample size and they were not included in the actual data collection The collected data were checked for the completeness and coded before entry and entered to EPI info version 7 then exported to SPSS version 20 for processing and analysis through descriptive statistics. Incomplete, inconsistent and invalid data were refined properly to get maximum quality of data before, during and after data entry. Percentage, Frequency distribution tables and figures were used to describe the study variable for assessment of HMIS.
Binary logistic regression was used to identify the association between problems in data quality and the factors. Bivariable analysis was conducted and variables with p <0.25 selected as candidate variables for multivariate analysis. Finally variables with p<0.05, during multivariable analysis was considered as significant. The overall data quality was calculated by taking the sum of completeness, timeliness and accuracy scores.

Data quality management
To ensure the quality of data the following activities were done: adapting questionnaires from Standard tools, then translated in to Amharic. Training was given to data collectors on sampling procedures, techniques of interview and data collection process and supervised by the principal investigator. Pre testing of questionnaire was undertaken to check the understandability by taking 5% of sample from other health centers which are not included in the actual data collection.
Inconsistent and incomplete data were managed accordingly before data entry in computer software's.

Variable measurement
Data accuracy;-was measured by calculating the number from source document over the number from report submitted to the next level. Based on 10% tolerance for data accuracy was classified as follows;-Over reporting (<0.90 or 90%), Acceptable limit (0.90-1.10 or 90%-110%) and Under reporting (>1.10 or 110%) Content completeness was measured by the number of cells of report form which are left blank without indicating "zero". If greater than or equal to 90% of cells of the report filled was considered as complete.
Report timeliness was measured by the number of reports delivered up to deadline for facility head over the number of reports expected to come Level of Knowledge: A health professional said to be knowledgeable if they responds knowledge questions above respondent mean score.
Confidence level or Self-efficacy;-was measured in a scale of 0-100 that means from no confidence (zero) to full confidence (100) to perform HMIS tasks.

Ethics approval and consent to participate
The ethical approval for this study was obtained from the research ethical committee of school of public health, Addis Ababa University; permission letter was written for AA, RHB, Hadiya zone health office, woreda health office and health centers. Then informed written consent was obtained from the participants, after the necessary explanation about the purpose, procedures, benefits, risks of the study is explained and also their right on decision of participating in the study. After getting informed consent from the respondents the right of the respondents to refuse answer for few of all of the questions was respected

Characteristics of respondents
A total of 291 respondents were participated in study with response rate of 96%. Eleven health centers head (3.8%), 137 department heads (47%), 15 HMIS focals (5.2%) and 128 Nurses (44%) were participated in the study. Most of the respondent's age was within the range of 21-30(71.1%). Among the respondents 62.5% were male. Regarding distribution of level of education 190 (65.3%) were level four diploma holders and 101 (34.7%) bachelor degree holders. About 56.7% the respondents were nurses with the maximum experience of 10 years and average experiences of 5 years (Table 1).

General structure and capability of HMIS
All health centers assigned HMIS focal persons who are responsible for reviewing and aggregating numbers prior to submission to the next level. About 11 health centers assigned HMIS focals who are engaged on other responsibility like service provision. Only 11% of facilities assigned HIT professionals.
Based on the finding only 4 health centers were using functional computer software and all have Rules to prevent unauthorized changes to data (password). All 18 health centers were established performance monitoring team (Table 2). Rules to prevent unauthorized changes to data 18 4 22 Establish performance monitoring team 18 18 100

Record keeping
All health centers kept copies of reports. The count for one year period of copies of reports shows that the monthly report kept ranges from 10-12. From all health centers assessed 96% kept copy of monthly reports that are sent to the next level.

Completeness of data
Content completeness was assessed by checking two months service delivery report whether the required data elements in a report form are filled or data are complete. Overall content completeness was 83.3%.

Timeliness of data
Timeliness of the HMIS data was assessed by checking whether HMIS data reporting by the health facilities met the predetermined deadline of reporting period received by the facility head.
Over all timeliness was 88.42%. About 55.5% facilities found within 90% tolerance level.
Based on the three dimensions of data quality which are accuracy, completeness and timeliness the overall data quality of the health centers was 82.5%.

HMIS process
Concerning participation of respondents in HMIS activities among the respondents 87.3% participate in aggregation or compilation of data from registration. More than half the of respondents (57.7%) reported that they conduct data quality check but frequency of conducting data quality varied among respondents that about 51.8% conduct data quality test on monthly basis. Overall 86.9% of the respondents reported that they fill registration or tally sheet completely.  (Table 3).

Self-efficacy
Confidence level to perform HMIS tasks for health professionals were assessed on a scale of 0 to 100. The average score obtained for the seven questions expressed as a percentage. Higher confidence was observed in checking data accuracy and calculating percentages (66%) and lower confidence was observed in explaining findings from bar charts (56%) relatively. The average confidence level to perform HMIS activities of respondents were 63%.

Organizational factors
Regarding  COR-Crude odds ratio, AOR-Adjusted odds ratio

Discussion
Quality of data is a key factor in generating reliable health information that enables monitoring progress and making decisions for continuous improvement [7]. However the quality of data in the zone based on accuracy, completeness and timeliness showed 76%, 83.3% and 88.4% respectively. Overall data quality of the zone scored 82.5% which was below the national target All decision of the health system depends on the availability of timely, accurate, and complete information. However the study found 76% of data accuracy. The finding was comparable with the assessment done in Ethiopia, 76% of data accuracy level reported [12]. However According to the baseline assessment done in SNNPR, 36.22% of data accuracy was observed at health centers which was lower than the current study [6]. This may be due to the time gap, 7 years between the studies. Out of 18 health centers 8 (44%) health centers were in acceptable level of data tolerance. This finding was supported by the study done in India, 63% facilities were not in acceptable limit of data accuracy [19].
Discrepancy of data was observed in all facilities, what is on register and on report formats.
Tendencies of over reporting in all indicators and under reporting of some indicators were observed. The finding was similar with an evaluation done in Tigray region [20]. This may be due to incompleteness of data, not understanding the definition of cases or data elements, or data may not fall within the reporting period [16].
Data were over reported in all facilities. ANC4 and PMTCT data was over reported by 14 health centers (78%). This is supported by a national assessment done by EPHI. From the indicators assessed over reporting was observed in ANC and FP services. The study showed only 30% of ANC data reported was matched with source document but in this study about 88% of ANC4 data was matched. The improvement may be due to the study was nationwide so that including many institutions probably increase inclusion of those facilities with low data quality. Delivery data were over reported about 8% which was similar with EPHI data over reporting >10% [21].
About 11% of health centers under reported TB service data and confirmed malaria cases.
PMTCT and ANC data was over reported by 14 health centers. From the indicators assessed, only three out of seven (42.8%) indicators were within 10% acceptable level. About 19% of ANC4 data, over reported (>10% tolerance level) followed by 16%, 15% and 14% CAR, Penta3 and PMTCT data were over reported (>10%). About 39% of health centers over reported delivery data. This was also comparable with EPHI national assessment where Proportions of public facilities made greater than 10% over (20%) of Penta3 data, 88% PMTCT data was the best-matched data among all indicators [21]. This may be due to the fact that the indicators are from the top priority indicators at national level and needed to be performed well which might lead the facilities to over report and it may also be due to manual entry of data. According to the new information revolution every facility expected to use electronic HMIS but in the studied facilities only four facilities use functional electronic HMIS software (data base).
Regarding content completeness the result found 83.3% of completeness based on 90% tolerance, which was slightly higher than a study conducted in Ayder referral hospital 78.6% and a systematic review conducted in Ethiopia [12,22]. Whereas the result was comparable with a study conducted previously in the same setting on HMIS utilization 82.8% [23].
Another dimension of data quality was timeliness which is measured by, facilities receiving case teams' reports by the predetermined deadlines. Overall timeliness scored 88.4% based on 90% tolerance of timeliness which was higher result from study done in SNNPR 77% [6,12]. The result also revealed better achievement when compared to study conducted previously in the same setting, only 59.6% reports submitted on recommended time period [14].
Content completeness and timeliness dimensions showed less achievement from a study done in Tigray region and Rwanda where 100% facilities met 90% data tolerance [20,24]. Possible reasons may be due to lack of knowledge of respondents about the implications of an incomplete data on a report formats and to send reports on timely manner among the health workers and it may also be less emphasis was given for data quality during supervision.
Odds of data quality on those health workers who were filling the source document (registration or tally), higher than those who were not filled [AOR 3.4, 95% CI (1.3, 8.7)]. Similar finding was found on a studies done in Jimma and Bahir Dar town [25,26]. This may be due to non understandability (complexity) of the tools/formats, using of untrained workers or shortage of training supports on the forms and registers. So that it is difficult to register all relevant information in correct manner and retrieval of these data will be trouble full.
Concerning supervision, regular Supportive supervision with feedback is a key in addressing quality issues by helping to improve overall performance of HMIS especially for better achievement of data quality [27]. More than half (62.5%), health centers participated in this study supervised by their respective higher level as per standard in the last two quarters. The result was supported by studies conducted previously in Dire Dawa and SNNPR [6,11]. Even though the result was comparable with other studies conducted earlier, about 37.2% health centers were not supervised regularly. One of the most important mechanisms to improve quality of data is regular supervision. Lack of regular systems on supportive supervision affects the importance and quality of data collection. Therefore without regular and program specific supportive supervision it is difficult to achieve information transformation.
Regarding training, continuous training on HIS activity is important to create awareness and to have trained staff and skilled human resources that are confident and motivated to perform HIS tasks [25]. This study found about 52% of health workers trained regarding HMIS activities. This finding was comparable with other studies done in Dire Dawa 52.7% and South Africa 58% were not trained related with HMIS activities [26,28]. All health workers who participate in the collection at various sections of healthcare, need continuous capacity building to conduct quality review of RHIS at every stage for in-depth understanding of the stages where quality of data can occur [28,27]. In this study all focal persons and department heads trained regarding HMIS activities but others, service providers who were not trained were involved in the process of HMIS. This may affect the quality of data.
Odds of health information data quality among Health workers those who were confident enough to perform HMIS activities were higher than those who were not confident [AOR=1.75, 95% CI (0.99, 3.11)]. The result was supported by studies conducted in SNNPR and South Africa [6,26].
This factor also suggested by WHO measure evaluation as one determinant of data quality [18].
This may be due to complexity of the formats/tools. If data collection forms are complex to fill in, it affects confidence levels and motivation of data collector [18].
Concerning data quality check, good data management require data quality check at all stages.
The checking of data quality is the responsibility of all health workers participating in the data management [29]. In this study about 57.7% of health workers check data quality with a frequency of 51.8% on monthly basis. This is supported by different literatures in done by WHO measure evaluation and a study done in Kenya. From a study done in Kenya about 63% of respondents check data quality but the frequency of carrying out the checks was varying from one respondent to another with majority indicating every quarterly 22% [18,23,29].

Conclusions
Data quality for the three dimensions was 82.5% which is lower than the national target 85% for data accuracy. Over reporting of data was observed at all facilities. About 39 % of health centers over reported delivery data. About 9% data of ANC4 over reported (>10% tolerance level) followed by 6%, 5% and 4% CAR, Penta3 and PMTCT data were over reported (>10%).
Decisions made using inaccurate, incomplete and reported not on timely manner can affect the health system performance. It was observed that there were inadequacy of supervision, training, HIT professionals, written feedback and procedural manuals. The major factors that affect quality of data were, filling registration or tally completely, training, supervision, data quality check and confidence level. Computerized HMIS data base should be distributed for those who are not using; as it will help to improve data accuracy, timeliness of report and reduce the burden of data collectors .

Consent for publication: Not applicable
Competing interests: The authors declare that they have no competing interests The need for organized, accessible, timely, and accurate data for health decision making has become a growing concern. In response to this, the Ethiopian FMOH has undertaken an extensive reform and redesign of the national HMIS. The reform has taken major steps to respond to the deficiency of routine health data that limited the quality of care, planning, and management systems, as well as decision making by managers at all levels in the health care system.(7).
According to the assessment conducted on HMIS data quality and information use showed content completeness, reporting timeliness and accuracy were 39%, 73% and 76% respectively.

Existing evidence shows in Ethiopia including SNNPR (South Nation Nationality People
Region) low level of data quality was reported as a gap which was below the national standard.
Data accuracy level for health centers was 36.22% which was much lower than the national target. This is due to many factors like lack of training, lack of decision based on supervision, lack of feedback, data quality assurances are done less frequently, limited skills of the health professionals [6,7,11,12]. . Thus, this study aimed to assess the level of data quality and factors associated with data quality in the area.
Therefore the health sector transformation plan (HSTP) considered a need for information revolution as one of the four transformation agendas which involves advancement on the methods starting from data collection to the use of information for decision. The focus of information revolution is not only on the method of advancement but also on the changes of culture and attitude toward information use. Improving health system efficiency and effectiveness through the guiding principles of standardizing, recording and reporting forms, integration, simplification, human resource development and ICT applications.. (7,8).
Data that are accurate, complete and delivered on time to users is an important aspect in healthcare planning, management and decision making but quality of data is frequently assessed as a component of the effectiveness or performance of the HIS; however data quality assessment is hidden within these scopes. This may lead to ignorance of data management and thereby the unawareness of data quality problem (9).
In Ethiopia, data quality and reliability issues are not well guiding program decisions in all aspects. Poor data quality at the lower administrative level or peripheral levels of woreda and health facilities, which are the source for majority of data used for decision making in the health sector remains a challenge as reported in 2016 annual reports of health sector transformation plan.. (10).
Lack of accuracy, timeliness and completeness of HIS reporting remains a weakness, and such delays contribute to the challenge to use data as the basis for informed decision-making in health care planning and management. (11).
According to the assessment conducted on HMIS data quality and information use showed content completeness, reporting timeliness and accuracy were 39% ,73% and 76% respectively (12).
Existing evidences shows in Ethiopia including SNNPR low level of data quality was reported as a gap which was below the national standard. Data accuracy level for health centers was 36.22% which was much lower than the national target. This is due to many factors like lack of training, lack of decision based on supervision, lack of feedback, data quality assurances are done less frequently, limited skills of the health professionals. (6,7,11).

Methods
This study was conducted in Hadiya Zone which is found in the Southern, Nations, Nationalities

Sample size determination
For Accuracy dimensions Samples of 18 Health centers were selected to assess data quality. Based on the national HMIS information use and data quality manual, seven to nine data elements from each health center is satisfactory to assess data accuracy(16).
Data elements were selected randomly from top priority indicators at national level. Therefore, seven data elements from the 18 selected health centers were verified. 2 month documents were reviewed to see consistence of selected data elements of by random selection of the months September and November. The accuracy of data elements was determined by Accuracy Ratio (recounted data from the source document or registrations over reported data to the next level) for the respective data element. Lower than 0.90 or 90% accuracy ratio indicates over-reporting and higher than 1.10 or 110% accuracy ratio indicates under-reporting.
Seven data elements, Antenatal care fourth visit, institutional deliveries, Pentavalent third doses, PMTCT coverage, Tuberculosis cure rate, confirmed malaria cases, and Contraceptive accepters rate were selected.

For Completeness and Timeliness
Content completeness was assessed by proportion of filled data elements of reporting formats pertaining to selected months. A tolerance level of 90% was used in grading health centers, which meant that each health center expected to complete at least 90% of data elements on report formats. All data elements of two months HMIS reports were reviewed to assess content completeness of reports. Timeliness also assessed by proportion of facilities with number of reports delivered up to deadline come for the selected two months. A tolerance of 90% was used in grading health centers.
Content completeness was assessed by proportion of filled data elements of reporting formats pertaining to selected months. A tolerance level of 90 percent was used in grading health centers, which meant that each health center expected to complete at least 90 percent of data elements on report formats. All data elements of two months HMIS reports were reviewed to assess content completeness of reports.
Timeliness also assessed by proportion of facilities with number of reports delivered up to deadline come for the selected two months. A tolerance of 90% was used in grading health centers.

Sample size for the respondents self-administered Questionnaires
Sample size was calculated using single population proportion formula based on the following assumption ,75% of peoples capable of performing HIS tasks in Eastern Ethiopia (9), desired degree of precision was 5% , 95% of confidence interval. These results the sample size of 288 and using a contingency of 5 % for non-respondents the final sample size will be 302.

Sampling procedures
Sample size was calculated using single population proportion formula based on the following assumption ,75% of peoples capable of performing HIS tasks in Eastern Ethiopia [9], desired degree of precision was 5% , 95% of confidence interval. These results the sample size of 288 and using a contingency of 5 % for non-respondents the final sample size will be 302.
Health professionals for self administered questionnaire were selected by using simple random sampling technique.
WHO recommended for assessment of health facilities by considering the available funds and human resources, selecting 10%-50% facilities to have representative sample. Among the total 61 health centers in the zone 30% of health centers were selected based on the suggestion (17) . A total of 18 health centers were selected by using simple random sampling technique. The calculated sample size for respondents Self administered questionnaire was proportionally allocated to each health center, then health professionals were selected randomly who are involved in HMIS activities starting from daily register of the source document to the final report were included.
Health centers that are functional for more than one year were included but Health professionals from new health centers for less than one year and Health workers who had less than six month experience were excluded.

Data collection instrument and procedures
Data collection tools were adapted from the PRISM assessment tools version 3.1 and HMIS user's guideline. The tool is prepared to fit with local context and it mainly contains questions to assess accuracy, completeness and timeliness HMIS data. Self-administered structured questionnaire containing back ground information of the respondents organizational , behavioural and technical determinants of data quality in health centers (16,18).
The tool was pretested prior to actual data collection period on 5% of the sampled health professionals and they weren't included in the actual data collection The collected data were checked for the completeness and coded before entry and entered to EPI info version 7 then exported to SPSS version 20 for processing and analysis through descriptive statistics. Incomplete, inconsistent and invalid data were refined properly to get maximum quality of data before, du ring and after data entry. Percentage, Frequency distribution tables and figures were used to describe the study variable for assessment of HMIS.
Binary logistic regression was used to identify the association between problems in data quality and the fact ors.
Bivariable analysis was conducted and variables with p <0.25 selected as candidate variables for multivariate analysis. Finally variables with p<0.05, during multivariable analysis was considered as significant.
The overall data quality was calculated by taking the sum of completeness, timeliness and accuracy scores.
The dependent variable were HMIS data quality while the following factors were included in the model as

Data quality management
To ensure the quality of data the following activities were done: adapting questionnaires from Standard tools, then translated in to Amharic. Training was given to data collectors on sampling procedures, techniques of interview and data collection process and supervised by the principal investigator. Pre testing of questionnaire was undertaken to check the understandability by taking 5% of sample from other health centers which are not included in the actual data collection. Inconsistent and incomplete data were managed accordingly before data entry in computer software's. .

Variables measurement
 Data accuracy;-was measured by calculating the number from source document over the number from report submitted to the next level. Based on 10% tolerance for data accuracy was classified as follows;-Over reporting (<0.90 or 90%), Acceptable limit (0.90-1.10 or 90%-110%) Under reporting (>1.10 or 110%)  Content completeness was measured by the number of cells of report form which are left blank without indicating "zero". If greater than or equal to 90% of cells of the report filled was considered as complete.
 Report timeliness was measured by the number of reports delivered up to deadline for facility head over the number of reports expected to come  Level of Knowledge: A health professional said to be knowledgeable if they responds knowledge questions above respondent mean score.
 Confidence level or Self-efficacy;-was measured in a scale of 0-100 that means from no confidence (zero) to full confidence (100) to perform HMIS tasks.

Ethics approval and consent to participate
The ethical approval for this study was obtained from the research ethical committee of school of public health, Addis Ababa University; permission letter was written for AARHB, Hadiya zone health office, woreda health office and health centers. Then informed written consent was obtained from the participants, after the necessary explanation about the purpose, procedures, benefits, risks of the study is explained and also their right on decision of participating in the study. After getting informed consent from the respondents the right of the respondents to refuse answer for few of all of the questions was respected.

General structure and capability of HMIS
All health centers assigned HMIS focal persons who are responsible for reviewing and aggregating numbers prior to submission to the next level. About 11 health centers assigned HMIS focals who are engaged on other responsibility like service provision. Only 11% of facilities assigned HIT professionals.
Based on the finding only 4 health centers were using functional computer software and all have Rules to prevent unauthorized changes to data (password). All 18 health centers were established performance monitoring team. Rules to prevent unauthorized changes to data 18 4 22 Establish performance monitoring team 18 18 100

Record keeping
All health centers kept copies of reports. The count for one year period of copies of reports shows that the monthly report kept ranges from 10-12. From all health centers assessed 96% kept copy of monthly reports that are sent to the next level.

Accuracy of data
A total of 18 health centers were studied for data quality by accuracy, completeness and timeliness dimensions.

Completeness of data
Content completeness was assessed by checking two months service delivery report whether the required data elements in a report form are filled or data are complete. Overall content completeness was 83.3%.

Timeliness of data
Timeliness of the HMIS data was assessed by checking whether HMIS data reporting by the health facilities met the predetermined deadline of reporting period received by the facility head. Over all timeliness was 88.42%. About 55.5% facilities found within 90% tolerance level.
Based on the three dimensions of data quality which are accuracy, completeness and timeliness the overall data quality of the health centers was 82.5%.

HMIS process
Concerning participation of respondents in HMIS activities among the respondents 87.3% participate in aggregation or compilation of data from registration. More than half the of respondents (57.7%) reported that they conduct data quality check but frequency of conducting data quality varied among respondents that about 51.8% conduct data quality test on monthly basis. Overall 86.9% of the respondents reported that they fill registration or tally sheet completely.

Self efficacy
Confidence level to perform HMIS tasks for health professionals were assessed on a scale of 0 to 100. The average score obtained for the seven questions expressed as a percentage. Higher confidence was observed in checking data accuracy and calculating percentages (66%) and lower confidence was observed in explaining findings from bar charts (56%) relatively. The average confidence level to perform HMIS activities of respondents were 63%.

Organizational factors
Regarding training status, from the total respondents 52.2% reported that they had received training on HMIS  * shows predictor variables for information utilization at p<0.05

Discussion
Quality of data is a key factor in generating reliable health information that enables monitoring progress and making decisions for continuous improvement (7). However the quality of data in the zone based on accuracy, completeness and timeliness showed 76%, 83.3% and 88.4% respectively. Overall data quality of the zone scored 82.5% which was below the national target 85% (5).
All decision of the health system depends on the availability of timely, accurate, and complete information.
However the study found 76% of data accuracy. The finding was comparable with the assessment done in Ethiopia, 76% of data accuracy level reported (12). However According to the baseline assessment done in SNNPR, 36.22% of data accuracy was observed at health centers which was lower than the current study (6). This may be due to the time gap, 7 years between the studies Out of 18 health centers 8(44%) health centers were in acceptable level of data tolerance . This finding was supported by the study done in India, 63% facilities were not in acceptable limit of data accuracy (19).
Discrepancy of data was observed in all facilities, what is on register and on report formats. Tendencies of over reporting in all indicators and under reporting of some indicators were observed. The finding was similar with an evaluation done in Tigray region (20). This may be due to incompleteness of data, not understanding the definition of cases or data elements, or data may not fall within the reporting period (16).
Data were over reported in all facilities. ANC4 and PMTCT data was over reported by 14 health centers (78%). This is supported by a national assessment done by EPHI. From the indicators assessed over reporting was observed in ANC and FP services. The study showed only 30 percent of ANC data reported was matched with source document but in this study about 88% of ANC4 data was matched. The improvement may be due to the study was nationwide so that including many institutions probably increase inclusion of those facilities with low data quality. Delivery data were over reported about 8% which was similar with EPHI data over reporting >10% (21).
About 11% of health centers under reported TB service data and confirmed malaria cases. PMTCT and ANC data was over reported by 14 health centers. From the indicators assessed, only three out of seven (42.8%) indicators were within 10% acceptable level. About 19% of ANC4 data, over reported (>10% tolerance level) followed by 16%, 15% and 14% CAR, Penta3 and PMTCT data were over reported (>10%). About 39 %of health centers over reported delivery data. This was also comparable with EPHI national assessment where Proportions of public facilities made greater than 10 percent over (20%) of Penta3 data, 88% PMTCT data was the best -matched data among all indicators (21). This may be due to the fact that the indicators are from the top priority indicators at national level and needed to be performed well which might lead the facilities to over report and it may also be due to manual entry of data. According to the new information revolution every facility expected to use electronic HMIS but in the studied facilities only four facilities use functional electronic HMIS software (data base).
Regarding content completeness the result found 83.3% of completeness based on 90% tolerance, which was slightly higher than a study conducted in Ayder referral hospital 78.6% and a systematic review conducted in Ethiopia (12,22). Whereas the result was comparable with a study conducted previously in the same setting on HMIS utilization 82.8% (23).
Another dimension of data quality was timeliness which is measured by, facilities receiving case teams' reports by the predetermined deadlines. Overall timeliness scored 88.4% based on 90% tolerance of timeliness which was higher result from study done in SNNPR 77% (6,12). The result also revealed better achievement when compared to study conducted previously in the same setting, only 59.6% reports submitted on recommended time period (14).
Content completeness and timeliness dimensions showed less achievement from a study done in Tigray region and Rwanda where 100 percent facilities met 90% data tolerance (20,24). Possible reasons may be due to lack of knowledge of respondents about the implications of an incomplete data on a report formats and to send reports on timely manner among the health workers and it may also be less emphasis was given for data quality during supervision.
Odds of data quality on those health workers who were filling the source document (registration or tally), higher than those who were not filled [AOR 3.4, 95% CI (1.3, 8.7)]. Similar finding was found on a studies done in Jimma and Bahir Dar town (25,26). This may be due to non understandability (complexity) of the tools/formats, using of untrained workers or shortage of training supports on the forms and registers. So that it is difficult to register all relevant information in correct manner and retrieval of these data will be trouble full.
Concerning supervision, regular Supportive supervision with feedback is a key in addressing quality issues by helping to improve overall performance of HMIS especially for better achievement of data quality (27). More than half (62.5%), health centers participated in this study supervised by their respective higher level as per standard in the last two quarters. The result was supported by studies conducted previously in Dire Dawa and SNNPR (6,11).
Even though the result was comparable with other studies conducted earlier, about 37.2% health centers were not supervised regularly. One of the most important mechanisms to improve quality of data is regular supervision.
Lack of regular systems on supportive supervision affects the importance and quality of data collection. Therefore without regular and program specific supportive supervision it is difficult to achieve information transformation.
Regarding training, continuous training on HIS activity is important to create awareness and to have trained staff and skilled human resources that are confident and motivated to perform HIS tasks (25). This study found about 52% of health workers trained regarding HMIS activities. This finding was comparable with other studies done in Dire Dawa 52.7% and South Africa 58% were not trained related with HMIS activities (26,28). All health workers who participate in the collection at various sections of healthcare, need continuous capacity building to conduct quality review of RHIS at every stage for in-depth understanding of the stages where quality of data can occur (28,27). In this study all focal persons and department heads trained regarding HMIS activities but others, service providers who were not trained were involved in the process of HMIS.
Odds of health information data quality among Health workers those who were confident enough to perform HMIS activities were higher than those who were not confident [AOR=1.75, 95% CI (0.99, 3.11)]. The result was supported by studies conducted in SNNPR and South Africa (6,26). This factor also suggested by WHO measure evaluation as one determinant of data quality (18). This may be due to complexity of the formats/tools. If data collection forms are complex to fill in, it affects confidence levels and motivation of data collector (18).
Concerning data quality check, good data management require data quality check at all stages. The checking of data quality is the responsibility of all health workers participating in the data management (30).In this study about 57.7% of health workers check data quality with a frequency of 51.8% on monthly basis. This is supported by different literatures in done by WHO measure evaluation and a study done in Kenya. From a study done in Kenya about 63% of respondents check data quality but the frequency of carrying out the checks was varying from one respondent to another with majority indicating every quarterly 22% (18,23,29).

Conclusion and recommendations Conclusion
Data quality for the three dimensions was 82.5% which is lower than the national target 85% for data accuracy. Over reporting of data was observed at all facilities. About 39 % of health centers over reported delivery data. About 9% data of ANC4 over reported (>10% tolerance level) followed by 6%, 5% and 4% CAR, Penta3 and PMTCT data were over reported (>10%). Decisions made using inaccurate, incomplete and reported not on timely manner can affect the health system performance. It was observed that there were inadequacy of supervision, training, HIT professionals, written feedback and procedural manuals. The major factors that affect quality of data were, filling registration or tally completely, training, supervision, data quality check and confidence level. Computerized HMIS data base should be distributed for those who are not using; as it will help to improve data accuracy, timeliness of report and reduce the burden of data collectors