Big data in global health: improving health in low- and middle-income countries

display image
Ishan Timilsina
Project & Operations Manager


There are many challenges to the full-scale implementation of big data systems in low- and middle-income countries. The collection of information from individuals – a prerequisite for any big data approach – is fraught with ethical, regulatory and technological issues. Given the increasing complexity of the field, the protection of individuals and populations must move from purpose-specific consent to emphasize appropriate use, risk assessment and risk minimization. The anonymization of data must be robust, monitored and enforced. Appropriate use must remain coherent with evolving societal values. Furthermore, the big data approach can amplify the existing difficulties associated with health-care delivery in settings with scarce resources. In such settings, it may be impossible for front-line health workers to extend their remit to the non-essential collection of data. Some policy-makers view the big data approach simply as a distraction for low- and middle-income countries. Others consider big data to be a critical milestone on the path towards the improvement of such countries (Box 1).

Box 1. Differing views on the big data approach In low- and middle-income countries, the future could go well or badly for the big data approach. 

Dystopian views In the worst-case scenario, big data would be an expensive distraction driven by high-income countries, focused on disease-specific outcomes and unintelligible to those who most need data access. The assimilation of fragmented data – which cannot be readily shared or compared – could undermine the relatively fragile global health community. Breaches of data security could threaten personal safety and lead to discrimination and genocide and other violence. The global health community could oversee the spending of huge amounts of money on big data, with potentially little to show for the investment. In brief, the big data approach could be associated with:
– the diversion of focus and resources away from interventions that are more needed;
– poor data governance – with databases held by private companies, frequent leaks and no recourse for citizens;
– the offloading of consent through poorly designed consent systems, which could threaten the safety of individuals;
– a lack of interoperability, with balkanized information systems that cannot be aggregated; and information that is poorly presented and analysed, considered ineligible or not credible.   

Utopian views: Conversely, the big data era could represent a major and beneficial turning point in the improvement of global health. Decision-makers in low- and middle-income countries could develop a “demand-side” platform to identify the information they need most. Partnerships formed with academia, industry, governments, international organizations and the non-profit sector could help develop innovative solutions. Although this idealized approach is optimistic, it is no less ambitious than achieving the Millennium Development Goals, eradicating polio or controlling malaria. The development of a “best-case” model for deploying big data may help us achieve all of these targets. In brief, the big data approach could be associated with:
– health data that are owned by patients;
– robust governance processes that have been developed to ensure respect of values and principles in the use of data, with an emphasis on risk minimization;
– data that are aggregated automatically, with little effort and decreasing cost;
– interoperability standards that allow data to be seamlessly pooled and connected;
– laws that, while establishing adequate safeguards, allow the sharing and pooling of anonymized data in real-time; and
– data that are presented in a usable format to patients, health-care providers, entrepreneurs and policy-makers.

Even in the best of cases, threats to the privacy of personal health information will remain. This concern is amplified when the information relates to individuals in vulnerable populations and communities. Even basic health data – e.g. on ethnicity, reproductive health, sexually transmitted infections, diseases with a genetic basis and risk exposures for disease – can be misused and lead to discrimination and reductions in personal safety. Any electronic database can be hacked. The risk of accidental or intentional breaches of data security may be particularly high in settings with high levels of illiteracy and corruption that are undergoing rapid technological transition. In many such settings, legislation supporting the privacy and security of information services is frequently underdeveloped and rarely enforced.

Even in high-income countries, purpose-specific informed consent is increasingly being rendered meaningless by high levels of complexity in the ways that collected data are – or might be – used. Privacy protection is a right and the preservation of public trust is a necessity. However, as the full potential of the big data approach to improve health becomes clearer, there is also a right for populations to reap all of the potential benefits of such an approach. The use of anonymized data for the greater good of populations needs to be incorporated into the process of risk minimization. There is an increasing need for traditional consent protocols to be replaced by – or supplemented with – transparent and effective processes for data governance. The values and concerns of the target populations need to be translated into best practices that balance the benefits and risks of data use. Concerns persist about data sharing and appropriate use.

The promise of big data is tempered by the weak health systems and limited governance structures to be found in most low-income countries. Many of the countries in greatest need of health metrics struggle to collect statistics on births and deaths. The epidemiological data collected in these countries are of variable reliability, have often only been collected at small sentinel sites and are rarely digitized. Improvements in the provision of food, water and sanitation remain the top priorities for over two billion people. In many low-income countries, data collection may only be possible at the expense of tangible health services. As reported by the United Nations, “it is important to recognize that big data and real-time analytics are no modern panacea for age-old development challenges”. However, as the cost decreases of aggregating and coordinating resources and services electronically, the big data approach may deliver large benefits to low- and middle-income countries. The more limited the resources for interventions, the more important the targeting and focusing of interventions become.

The persistent tension between vertical or disease-specific programmes and horizontal or health-system-focused approaches remains unresolved. The big data approach fits a horizontal programme better than a vertical programme and could potentially improve the control and treatment of all human disease. At the moment, global health remains driven by disease-specific interests and disease-specific advocacy groups may well head the queue for big data – risking further fragmentation of the health community.

The next step

The role of big data collection – whether it is perceived as a tool or a threat – remains unclear. For positive outcomes, informed, reflective and resourced stewardship of data is critical. At the moment, the structures for global health governance remain relatively fragile. In 2009, the United Nations established the Global Pulse initiative. “Its mission is to accelerate discovery, development and scaled adoption of big data innovation for sustainable development and humanitarian action. Unfortunately, the current data protection standards for Global Pulse are badly outdated as they are grounded in guidelines – for the regulation of computerized personal data files – that were published in 1990.

Some guidance on the collection and use of health data was provided within the World Economic Forum’s Global health data charter, as part of the Forum’s vision of “better data for better health”. For health data, the charter identified eight key challenges and highlighted several enabling activities. The expansive scope of big data requires the cooperation of multiple stakeholders. Universities, professional societies, government agencies and research-driven companies are examples of organizations that could develop and operate data systems to support health care. Clear governance and decision-making framework is needed to inform each stakeholder of its accountability and responsibility for each process. There needs to be transparency in addressing and troubleshooting any issues until major decisions are made. Issues often persist for lack of ‎clear agreements on who should resolve them and how they should be resolved. In an emerging field such as big data, where protocols are still being developed, governance plays a major role in assuring stakeholders that there is a system for resolving issues.

However, the global health community has a patchy record on the governance of technological developments. Optimizing the application of big data will involve much more than confidentiality safeguards and minimum standards. A broad effort to establish interoperability standards is imperative to maximize the benefits of big data. Global health governance needs to move from a reactive model to a proactive, norm-forming approach.


In the field of health-care delivery, the big data approach may represent a major milestone – facilitating the development of learning systems of care and enabling more precise management of individuals to improve the health of entire populations. Sheer size increases both the potential risks and potential benefits of the approach. Although the approach may have the most value in low-resource settings, it is also most vulnerable to fragmentation and misuse in such settings. Collaborative governance, careful analysis and technical partnerships are needed to minimize the risks. The complexities should not be underestimated. In low- and middle-income countries, the shepherding of the transition from paper records to petabytes of digital storage provides another opportunity for global health institutions to offer useful governance.


Wyber, R., Vaillancourt, S., Perry, W., Mannava, P., Folaranmi, T. and Anthony Celi, L., 2015. Big Data In Global Health: Improving Health In Low- And Middle-Income Countries. [ebook] NCBI. Available at: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4339829/>

Image credit: Shirish Suwal – Unsplash

Big data and EHR Big data in Nepal