Self-Assembly Required: A Real-Time Platform for Global Pulse
By Robert Kirkpatrick
In preparation for our first Pulse Camp, I'd like to share a few thoughts on the technology aspects of Global Pulse. Specifically, I'm going to provide a brief background on those aspects of the project that bear directly on our upcoming efforts to develop an open standards-based reference architecture, and then offer a snapshot of some of our thinking about users and requirements for the platform. This isn't prescriptive, and our discussions over the three days of Pulse Camp may take us in a different direction, but I thought it might be helpful to ground our discussions in some of this raw material. If you're already caught up on the context, feel free to skip ahead. I'm providing it here for those Pulse Camp participants who are generously taking time out of their busy professional lives to help us design the reference architecture for our platform and may not have had time to catch up on the context of this Initiative.
Today's post is about requirements. Tomorrow I'll expand on our Open Architecture strategy. For now, I'll just note that whatever software components Global Pulse creates will be free and open source. An open source platform approach will help us maximize sustainability, appropriateness, and potential to harness innovation from all around the world. At the same time, we're looking to design an open standards-based plug-in architecture for the platform that supports integration of a wide range of useful existing applications and services that provide capabilities such as mapping, social networking, analysis, visualization and collaboration. Many of the most powerful tools out there today are not open source, and they do not need to be open source to play a role in helping leaders around the world protect the vulnerable from harm.
Imagine the following scenario:
- A global crisis begins to unfold – let's say it involves increasing fuel prices, decreasing food supply, and massive job losses. Global shocks begin to impact households in countries around the world.
- In some cases, families are able to adapt successfully to increased hardship by changing their taking on extra work, cutting back on non-essential expenses, selling household items they do not need, and switching to less costly foodstuffs and forms of transportation. They have had to make sacrifices, but for them it's not a question of survival, and there is no great risk of significant harm in terms of health, nutrition, education, and livelihood. When the crisis ends, they will eventually bounce back. This is resilience.
- Yet the most vulnerable households have little or no cushion to fall back on: they have no savings, own no possessions they can easily do without, and barely earn enough to feed and clothe their families. Perhaps they work in the informal sector at jobs that provide no benefits. Perhaps they live in communities several days' travel from health clinics, or in areas prone to flooding, drought, or disease. Their condition is far more fragile, and their capacity to cope is limited. When the crisis ends, these households will be in far worse shape than they were at the beginning, and they will not bounce back.
- When food prices increase and the primary wage earner loses a job, these households must change their behavior in order to survive, and they are forced to make truly painful sacrifices. Some will pull their children out of school to work in the market, and many of those children will never complete their education. Others will cut back on the quality and number of meals, resulting in malnutrition. Others will forego medical care to save money, with predictable results. Others will risk their lives to migrate across national borders for work. Still others will cut down the last remaining trees in their community to sell as firewood, or sell the family cow. Each of these tradeoffs may have both short-term impacts and long-term, complex consequences playing out in multiple sectors – poor health, food insecurity, lack of education, chronically low income, increase vulnerability to natural disasters, etc.
The scenario above is pretty close to what we believe happened during the compound food, fuel and financial crisis of the past three years. At the height of the crisis, leaders knew they needed to act fast to prevent long-term harm to vulnerable communities. Specifically, they needed to make adjustments to social safety nets targeted specifically at the tradeoffs that different vulnerable populations were making. Here, a policy response would need to provide parents with incentives to keep their children in school. There, they would need to implement food subsidies. To intervene rapidly and effectively to protect development gains, they needed to know which communities were vulnerable, which kinds of impacts they were vulnerable to, and which coping strategies they were adopting in response.
The Information Gap
But there was a problem. As leaders prepared to invest significant resources, they were certain the crisis was impacting some households disproportionately, so they began to ask for information about what was happening at the household level. Who was being affected, and how? What we found, however, as we reached out to UN agencies and partner organizations, is that hard evidence on the real-time impacts of the crisis at the household level was pretty much non-existent. The statistical data available was several years out of date and was of no use. All we had to go on were anecdotal reports. Without actionable evidence, leaders in many cases had to target resources based on hearsay and guesswork. This information gap can have long-lasting consequences: finding out today that child malnutrition set in two years ago is two years too late to prevent harm that can last a lifetime.
The Global Pulse initiative was created to fill this information gap. Global Pulse must allow local, national and global leaders in times of crisis to use real-time information to protect vulnerable populations through agile, targeted policy decisions. The global network of capacity-building Pulse Labs, where this technology will be ported to local languages, adapted to local needs, and put to work, is being created to learn how to help leaders at every level – local, national, and global -- detect, characterize, and investigate patterns in collective behavior that could represent incipient impacts and emerging vulnerabilities.
By "leaders" here we mean decision-makers in local communities, national governments and global organizations such as the UN, but we also mean the teams of experts in our country-level Pulse Labs – including those from academe, UN agencies, NGOs, local media, or the private sector -- working to support them. Global and national leaders want simple, clear reports they can take in at a glance, but the process of gathering, fusing, analyzing, filtering, and visualizing information to support the decision-making process will require a great deal of behind-the-scenes work by a multi-disciplinary team of experts drawing on work within diverse communities of practice. Finally, leaders in local communities will need to report on what they are hearing and seeing and stay in the loop about the bigger picture as it becomes clear, even if all they have for communications is a simple mobile phone.
A Multi-Level Challenge
The long-term vision for Global Pulse is implied by the name: a kind of global nervous system integrating across diverse information streams to allow leaders to detect, characterize and respond to fluctuations in human wellbeing – in real time. Yet from the early days of this initiative, it was clear that such a massive system could not be built purely from the top down. Timely and effective collective action at the global level will only emerge from intergovernmental, inter-organizational, interdisciplinary information sharing…which in turn requires a baseline capacity for early detection and coordinated response by individual governments…which in turn requires that communities generate the information their governments need to detect the impacts of crisis.
We've been thinking about this multi-level challenge for some months now, and one truth that has become self-evident is that unless we tap into principles of emergence, we will drown in complexity. We have to create a decision-support platform out of building blocks that can self-organize into a global network linking local communities, government ministries, and the global community. We've been thinking for a while about what those building blocks might be, but I'm getting ahead of myself. Let's first talk concretely about requirements we already know of.
As a decision support platform, the Global Pulse platform must certainly incorporate the standard capabilities of aggregating and fusing structured, semi-structured, and unstructured information from a variety of sources, allowing users to capture and share information about uncertainty, subjecting the resulting data to analysis, visualizing it in various ways, and alerting users when certain events and trends are detected. Ideally, most of these capabilities could be based on existing tools wrapped to fit into a plug-in architecture. Yet the platform must do more than this. Our platform must support three different kinds of interaction:
- It must allow government to collect information about, elicit feedback from, and broadcast information to, local communities in real-time. This information might include structured, semi-structured and unstructured text-based information as well as data from a variety of sensors. Features here include structured data collection, bounded crowdsourcing, and crowdfeeding. For communities to generate information useful for government, likewise, they must be able to search for information available on risks and available resources, subscribe to feeds of information, and report on what they are seeing and hearing.
- It must allow national government personnel from different ministries to collaborate securely with one another as well as with outside experts from UN agencies and academic institutions to fuse multiple streams of information, tag, rate, filter and cluster items in the data stream, analyze concerning patterns as they emerge, implement appropriate policy responses, and monitor the impact of responses over time. Here we are talking about collaborative team-based monitoring, analysis, and evaluation.
- It must allow teams around the world to share selected data, metadata and models with other teams, with trusted networks, or with the global community as a whole. The idea here is that organizations working around the world to protect the vulnerable would be able to maintain a certain level of shared awareness of what kinds of information others consider useful and credible, how they are characterizing the patterns they are seeing, what populations they are concerned about, and what impacts they are monitoring for. When the amount of shared information reaches a critical mass, it will begin to influence the decision processes of different groups working within the network on a global scale. So here we are talking about cross-team, inter-organizational social sharing intended to lower barriers to global collective action.
Potential Information Sources
Another challenge is that we're going to have to settle quickly on a lingua franca for representing data. Perhaps we might develop a reference ontology that organizations could used to decorate their own data sets so that others systems can consume them? Below are a few of the types of data that the Global Pulse platform would likely need to be able to aggregate, overlay, fuse, and analyze.
- UN Early Warning Systems. The UN has some 39 early warning systems related to specific sectors, and much of this data is available in close to real time, so we would expect to provide the Global Pulse network with feeds of data fused from different UN sources.
- UN Open Data Repository The UN already has a substantial statistical data repository at http://data.un.org.
- Rapid Data Collection. All over the world, governments, UN agencies, NGOs and private sector organizations are collecting structured data via mobile phones. Some of this information is available publicly, and much of it could be made available to Pulse Labs in the countries where it is collected.
- Macroeconomic Data. For any given country, there is a great deal of macroeconomic data available on imports, exports, employment, productivity, and the behavior of markets. While this information tends to obscure the differential impacts of crisis on specific population groups, it is tremendously useful for understanding context, particularly since much of it is available weekly or monthly.
- Satellite Imagery. Remote sensing can potentially tell us a great deal about what is happening to a population, both in terms of impacts of cries and adoption of various coping strategies.
- Information Exhaust. All around the world, people are increasingly generating information exhaust as a by-product of using online or mobile services or participating in programs run by government, NGOs, and UN agencies. Any time someone buys or sells a product, transfers money, asks a question from an information service, tops up a SIM card, cashes in a food voucher, applies for a job, or searches for the price of fertilizer, he or she creates a trail of data. There are certainly issues here with privacy, sovereignty, and intellectual property, but with the proper framework to anonymize, aggregate, and otherwise obfuscate as required, real-time information exhaust could be monitored for anomalies that might reveal when vulnerable populations are in trouble.
- Media Mining. Google News, Twitter, Facebook, and the plethora of equivalent sites and services in developing countries can be mined for key words and phrases that can both reveal emerging events and help leaders understand how those events are being perceived.
- Citizen Reporting. National government personnel could elicit reports directly from trusted leaders within the affected communities as a first step in investigating patterns of concern initially revealed by passive monitoring techniques.
The kind of scenario we imagine the platform supporting might run as follows:
- Several months after the onset of a complex global crisis, personnel at the Pulse Lab decide to assign elevated food security risk scores to the entire geographic area containing a certain low-income agrarian population, due to a combination of food security monitoring indicators tracked by a UN food security early warning system at UN Headquarters:
a. Total rainfall is well below the amount required by crops grown in the area, and
b. Oil prices are rising (a potential predictor of increases in the price of certain foods, because fuel prices determine the price of transporting fertilizer needed for crops)
- Several months later, Global Pulse software automatically picks up a more worrisome pattern – this time through real-time monitoring at the national level:
a. A sharp uptick in food prices reported through an ongoing mobile phone text message-based survey conducted by an UN agency;
b. Increased inquiries about on the price of fertilizer and rainfall predictions submitted through a mobile phone-based question-and-answer service;
c. Increases in the proportion of mobile phone accounts using pre-paid cards, decreased international calls, and decreased use of IP-based mobile services;
d. Increased rate of cashing in food vouchers distributed by an NGO.
- Based on this new information, personnel at the Pulse Lab share their findings online through the Global Pulse network with local and international experts familiar with the population. They aren't able to share all of their raw data, as some of it is restricted by various agreements with the private sector and NGOs, but they do share descriptions of the population, tags, comments, and descriptions of the data sources they are using, and the indicators they are tracking.
- It turns out that a team in a neighboring country government is observing comparable changes in communities on their side of the border. This second team hasn't seen changes in the use of food vouchers, as no similar program exists there, but they have seen an increase in livestock sales in several communities that do not traditionally sell livestock.
- Based on this mounting but circumstantial evidence, the first team proposes two hypotheses online related to the community of concern: an event hypothesis ("food shortages") and a vulnerability (future impact) hypothesis ("moderate malnutrition"). A search of information about previous events suggests several coping strategies this population is likely to adopt in response to food shortages: they are likely to pull their girls out of school to work in the market, and they will begin migrating across a nearby border to find jobs.
- Another search provides contact information for teachers and operators of radio stations that cover the population. Pulse Lab staff broadcast text messages to radio station operators to inform them of their concerns about food shortages and sensitize them to listen for calls and messages from their audience that could indicate households are cutting back on consumption of food or migrating for work. Radio hosts reply, agreeing to stay alert, make inquiries when appropriate during call-in shows, and send weekly text messages into the Global Pulse system summarizing their observations. Teachers are likewise alerted to be on the lookout for decreases in girls' attendance.
- A few weeks later, the Pulse Lab team monitoring text messages from radio operators begin to receive reports that families of migrant workers throughout the area being monitored and tea producers in the northwest district are unable to afford food, are coping by eating less well and less often, children are going to school hungry, and family members are contemplating migrating to the capital to find work. The Global Pulse system, meanwhile, automatically detects an increase in employment-related inquiries through the question-and-answer service mentioned earlier, specifically referencing cities in the neighboring country. The software suggests to Pulse Lab staff that these findings might constitute additional evidence supporting their current hypotheses.
- Based on this mounting body of evidence, the Pulse Lab recommends that the government initiate a verification step, and a team is to perform a rapid household-level impact assessment. Their findings confirm that affordability of food has become quite problematic and families have begun to make difficult trade-offs to get by.
- Pulse Lab staff use the system to broadcast text messages to key contacts working in the affected communities, such as local government, NGOs, radio station operators, teachers and community health workers, alerting them to the situation, notifying them that additional food vouchers and school feeding programs will be initiated in their community. Several citizens query public information feeds via SMS on their mobile phones and subscribe to receive further updates automatically.
- As the response gets underway, Pulse Lab staff monitor the population through a combination of proxy indicators, citizen reporting, and mobile surveys to evaluate the effectiveness of the response. They also compare notes with their counterparts in the neighboring country. They begin working on a project to train the Global Pulse system to monitor reports from citizen reporters automatically, looking for key words and phrases indicative of the coping strategies being employed. The next time a crisis hits, the team will have more free time to focus on the aspects of analysis that cannot be automated.
Socially-Meshed Intelligent Workspaces?
What kind of core object could serve as the basic building block of a system to address the scenario described above? To date, what we've found most suitable conceptually is the idea of a distributed system where government users collaborate in secure workspaces to collect, analyze, and visualize information and then share information selectively outside the workspaces through a social network. We know that governments will likely only share information more broadly when their own work is going well, so the workspace environment has to be secure and useful in its own right while providing all sorts of exciting features such as social sharing, search, community learning, modeling, and contextual awareness to provide incentives to reach out more broadly across workspace boundaries. Might such secure yet socially linked workspaces facilitate self-assembly into a global network? Could such a design help the work of collaborating teams fuse into global collective action?
Here are a few thoughts on what kinds features might make this work.
- Secure Team-Based Analysis.
a. Users would be able to create a new workspace easily, invite colleagues into the workspace, subscribe the workspace to various sources of information relevant to one or more vulnerable populations, and use a variety of software tools within the workspace to interact securely with one another and monitor the population for evidence of the early impacts of crises. b. These tools might include applications for data fusion, mapping, analysis, visualization, and annotating reports, data sets, or information sources with social metadata.
c. Data in these workspaces will likely need to be encrypted on disk and over the wire, because some of the data governments will be using will be of a sensitive nature.
- Agent-Augmented Reasoning.
a. Workspaces could allow users to develop, refine, import and export a variety of models for events that users are keen to detect.
b. Machine learning agents could be used to observe how users gather and characterize evidence and learn to automate low-level tasks related to classification and filtering.
- Multi-Channel Communications.
a. Workspaces could be configured to interface with SMS gateways, IVR systems, and mobile phone-based applications and other communication channels to allow interaction with networks of trusted community leaders (such as local officials, teachers, radio station hosts, community health workers).
b. Users could use these features to collect periodic reports, elicit specific reports when concerning patterns are detected, and broadcast risk communication.
- Social Collective Reasoning.
a. Users' account profiles could be linked to one another via one or more social graphs.
b. Social metadata could be selectively shared through a user's social network, such that users are able to see how others they trust are rating a particular information source or characterizing a trend.
c. The social network might also be used for sharing raw data sets, reports, alerts, maps, visualizations, models, hypotheses, and evidence that specific policy responses did or did not work.
d. Users might selectively share a variety of different audiences, including other trusted individuals, with everyone in their organization, with everyone in any organization cross-certified with their organization, with everyone in their community of practice, or with the global community as a whole.
e. The user interface would need to support workspace-based interactions while also providing a kind of peripheral awareness of decision-making trends within the broader network.
f. Organizations might choose to limit what kinds user-initiated sharing would be allowed.
Users might also want to:
- Select specific populations to monitor across any combination of features (location, age, sex, ethnicity, income level, etc.) in order to monitor that population for changes in collective behavior that might indicate the impacts of crisis or emerging vulnerabilities.
- Layer risk factors both geospatially and temporally to target closer monitoring.
- Compare patterns in current real-time data to a historical baseline to determine whether observed phenomena constitute an anomaly.
- Correlate an anomalous trend in real-time information with current and historical contextual information in order to decide whether further investigation is warranted.
- Compare current context and impact of past events in similar contexts in order to assess the vulnerability of the country population to event X.
- Explore the change in indicators after a certain event (shock, onset of coping, or implementation of policy response) in order to assess the current impact on a particular population.
- Compare current coping mechanisms in a population with other coping mechanisms in a similar context in order to target behavior-change messaging effectively.
- Receive reports from analysts regarding trends and patterns that are supported by indicators in order to design appropriate interventions and policies.
- Comment, flag, share and discuss particular trends in indicators, events, patterns, and hypotheses in order to make a decision on next steps.
- Alert key observers (community leaders, local NGOs, local media) of findings in order to sensitize them to what they should be on the lookout for and elicit follow-up reports from them confirming or refuting hypotheses.