Democratizing data reporting from the ground up

Building a dust sensor network

As the Trump administration continues to curtail funding for longstanding federal agencies like the EPA, citizens have taken it upon themselves to gather and contextualize the data formerly compiled by federal agencies. Scientists, academics and individuals have banded together to form save the data events to protect existing environmental data that has already been collected for fear it may be deleted or made inaccessible. This trend raises the question: Where will future data come from?

One possibility is for citizens and journalists to gather the data themselves.

In our research, we spoke with David Unger, an environmental journalist. According to Unger, the field of journalism is known for being slow to adopt new technology, and this includes emerging means and applications of data collection. Unger believes journalists should move away from “old-fashioned” data gathering and toward networked communications that can deliver more data at a faster rate. Even though gathering environmental data is “probably not the highest of priorities” under the Trump administration, Unger said, he has seen community activists take a more prominent role in their own data collection.

SensorGrid aims to be an open-source, inexpensive method for collecting and understanding environmental data locally. Knight Lab’s SensorGrid was developed with underlying goals of promoting citizen science and democratizing data collection. The technology aims to use the emergence of DIY Arduino-based hardware to create a product that makes easily customizable and deployable environment sensing nodes that come ready to communicate with each other.

The current landscape of environmental data projects

In considering the purpose and direction of the SensorGrid project, an effort was made to discover similar and related projects and to consider the role that SensorGrid might play in this broader ecosystem of options. The following table summarizes the projects that we found:

Link Cost Sensors Connectivity Notes
Array of Things (aka AoT) n/a (not a consumer product. See Waggle for tech. availability) Temperature
Humidity
Barometric pressure
Vibration
Sound intensity
Magnetometer
Nitrogen Dioxide
Ozone
Carbon Monoxide
Hydrogen Sulfide
Sulfur Dioxide
Light/IR
Pedestrian & vehicle traffic
Chicago utilities

"Smart city" style project managed in coordination with city of Chicago utilities

Broad set of sensors, but municipally managed. Not user deployable or manageable. Locations limited to pre-select municipal locations. C.f. e.g. Caren Cooper’s Citizen Science for problems with pre-selected locations of monitoring air quality
Waggle $1400/node See AoT above WiFi or Cellular

System that AoT is built with.

Standard set of sensors - not modular, thus costing space and power consumption. This appears to be a mature system with a broad set of sensors built-in but lacks modularity and requires either cellular or wifi connectivity. High cost
Smart Citizen Unknown. Out of stock at time of writing Temperature
Humidity
Light
Sound
Carbon Monoxide
Nitrogen Dioxide
WiFi Seems to be aimed primarily at each user deploying a single or small number of nodes to contribute to a larger set of community nodes. The democratizing aspect of user deployed sensors (vs. municipally deployed sensors of AoT) is appealing. However, this project seems to lack the features needed for self-contained ad-hoc deployed data acquisition projects
Air Quality Egg $280 per node. Nodes are dedicated to data type (e.g. CO2, NO2/CO, O3/SO2, etc) CO2
NO2
CO
O3
SO2
Particulates
Volatile organic compounds
WiFi Similar to Smart Citizen, this project seems to focus on users making a small number of node deployments to contribute to a larger community data set
DustDuino Unknown (DIY solution) Dust particulates (PM10, PM2.5) uses Shinyei PPD42NS Direct to PC? A single-node DIY Arduino-based solution. There appears to be some useful information here. The Shinyei may be a good alternative to the Sharp sensor used in the current study, in no small part due to Public Lab’s statement that DustDuino measures PM10 and PM2.5 (Sharp’s PM detection is unknown to us at this time), and that it has been used in studies demonstrating correlation with other measurement methods.
TelosB 988 Euro for starter kit with nodes and data concentrator Temperature
Humidity
Light
CO
CO2
PIR
Magnetic
Microphone
Ethernet via 802.15.4-Ethernet data concentrator Aiming at industrial solutions and research science, the TelosB mote system seems to be highly adaptable to a broad set of distributed sensor needs. The entry point to simple ad hoc data acquisition, however, is not clear. TelosB seems to be targeting high-end users able to make significant engineering investments up front. It is not clear if they are or will be targeting a more non-technical user base for ad hoc deployments. Requires an ethernet connected data concentrator
Table 1. Alternative Solutions

Table 1 is probably not an exhaustive set of existing technologies, but is representative of much of what is available. Reviewing this table for a sense of existing environmental data acquisition solutions, our observations are summarized below into the topics of data ownership, and barriers to entry.

Data ownership

Many systems have some emphasis on a notion of democratization or citizen participation in data. However, existing approaches have some problems.

Projects like Array of Things “democratize” by making the collected data available to the public. However, deployment is under municipal control, potentially restricting data collection ability and missing important data that could be had under full user-driven deployments.

Projects like Smart Citizen and Air Quality Egg are, in a sense, more democratized than Array of Things in that they shift the burden of deployment to the citizens, supporting a much stronger element of public control of the system as a whole. These projects, however, are very much oriented toward citizen contributions to larger data sets. This is important work, and could prove to produce very important data over wide and dispersed environmental areas. This is simply a different approach from SensorGrid which aims to give full control of data acquisition to the user. It is arguable that these projects take a more “citizen science” oriented tack than SensorGrid does, given their efforts to drive public participation in providing data. However, for the use cases of campus-wide and neighborhood-wide data collection with a good distribution of acquisition nodes, the connectivity burden of these projects could be an impediment compared to the distributed peer-to-peer approach of SensorGrid.

Barriers to entry

Most of these systems have a high initial setup cost. Also, lack of modularity means adoption of a particular technology will lock the user into the sensors supported by that technology. SensorGrid does not yet support a large number of sensors, but aims to be open and adaptable such that sensor availability can continually grow. A single SensorGrid node can be built for under $100 and is sufficient for gathering data at that point via a connected computer. 2 nodes are sufficient for remote data harvesting.

Another barrier to entry for existing technologies is the level of technical expertise required to deploy and use the system. TelosB seems to be the system most similar to SensorGrid’s approach. However, this system seems to be aimed at technical users. It remains to be determined if TelosB will attempt more turn-key approaches on par with SensorGrid.

SensorGrid aims to be both affordable and technically accessible. Separation of electrical assembly/programming from modular assembly/configuration will help to support the goal of technical accessibility. Once nodes have been electronically assembled and boot-loaded, users who are not comfortable with the engineering aspects of the system can write a simple configuration file with no soldering or coding required.

Focusing on dust

A project still in its infancy, SensorGrid has a number of outstanding technical challenges to be addressed. These challenges include such variety as power management, data transmission protocol improvements, enclosure considerations, and sensor integration. To some extent, each of these challenges has been touched upon in early stages of development, but still require work and fine tuning to be done, and improvements to be made.

As a mechanism for addressing the challenges at hand by means of a concrete implementation, we decided that, for this quarter, we would focus on dust detection. More specifically, we investigated particulate matter (PM) concentration, which is a key problem area with respect to health and climate related issues (see, for example, Kelly et al, Environmental Pollution, 221).

PM levels are typically noted based on the size of the particles in micrometers, such as PM2.5 or PM10 (2.5 micrometers or 10 micrometers, respectively). In a conversation with Kari Lydersen, a Medill faculty member, we discovered that the PM size distinction is necessary because different-sized particles cause different effects. “Dust is a little bit misleading,” Lydersen said. That’s because, while the public may think of dust as a respiratory problem, the finer particles – sometimes the result of diesel exhaust – can penetrate the blood-brain barrier.

The technical details

The Dust Sensor

In particular, we worked on integrating the Sharp GP2Y1010AU0F dust sensor into SensorGrid. The Sharp is a commonly available air particulate sensor that is popular in DIY sensor projects. Resources we drew from included these online tutorials:

The Sharp sensor is shown in the figure below.

The Sharp GP2Y1010AU0F Dust Sensor

According to the sensor’s datasheet, features of the sensor include:

  • Compact packaging

  • Low power consumption

  • Dust presence detectable by single-pulse photometry

  • Can distinguish smoke from dust

Notably absent from the datasheet is any information about particulate matter size to be detected. The datasheet does indicate an ability to detect very fine particulate matter like cigarette smoke, without quantifying “very fine.”

Also notable: the ability to distinguish smoke from dust is done with respect to the pulse pattern of the output voltage. However, there is no indication in the datasheet of what those patterns are.

For the scope of this quarter-long project, we focused on single-shot periodic sampling of dust levels. Given some of the complexities and unknowns about the specifications of this particular sensor, a more thorough investigation would examine the tradeoffs and potential necessity of a more continuous approach to data sampling. Furthermore, such exploration would play a role in further refining the definition of SensorGrid’s use case scope.

Sensor Connectivity

The Dust Sensor Adapter from DFRobot

It is worth noting that the sensor does not necessarily come with the required proprietary connector. SparkFun makes a connector available which requires wire crimping.

Digikey does not seem to have the correct connector available. The connector that was recommended by their shopping cart recommender system was the wrong connector.

DFRobot is the best resource with respect to connectivity of this sensor. They provide a pre-crimped connector with the sensor. However, DFRobot has even done one better: they have made a breakout/connector combo dust sensor adapter that eliminates the need for additional analogue circuitry when integrating the sensor into an Arduino based circuit. For us, the simplified connectivity and streamlined design workflow easily justifies extra cost of $3.90 for this component.

System Assembly

Electronics assembly

To take advantage of the Feather’s modular design, we integrated the dust sensor into the system by creating a connectivity board from a Feather single proto. Because of the size of the sensor, and it’s external connectivity wiring, the sensor is not directly integrated into the proto board – rather, a simple pinout configuration was assembled to make sensor connectivity as simple as possible.

The figures below show the top and bottom of the connectivity assembly.

Sharp dust sensor connectivity board SensorGrid modular integration
Sharp dust sensor connectivity board SensorGrid modular integration

To minimize wiring and soldering on the connectivity assembly, we took advantage of the power and ground busses on the proto board and the way these respective pinouts appear on the DFRobot sensor adapter.

Modular assembly

Once the sensor connectivity proto board is assembled, modular assembly is a simple matter of taking advantage of the stackable Feather system. The figure below shows a completed sensor node, with the dust sensor attached.

Assembled dust sensor node

Enclosure

3D printed Stevenson enclosure

We needed weather-resistant enclosures that could accommodate the sensor and the hardware while allowing air to pass through.

When particles enter the sensor, they block the light source inside, measuring particulate matter via occlusion. So, a falling rain drop could occlude the sensor and skew the data. At the same time, the only way to measure air quality is to measure the air.

We came across the Array of Things, a Chicago-based sensor project that plans to deploy 500 air-quality sensors across the city by the end of next year, early in the research phase. These nodes are enclosed by a Stevenson shield, which is a series of stacked discs that resembles a beehive, according to Array of Things press manager Rob Mitchum.

Unlike SensorGrid, the Array of Things nodes contain 25 to 30 sensors and are not intended for use by citizens, but the design seemed like it would work. We obtained open source files for 3-D printing, but each piece required 10 to 12 hours to complete. Future models may benefit from a more DIY approach. We found multiple tutorials using plastic discs available at hardware stores, and an educational version from the EPA uses a plastic children’s toy.

Considerations and lessons learned

Code change complexities

While the code changes for sensor integration were simple, the need for changes at all is not ideal for modularity. This raises the question of whether, in time, SensorGrid could have standardized implementations according to broad integration type. Such type variations would include input resistance changes, pulse width modulation inputs, and in the case of the Sharp sensor, linear input voltage changes over a range of input values.

Most analog voltage based sensors will operate on a full voltage range (0-3.3v) which will map to a raw read value of 0-1024. To some extent it might make sense to transmit the read value and calculate its meaning in data processing, rather than on the microcontroller. This would provide a mechanism for abstract consistency across sensors of this type. However, there are control considerations that are not yet clear. The Sharp dust sensor requires a low-value control input during sampling to turn on the internal LED. Not all sensors will require the same control mechanism. Other differences might include delay times required for getting proper readings. Furthermore, as discussed with respect to dust type reading and sampling methodology, it is not yet clear that the current single-shot sampling method is the best approach for this particular sensor.

Accuracy and calibration

There remains uncertainty around the accuracy and utility of the Sharp sensor. Lack of specification for the sensor means that without further testing, we cannot be sure of the sensor’s sensitivity to PM size and concentration, nor can we be sure of the consistency of readings across sensors or their correlation with more well established methods of particulate measurement.

Conclusion

A number of outstanding paths of inquiry remain for SensorGrid to reach the point of being field-ready for real environmental data acquisition. The future of SensorGrid work will fall into these major tiers of investigation, each of which has significant and interesting work to be done:

  • Sensor Type
  • Power management
  • Enclosures
  • Data management

The present work has focused on a particular sensor type, but by doing the work to integrate that sensor into the system, we cut through aspects of each of these tiers of investigation.

For this project, we focused on the idea of particulate matter sensing with the Sharp dust sensor. Wang, et al have shown the Sharp sensor to have a high degree of linearity with known reference sensing tools and to be sufficiently sensitive for obtaining useful data points about environmental pollution. We have shown that integrating the Sharp sensor into SensorGrid is a straightforward task, and the resulting integration is now available for simple modular assembly, configuration and deployment.

While progress has been made on integrating the Sharp dust sensor into SensorGrid, the question of whether this particular sensor brings needed functionality to the SensorGrid ecosystem remains to be answered.

Here are some outstanding questions and further points of exploration that remain with respect to the Sharp dust sensor:

  • Address the issue of voltage signal patterns and related particulate type. Integrate findings into SensorGrid’s use case scope definition. I.e. address the following:

    • Are continuous samples required for an accurate picture of dust? How continuous?

    • If so, does this dust sensor fit into SensorGrid’s “low periodicity” use case requirement? Does any? What other data types that might have the same problem?

    • Would a sample average approach be sufficient? E.g. every data point is actually an average of samples taking over a designated time period or maybe since the last data point

    • Would a periodic-continuous approach be sufficient? E.g. take a sample every second but only for 1 minute of every hour.

  • Test the sensor for its sensitivity to different PM sizes

  • Compare multiple sensors for a better sense of accuracy

  • Test in controlled dust environments to better understand the need for calibration and/or controls such as humidity, dust adhesion, etc.

  • Detailed integration documents. Should we determine that this sensor is worth including in the SensorGrid ecosystem (based on the above outstanding explorations), a detailed integration document should be produced for users

In conclusion, the Sharp GP2Y1010AU0F is an affordable and accessible dust sensor that is simple to integrate into SensorGrid. The sensor’s flexibility of input voltage requirements, and its straightforward voltage-level reading method makes quick work of this integration. However, lack of specification and uncertainty around the sensor’s application to the use case at hand means that much more data needs to be gathered in order to begin to understand the Sharp’s applicability to SensorGrid. Ideally, a controlled testing environment, outside the scope of the current work, would be utilized to determine some of these aspects of applicability.

About the authors

Holly Kane

Victoria Cabales

David Wallach

Rachel Inderhees

About the project

Environmental Reporting with Sensors

Sensor journalism uses sensors to collect information about our environment. It opens new possibilities for journalists enabling them to collect and process data that might not be available or at a level of detail not previously available.