Scientific evidence for health supplements: an interactive, generative data visualization

An overwhelming number of natural products and nutraceuticals vie for our attention. Each is associated with a variety of claims of health benefits, often without any reference to the experimental evidence (if any) supporting those claims – or with reference only to dubious, poorly controlled studies in backwater journals. I don’t spend a lot of time following these compounds, but occasionally one gets mentioned often enough that is breaks through into the literature (e.g., resveratrol, green tea, carnitine/lipoate, or other supplements) and I discuss it here.

If only because of the size of the heap, I nonetheless still suspect that there’s a pony in there somewhere; I’ve often wished I had the time to do a comprehensive literature review of my own, so that I could identify the compounds whose associated claims are supported by the best evidence. Now it looks like I can start wishing for something else, because someone did it for me.

At the (amazing) blog Information is Beautiful, David McCandless and Andy Perkins have assembled a “generative data-visualisation of all the scientific evidence for popular health supplements“. In David’s words:

I’m a bit of a health nut. Keeping fit. Streamlining my diet. I plan to live to the age of 150 in fact. But I get frustrated by constant, conflicting reports and studies about health supplements.

Is Vitamin C worth taking or not? Does Echinacea kill colds? Am I missing out not drinking litres of Goji juice, wheatgrass extract and flaxseed oil every day?

In an effort to give myself a quick reference guide, I dove into the scientific evidence and created a visualization for my book. And then worked with the awesome Andy Perkins on a further interactive, generative “living image”.

The image itself is dynamic with respect to both user input about what information is desired, and introduction of new data – it is based on the information in a spreadsheet, which can be updated (new compounds, or information about compounds already mentioned), altering the visual rendering the dynamic image. You can play with the image here; I’ve attached a still snapshot below.

The rendering is imperfect (as also discussed elsewhere): More reliable claims are near the top, and more dubious claims are near the bottom, but this positioning is the result of a single variable, “evidence,” which may the based largely on a citation count. This is a problem because not all citations that mention a compound should be weighted equally; furthermore, it’s not clear how conflicting claims end up getting counted. The abstraction of a complex body of data into a single number unquestionably involves some judgment calls that could be made differently – that’s not necessarily a lethal criticism, but the process should be as transparent as possible.

On a visual level, the image is attractive, but color is mostly a wasted variable: position along the color spectrum is synonymous with height — except in the case of orange, which indicates a compound with “low evidence, promising results”. The orange compounds are still assigned an evidentiary weight, according to an algorithm I can’t fathom; this is particularly confusing at both ends: beta-glucan is in the “high evidence” position, which seems to contradict the label’s definition (“low evidence”); whereas noni and astragalus are in the “no evidence” position, raising questions about how there could be “promising results”.

The strength of the project, however, is that it can evolve; the creators are already enthusiastically updating it. So far the changes (as detailed in this log) are content-oriented; one hopes that the methodology of generative data visualization will also enjoy improvements as time goes by.