Category Archives: Big Data

Turning diamonds’ defects into long-term 3-D data storage


With the amount of data storage required for our daily lives growing and growing, and currently available technology being almost saturated, we’re in desperate need of a new method of data storage. The standard magnetic hard disk drive (HDD) – like what’s probably in your laptop computer – has reached its limit, holding a maximum of a few terabytes. Standard optical disk technologies, like compact disc (CD), digital video disc (DVD) and Blu-ray disc, are restricted by their two-dimensional nature – they just store data in one plane – and also by a physical law called the diffraction limit, based on the wavelength of light, that constrains our ability to focus light to a very small volume.

And then there’s the lifetime of the memory itself to consider. HDDs, as we’ve all experienced in our personal lives, may last only a few years before things start to behave strangely or just fail outright. DVDs and similar media are advertised as having a storage lifetime of hundreds of years. In practice this may be cut down to a few decades, assuming the disk is not rewritable. Rewritable disks degrade on each rewrite.

Without better solutions, we face financial and technological catastrophes as our current storage media reach their limits. How can we store large amounts of data in a way that’s secure for a long time and can be reused or recycled?

In our lab, we’re experimenting with a perhaps unexpected memory material you may even be wearing on your ring finger right now: diamond. On the atomic level, these crystals are extremely orderly – but sometimes defects arise. We’re exploiting these defects as a possible way to store information in three dimensions.

Focusing on tiny defects

One approach to improving data storage has been to continue in the direction of optical memory, but extend it to multiple dimensions. Instead of writing the data to a surface, write it to a volume; make your bits three-dimensional. The data are still limited by the physical inability to focus light to a very small space, but you now have access to an additional dimension in which to store the data. Some methods also polarize the light, giving you even more dimensions for data storage. However, most of these methods are not rewritable.

Here’s where the diamonds come in.

The orderly structure of a diamond, but with a vacancy and a nitrogen replacing two of the carbon atoms.
Zas2000

A diamond is supposed to be a pure well-ordered array of carbon atoms. Under an electron microscope it usually looks like a neatly arranged three-dimensional lattice. But occasionally there is a break in the order and a carbon atom is missing. This is what is known as a vacancy. Even further tainting the diamond, sometimes a nitrogen atom will take the place of a carbon atom. When a vacancy and a nitrogen atom are next to each other, the composite defect is called a nitrogen vacancy, or NV, center. These types of defects are always present to some degree, even in natural diamonds. In large concentrations, NV centers can impart a characteristic red color to the diamond that contains them.

This defect is having a huge impact in physics and chemistry right now. Researchers have used it to detect the unique nuclear magnetic resonance signatures of single proteins and are probing it in a variety of cutting-edge quantum mechanical experiments.

Nitrogen vacancy centers have a tendency to trap electrons, but the electron can also be forced out of the defect by a laser pulse. For many researchers, the defects are interesting only when they’re holding on to electrons. So for them, the fact that the defects can release the electrons, too, is a problem.

But in our lab, we instead look at these nitrogen vacancy centers as a potential benefit. We think of each one as a nanoscopic “bit.” If the defect has an extra electron, the bit is a one. If it doesn’t have an extra electron, the bit is a zero. This electron yes/no, on/off, one/zero property opens the door for turning the NV center’s charge state into the basis for using diamonds as a long-term storage medium.

Starting from a blank ensemble of NV centers in a diamond (1), information can be written (2), erased (3), and rewritten (4).
Siddharth Dhomkar and Carlos A. Meriles, CC BY-ND

Turning the defect into a benefit

Previous experiments with this defect have demonstrated some properties that make diamond a good candidate for a memory platform.

First, researchers can selectively change the charge state of an individual defect so it either holds an electron or not. We’ve used a green laser pulse to assist in trapping an electron and a high-power red laser pulse to eject an electron from the defect. A low-power red laser pulse can help check if an electron is trapped or not. If left completely in the dark, the defects maintain their charged/discharged status virtually forever.

The NV centers can encode data on various levels.
Siddharth Dhomkar and Carlos A. Meriles, CC BY-ND

Our method is still diffraction limited, but is 3-D in the sense that we can charge and discharge the defects at any point inside of the diamond. We also present a sort of fourth dimension. Since the defects are so small and our laser is diffraction limited, we are technically charging and discharging many defects in a single pulse. By varying the duration of the laser pulse in a single region we can control the number of charged NV centers and consequently encode multiple bits of information.

Though one could use natural diamonds for these applications, we use artificially lab-grown diamonds. That way we can efficiently control the concentration of nitrogen vacancy centers in the diamond.

All these improvements add up to about 100 times enhancement in terms of bit density relative to the current DVD technology. That means we can encode all the information from a DVD into a diamond that takes up about one percent of the space.

Past just charge, to spin as well

If we could get beyond the diffraction limit of light, we could improve storage capacities even further. We have one novel proposal on this front.

A human cell, imaged on the right with super-resolution microscope.
Dr. Muthugapatti Kandasamy, CC BY-NC-ND

Nitrogen vacancy centers have also been used in the execution of what is called super-resolution microscopy to image things that are much smaller than the wavelength of light. However, since the super-resolution technique works on the same principles of charging and discharging the defect, it will cause unintentional alteration in the pattern that one wants to encode. Therefore, we won’t be able to use it as it is for memory storage application and we’d need to back up the already written data somehow during a read or write step.

Here we propose the idea of what we call charge-to-spin conversion; we temporarily encode the charge state of the defect in the spin state of the defect’s host nitrogen nucleus. Spin is a fundamental property of any elementary particle; it’s similar to its charge, and can be imagined as having a very tiny magnet permanently attached it.

While the charges are being adjusted to read/write the information as desired, the previously written information is well protected in the nitrogen spin state. Once the charges have encoded, the information can be back converted from the nitrogen spin to the charge state through another mechanism which we call spin-to-charge conversion.

With these advanced protocols, the storage capacity of a diamond would surpass what existing technologies can achieve. This is just a beginning, but these initial results provide us a potential way of storing huge amount of data in a brand new way. We’re looking forward to transform this beautiful quirk of physics into a vastly useful technology.

The Conversation

Siddharth Dhomkar, Postdoctoral Associate in Physics, City College of New York and Jacob Henshaw, Teaching Assistant in Physics, City College of New York

Big data’s ‘streetlight effect’: where and how we look affects what we see


Big data offers us a window on the world. But large and easily available datasets may not show us the world we live in. For instance, epidemiological models of the recent Ebola epidemic in West Africa using big data consistently overestimated the risk of the disease’s spread and underestimated the local initiatives that played a critical role in controlling the outbreak.

Researchers are rightly excited about the possibilities offered by the availability of enormous amounts of computerized data. But there’s reason to stand back for a minute to consider what exactly this treasure trove of information really offers. Ethnographers like me use a cross-cultural approach when we collect our data because family, marriage and household mean different things in different contexts. This approach informs how I think about big data.

We’ve all heard the joke about the drunk who is asked why he is searching for his lost wallet under the streetlight, rather than where he thinks he dropped it. “Because the light is better here,” he said.

This “streetlight effect” is the tendency of researchers to study what is easy to study. I use this story in my course on Research Design and Ethnographic Methods to explain why so much research on disparities in educational outcomes is done in classrooms and not in students’ homes. Children are much easier to study at school than in their homes, even though many studies show that knowing what happens outside the classroom is important. Nevertheless, schools will continue to be the focus of most research because they generate big data and homes don’t.

The streetlight effect is one factor that prevents big data studies from being useful in the real world – especially studies analyzing easily available user-generated data from the Internet. Researchers assume that this data offers a window into reality. It doesn’t necessarily.

Looking at WEIRDOs

Based on the number of tweets following Hurricane Sandy, for example, it might seem as if the storm hit Manhattan the hardest, not the New Jersey shore. Another example: the since-retired Google Flu Trends, which in 2013 tracked online searches relating to flu symptoms to predict doctor visits, but gave estimates twice as high as reports from the Centers for Disease Control and Prevention. Without checking facts on the ground, researchers may fool themselves into thinking that their big data models accurately represent the world they aim to study.

The problem is similar to the “WEIRD” issue in many research studies. Harvard professor Joseph Henrich and colleagues have shown that findings based on research conducted with undergraduates at American universities – whom they describe as “some of the most psychologically unusual people on Earth” – apply only to that population and cannot be used to make any claims about other human populations, including other Americans. Unlike the typical research subject in psychology studies, they argue, most people in the world are not from Western, Educated, Industrialized, Rich and Democratic societies, i.e., WEIRD.

Twitter users are also atypical compared with the rest of humanity, giving rise to what our postdoctoral researcher Sarah Laborde has dubbed the “WEIRDO” problem of data analytics: most people are not Western, Educated, Industrialized, Rich, Democratic and Online.

Context is critical

Understanding the differences between the vast majority of humanity and that small subset of people whose activities are captured in big data sets is critical to correct analysis of the data. Considering the context and meaning of data – not just the data itself – is a key feature of ethnographic research, argues Michael Agar, who has written extensively about how ethnographers come to understand the world.

What makes research ethnographic? It is not just the methods. It starts with fundamental assumptions about the world, the first and most important of which is that people see and experience the world in different ways, giving them different points of view. Second, these differences result from growing up and living in different social and cultural contexts. This is why WEIRD people are not like any other people on Earth.

The task of the ethnographer, then, is to translate the point of view of the people they study into the point of view of their audience. Discovering other points of view requires ethnographers go through multiple rounds of data collection and analysis and incorporate concepts from the people they study in the development of their theoretical models. The results are models that are good representations of the world – something analyses of big data frequently struggle to achieve.

Here is an example from my own research with mobile pastoralists. When I tried to make a map of my study area in the Logone Floodplain of Cameroon, I assumed that places had boundaries, as the one separating Ohio from Michigan. Only later, after multiple interviews and observations, did I learn that it is better to think of places in the floodplain as points in an open system, like Columbus and Ann Arbor, without any boundary between them. Imagine that!

Don’t get me wrong: I think big data is great. In our interdisciplinary research projects studying the ecology of infectious diseases and regime shifts in coupled human and natural systems, we are building our own big data sets. Of course, they are not as big as those generated by Twitter or Google users, but big enough that the analytical tools of complexity theory are useful to make sense of the data because the systems we study are more than the sum of their parts.

Moreover, we know what the data represents, how it was collected and what its limitations are. Understanding the context and meaning of the data allows us to check our findings against our knowledge of the world and validate our models. For example, we have collected data on livestock movements using a combination of surveys and GPS technology in Cameroon to build computer models and examine its impact on the spread of foot-and-mouth disease. Because we know the pastoralists and the region in which they move, we can detect the errors and explain the patterns in the data.

For data analytics to be useful, it needs to be theory- or problem-driven, not simply driven by data that is easily available. It should be more like ethnographic research, with data analysts getting out of their labs and engaging with the world they aim to understand.

The Conversation

Mark Moritz, Associate Professor of Anthropology, The Ohio State University

This article was originally published on The Conversation. Read the original article.