Written words and other information can be encoded in synthetic molecules and then recovered by analysing the chemicals.
This means that microscopic bits of plastic could potentially hold much more data than is stored on today’s computer hard drives, which use cumbersome codes and relatively large magnetic particles to store information, says Eric Anslyn of the University of Texas at Austin.
Currently, data is stored using binary code – long strings of 0’s and 1’s. Its simplicity makes the code easy to decipher, but this approach requires significant space on a hard drive, says Anslyn.
His approach may be a space saver, although the initial aim wasn’t to encode data at all. Anslyn had been attempting to create complex molecules that would make products like pharmaceuticals and dishwasher detergents more effective.
But when he discussed his work with computer programmer friends, Anslyn realised that the compounds he was working with – made from elements including hydrogen, nitrogen, oxygen, and the hydrogen isotope deuterium – could each represent symbolic values for storing information.
Various molecules built from these could become their own code language based on a rich “molecular alphabet” of 16 characters – a hexadecimal code. That’s eight times the characters used in the binary system, making the approach particularly efficient for storing data.
Read more: Libraries of plastic molecules could store huge amounts of data
What is more, the liquid chromatography-mass spectroscopy (LC/MS) analytical system he was already using could easily analyse and sequence such complex substances.
Inspired by the possibilities, Anslyn’s team developed software that would encode regular text symbols into a hexadecimal “molecular language”. Then, they created molecules representing the code needed to write a simple statement: “Hello World!”.
A number of molecules were needed to store the message, so to keep them in the correct order when reading the message, the team used a special plate containing a regular array of wells and placed the molecules in the wells sequentially – a bit like the way a mechanical hard disk drive uses physical location to store a computer’s data.
Encouraged by how easily the software reconstructed the words after the molecules had been sequenced with LC/MS, the researchers moved on to a more complex sentence.
An avid Jane Austen fan, Anslyn chose what he describes as an “apt but timeless quote” from his favourite author’s 1814 novel Mansfield Park: “If one scheme of happiness fails, human nature turns to another; if the first calculation is wrong, we make a second better: we find comfort somewhere.”
The researchers gave a chemically encoded version of the sentence to a colleague, who wasn’t involved with the project. Armed with the new software, the colleague successfully read the Austen quote in full.
Other teams have previously developed prototypes of molecular storage, but using binary. Anslyn says the hexadecimal version has “mind-blowing” potential for storing data in a smaller physical space – partly because the basic concept of the molecular code itself is so simple and familiar.
“We always write in symbols, and molecules are just another set of symbols that we can assemble – not just for building molecules analogous to those found in nature, but to create our own inventions,” he says. Or even, apparently, to store the literary inventions of 19th-century novelists.
Journal reference: Cell Reports Physical Science, DOI: 10.1016/j.xcrp.2021.100393