http://www.usatoday.com/story/news/nation/2013/01/23/dna-information-storage/1858801/ The best part about this is what it's going to do to science fiction.
There's been SF about stuff like this before, and the concept of biological computation has been around for years as has been the idea of encoding non-biological information in DNA-- that's not new (in fiction). But it is very neat that it's reached non-fiction. /edit -- there was this really good canadian TV series that used the concept years ago --I forget the name of it though, but it involved doctors and scientists dealing with all sorts of medical and ecological mysteries. It wasn't computation in that case, but secret messages hidden in the RNA of manufactured viruses. /edit "Regenesis" is the show (if my mind were a sieve, you could drive a truck through the holes in it.) What I thought was odd about the article is that it talks about 0s and 1s, when I'd think you'd use base 4 math. But that may just be carelessness on the part of the writer. Base 2 works well for electronic systems but ideally, you'd want the base to be as large as is possible/practical. If there were 10 nucleic acids instead of 4, you'd want to use base 10.
If there were 10, you would want to use base 8. The reason is parity. Otherwise you are likely to have errors. Check this for a long and boring mathematics article that explains it in much detail. https://en.wikipedia.org/wiki/Reed–Solomon_error_correction
Or base 16, and have 6 unused values. Either way, the whole "DNA as data storage" (not that it wasn't a sort of data storage in the first place) is kind of weird, but I think it can be potentially useful. Not for full-size computer stuff (servers and that sort of things), but as a novelty with potential applications in smaller things.
The conversion between base 4 and base 2 is mind-numbingly simple. Base 4, base 8, base 16 etc can just be seen as shorthands for base two. Which is why the fact that DNA uses base 4 more illustrates the natural significance of base 2 than it does of base 4. The fact that DNA has the potential to be used as a compact form of information storage has been known for some time. I'm not sure it'll be able to compete with existing technologies like holographic discs though for the high end.
I would say a freakshow. But the point I was making is that if data can be stored in a few cells of biological matter that is otherwise inert, data smuggling can be done without any risk of discovery. Make several thousand such cells and inject them somewhere that will not be possible to inspect, like inside your breasts. (Male AND female apply equally here, both genders have breasts.) If anyone decided to try to inspect them thoroughly, they would be begging for a sexual harassment lawsuit. Likewise any attempt to extract samples from the hundreds of different areas a tiny sample could be injected would require a drastic change of laws that would probably result in massive protests and civil unrest. I am not trying to get us on political/legal subjects. But most other industries that use data storage devices have no need for this sort of undetectable storage. Espionage is the only one that would capitalize on it. What would change this is if they somehow made self replicating organic storage devices that retained the data across millions of generations. Then it would outshine the best data storage of the day, with exception to how the hell you access the data. With this sort, you need a microscope and loads of time.
I should point out that the story this all stems from is an outright lie. The idea of using DNA for data storage happened in Star Trek almost a decade ago. And in other science fiction works many decades ago. Basically once people understood DNA could be altered, the idea was there. The claim that these "scientists" came up with the idea in a bar is absurd. Perhaps they remembered reading about it from some sci-fi a decade or more ago, but they are not the first to come up with the idea. Calling them plagiarists is too kind. A fucking search engine would confirm that the idea was not new. They are scammers. And they sully the name of scammers selling wonder-placebos by claiming to be the first. If they want bragging rights, they can demonstrate that it works publicly. Until they do, this is a hoax in my eyes. If they want their names to go down in history with great reverence, they can work on actually making an automated system to extract the data. That would be much more of a feat than manufacturing a billion cells of bio-junk and twisting a few of each batch chemically until they have the pairs they want. As the article reads, we have exactly nothing but their word that this is anything other than dust in a vial. /rant off (I know I get worked up over little things. Sorry. But I am a skeptic, and until they can prove it, I will look at them with the harshest criticism tolerable on these forums.)
An idea is fine, but giving credit to its invention vs. actually putting it into practice -- well It all depends on how detailed the original idea gets. For example, Arthur C. Clarke has long been credited as the 'inventor' of geosynchronous satellites, even though he never patented it, because it was such a simple idea, and satellites already existed. On the other hand, powered flight was not such a simple thing to get done by the existing technology of the time, so simply writing about it is not enough to gain credit for inventing it. No one would claim that John Brunner invented the computer virus even though he wrote about them before they actually existed in reality. People wrote about atom bombs before such a bomb existed (a couple of people almost went to jail over it, in fact). But they didn't invent it. They just pieced together existing clues from public scientific and technical documents. I think the same has to be said about encoding data in DNA, as with flight.. It's not enough to have the concept, one has to prove that concept to get credit as its inventor. The engineering of it has to be exact enough to prove that it can be used for that purpose. Engineers get no respect lol. It's always those damn scientists with their sexy 'slide rules' and 'theories' and 'equations'. Never the guy who figures out how to build the darn thing who gets the girls and the ferraris (j/k)
I just want to put it out there; but I mean, sure, they didn't come up with the idea - but they (as scientists) probably came up with the idea of doing it via X - alternatively, this was 20+ years ago that they met in a pub. As the paper does not specify, I'd be wary about casting aspersions on this being a wild publicity stunt. I mean, everyone knows that stuff falls down, right? Yet I don't see people screaming and shouting at Newton for "discovering gravity". Secondly, and perhaps more importantly, given that two independent teams have done this (as a prototyped technology) we have fairly convincing evidence that its possible. Thirdly: If you wanted to smuggle information over borders... you really don't need a super complex method to do this, lol. If you're known to be a spy, you're likely to be stopped. If you're not, you're not likely to be searched. I mean, seriously. The main theory behind developing DNA storage would be to encrypt vast amounts of knowledge into a very small amount of space - so for example, you're an organisation in charge of [some stupid big modifier]bytes of information in the near future. You want to back this up. You could A) do it with [some big number] of HDDs, or use a few strands of DNA. Sure, the latter is more expensive, but 20 years ago a computer was expensive compared to a far superior mobile phone today.
When you think real spies, you shouldn't be thinking James Bond, but look at the true stories. "The Falcon and the Snowman", Or "Breach". In both cases, there was no need for the spy to smuggle the info out of the country, just out of the facilities where they worked to someone working in the embassy who was paying them. DNA is not a very practical means of transporting data in those typical cases. Generally, they were dealing with actual documents or film containing photos of documents. Once it's outside the facility where the info originates, there's very little that can be done to stop it. The best you can hope for is to catch your spy. Because no one wants to start messing directly with ambassadors and diplomats.
k you for the clarification Mining. But I never meant to imply that real spies work like movies portray them. There is no glory is photocopying documents and shoving them up your expletives as a means to get the raw data to your side of whatever. The film industry would also have us believe a golden gun is somehow more dangerous than a steel one, and never explodes by being made of an inferior metal... Mcguyver science is not science at all. But when I read the article above I read "Mcguyver made a fucking ICBM with a multistage engine and a dozen warheads out of a matchstick and some pocket lint." But even with your explanation of the truth behind this, the fact remains that it is currently not possible, and even if it were, it would be aimed at covert and secretive organizations with much to hide and unlimited resources. If you need it to be encrypted, you are wanting it hidden. That is the most simplistic breakdown I can come up with. Just to nit-pick, encryption never reduces the size of the data storage required. I know that is not what you meant, but I am in a nit-picky mood. I keep an encrypted partition on my drive with my passwords and logins for all the sites and services I use. But aside from that I have zero use for encryption. But because of this I know quite a bit about it. (It gave me something useful to read up on a decade back.) The first weak link will always be the point of attack in any given case where it is an option. If the authorities ever asked me for my password I would refuse, but upon being instructed to decrypt the data by a court of law I would do exactly that. I would still never give up the password even if ordered to do so by a court as that opens the door for them to require me to come up with a new one every time they decide they have reason again. (Once you ever give up one password you can never use that one again.) Here is the US they have been careful not to let this sort of case get to a trial. Because they would likely risk a loss that cannot be endured by our law enforcement agencies. (Not monetary, but in the precedent it would set that could and would allow likely criminals to avoid giving up their encrypted content.) My worst offenses include and are limited to being an ass everywhere I go and speaking my mind in situations where some would prefer I did not. I doubt my passwords will ever be requested. But finally, let us get back to the real topic. Can anyone come up with a realistic goal of DNA as a data storage mechanism that does not involve espionage? If we already had Star Trek scanner devices, we could capitalize on this and have the entire Internet in all its (un)glory in a jar the size of a finger. But the weak point remains the need to manually read the DNA via a microscope.
It's really not about encryption in the traditional sense (keeping something secret). It's more about the fact that something stored in DNA is very small. To quote the article: It's not about storing information in a living person (or mutant creature, as I joked), but about storing lots and lots of data in a relatively small container designed for that purpose. So examples of applications for this could be storing huge libraries of information digitally, at a fraction of today's cost to house that data.
Ah, the idea itself is not original at all and artificial DNA (well, RNA mostly) has been around for some time. I guess if they managed make predefined chains of our beloved bases reliably and quite fast, that would be something. But everybody who has to do something with DNA has thought of how awesome it would be to store information in such a compact way. I also can't see this replacing other means of data storage for your average person anytime soon, since setting up an automated reader/encoder would be hugely, hugely costly. I still shudder at the numbers they threw at us in the practical parts concerning the sequencers Also, you wouldn't want to inject this into a living thing. DNA is stored in cells, and guess what the body does with those? Things I can think of that could happen: 1) Your spycell is recognized as a pathogen (in this case that just means foreign to your body), your macrophages take care of it. This is the most likely thing. 2) Yay, cells! Let's try to read the DNA and build some proteins! Except it's all gibberish and I rather think that the mitosis wouldn't really work either, so I guess they'd selfdestruct after some time. 3) You put that DNA into something that is not a cell, but more like a cover of proteins. Remember what happens with anything your body can't make use of? It goes out the other side. 4) Put it in a hull that resembles some kind of transmitter/hormone, but stays inert in the receptor. Which is basically a drug. Something like this might just work, but I'm not sure how you would place the DNA there such that it is not destroyed. This would also give the term "Information Overdose" a whole new meaning. Other things to keep in mind: DNA is far, far from inert, as portrayed here. Thousands of mutations happen in our body every second. Chemicals, heat, sunlight, reproduction errors... All of these change the DNA sequence, because the molecules are very similar basically. Usually, this has no effect because our body has tons of repair mechanisms in place to counter this, as well as most of our DNA being just repetititions of simple patterns (Junk-DNA would be the trivial, but misleading name) and the actual encoding part being something like 2-3% if I remember correctly. This redundancy is needed, because even small mutations can turn important information into gibberish. Just imagine you delete 1 base: The whole sequence starting there is now totally wrong, because everything has been moved to the left. The whole output is now wrong. Imagine if that happened a few hundred times over the course of your DNA strain, and you can also add some random insertions. TL;DR: I'm very sceptic about this. The idea is old, afaik something like this has been possible for some time, albeit slowly and with a lot of costs. Also, I don't think that there are too many practical usages for this at this time, due to lots of difficulties and problems.
The idea of storing information in DNA shouldn't be new... that is like saying that storing data on the platters of a hard drive is novel. DNA has evolved (or been designed if you will) to store data.This is almost expected, as we are really progressing in the ease of artifically attaching base pairs in order. The information was actually converted to BASE 3 which is described in great detail in the paper, this helps for a number of reasons, but principle in this was for redundancy and error checking. By using base 3, you ensure that there is no coding that requires more than 2 of the same base pairs in a row. Also the greatly reported information density of 2.2 petabytes per gram is extremely extrapolated. If you look they had at best only several MB of data encrypted on DNA because if you think about it a gram of DNA is a ridiculous amount, and while it is still fundamentally cool to think about, we have no technology that could 'write' that amount in an amount of realistic timescale. Finally I wanted to point out that information density of DNA is pretty impressive but there should be even better ways to do it. Just like optical 3D storage has yet to be attained effectively, there is huge information potential by adding dimensionality to the storage. The near perfect example is the the HIV virus. It is remarkably small amount of RNA that increases the amount of information transferred by taking advantage of how proteins fold to create proteins that perform different tasks based on where the encoding began and ended. There is nothing like this in the computer world as far as I can tell. Could you imagine a computer code that was so well written that it read a small string of code to create additional code. That code in turn would read every third bit of the original code and every second bit of itself to make another program that would read itself backwards along with every prime bit to create more code. Different variations of reading the bits in the right order would further build the program on. Eventually you would have expanded yourself into tons of code out of the same small amount of stuff. Remember, all you are working with are 1's and 0's. If you could design a complicated enough algorithm that simply chose new ones and zeroes out of an existing string of them, you could write more choosing programs indefinitely. -And that is kind of how AIDS works, except that it create proteins instead of code that further read the code and do other tasks (like modifying other proteins already created or already present). And we have yet to even crack the surface to protein folding. We have a long way to go. DNA is truly amazing and from what we currently know, or assume is that most of our Genome isn't used or expressed in anyway, and even that all combined with some would be (some wild guess) only a few micrograms.
Actually (though its super not how most people do stuff because it's pretty hard to do + non-standard) http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
SHA in all forms is *NOT* a cryptographic mechanism. It is a hashing mechanism. To our computer unaided eyes they all appear to be gibberish, but we know the difference upon closer examination. And just because you have the SHA-1 hash, that in no way means you have the data it represents. An example follows: File name unimportant.iso Size 7.13 Gigabytes. SHA-1 Hash is 00F0060F1F0040749C90047CD50038B08F0014DA Even with that hash, you have no idea what the file is beyond the data I provided. If you search for that hash you may find it,* but you still have to download it to have the data. Thus it is in no way a compression or encryption scheme. The original file remains untouched after hashing. *Edit* Here look for yourselves if you want to. https://en.wikipedia.org/wiki/SHA-1 *Edit* Just another edit to point out that I am not upset. Reading my post now it sounds like I am a raging nutcase. But I just like to be clear. I have no hostility towards you or anyone else Kaidelong. Also, that SHA-1 is was a real hash from a real ISO image of a commercial game. I backup everything I have so it does not get destroyed in the case of a drive failure or me somehow losing my logins to the sites where I got it from... I am now changing it since I do not want you added to a list of people who searched for the hash. So the hash looks fake because it is made up. I just replaced four pairs of characters with zeros. That should make it close enough to non-existent. (And likely not a real SHA-1 hash now anyway.)
I know you can't recover the plaintext from the ciphertext. I didn't really study hash algorithms much, most of my work was with the Rabin cryptosystem and the disambiguation associated with it. However cryptographic hash algorithms can be seen as one way cryptosystems where the message sent is hidden but the ciphertext cannot ever be decrypted. It is useful for dealing with situations where the content of a message is not important, only that both parties have the correct message.