Archive for the ‘Biology’ Category

DNA Hacks — More Bits per Basepair

Thursday, December 14th, 2006

Eric Kool (what a name, I wonder if he has a brother named Joe) at Stanford University has created a clever hack on DNA where instead of storing the customary two bits per base pair, it can store three bits. Here, he inserts a benzene ring into the chemical structure of the nucleic acids and creates an “expanded” base pair set, thus increasing the set of base pairs from C,G,T, and A to include xC,xG,xT, and xA. So now, instead of being able to store just A-T/G-C pairs, a piece of DNA can now store xA-T, A-xT, xG-C, and G-xC combinations (x-x combinations and non x-x combinations are disallowed due to spacing design rules imposed by the rigidity of the deoxyribose backbone). It’s like StrataFlash for your cell nucleus. Of course, there are no polymerases in the cell that can handle replicating these, and there are no metabolic pathways to synthesize these nucleotides, but Rome wasn’t built in a day either.

Okay, okay, so this wasn’t a name that ware–it’s coming soon, I promise, and it’s a pretty interesting one too, I think–but when I read the article in Nature, I thought it was just too cool not to write a short post about it. The thought that something as evolved and taken for granted as DNA can be improved upon is pretty exciting; there’s apparently a lot more to explore out there! Presumably, there is some marked downside to xDNA, otherwise, evolution would have picked up on it…perhaps the metabolic overhead of creating and maintaining all of these extra base pairs wasn’t worth the overhead of getting better coding efficiency. Small viruses could probably benefit from more coding density, but there’s that nasty interoperability problem of xDNA with regular DNA. Then again, evolution tends towards local minima, and perhaps xDNA is in fact superior but chance never lined up to put all the right factors together in a single cell to create a sustainable xDNA line. I wonder if there is some alien lifeform out there (or perhaps a yet undiscovered species on this good planet) that uses the xDNA coding scheme.

Here’s the image from the Nature article, which gives you a better idea of how this stuff works:


Saturday, May 13th, 2006

So I’ve been admonished in the past for posting ponderings and opinions on my blog–I guess the problem is that my comments are not a-priori peer-reviewed, and it seems a lot like I’m just pontificating to an audience on my personal peeves. I think, however, writing to the blog helps me organize my internal thoughts, and I enjoy the a-postiori commentary to my post, which can be more embarrassing and candid than any private peer-review. Well, either way, if you don’t like reading about my opinions or don’t want to be influenced by them, skip this post and the one after it.

I was reading Nature again (a lot of my pondering posts seem to start there!) and being a hacker as well as an armchair quarterback in molecular biology and genomics, I’m gently amused by the surprise that the genetics community is registering about the results from the Human Genome project. Simply put, there was a prevailing notion that once we had the entire genetic sequence written out, we would crack the code on all sorts of diseases and be able to trace out the function of a cell–and perhaps the human body–from the ground truth of the genetic code.

However, for the past year I have read numerous articles that contain a phrase similar to this: researchers were surprised to find that having the source code told them nothing about how the network was configured. Or better yet, having the source code wasn’t useful because the code is self-modifying. Simply put, the Human Genome project is like having the source code to your OS, but humans are complex networks of cellular machines; many diseases and problems arise from a failure of the network or a failure of the configuration of the OS, which is not apparent from the source code alone.

I guess, to some extent, it’s not surprising that biologists are peeling the onion instead of cutting through it. I remember back in college, I took a couple of molecular biology courses. It was interesting to see the approach of the typical pre-med/biology student toward biology: lots of rote memorization, with no attention at all to system design. It’s like trying to study computer architecture by memorizing the configuration of all the transistors in a standard cell library, without understanding why you’d use one element over another.

My personal experience is that there is a significant amount of architecture in biology. When people found out I had none of the organic chemistry or genetics prerequisites for the molecular biology class, they looked at me like I was crazy. However, I survived the class with relatively little studying, the difference being that I looked at molecular biology from a system standpoint. I tried to look for high-level patterns, and totally skipped the memorizing the basic patterns–because for the tests, we were allowed to bring in an 8.5×11 sheet with notes. I wrote the basic organic chemistry operations on there, as well as the basic formulae and chemical reaction sequences I would need, so I didn’t have to memorize them. The class also focused a lot of its attention on the design of an experiment–how do you analyze a complex system and determine its features given a set of limited techniques? I remember we had a number of difficult questions about using radioactive carbon labeling to try and determine the metabolic path of a molecule. The techniques you use to design these experiments are very similar to those you use when reverse engineering a hardware system.

Epigenomics is a field that I think is very interesting and exciting, and is closing in on the idea of a “biological architect”. Epigenomics is the study of the tertiary and quaternary genetic code, to borrow terms from protein folding (okay, for you real biologists out there, I am really pushing it). It turns out that DNA is indeed self-modifying and carries information beyond the genetic code. For example, your DNA adds methyl (CH3) groups to its backbone, which modifies the rate of protein expression from that segment of DNA. Also, DNA has a very complex 3-D structure. Those Hollywood views we have of DNA being this beautiful, perfect double-helix are eminently misleading. DNA is twisted upon itself, tied in knots, and bound up by histones (protein complexes that act like DNA katamari). Given that chemical machinery is essentially a mechanical computer, the 3-D morphology of a molecule is as much part of the programming as is its composition. So Epigenomics in my view should be the study of all the factors that aren’t coded in the genome–sort of like a study of all the different configurations of an OS and how it affects the race conditions, callbacks and stability of an OS. Stepping beyond that, we have the network context and ultimately the user behavior. A human cell is many orders of magnitude more complex than the internet, and a single cell is a far cry from a human being. We are a long way off from understanding the human genome and what it really means in the context of the human network, which means there will be a lot of interesting and exciting work for years to come.

And so I ponder on this beautiful, mellow Saturday afternoon in San Diego as I procrastinate on my long list of things to do…

Freakin’ Cool DNA Tricks

Saturday, March 18th, 2006

So tonight I took in my copy of Nature from the mail and opened it up…and saw smiley faces. I thought this was the coolest thing I had seen in a while: Paul W.K. Rothemund, a researcher at Caltech, has figured out a way to cause DNA to fold into arbitrary patterns (Nature calls it DNA origami). The patterns include, of all things, a nano-scale smiley face (I’m glad to see he has a sense of humor! The article actually has a number of other very interesting patterns that demonstrate the utter generality of the technique). And when I say nano-scale smiley, I mean a smiley face that’s smaller than 100 nm across. Given that fabs struggle to get a single rectangular strip of polysilicon to print that is 65 nm, this is a really remarkable result. Although I haven’t had a chance to fully digest the article (after all, I’m reading this the evening of St. Patty’s day), the technique used is ingenious. They use a well-characterized 7 kBase DNA sequence from the bacteriophage virus M13mp18 and they use a combination of computer scripts to determine a set of oligonucleotides (short DNA sequences) that they can easily synthesize to “staple” the structure into the desired pattern. As far as I can tell, they take a gob of these genomes and mix it with these oligos in a test tube, shake, and viola, they self-assemble into patterns–such as the smiley face below (pictures from the Nature article linked above–hope this is fair use…)

The scale bar on the top image is 100 nm(!). The mind boggles. Even though one can obviously see defectivity in the self-assembled arrays, they are so tiny that it sort of doesn’t matter. Five or six of these smiley faces easily fit onto a single small-sized transistor’s gate from one of the silicon die shown in the blog post below this one. And these are complex structures, not some simple rectangle of polysilicon. Fortunately, guys like Andre’ DeHon have thought about how to leverage such phenomenal small-scale integration through fault-tolerance techniques that map well into the computer domain. Oh, and I forgot to mention, the author of the article notes that there are many forms of chemically modified DNA available that could be incorporated into these scaffords to enable the incorporation of even more wacky self-assembled features.

I think maybe it’s past time for me to move up the periodic table one row and get into Carbon hacking. The good work of guys like Rothemund, Knight, and Endy are bringing synthetic biology to a level where even I might be able to play with it and create something neat, if not just artistic and fun.

After sleeping on the thought for a night, I realized one of the really important aspects of this work is that they used an arbitrary DNA ring as the substrate for these structures…M13mp18 happens to be well known, but I suppose any fully sequenced DNA strand might also work. It’d be funny to see smiley faces made with my Y chromosome (despite being my smallest, gimpiest, chromosome it’s still 23 million base pairs…maybe a bit too big for this technique). At any rate, the key is that you only need to synthesize a couple hundred short segments of DNA (a relatively easy task) to pin down a very long segment of sequenced DNA (something that is currently hard to sythesize) into an arbitrary shape.


Tuesday, March 29th, 2005

Plasmids. The mere mention of the word brings tingles to the spine of the bio-engineer. Little circles of DNA, awaiting to be cleaved, replicated and spliced. One of my old friends, Melina Fan, started a non-profit called AddGene whose sole mission is devoted to the collection of these tiny morsels of life. I think she looks at it more from the biologist’s perspective, that being her background, but as an engineer and hacker, I look at it more as the start of a DigiKey or findchips of the bio-world. Well, it’s not a full-service broker yet for all parts bio-engineering, but maybe someday I’ll be able to order my cell membrane liposaccharide coding sequences from them and my DNA polymerase kit to hack a little local flavor into my favorite bio-sensor transducer bacteria. Someday…could be fun!