Sandwalk: Educating an Intelligent Design Creationist: Rare Transcripts

Thursday, April 11, 2013

Educating an Intelligent Design Creationist: Rare Transcripts

I'm replying to a post by andyjones (More and more) Function, the evolution-free gospel of ENCODE. This was the fourth post in a series and I'm working my way through five issues that Intelligent Design Creationists need to understand.

Educating an Intelligent Design Creationist: Introduction
Educating an Intelligent Design Creationist: Pervasive Transcription

Andyjones says he didn't know that many of the unusual transcript are very rare. That's a shame because it's one of the very important things you need to know in order to have an intelligent opinion about junk DNA. Here's a question from andyjones ...

The second point is interesting, but I have to ask the question: given the fact that we don’t know everything about the genome, isn’t it precisely those parts that are rarely transcribed that would give most difficulty when it comes to determining their functions?

The simple answer to your question is "yes" but that doesn't mean we don't have clues. The best explanation depends on how rare the transcripts are and on whether there's another, equally reasonable, explanation that accounts for their existence. What we can say right now is that the presence of these rare transcripts is consistent with junk DNA. We can also say that there's no reasonable functional explanation for huge numbers of transcripts that are present at less that one copy per cell. Think about that for a minute. It means that right now there are only two scientifically reasonable explanations: (1) junk DNA/RNA, and (2) we don't know if they have a function. It is scientifically incorrect to say that these transcribed regions are functional and therefore junk DNA is refuted.¹

Let's review the evidence for transcript abundance. The number of specific mRNA molecules per mammalian cell range from tens of thousands to several hundred to ten or less. These three classes are quite common. It's been known since experiments in the 1970's that there are very few highly abundant mRNAs (e.g. ovalbumin in oviducts, globin mRNA in erythrocytes). Most fall into the intermediate class and a small number are low abundance mRNAs. It's unlikely that a steady-state level of only a few mRNAs could support enough protein synthesis to make much of a difference but it can't be ruled out. It's even more unlikely that such rare transcripts could perform a regulatory function since they would have to be ten times more abundant that the mRNA they regulate.

Theme

Genomes
& Junk DNAOnce of the most widely-read discussions of transcript abundance comes from a paper written by my colleagues Ben Blencowe and Tim Hughs (van Bakel et al. 2010). They have a nice figure to illustrate the difference between the mass of RNA being analyzed and its complexity (= the number of different sequences). In this case they are looking at poly A⁺ RNA—that's supposed to be almost exclusively mRNA that encodes protein. [I wrote up some things about this paper when it first came out: Junk RNA or Imaginary RNA.]

You would expect that the bulk of this RNA would correspond to exons and that's exactly what they find. In the experiment with human RNA they show that 88% of the mass of RNA is transcribed from exons. That figure is also 88% in the mouse experiment. Now look at the parts of the genome that are covered by these preparations of RNA. That's shown on the right in the figure below. In this case, 51% of the sequences represented in the preparation are complementary to introns. There shouldn't be any introns in mRNA so this represents mostly contamination or artifact. What it says is that 5.8% of the bulk RNA (red bars) covers more than half of the total complexity of the RNA preparation.

Each individual bit of intron RNA is extremely rare. The intron fraction is high complexity, low abundance.

The other three fractions (EST exon, EST intron, and other) reveal a similar pattern. These are unannotated gene sequences that are most likely junk DNA. They make up 6.4% of the bulk of RNA (human) but cover 26% of the total genome coverage. These are very rare bits of RNA from all over the genome. They are probably present at much less than one copy per cell.

If one looks at poly A⁺ RNA from many different tissues—as the ENCODe project did—then what you find is that you eventually begin to saturate the known protein-encoding genes but the rare transcripts from intergenic regions continue to cover more and more of the genome until, eventually, it looks like almost all of the genome is transcribed in one tissue or another. This is "pervasive transcription."

As van Bakel et al. (2010) point out ...
THEME:
Transcription

... the fact that such pervasive transcription would only be detected at sequencing depths more than two orders of magnitude above current levels suggests that these transcripts may largely be attributed to biological and/or technical background. Indeed, the vast majority of intergenic and intronic seqfrags have very low sequence coverage (Figure 2E, 2F), exemplified by the fact that 70% (human) to 80% (mouse) of the transcribed area in these regions is detected by a single RNA-Seq read in only one sample, much of which is consistent with random placement.

If you can only detect one single transcript of a particular region then transcription of that part of the genome must be exceedingly rare.

This does not mean that all intergenic transcription is rare because van Bakel et al. (2010) go to great lengths to identify some 16,000 sites where transcription is frequent enough to suggest functionality. However, this is only a small fraction of the genome. The rest of the transcripts are consistent with models of random transcription.

The same rationale applies to bulk RNA (i.e. not poly A⁺ RNA) except that you detect a lot more intergenic transcription.

Because our genomes have introns, Alu elements, and endogenous retroviruses, these things must be doing us some good. Because a region is transcribed, its transcript must have some fitness benefit, however remote. Because residue N of protein P is leucine in species A and isoleucine in species B, there must be some selection-based explanation. This approach enshrines “panadaptationism,” which was forcefully and effectively debunked by Gould and Lewontin in 1979 but still informs much of molecular and evolutionary genetics, including genomics.

Ford Doolittle (2013)The authors suggest that the observed results can be more satisfactorily explained by accidental or spurious transcription leading to rare transcription of an intergenic region of the genome. ("Pervasive transcription of intergenic regions as described in previous studies occurs at a significantly reduced level and is of a random character.")

They also suggest that the appropriate null model in these studies should be accidental transcription ("To be conservative, a null hypothesis should perhaps be that novel transcripts—particularly those that are small and low-abundance—are a by-product rather than an independent functional unit. Searching for phenotypes caused by genetic perturbation may be the most useful approach to disproving the null hypothesis.") The onus should be on those who claim function to support their case. It's not up to the opponents of pervasive functional transcription to prove lack of function. Function is not the default option as long as you understand that transcription is not perfect.²

There's an important point here. It's not sufficient just to show that one's RNA prep covers a large part of the genome. You also have to include quantitative data so the result can be realistically evaluated. When van Bakel et al. (2010) challenge "pervasive transcription" they are not challenging the data that show RNA hybridization to the bulk of the genome. What they are challenging is the fact that much of this coverage is consistent with rare, random transcription. That's not "pervasive transcription" in their minds.

John Mattick challenged the conclusions of van Bakel et al. (2010) (Clark et al. 2010) and my colleagues responded (van Bakel et al. 2011). This debate is well known³ but the controversy is completely ignored in the summary papers from the ENCODE project last September.

Some opponents of junk DNA (i.e. proponents of function for all/most transcripts) are aware of the problem and own up to the difficulty of defining function. I want to close with a quotation from Willingham and Gingeras (2006) to prove that good scientists discuss both sides of a controversy.

Noncoding RNAs and Their Functions

A key question hangs like an ominous cloud over these observations of widespread transcription: are these transcripts biologically functional, or are they the transcriptional noise of a less than precise set of biological processes? Recent experiments in mice in which megabase “gene desert” regions have been deleted underscore the relevance of this question. Deletion of 1.5 Mb and 0.8 Mb genomic intervals, which together contain 1243 noncoding sequences conserved between rodent and primate, resulted in viable mice with no obvious deleterious phenotypes (Nobrega et al., 2004). However, if history is our guide, then the answer to this question may be complex.

1. That's why The Myth of Junk DNA is not a science book. It's basically an argument from ignorance where "we don't know" is translated to mean "it must be functional."

2. This concept is not new. Michael White wrote about the proper null hyoothesis some years ago in Genomic Junk And Transcriptional Noise. I elaborated a little bit in my blog post about his paper [see How to Frame a Null Hypothesis]

3. Jonathan Wells devotes several pages to attacking the reputation of Hughes and Blencowe and their colleagues.

Clark, M.B., Amaral, P.P., Schlesinger, F.J., Dinger, M.E., Taft, R.J., Rinn, J.L., Ponting, C.P., Stadler, P.F., Morris, K.V. and Morillon, A. (2011) The reality of pervasive transcription. PLoS biology 9, e1000625. [doi: 10.1371/journal.pbio.1000625]

Doolittle, W.F. (2013) Is junk DNA bunk? A critique of ENCODE. Proc. Natl. Acad. Sci. (USA) 110:5294-5300. [doi: 10.1073/pnas.1221376110]

Gould, S.J. and Lewontin, R.C. (1979) The spandrels of San Marco and the Panglossian paradigm: a critique of the adaptationist programme. Proc. Royal Soc. (London) Series B. Biological Sciences 205:581-598.

Willingham, A.T. and Gingeras, T.R. (2006) TUF love for “junk” DNA. Cell 125:1215-1220. [doi: 10.1016/j.cell.2006.06.009]

van Bakel, H., Nislow, C., Blencowe, B.J. and Hughs, T.R. (2010) Most “dark matter” transcripts are associated with known genes. PLoS Biology 8(5): e1000371. [doi: 10.1371/journal.pbio.1000371]

van Bakel, H., Nislow, C., Blencowe, B.J. and Hughes, T.R. (2011) Response to “the reality of pervasive transcription”. PLoS Biology 9, e1001102. [doi: 10.1371/journal.pbio.1001102]

14 comments :

Mikkel Rumraket Rasmussen said...: I think you have mistyped "we don't know" in your 1. footnote.; Thursday, April 11, 2013 11:23:00 AM
Mikkel Rumraket Rasmussen said...: Anyway, very informative post Larry thank you. Not just ID proponents or creationists are recieving education here.; Thursday, April 11, 2013 11:24:00 AM
Faizal Ali said...: Creationists really have a problem with the concept of the null hypothesis, don't they? Theists do in general, come to think of it.; Thursday, April 11, 2013 11:40:00 AM
Diogenes said...: Is that Willingham and Gineras or Gingeras?; Thursday, April 11, 2013 12:00:00 PM
Diogenes said...: Larry, have you interpreted the bar graph correctly?

Now look at the parts of the genome that are covered by these preparations of RNA. That's shown on the right in the figure below. In this case, 51% of the sequences represented in the preparation are complementary to introns. There shouldn't be any introns in mRNA so this represents mostly contamination or artifact. What it says is that 3% of the bulk RNA (red bars) covers more than half of the total complexity of the RNA preparation.

If by red bars you mean orange bars, it looks like 5.8% to me.

And this:

The other three fractions (EST exon, EST intron, and other) reveal a similar pattern. These are unannotated gene sequences that are most likely junk DNA. They make up 3% of the bulk of RNA (human) but cover 26% of the total genome coverage.

To me it looks like 3.3% + 0.9% + 2.2% = 6.4%.; Thursday, April 11, 2013 12:04:00 PM
Diogenes said...: Jonathan Wells devotes several pages to attacking the reputation of Hughes and Blencowe and their colleagues.

Ooh ooh ooh! Can you copy in some choice bits? I love bitchy Wells ad hominems!; Thursday, April 11, 2013 12:05:00 PM
Larry Moran said...: Thanks.; Thursday, April 11, 2013 12:21:00 PM
Larry Moran said...: Is that a rhetorical question? :-)

Thanks.; Thursday, April 11, 2013 12:23:00 PM
Larry Moran said...: Thanks. I fixed that.; Thursday, April 11, 2013 12:23:00 PM
Donald Forsdyke said...: What could be the role of rare transcripts? Following up on my previous comment, one host RNA molecule with a sequence complementary to a viral transcript (enough for two helical turns), should suffice to activate intracellular alarms (e.g. see Marcus, P. (1983) Interferon induction by viruses: one molecule of dsRNA as the threshold for induction. Interferon 5, 115-180).

The interferon signal can alert other cells not yet infected by the virus. Thus, even if, on average, there were less than one specific antiviral host RNA molecule/cell, if the initially infected cell had that RNA, the host would be alerted. This could be adaptive (http://post.queensu.ca/~forsdyke/theorimm4.htm).; Thursday, April 11, 2013 12:55:00 PM
Diogenes said...: That's interesting, but if only two helical turns are needed, then that would be a small fraction of the nucleotides within pervasive transcription.; Thursday, April 11, 2013 1:01:00 PM
Donald Forsdyke said...: For the actual calculation see section 13 of our 2001 paper at http://post.queensu.ca/~forsdyke/EBV.htm; Thursday, April 11, 2013 2:18:00 PM
Larry Moran said...: Don says,

one host RNA molecule with a sequence complementary to a viral transcript (enough for two helical turns), should suffice to activate intracellular alarms

Have you done some calculations to see how long it would take for this single RNA molecule to find the viral transcript in a typical mammalian cell with probably a BILLION other RNA binding sites? I'm thinking many days at 37°C.; Thursday, April 11, 2013 5:11:00 PM
Donald Forsdyke said...: Perhaps someone out there would attempt the calculation, taking into account the crowded nature of the cytosol, which the hand of Nature is likely to have optimized (pH, salt concentration, etc.) to favour what is perhaps the kinetically most important reaction in cells - that between tRNA anticodon loops and mRNA codons.

When we biochemists carry out RNA-RNA hybridizations in our plastic Eppendorf tubes, we try to simulate these reaction conditions by adding crowding agents, such as polyethylene glycol, and fine-tuning pH and salt concentrations. How close the results obtained from such systems correspond to what would obtain in real cells is problematic. Fancy biophysical techniques (FRET analysis) are beginning to caste light on this, but pending such studies I vote for minutes or hours, not days!; Thursday, April 11, 2013 6:40:00 PM

Quotations

The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)

Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.

Charles Darwin (1859)

Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...

Peter Atkins

Quotations

The world is not inhabited exclusively by fools, and when a subject arouses intense interest, as this one has, something other than semantics is usually at stake. Stephen Jay Gould (1982)
I have championed contingency, and will continue to do so, because its large realm and legitimate claims have been so poorly attended by evolutionary scientists who cannot discern the beat of this different drummer while their brains and ears remain tuned to only the sounds of general theory. Stephen Jay Gould (2002) p.1339
The essence of Darwinism lies in its claim that natural selection creates the fit. Variation is ubiquitous and random in direction. It supplies raw material only. Natural selection directs the course of evolutionary change. Stephen Jay Gould (1977)
Rudyard Kipling asked how the leopard got its spots, the rhino its wrinkled skin. He called his answers "just-so stories." When evolutionists try to explain form and behavior, they also tell just-so stories—and the agent is natural selection. Virtuosity in invention replaces testability as the criterion for acceptance. Stephen Jay Gould (1980)
Since 'change of gene frequencies in populations' is the 'official' definition of evolution, randomness has transgressed Darwin's border and asserted itself as an agent of evolutionary change. Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat. Stephen Jay Gould (1999) p.84

Quotations

My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.

Jerry Coyne
Why Evolution Is True

I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.

Sydney Brenner
TIBS Dec. 2000

It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations

Douglas Futuyma

One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.

Francis Crick

There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.

Sydney Brenner

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist

Richard Dawkins

Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.

Jacques Monod

The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.

Richard Lewontin

More Recent Comments

Thursday, April 11, 2013

Educating an Intelligent Design Creationist: Rare Transcripts

14 comments :