More Recent Comments

Tuesday, July 29, 2014

The most important rule for publishing a paper on alternative splicing

I'm not a big fan of alternative splicing. I think it falls into the same category as pervasive transcrition—most of it is accidental [see Alternative Splicing and Why IDiots Don't Understand How Science Works and A Challenge to Fans of Alternative Splicing ]. The error rate for splicing is known to be high [Splicing Error Rate May Be Close to 1% ].

I just read a paper about alternative splicing in Science (Lo et al., 2014) and it annoys me that a key piece of data was left out. The missing data is the abundance of the rare transcripts that are presumed to be genuine, alternatvely spliced, variants. Are they present at more than one copy per cell?

We need to know this in order to decide whether the detection of alternatively spiced variant is biologically significant. You should not be able to publish a paper on this topic without presenting your data on relative and absolute abundance [see Extraordinary Claims about Human Genes and How to Evaluate Genome Level Transcription Papers]. Surely this is obvious, having just been through the ENCODE publicity hype disaster.


Lo, W.-S., Gardiner, E., Xu, Z., Lau, C.-F., Wang, F., Zhou, J. J., Mendlein, J. D., Nangle, L. A., Chiang, K. P. X-L, Yang, K-F. Au, W. H. Wong, M. Guo, M. Zhang, and P. Schimmel1 (2014) Human tRNA synthetase catalytic nulls with diverse functions. Science 345:328-332. [doi: 10.1126/science.1252943]

7 comments :

Anonymous said...

If it was present at about 1 copy/cell and this was not due to NMD couldnt this be functional? I would think it could produce 10s of thousands of proteins before it was degraded.
Whats lowest abundance for a known functional mRNA? ..and I vaguely recall that there are proteins prensent in less than 100 copies/cell - is this correct?

Peter said...

They have Western blots for at least some of the splice variants showing that they're expressed at levels detectable by Coomassie staining. From the materials and methods you can see they started from 10^7 cells (well it says 107 cells, but I assume that's a typesetting error).

So, let's work it out. Detection limit for Coomassie is ~10ng

1g is 6x10^23 daltons (Avogadro's number)
10ng is 6 x 10^23 x 10^-8 = 6 x 10^15 daltons

The proteins they're looking at are a few tens of kilodaltons, so say 6 x 10^5 daltons to make the numbers easy. That means the band detected by Western blot represents a minimum of 1 x 10^10 protein molecules.

Given a starting input of 10^7 cells, there must be a minimum of 1000 copies of the alternative splice variant protein per cell, assuming the immunoprecipitation, Western blot and subsequent staining were all 100% perfectly efficient.

Larry Moran said...

it will always be possible to argue for function no matter what level is detected. However, if there is less than one copy per cell, it's not very likely that an RNA molecule has a biological function.

Larry Moran said...

Yes. They looked for the variant proteins in Jurkat cells (T cell cancer cells). They found some of them at about 5% of the concentration of the normal aminoacyl tRNA synthetases. If your calculation is correct then each cell would contain 20,000 copies of each of the twenty synthetases. That doesn't seem very likely but maybe these are big cells that make lots and lots of proteins at a high rate.

Nevertheless, the supplemental material does contain some information on expression levels of the alternative transcripts and the predicted polypeptides. None of these values inspires confidence that they are looking at anything other than accidental splicing and defective polypeptides.

It would be interesting to see if the same set of transcripts was produced in other mammals. As a general rule, alternative splice patterns are not conserved.

SPARC said...

Due to my experiences I am reluctant regarding the impact of alternative splicing. Especially, I doubt that many peptides encoded by alternatively spliced transcripts possess relevant functions.
Back in the times when PCR was new like everybody else I did Northern blots to detect sparc mRNA. I never saw anything else but the two mRNAs that differed in length due to the fact that two alternate polyA signals were used. According to ENSEMBL the human sparc gene expresses 9 different mRNAs 5 of which contain ORFs that would result in proteins of 303, 115, 149, 111 and 53 amino acids respectively. Of the others one is processed but without an ORF and the rest contain retained introns. Unfortunately, I don’t have any data about the abundance of these transcripts. I am unaware that anybody ever purified anything else but the protein originally described as sparc/BM-40/osteonectin and people purified tons of it because it is heavily expressed by the murine EHS tumor and it is one of the most abundant proteins in bone. Even if all of the above mentioned mRNAs were translated what would the expressed proteins look like? I currently cannot say if the shorter ORFs would encode single domains of sparc. While there is some correlation between the structure of extracellular matrix proteins like sparc and the exon-intron structure of the underlying genes many other proteins will just fold into something completely different if the sequence encoded by a skipped exon is missing from the final protein.
Additional observations that should raise doubts about the relevance of peptides encoded by alternatively spliced mRNAs come from rescue experiments: In many model organisms full knockouts of genes are available. Their phenotypes can be rescued when the respective cDNA is expressed. The later will of course only express the full length protein. If proteins encoded by alternatively spliced transcripts possessed a broader relevance the wild type phenotype should not be fully restored in such cells regularly. Since this is not the case it seems unlikely that many peptides translated from alternative transcripts play important roles.

SPARC said...

BTW, are alternative splicing databases curated toexclude or remove transcripts prone to NMD ?

SPARC said...

Are you aware of the following paper:
Gonzàlez-Porta M, Frankish A, Rung J, Harrow J, Brazma A. Transcriptome analysis of human tissues and cell lines reveals one dominant transcript per gene. Genome Biol. 2013 Jul 1;14(7):R70