Genomics
12/9/09 11:41 AM
Endangered mammal

The sequence and de novo assembly of the giant panda genome

A giant panda.

Article on Nature

The giant panda, Ailuropoda melanoleura, is at high risk of extinction because of human population expansion and destruction of its habitat. The latest molecular census of its population size, using faecal samples and nine microsatellite loci, provided an estimate of only 2,500–3,000 individuals, which were confined to several small mountain habitats in Western China.

The giant panda has several unusual biological and behavioural traits, including a famously restricted diet, primarily made up of bamboo, and a very low fecundity rate. Moreover, the panda holds a unique place in evolution, and there has been continuing controversy about its phylogenetic position. At present, there is very little genetic information for the panda, which is an essential tool for detailed understanding of the biology of this organism.

A major limitation in obtaining extensive genetic data is the prohibitive costs associated with sequencing and assembling large eukaryotic genomes. The development of next-generation massively parallel sequencing technologies, including the Roche/454 Genome Sequencer FLX Instrument, the ABI SOLiD System, and the Illumina Genome Analyser, has significantly improved sequencing throughput, reduced costs, and advanced research in many areas, including large-scale resequencing of human genomes, transcriptome sequencing, messenger RNA and microRNA expression profiling, and DNA methylation studies. However, the read length of these sequencing technologies, which is much shorter than that of traditional capillary Sanger sequencing reads, has prevented its use as the sole sequencing technology in de novo assembly of large eukaryotic genomes.

Here, using only Illumina Genome Analyser sequencing technology, we have generated and assembled a draft genome sequence for the giant panda with an assembled N50 contig size reaching 40 kilobases (kb), and an N50 scaffold size of 1.3 megabases (Mb). This represents the first, to our knowledge, fully sequenced genome of the family Ursidae and the second of the order Carnivora.

We also carried out several analyses using the complete sequence data, including genome content, evolutionary analyses, and investigation of some of the genetic features underlying the panda's unique biology. The work presented here should aid in understanding and carrying out further research on the genetic basis of panda's biology, and contribute to disease control and conservation efforts for this endangered species.

Furthermore, our demonstration that next-generation sequencing technology can allow accurate de novo assembly of the giant panda genome will have far-reaching implications for promoting the construction of reference sequences for other animal and plant genomes in an efficient and cost-effective way.

Available languages: Spanish
Lynxexitu
 
Powered by