Among the Eucalyptus species, E. camaldulensis, known as river red gum, is naturally distributed in most of the Australian mainland, and is planted in many tropical and subtropical countries (Butcher PA, et al., 2002). Because of its diploid nature (2n=22) and feasibility of Agrobacterium-medicated genetic transformation (Mullins KV, et al., 1997), E. camaldulensis is suitable for molecular genetic analysis and application of genetic engineering. To quickly survey the genetic information carried by this plant and to accelerate the process of molecular breeding, we analyzed the structure of the whole genome of E. camaldulensis (Hirakawa H, et al., 2011).
The genetic information in the genome of E. camaldulensis was investigated by sequencing the genome and the cDNA using a combination of the conventional Sanger method and next-generation sequencing methods, followed by intensive bioinformatics analyses. The total length of the non-redundant genomic sequences thus obtained was 655,922,307 bp consisting of 81,246 scaffolds and 121,194 singlets. These sequences accounted for approximately 92% of the gene-containing regions with an average G+C content of 33.6%. A total of 77,121 complete and partial structures of protein-encoding genes have been deduced.
- References
- Butcher PA, et al. (2002) Heredity 88: 402–412. [Link]
- Mullins KV, et al. (1997) Plant Cell Rep. 16: 787–791. [Link]
- Hirakawa H, et al. (2011) Plant Biotechnology 28: 471-480.[Link]
- Accession numbers
- Sanger sequences
- ESTs: FY782538-FY841121 (58,584 entries)
- BACs: BADO01000001-BADO01274001 (27,4001 entries)
- Roche GS FLX Titanium
- Paired-end (3kb): DRA000466
- Paired-end (8kb): DRA000467
Assembly statistics
Statistics of the assembly of EUC_r1.0
Scaffolds | |
---|---|
Total length (bp) | 624,468,648 |
Total number | 81,246 |
Average length (bp) | 7,686 |
Maximum length (bp) | 708,721 |
N50 | 18,024 |
G+C content (%) | 33.6 |
Singlets | |
Total length (bp) | 119,756,130 |
Total number | 121,194 |
Average length (bp) | 988 |
Maximum length (bp) | 34,761 |
G+C content (%) | 39.4 |
Sequencing strategy