As their name implies, ultraconserved elements (UCEs) are highly conserved regions of organismal genomes shared among evolutionary distant taxa - for instance, birds share many UCEs with humans. UCEs were first described in a wonderful manuscript by Gil Bejerano et al. (2004) from David Haussler’s group and subsequently identified in several classes of organisms outside the group of original taxa (Siepel et al. 2005) used to identify these genomic elements. The 27-way vertebrate genome alignment (Miller et al. 2007) identified additional regions of high conservation.
We have discovered (see Citations) that we can collect data from UCEs and the DNA adjacent to UCE locations (flanking DNA), and that these data are useful for reconstructing the evolutionary history and population-level relationships of many organisms. Because UCEs are conserved across disparate taxa, UCEs are also
That's an extremely good question, and one to which we do not entirely know the answer (Dermitzakis et al. 2005). UCEs have been associated with gene regulation (Pennachio et al. 2006) and development (Sandelin et al. 2004, Woolfe et al. 2004) and we generally assume that UCEs must be important by the very nature of their near-universal conservation across extremely divergent taxa. However, gene knockouts of UCE loci in mice resulted in viable, fertile offspring (Ahituv et al. 2007), suggesting that their role in the biology of the genome may be cryptic.
You can identify UCEs in organismal genome sequences by aligning several genomes to each other, scanning the resulting genome alignments for areas of very high (95-100%) sequence conservation, and filtering on user-defined criteria, such as length (e.g., Bejerano et al. 2004). There are a number of different ways to do this, and we have detailed one approach to identifying UCEs for use as genetic markers and designing baits to target them Faircloth 2017. There is also a tutorial that shows how to use this approach available as part of the PHYLUCE documentation.
From a set of UCEs shared among a taxonomic group, we design sequence capture (AKA solution hybrid selection sensu Gnirke et al. 2009) baits that are similar in sequence to the UCE loci we are targeting. These bait sets differ in number and composition, depending on the types of questions we are asking and the taxa with which we are working. Once we design a bait set, we follow sequence capture protocols to enrich DNA libraries for the target UCEs, usually in multiplex. Following enrichment, we sequence the DNA enriched for UCEs using massively parallel sequencing, usually on Illumina platforms.
The most complex part of using UCEs to understand evolutionary relationships, population structure, and population relationships is analyzing the DNA sequence data. We have created several software packages and we're working on tutorials to help get you started. Many of the steps, at this point, require that you are comfortable working with computer software on the command line. We encourage everyone interested to get the software and contribute to the effort of documenting, improving, and extending our computer code.
The manuscripts listed below are the primary citations establishing UCEs as useful phylogenomic markers, markers from which we can collect empirical data, markers we can use to infer shallow-level relationships, and markers that can do all of these things across a variety of vertebrate and invertebrate lineages.
Smith BT, Harvey MG, Faircloth BC, Glenn TC, Brumfield RT. 2014. Target Capture and Massively Parallel Sequencing of Ultraconserved Elements (UCEs) for Comparative Studies at Shallow Evolutionary Time Scales. Syst Biol 63:83-95. doi:10.1093/sysbio/syt061.
Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumfield RT, Glenn TC. 2012. Ultraconserved Elements Anchor Thousands of Genetic Markers Spanning Multiple Evolutionary Timescales. Syst Biol 61:717-726. pmid: 22232343 doi:10.1093/sysbio/sys004.
McCormack JE, Faircloth BC, Crawford NG, Gowaty PA, Brumfield RT, Glenn TC. 2012. Ultraconserved Elements Are Novel Phylogenomic Markers that Resolve Placental Mammal Phylogeny when Combined with Species Tree Analysis. Genome Res 22: 746–754. pmid: 22207614 doi: 10.1101/gr.125864.111.
Below are several commercial laboratories offering UCE enrichment as a service. Generally speaking, these commercial vendors will accept DNA extracts for enrichment, conduct the library preparation and enrichment steps, sequence the enriched libraries, and return the sequence data to you. We do not derive any referall revenue from these companies, but we list each here to help interested labs get started.
Below are several probe designs that we have used to study relationships among amniotes/tetrapods (e.g. Crawford
You can now buy each of these probe sets direct from Arbor Biosciences in the form of a capture kit. Arbor Biosciences has even made a discounted "pilot" sized kit available for labs who want to do some test enrichments.
Below is the probe design that we used to understand relationships among the early diverging teleosts (Faircloth
You can now buy this probe set direct from Arbor Biosciences in the form of a capture kit. Arbor Biosciences has even made a discounted "pilot" sized kit available for labs who want to do some test enrichments.
Below are a number of bait designs targeting UCEs in different invertebrate groups. The baits designs described below derive from Faircloth 2017.
You can buy many of these probe set direct from Arbor Biosciences in the form of a capture kit. Arbor Biosciences has even made a discounted "pilot" sized kit available for labs who want to do some test enrichments.
Described as part of Faircloth 2017. First use has not been published, yet.
Below are several software packages we have developed to help analyze data collected from UCE loci. All computer code is available under a flexible open-source license (BSD). We welcome all code contributions, from helping to improve the code, fix bugs, improve usability, and improve documentation, which is rather sparse, at the moment. Please contact us through twitter (@ultraconserved) if you are interested in helping and/or post an issue on github for the respective package.
Note: All software packages are likely to contain bugs - use at your own risk.
Our main code repository for analyzing data collection from UCE loci. Contains command-line applications for assembling contigs from sequence data, finding which contigs align to UCEs, aligning UCE contigs, and preparing data for downstream analysis in mrbayes, raxml, and cloudforest.
Report an issue with phyluce.
A program for automated cleaning of fastq files from sequencing. Removes adapter contamination using scythe and trims reads for quality using sickle. Concatenates reads into an "interleaved" fastq.gz file for use with velvet.
Report an issue with illumiprocessor