What are UCEs?

As their name implies, ultraconserved elements (UCEs) are highly conserved regions of organismal genomes shared among evolutionary distant taxa - for instance, birds share many UCEs with humans. UCEs were first described in a wonderful manuscript by Gil Bejerano et al. (2004) from David Haussler’s group and subsequently identified in several classes of organisms outside the group of original taxa (Siepel et al. 2005) used to identify these genomic elements. The 27-way vertebrate genome alignment (Miller et al. 2007) identified additional regions of high conservation.

Why are UCEs useful?

We have discovered (see Citations) that we can collect data from UCEs and the DNA adjacent to UCE locations (flanking DNA), and that these data are useful for reconstructing the evolutionary history and population-level relationships of many organisms. Because UCEs are conserved across disparate taxa, UCEs are also universal genetic markers in the sense that the locations (or loci) that we can target in humans are identical, in many cases, to the loci that we can target in ducks or snakes or lizards.

What do UCEs do?

That's an extremely good question, and one to which we do not entirely know the answer (Dermitzakis et al. 2005). UCEs have been associated with gene regulation (Pennachio et al. 2006) and development (Sandelin et al. 2004, Woolfe et al. 2004) and we generally assume that UCEs must be important by the very nature of their near-universal conservation across extremely divergent taxa. However, gene knockouts of UCE loci in mice resulted in viable, fertile offspring (Ahituv et al. 2007), suggesting that their role in the biology of the genome may be cryptic.

How do I identify UCEs?

You can identify UCEs in organismal genome sequences by aligning several genomes to each other, scanning the resulting genome alignments for areas of very high (95-100%) sequence conservation, and filtering on user-defined criteria, such as length (e.g., Bejerano et al. 2004). If you want to use these regions as genetic markers, it is best to remove UCEs that appear to be duplicates of one another which we loosely define as being in more than one spot within each genome that you aligned. The resulting loci are the highly conserved that we target for use as molecular markers.

How do I collect UCE data?

From the resulting set of UCEs shared among a taxonomic group, we design sequence capture (AKA solution hybrid selection sensu Gnirke et al. 2009) probes that are similar in sequence to the UCE loci we are targeting. These probe sets differ in number and composition, depending on the types of questions we are asking and the taxa with which we are working. Once we design a probe set, we follow sequence capture protocols to enrich DNA libraries for the target UCEs, usually in multiplex. Following enrichment, we sequence the DNA enriched for UCEs using massively parallel sequencing.

Get protocols »

How do I analyze UCE data?

The most complex part of using UCEs to understand evolutionary relationships, population structure, and population relationships is analyzing the DNA sequence data. We have created several software packages and we're working on tutorials to help get you started. Many of the steps, at this point, require that you are comfortable working with computer software on the command line. We encourage everyone interested to get the software and contribute to the effort of documenting, improving, and extending our computer code.

Get computer software »


We compiled a list of questions you may have that go beyond the details provided above. If you do not find your answer in the FAQ, please get in contact with us via twitter (@ultraconserved).

Frequently asked questions »


Smith BT, MG Harvey, BC Faircloth, TC Glenn, RT Brumfield. 2014. Target Capture and Massively Parallel Sequencing of Ultraconserved Elements (UCEs) for Comparative Studies at Shallow Evolutionary Time Scales. Syst Biol 63:83-95. doi:10.1093/sysbio/syt061.


Faircloth BC, Sorenson L, Santini F, Alfaro ME. 2013. A Phylogenomic Perspective on the Radiation of Ray-Finned Fishes Based upon Targeted Sequencing of Ultraconserved Elements (UCEs). PLoS ONE 8: e65923. doi:10.1371/journal.pone.0065923.


McCormack JE, Harvey MG, Faircloth BC, Crawford NG, Glenn TC, Brumfield RT. 2013. A Phylogeny of Birds Based on Over 1,500 Loci Collected by Target Enrichment and High-Throughput Sequencing. PLoS ONE 8:e54848. pmid: 23382987 doi:10.1371/journal.pone.0054848.


Crawford NG, Faircloth BC, McCormack JE, Brumfield RT, Winker K, Glenn TC. 2012. More than 1000 ultraconserved elements provide evidence that turtles are the sister group of archosaurs. Biol Lett 8:783-786. pmid: 22593086 doi:10.1098/rsbl.2012.0331.


Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumfield RT, Glenn TC. 2012. Ultraconserved Elements Anchor Thousands of Genetic Markers Spanning Multiple Evolutionary Timescales. Syst Biol 61:717-726. pmid: 22232343 doi:10.1093/sysbio/sys004.


McCormack JE, Faircloth BC, Crawford NG, Gowaty PA, Brumfield RT, Glenn TC. 2012. Ultraconserved Elements Are Novel Phylogenomic Markers that Resolve Placental Mammal Phylogeny when Combined with Species Tree Analysis. Genome Res 22: 746–754. pmid: 22207614 doi: 10.1101/gr.125864.111.



Here's a list of UCE-related talks at various conferences. Slides are available for each talk by clicking on the linked talk title. These talks should give a general idea of where we're going with new research projects that use UCEs.

Go to talks »


Below are several commercial laboratories offering UCE enrichment as a service. Generally speaking, these commercial vendors will accept DNA extracts for enrichment, conduct the library preparation and enrichment steps, sequence the enriched libraries, and return the sequence data to you. We do not derive any referall revenue from these companies, but we list each here to help interested labs get started.

Below are several probe designs that we have used to study relationships among amniotes/tetrapods (e.g. Crawford et al. 2012, McCormack et al. 2013). We are constantly evaluating the utility of given probe sets and probe designs, in addition to expanding the number of UCE loci we are targeting. We have several larger probes sets in the works, and we are also working on optimizing probe sets based on their capture success, phylogenetic utility, etc. Please check back for updates.

You can now buy each of these probe sets direct from MYcroarray in the form of a capture kit. MYcroarray has even made a discounted "pilot" sized kit available for labs who want to do some test enrichments.

Order enrichment kits from MYcroarray »


2,560 probes targeting 2,386 UCEs

(Tetrapods-UCE-2.5Kv1)


The linked FASTA file (ZIP archive) contains probe sequences (120 nt) designed for synthesis as part of a Agilent SureSelect or MycroArray MyBait target enrichment kit. We used these probes for our in-silico analysis of the placental mammal phylogeny, our in vitro analysis of extant bird groups, and our in vitro analysis of the phylogenetic position of turtles. By their deposition in Dryad, all probes are available under a CC0 license, thus freely available for you to use, without restriction.

Note: We designed probes from UCEs by including flanking sequence from chickens. Because of the highly conserved nature of UCEs and their flanking sequence, we have found these probes work well across amniotes.

Get 2,560 probe set for amniotes »

5,472 probes targeting 5,060 UCEs

(Tetrapods-UCE-5Kv1)


The linked FASTA file (ZIP archive) contains probe sequences (120 nt) designed for synthesis as part of a Agilent SureSelect or MycroArray MyBait target enrichment kit. We used these probes for our in-silico analysis of the primate phylogeny, and the 2,560 probes targeting 2,386 loci are a subset of this larger set of probes. All probes are available under a CC0 license, thus freely available for you to use, without restriction.

Note 1: We designed probes from UCEs by including flanking sequence from chickens. Because of the highly conserved nature of UCEs and their flanking sequence, we have found these probes work well across amniotes.

Note 2: Although this probe set is not, yet, referenced in a publication, we have been using it for some time across a variety of taxa with much success.

Get 5,472 probe set for amniotes »

Below is the probe design that we used to understand relationships among the early diverging teleosts (Faircloth et al. 2013). We are constantly evaluating the utility of given probe sets and probe designs, in addition to expanding the number of UCE loci we are targeting. We have several larger probes sets in the works for fishes, and we are also working on optimizing probe sets based on their capture success, phylogenetic utility, etc. Please check back for updates.

You can now buy this probe set direct from MYcroarray in the form of a capture kit. MYcroarray has even made a discounted "pilot" sized kit available for labs who want to do some test enrichments.

Order enrichment kits from MYcroarray »


2,001 probes targeting 500 UCEs

(Actinopts-UCE-0.5Kv1)


The linked FASTA file (ZIP archive) contains probe sequences (120 nt) designed for synthesis as part of a Agilent SureSelect or MycroArray MyBait target enrichment kit. We used these probes for our analysis of early diverging teleosts. By their deposition in Dryad, all probes are available under a CC0 license, thus freely available for you to use, without restriction.

Note: We designed probes from UCEs by including flanking sequence from medaka. Because of the highly conserved nature of UCEs and their flanking sequence, we have found these probes work well across fishes.

Get 2,001 probe set for fish »

Below are several software packages we have developed to help analyze data collected from UCE loci. All computer code is available under a flexible open-source license (BSD). We welcome all code contributions, from helping to improve the code, fix bugs, improve usability, and improve documentation, which is rather sparse, at the moment. Please contact us through twitter (@ultraconserved) if you are interested in helping and/or post an issue on github for the respective package.

Note: All software packages are likely to contain bugs - use at your own risk.


phyluce

Our main code repository for analyzing data collection from UCE loci. Contains command-line applications for assembling contigs from sequence data, finding which contigs align to UCEs, aligning UCE contigs, and preparing data for downstream analysis in mrbayes, raxml, and cloudforest.

Report an issue with phyluce.

Get phyluce »

splitaake Helper code

A program for demultiplexing massively parallel sequencing reads tagged with edit distance or Hamming distance sequence tags - tailored to edit distance tags (see Tags). Capable of demultiplexing hundreds of sequence tagged libraries at once.

Report an issue with splitaake.

Get splitaake »

illumiprocessor Helper code

A program for automated cleaning of fastq files from sequencing. Removes adapter contamination using scythe and trims reads for quality using sickle. Concatenates reads into an "interleaved" fastq.gz file for use with velvet.

Report an issue with illumiprocessor

Get illumiprocessor »

cloudforest Helper code

A program started by Nick Crawford with contributions from Brant Faircloth for parallel computation of individual genetrees and the estimation of a species tree from gene tree data. Allows analysis and bootstrapping of large datasets. Uses elastic map reduce or multiprocessing to analyze data in parallel. MPI version coming soon.

Report an issue with cloudforest

Get cloudforest »

Standard workflow (Illumina)

  • Library preparation for Illumina sequencing with on-bead reactions and NEB/Kapa reagents
  • Sequence capture enrichment protocol with universal blockers
  • Post-enrichment limited-cycle PCR
  • Post-enrichment enrichment validation using qPCR
  • Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, et al. (2004) Ultraconserved elements in the human genome. Science 304: 1321–1325. doi:10.1126/science.1098119.
  • Sandelin A, Bailey P, Bruce S, Engström PG, Klos JM, et al. (2004) Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes. BMC Genomics 5: 99. doi:10.1186/1471-2164-5-99.
  • Dermitzakis ET, Reymond A, Antonarakis SE (2005) Opinion: Conserved non-genic sequences — an unexpected feature of mammalian genomes. Nat Rev Genet 6: 151–157. doi:10.1038/nrg1527.
  • Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15: 1034–1050. doi:10.1101/gr.3715005.
  • Woolfe A, Goodson M, Goode D, Snell P, McEwen G, et al. (2005) Highly Conserved Non-Coding Sequences Are Associated with Vertebrate Development. PLoS Biol 3: 116–130. doi:10.1371/journal.pbio.0030007.
  • Pennacchio LA, Ahituv N, Moses AM, Prabhakar S, Nobrega MA, et al. (2006) In vivo enhancer analysis of human conserved non-coding sequences. Nature 444: 499–502. doi:10.1038/nature05295.
  • Ahituv N, Zhu Y, Visel A, Holt A, Afzal V, et al. (2007) Deletion of Ultraconserved Elements Yields Viable Mice. PLoS Biol 5: e234. doi:10.1371/journal.pbio.0050234.
  • Miller W, Rosenbloom K, Hardison RC, Hou M, Taylor J, et al. (2007) 28-way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res 17: 1797–1808. doi:10.1101/gr.6761107.
  • Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, et al. (2009) Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nature Biotechnology 27: 182–189. doi:10.1038/nbt.1523.
  • Blumenstiel B, Cibulskis K, Fisher S, DeFelice M, Barry A, et al. (2010) Targeted exon sequencing by in-solution hybrid selection. Curr Protoc Hum Genet Chapter 18: Unit18.4. doi:10.1002/0471142905.hg1804s66.