Exciting announcement of a new dirt-cheap machine-less DNA sequencing technology. If their promises hold true, this technology will be a game-changer for those of us working with “emerging” non-model organisms because:
- it supposedly provide 100,000bp long reads. This will eliminate most scaffolding issues we have with assembling de novo genome sequence.
- using the USB thumb-chip version, no machine is required. Thus when you are out in the field, you can sequence right then and there – a potential workaround for worrying about tissue sample export permits… at least until new regulations appear!

High error rates are problematic in short reads because they introduce ambiguity making it challenging to align and assemble these reads. However, this is much less of an issue with longer reads: A 100,000bp region will remain uniquely identifiable even with Oxford Nanopore’s currently high error rate of 4%. And because for assembly you will use multiple reads representing the same region, error rate of assembled sequence will be low: if two reads overlap, your consensus sequence for the overlapping region will have an error of 4% * 4% = 0.16% (supposing that the errors give lower quality scores than normal sequence). If you have 3x coverage or more, you can resolve most errors unambiguously…
New year, new country, new job: I am now a Lecturer at Queen Mary University of London. I will continue to use genomics and bioinformatics approaches to examine the interplay between social evolution and genome evolution. Get in touch if you’re interested in working with me in a great place.

And a few nice papers on which I am coauthor are now out.

Using modern molecular tools on emerging (non-model) organism makes it possible to address exciting new questions. But the data aren’t as perfect they should be. In particular, genomes created from Roche 454, Illumina or ABI Solid sequence are fragmented: You wish you’d get a FASTA file with one long sequence per chromosome. Dream on! You get sequences for dozens to thousands of scaffolds. Each scaffold is a series of contigs, separated by stretches of unresolved NNNNNNNNN sequence (usually repetitive sequences). But the assembler knows these contigs are adjacent thanks to paired reads.

Genome fragmentation can make things challenging. Some tips from my experience with ant genomes:
How can you determine what is inside the unresolved poly-NNNNN sequence without genome walking or PCR and sequencing? Getting the whole thing will be difficult. But its easy to get a little:
- Select resolved genomic sequence.
- BLASTN against the raw unassembled reads (need help setting up a custom BLAST server?)
- Among the reads that match your query, some will give you sequence that extends inside the unresolved NNNNNN region. If the sequence is known why wasn’t it shown? Because it’s repetitive nature made the sequence ambiguous for the assembly software.
Are two scaffolds adjacent?
- Perhaps some paired reads do link them – but there were insufficient data for the assembler to be sure. With 454 assemblies, check newbler’s output in the 454Scaffolds.txt and 454PairStatus.txt files. These report where all paired reads map.
- Or map independently obtained transcriptome or proteome data onto your genome. Oksana Riba-Grognuz developed an easy way of visualizing RNA mapped to genomes.
- Or check a closely related species – perhaps the region is better assembled there.
How good is good enough? Some sequence/data/scaffolds/models are missing or mediocre! But no biological dataset is ever perfect. If you’re trying to make your emerging model organism’s data perfect… you’ll get nowhere fast. The 20% effort that bring you 80% of the way will probably be good enough to answer your exciting biological question.
Many interesting talks and stimulating discussions during Shenzhen’s Social Insect Genomics Conference which coincided with the release of Sanne Nygaard’s Acromyrmex echinatior leaf-cutter ant genome paper showing adaptations linked to fungal farming. More excitement is on its way with next generation sociogenetics projects bubbling up around the world & across the phylogeny!

Had a great two weeks visiting John Wang’s lab at Academia Sinica, Taiwan, and join National Taiwan University’s International Symposium on Social Insects for wonderfully stimulating talks by Jo Billen, Lars Chittka, James Nieh, Kenji Matsuura & Bob Vander Meer. The symposium gave me the opportunity to share some thoughts about sequencing genomes with high throughput technologies in the journal of the Taiwan Entomological Society, Formosan Entomologist.

In genomic news, the Acromyrmex echinatior leafcutter ant genome, led by Sanne Nygaard & Koos Boosma is in press! The data are already on Fourmidable; and Fourmdiable’s ant genome BLAST interface was updated to the latest SequenceServer.
Two papers just out! Our Solenopsis invicta fire ant genome paper is out in PNAS. Win! And a study on fire ant Odorant Binding Proteins in PLoS ONE. Anurag Priyam and are developing a generic BLAST web interface in ruby. It’s already super useful for our fourmidable ant genome database, and I’m sure will be for others working with non-model organisms. (easy to use; less of a hassle to set up than gmod…). Using the server, you can blast ant genome sequences (and predicted genes).

Photo of fire ants on their genome © Romain Libbrecht & Yannick Wurm
Jakartans use “fogger” chemical sprays to fight dengue-transmitting Aedes mosquitoes.

Photo © Hermitianta P. Putra
Whatever was meant to target the mosquitoes made cockroaches jumped into our swimming pool and drown themselves, and made “normal” ants frantically run in circles. This suggests that whatever they use for fogging is a generalist poison that screws up all insect brains and likely affects larger things as well. That is bad for controlling dengue, because many small animals compete with the mosquitos for food and reproductive space, or may even eat them (eg: larvae from other insects, or chick-chacks).
Probably for these reasons, fogging for dengue is not advised:
So please stop fogging!
The best approach is to eliminate all possible breeding sites: getting rid of even small amounts of stagnant water (in containers, trash, leaves, flower pot dishes….)

Aedes aegypti © Wikipedia