Ants, Genomes & Evolution

@ Queen Mary University London

Recruiting 3 year Postdoc for Pollinator Population Genomics and Transcriptomics

We are recruiting an evolutionary-minded person with strong bioinformatics skills for a 3 year postdoc position analysing large pollinator genomics and transcriptomics datasets. This is a NERC-funded position in collaboration with Richard Gill, Nigel Raine and Lars Chittka.

Full ad is here

Apply by August 17th.

July 31, 2014

Reference Letters

Current or former students very regularly ask me for a reference to help them apply for a job or a new study program. The process is facilitated & the letter is improved by the following advice.

If you need a reference letter from me, I need you to write a first draft. First, you are best positioned to know what makes you great for what you're applying for. Second, you'll end up with a better letter if my time is spent revising something than if I try to create something from scratch.

Your draft should in the form of a letter from me about you (yes, it can feel awkward to write like this). You basically need to say that you are a great and justify why). Some general tips:

  • Please respect the style guidelines given by Strunk & White's "The Elements of Style". Keep it concise.
  • Use a spell-checker and a grammar-checker (on strict mode!).
  • It's better if the examples you use are relevant to the degree you're applying to.
  • Don't highlight weaknesses. E.g. if you have a "C" in something don't mention it.
  • Whatever you do, don't lie. Any lies will come back to hurt you 1000-fold (karma).
  • Send it as a document I can edit (not a PDF).


Introductory paragraph. This should include:

  • Why I am writing
  • Why I know you well (eg. I am your advisor/supervisor).
  • Which degree you are doing and when you are expected to graduate.
  • The last sentence should be a small list of ideas (see below), summarizing why you are great for the opportunity you're applying to. This also announces the structure of the next few paragraphs.

One paragraph per idea (no ping-ponging back and forth!). Some examples of ideas:

  • evidence that you are serious & hardworking (e.g., based on your project)
  • academic achievements (e.g., coursework or overall grades, your perception of your final grade ("first?"))
  • extra-curricular activities (jobs, volunteering)
  • evidence that you're a kind person (helping others).

Conclusion: a quick summary stating that you're great for the degree/program/job because of the 3 or 4 ideas.

January 27, 2014

Recruiting Postdoc & potential PhD students

Positions to be filled:

Please get in touch by email with a CV for more details. NERC PhD application deadline is mid-february. Cheers, Yannick.

NERC logo

January 16, 2014

Long time no update

No real update here for a while! Major events include:

Perhaps hope for more regular updates in 2014? We'll see.

December 23, 2013

Oxford Nanopore sequencing - a revolution for non-model organisms?

Exciting announcement of a new dirt-cheap machine-less DNA sequencing technology. If their promises hold true, this technology will be a game-changer for those of us working with "emerging" non-model organisms because:

  • it supposedly provide 100,000bp long reads. This will eliminate most scaffolding issues we have with assembling de novo genome sequence.

  • using the USB thumb-chip version, no machine is required. Thus when you are out in the field, you can sequence right then and there - a potential workaround for worrying about tissue sample export permits... at least until new regulations appear!


High error rates are problematic in short reads because they introduce ambiguity making it challenging to align and assemble these reads. However, this is much less of an issue with longer reads: A 100,000bp region will remain uniquely identifiable even with Oxford Nanopore's currently high error rate of 4%. And because for assembly you will use multiple reads representing the same region, error rate of assembled sequence will be low: if two reads overlap, your consensus sequence for the overlapping region will have an error of 4% * 4% = 0.16% (supposing that the errors give lower quality scores than normal sequence). If you have 3x coverage or more, you can resolve most errors unambiguously...

February 18, 2012

New publications & New job

New year, new country, new job: I am now a Lecturer at Queen Mary University of London. I will continue to use genomics and bioinformatics approaches to examine the interplay between social evolution and genome evolution. Get in touch if you're interested in working with me in a great place.

Queen mary qmul logo blue

And a few nice papers on which I am coauthor are now out.

TIGs ant genomes

February 7, 2012

Genome analyses for emerging model organisms

Using modern molecular tools on emerging (non-model) organism makes it possible to address exciting new questions. But the data aren't as perfect they should be. In particular, genomes created from Roche 454, Illumina or ABI Solid sequence are fragmented: You wish you'd get a FASTA file with one long sequence per chromosome. Dream on! You get sequences for dozens to thousands of scaffolds. Each scaffold is a series of contigs, separated by stretches of unresolved NNNNNNNNN sequence (usually repetitive sequences). But the assembler knows these contigs are adjacent thanks to paired reads.


Genome fragmentation can make things challenging. Some tips from my experience with ant genomes:

How can you determine what is inside the unresolved poly-NNNNN sequence without genome walking or PCR and sequencing? Getting the whole thing will be difficult. But its easy to get a little:

  • Select resolved genomic sequence.

  • BLASTN against the raw unassembled reads (need help setting up a custom BLAST server?)

  • Among the reads that match your query, some will give you sequence that extends inside the unresolved NNNNNN region. If the sequence is known why wasn't it shown? Because it's repetitive nature made the sequence ambiguous for the assembly software.

Are two scaffolds adjacent?

  • Perhaps some paired reads do link them - but there were insufficient data for the assembler to be sure. With 454 assemblies, check newbler's output in the 454Scaffolds.txt and 454PairStatus.txt files. These report where all paired reads map.

  • Or map independently obtained transcriptome or proteome data onto your genome. Oksana Riba-Grognuz developed an easy way of visualizing RNA mapped to genomes.

  • Or check a closely related species - perhaps the region is better assembled there.

How good is good enough? Some sequence/data/scaffolds/models are missing or mediocre! But no biological dataset is ever perfect. If you're trying to make your emerging model organism's data perfect... you'll get nowhere fast. The 20% effort that bring you 80% of the way will probably be good enough to answer your exciting biological question.

September 22, 2011 Tags : genomics sequencing

Social insect genomics conference 2011

Many interesting talks and stimulating discussions during Shenzhen's Social Insect Genomics Conference which coincided with the release of Sanne Nygaard's Acromyrmex echinatior leaf-cutter ant genome paper showing adaptations linked to fungal farming. More excitement is on its way with next generation sociogenetics projects bubbling up around the world & across the phylogeny!

Social insect genome conf photo

July 3, 2011

May Taiwan Conf & June genome updates

Had a great two weeks visiting John Wang's lab at Academia Sinica, Taiwan, and join National Taiwan University's International Symposium on Social Insects for wonderfully stimulating talks by Jo Billen, Lars Chittka, James Nieh, Kenji Matsuura & Bob Vander Meer. The symposium gave me the opportunity to share some thoughts about sequencing genomes with high throughput technologies in the journal of the Taiwan Entomological Society, Formosan Entomologist.

Taiwan International Symposium on Social Insects

In genomic news, the Acromyrmex echinatior leafcutter ant genome, led by Sanne Nygaard & Koos Boosma is in press! The data are already on Fourmidable; and Fourmdiable's ant genome BLAST interface was updated to the latest SequenceServer.

June 9, 2011

Fire ant genome published

Two papers just out! Our Solenopsis invicta fire ant genome paper is out in PNAS. Win! And a study on fire ant Odorant Binding Proteins in PLoS ONE. Anurag Priyam and are developing a generic BLAST web interface in ruby. It's already super useful for our fourmidable ant genome database, and I'm sure will be for others working with non-model organisms. (easy to use; less of a hassle to set up than gmod...). Using the server, you can blast ant genome sequences (and predicted genes).

Fire ants on genome 096 cropped large shortened

Photo of fire ants on their genome (C) Romain Libbrecht & Yannick Wurm

February 21, 2011 Tags : work

All Posts >>