What's new
New posts
New media
New media comments
New profile posts
New review items
Latest activity
Forums
New posts
Search forums
Gallery
New media
New comments
Search media
Reviews
New items
Latest content
Latest reviews
Latest questions
Brands
Search reviews
Opus
Birds & Bird Song
Locations
Resources
Contribute
Recent changes
Blogs
Members
Current visitors
New profile posts
Search profile posts
ZEISS
ZEISS Nature Observation
The Most Important Optical Parameters
Innovative Technologies
Conservation Projects
Log in
Register
What's new
Search
Search
Search titles only
By:
New posts
Search forums
Menu
Log in
Register
Install the app
Install
BirdForum is the net's largest birding community dedicated to wild birds and birding, and is
absolutely FREE
!
Register for an account
to take part in lively discussions in the forum, post your pictures in the gallery and more.
Forums
Birding
Bird Taxonomy and Nomenclature
Scolopaci
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="l_raty" data-source="post: 2571576" data-attributes="member: 24811"><p>Trying to crawl my way through this... <img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" class="smilie smilie--sprite smilie--sprite2" alt=";)" title="Wink ;)" loading="lazy" data-shortname=";)" /> (Sorry: quite long!) <a href="https://tspace.library.utoronto.ca/handle/1807/24247" target="_blank">Gibson (2010)</a> and Gibson & Baker (2012) offered only a purely Bayesian analysis, which is not something a trust a lot, so I was interested to see how the results would look if analysed with Maximum Likelihood.</p><p></p><p>The analysis was based on 5 genes: 12S-rRNA (12s, mitochondrial), cytochrome oxidase subunit-1 (cox1/coi/barcode, mitochondrial), cytochrome b (cytb, mitochondrial), NADH dehydrogenase subunit-2 (nd2, mitochondrial), and Recombination Activating Gene 1 (rag1, nuclear, exon). Reconstructing exactly their dataset is not really possible, however, because (1) their cox1 sequences were taken from a "concurrent DNA barcoding study in [thei]r lab" (presumably <a href="https://tspace.library.utoronto.ca/handle/1807/30580" target="_blank">Rebecca Elbourne's MSc thesis</a>), and (2) for the other genes, they clearly took many of their sequences from earlier datasets; in both cases, they did not detail which sequences they used. Sequences specifically associated to this work appear in <a href="http://www.ncbi.nlm.nih.gov/nuccore/" target="_blank">GenBank</a> under accession numbers JQ962980-JQ963056: although these were presumably all used in the analysis, they undoubtedly only represent a small subset of the analysed matrix (77 sequences in total; the full data matrix has 86 taxa * 5 genes = 430 cells).</p><p></p><p>Cox1 sequences, for all the taxa in the tree except <em>Hydrophasianus chirurgus</em> and <em>Microparra capensis</em> (both also lacking in Rebecca Elbourne's thesis), can easily be retrieved from <a href="http://www.boldsystems.org/" target="_blank">BOLD</a>. Except in a few rare cases, several sequences are available for each taxon, and their congruence can readily be checked. These data look entirely problem-free to me. As to the other 4 genes and included taxa, a list of what can be retrieved from <a href="http://www.ncbi.nlm.nih.gov/nuccore/" target="_blank">GenBank</a> appears in the attached .txt file (mostly interesting to buid your own idea about possible gaps in the sampling).</p><p></p><p>I presumed that, where sequences are not available either in GenBank or in BOLD, the gene must have been coded as missing data for the given taxon in the analysis (ie., a part of the data was not kept away from the public). On the other hand, it is certain that a part of what can be accessed in GenBank, in particular sequences published by entirely different research groups, was not used in the analysis, even though it existed.</p><p></p><p style="text-align: center">================</p><p></p><p>I first started with some exploratory analysis of the non-cox1 data, building trees for each gene (in some cases for specific parts of the genes), trying to compare sequences derived from the same taxon, looking at alignments, etc. The data, unfortunately (and as I already wrote above), does not appear completely problem-free. What follows is a list of issues involving sequences that were prossibly/presumably included in the analysis.</p><ul> <li data-xf-list-type="ul">- <em>Phalaropus tricolor</em>, cytb: AY894240 (voucher: RCA87-192, 870 bp, Pereira & Baker 2005)<br /> - <em>Gallinago gallinago</em>, cytb: FJ603652-54 (voucher: multiple, 702 bp, Baker et al. 2009)<br /> - <em>Gallinago gallinago</em>, cytb: FJ787309-10 (voucher: 5984/5312, 943 bp, Hering & Päckert 2010)<br /> - <em>Gallinago gallinago</em>, cytb: AF194445-6 (voucher: /, 277 bp, Austin unpubl.)<br /> - <em>Gallinago delicata</em>, cytb: FJ603651 (voucher: 1B-2986, 702 bp, Baker et al. 2009)<br /> <br /> These are all near-identical; in trees, they appear embedded in <em>Gallinago</em>.<br /> <strong>Conclusion</strong>: AY894240 (Pereira & Baker 2005) is presumably misidentified, and is <em>G. gallinago/delicata</em>.</li> </ul> <ul> <li data-xf-list-type="ul">- <em>Limnodromus scolopaceus</em>, 12s: EF373090 (voucher: MKP-1523, 552 bp, Baker et al. 2007)<br /> - <em>Limnodromus scolopaceus</em>, 12s: AF285806 (voucher: /, 462 bp, Spellman & Winker 2001)<br /> - <em>Phalaropus tricolor</em>, 12s: DQ674581 (voucher: /, 1042 bp, Fain & Houde 2007)<br /> <br /> EF373090 is near-identical to DQ674581, but highly divergent from DQ674581; in trees, EF373090 and DQ674581 cluster with <em>Phalaropus tricolor</em> AY894155, then with tringines, not with scolopacines.<br /> <strong>Conclusion</strong>: EF373090 (Baker et al. 2007) is presumably misidentified, and is <em>Phalaropus tricolor</em>.</li> </ul> <ul> <li data-xf-list-type="ul">- <em>Limnodromus scolopaceus</em>, cytb: EF373140 (voucher: MKP-1523, 925 bp, Baker et al. 2007)<br /> - <em>Limnodromus scolopaceus</em>, cytb: AF285819 (voucher: /, 1019 bp, Spellman & Winker 2001)<br /> <br /> EF373140 is highly divergent from AF285819; in trees, it appears as the sister group of (<em>Phalaropus fulicarius</em> JQ963055 + <em>Phalaropus lobatus</em> AY894239), and associated to tringines, not to scolopacines; for AF285819, see below.<br /> <strong>Conclusion</strong>: EF373140 (Baker et al. 2007) is problematic/misidentified; considering that Baker et al.'s 12s "<em>Limnodromus scolopaceus</em>" appears to be a misidentified <em>Phalaropus tricolor</em>, it seems reasonable to me to assume that the same applies to their cytb.</li> </ul> <ul> <li data-xf-list-type="ul">- <em>Gallinago gallinago</em>, 12s: EF373082 (voucher: MKP-1590, 554 bp, Baker et al. 2007)<br /> - <em>Gallinago gallinago</em>, 12s: DQ674576 (voucher: /, 1044 bp, Fain & Houde 2007)<br /> - <em>Gallinago gallinago</em>, 12s: FJ603664-66 (voucher: multiple, 675 bp, Baker et al. 2009)<br /> - <em>Limnodromus scolopaceus</em>, 12s: AF285806 (voucher: /, 462 bp, Spellman & Winker 2001)<br /> <br /> EF373082 is near-identical to AF285806, but highly divergent from DQ674576 and FJ603664-66 (which are all near-identical to one another); in trees, EF373082 and AF285806 appear as the sister group of (<em>Limnodromus griseus</em> JQ962988 + <em>Limnodromus</em> sp. DQ674578).<br /> <strong>Conclusion</strong>: EF373082 (Baker et al. 2007) is presumably misidentified, and is <em>Limnodromus scolopaceus</em>.</li> </ul> <ul> <li data-xf-list-type="ul">- <em>Gallinago gallinago</em>, cytb: EF373132 (voucher: MKP-1590, 943 bp, Baker et al. 2007)<br /> - <em>Gallinago gallinago</em>, cytb: FJ603652-54 (voucher: multiple, 702 bp, Baker et al. 2009)<br /> - <em>Gallinago gallinago</em>, cytb: FJ787309-10 (voucher: 5984/5312, 943 bp, Hering & Päckert 2010)<br /> - <em>Gallinago gallinago</em>, cytb: AF194445-6 (voucher: /, 277 bp, Austin unpubl.)<br /> - <em>Gallinago delicata</em>, cytb: FJ603651 (voucher: 1B-2986, 702 bp, Baker et al. 2009)<br /> - <em>Limnodromus scolopaceus</em>, cytb: AF285819 (voucher: /, 1019 bp, Spellman & Winker 2001)<br /> <br /> EF373132 is highly divergent from FJ603652-54, FJ787309-10, AF194445-6, and FJ603651 (which are all near-identical to one another); in trees, it clusters first with AF285819, then with <em>Limnodromus griseus</em> JQ963049. However, EF373132 and AF285819 are not near-identical; they start the same, remain so over the first 345 bp of EF373132, then start diverging more and more until the end of the sequence (suggesting a sequencing problem in one of them); the overal distance between them is about 5%; in trees, this divergence is reconstructed as fully autapomorphic for AF285819; looking at the distances between the two sequences and their nearest BLAST matches also shows that AF285819 is globally more distant from all closely-related shorebird sequences than EF373132.<br /> <strong>Conclusion</strong>: This one is a though call. EF373132 is clearly not <em>G. gallinago</em>, and is at least mostly <em>L. scolopaceus</em>; as Baker et al.'s 12s "<em>G. gallinago</em>" appears to be, in its entirety, a misidentified <em>L. scolopaceus</em>, it seems reasonable to consider that the same could apply to their cytb. But this doesn't solve everything: besides this consideration, either EF373132 or AF285819 still appears to be incorrect due to a sequencing problem; the apparent autapomorphic character of the divergence of AF285819 leads me to think that a sequencing problem occurred there. IOW: I think the most likely explanations to what I see, is that (1) EF373132 (Baker et al. 2007) is misidentified, and is <em>Limnodromus scolopaceus</em> (but I see no strong suggestion that it suffers other problems); and (2) AF285819 (Spellman & Winker 2001), although correctly identified, suffers from a sequencing problem.</li> </ul> <ul> <li data-xf-list-type="ul">- <em>Gallinago delicata</em>, cytb: JQ963043 (voucher: JGS-1783, 852 bp, Gibson & Baker 2012)<br /> - <em>Gallinago gallinago</em>, cytb: FJ603652-54 (voucher: multiple, 702 bp, Baker et al. 2009)<br /> - <em>Gallinago gallinago</em>, cytb: FJ787309-10 (voucher: 5984/5312, 943 bp, Hering & Päckert 2010)<br /> - <em>Gallinago gallinago</em>, cytb: AF194445-6 (voucher: /, 277 bp, Austin unpubl.)<br /> - <em>Gallinago delicata</em>, cytb: FJ603651 (voucher: 1B-2986, 702 bp, Baker et al. 2009)<br /> <br /> JQ963043 is made of two sequenced fragments (353 and 411 bp respectively), separated by an 88 bp non-sequenced gap (a "poly-N" in the sequence). This sequence is congruent with the other sequences of<em> G. gallinago/delicata</em> listed above, except for the 213 last bp of the first sequenced fragment: this part differs by 15 substitutions (and has an additional 13 unidentified positions, which suggests a problem as well).<br /> <strong>Conclusion</strong>: there was apparently a problem with the sequencing of this part of JQ963043 (Gibson & Baker 2012).</li> </ul> <ul> <li data-xf-list-type="ul">- <em>Numenius minutus</em>, cytb: EF373145 (voucher: S-072-78498, 963 bp, Baker et al. 2007)<br /> - <em>Numenius arquata</em>, cytb: AF417929 (voucher: /, 1143 bp, Chen et al. 2003)<br /> <br /> These two sequences are near-identical; in trees, they appear embedded in the "large curlew" group (clading with<em> N. madagascariensis</em> AF417925, as the sister group of <em>N. americanus</em> JQ963052) which is the position of <em>N. arquata</em> (both based on traditional morphology/biogeography, and on all other available mt genes).<br /> <strong>Conclusion</strong>: EF373145 is presumably misdentified, and is <em>Numenius arquata</em>.</li> </ul> <ul> <li data-xf-list-type="ul">- <em>Numenius tahitiensis</em>, 12s: JQ962997 (voucher: BTCU-113, 550 bp, Gibson & Baker 2012)<br /> - <em>Limosa limosa islandica</em>, 12s: JQ962990 (voucher: MKP-1596, 555 bp, Gibson & Baker 2012)<br /> <br /> These two sequences are near-identical; in trees, they appear embedded in <em>Limosa</em>.<br /> <strong>Conclusion</strong>: JQ962997 (Gibson & Baker 2012) is presumably misdentified, and is a western <em>L. limosa</em>, probably <em>L. l. islandica</em>.</li> </ul> <ul> <li data-xf-list-type="ul">- <em>Phalaropus lobatus</em>, rag1: AY894222 (voucher: JMP-2057, 885 bp, Pereira & Baker 2005)<br /> - <em>Microparra capensis</em>, rag1: EF373194 (voucher: MKP-1479, 2611 bp, Baker et al. 2007)<br /> - <em>Tringa stagnatilis</em>, rag1: AY894219 (voucher: MKP-1353, 885 bp, Pereira & Baker 2005)<br /> <br /> AY894222 and EF373194 are near-identical (where EF373194 overlaps with the others = first 854 bp of EF373194, last 854 bp of the other two); in trees based on this fragment, these three sequences cluster together with high support, outside both Scolopacidae and Jacanidae. BLAST searches based on these sequences produce odd results, with <em>Haematopus ater</em> AY228794 appearing systematically among the closest matches.<br /> <strong>Conclusion</strong>: where they overlap, these three sequences are presumably wrong, but I can't identify the problem precisely. Note that the rest of EF373194, as far as can be judged, behaves "normally" (ie., <em>Microparra</em> clusters with <em>Irediparra</em> and <em>Actophilornis</em> in Jacanidae) and may perfectly be correct.</li> </ul><p>Of course it can be difficult to be sure that an apparently wrong sequence in GenBank was actually used in a published analysis. This is particularly true for simple misidentifications, as sequences may as well have been interverted at the time of deposition, and the analysis might be fully correct. Here, however, some aspects of the published tree also suggest that the dataset was not "fully clean". The relationships within <em>Numenius</em> in this tree, in particular, are clearly wrong as far as I'm concerned, but not unexpected given the problems described above (<em>tahitiensis</em> "most divergent", attracted towards godwits due to its 12s <em>Limosa</em> sequence; <em>minutus</em> probably closer to "large curlews" than it really is, due to its <em>arquata</em> cytb). The long branches leading to <em>Gallinago gallinago</em> and <em>G. delicata</em>, as well as the "less-than-1" PP associated to this node, might also be linked to the inclusion of wrong sequences (these taxa are not supposed to differ at all genetically). Similarly, the long branch leading to, and the weakness of the tree reconstruction (low PPs) around <em>Tringa stagnatilis</em> might easily be due to the rag1 sequence AY894219.</p><p></p><p style="text-align: center">================</p><p></p><p>Taking the above into consideration, I reconstructed a dataset, using cox1 sequences from BOLD, and sequences of the other 4 genes, as available in GenBank. I omitted the sequences that looked problematic to me, or used them as being what I believed they are, as explained above. I divided this dataset into 9 partition as in Gibson & Baker 2012: one for the 12s, one for 1st and 2nd positions of each of the 4 coding genes, one for 3rd positions of each of the 4 coding genes; I selected models for these partitions based on the AICc criterion in TreeFinder; then I reconstructed a "best" ML tree, and ran a 100-replicate bootstrap analysis. The unbootstrapped tree and the consensus tree from the bootstrap analysis are attached.</p><p></p><p>The results are rather similar to Gibson & Baker's, with the main exception of the curlews (I omitted the two problematic sequences, which resulted in a tree consistent with single-gene analyses - <em>minutus</em> most divergent, <em>tahitiensis</em> sister to whimbrels. The support for internal nodes, in ML, is rather weak. The <em>Calidris</em> radiation, in particular, also appears basically unresolved (which is rather unsurprising, given that many of the taxa actually stand in the matrix based on a cox1 sequence only).</p><p></p><p>(My take at high PP/low BS, is that this indicates that a "best" tree is clearly identified given the dataset, but that this tree should probably be expected to be highly sensitive to the addition of new data.) </p><p></p><p>OK, I'll stop with this for now (but I'd welcome thoughts on the above).</p><p></p><p>Cheers, L -</p></blockquote><p></p>
[QUOTE="l_raty, post: 2571576, member: 24811"] Trying to crawl my way through this... ;) (Sorry: quite long!) [URL="https://tspace.library.utoronto.ca/handle/1807/24247"]Gibson (2010)[/URL] and Gibson & Baker (2012) offered only a purely Bayesian analysis, which is not something a trust a lot, so I was interested to see how the results would look if analysed with Maximum Likelihood. The analysis was based on 5 genes: 12S-rRNA (12s, mitochondrial), cytochrome oxidase subunit-1 (cox1/coi/barcode, mitochondrial), cytochrome b (cytb, mitochondrial), NADH dehydrogenase subunit-2 (nd2, mitochondrial), and Recombination Activating Gene 1 (rag1, nuclear, exon). Reconstructing exactly their dataset is not really possible, however, because (1) their cox1 sequences were taken from a "concurrent DNA barcoding study in [thei]r lab" (presumably [URL="https://tspace.library.utoronto.ca/handle/1807/30580"]Rebecca Elbourne's MSc thesis[/URL]), and (2) for the other genes, they clearly took many of their sequences from earlier datasets; in both cases, they did not detail which sequences they used. Sequences specifically associated to this work appear in [URL="http://www.ncbi.nlm.nih.gov/nuccore/"]GenBank[/URL] under accession numbers JQ962980-JQ963056: although these were presumably all used in the analysis, they undoubtedly only represent a small subset of the analysed matrix (77 sequences in total; the full data matrix has 86 taxa * 5 genes = 430 cells). Cox1 sequences, for all the taxa in the tree except [I]Hydrophasianus chirurgus[/I] and [I]Microparra capensis[/I] (both also lacking in Rebecca Elbourne's thesis), can easily be retrieved from [URL="http://www.boldsystems.org/"]BOLD[/URL]. Except in a few rare cases, several sequences are available for each taxon, and their congruence can readily be checked. These data look entirely problem-free to me. As to the other 4 genes and included taxa, a list of what can be retrieved from [URL="http://www.ncbi.nlm.nih.gov/nuccore/"]GenBank[/URL] appears in the attached .txt file (mostly interesting to buid your own idea about possible gaps in the sampling). I presumed that, where sequences are not available either in GenBank or in BOLD, the gene must have been coded as missing data for the given taxon in the analysis (ie., a part of the data was not kept away from the public). On the other hand, it is certain that a part of what can be accessed in GenBank, in particular sequences published by entirely different research groups, was not used in the analysis, even though it existed. [CENTER]================[/CENTER] I first started with some exploratory analysis of the non-cox1 data, building trees for each gene (in some cases for specific parts of the genes), trying to compare sequences derived from the same taxon, looking at alignments, etc. The data, unfortunately (and as I already wrote above), does not appear completely problem-free. What follows is a list of issues involving sequences that were prossibly/presumably included in the analysis. [list] [*]- [I]Phalaropus tricolor[/I], cytb: AY894240 (voucher: RCA87-192, 870 bp, Pereira & Baker 2005) - [I]Gallinago gallinago[/I], cytb: FJ603652-54 (voucher: multiple, 702 bp, Baker et al. 2009) - [I]Gallinago gallinago[/I], cytb: FJ787309-10 (voucher: 5984/5312, 943 bp, Hering & Päckert 2010) - [I]Gallinago gallinago[/I], cytb: AF194445-6 (voucher: /, 277 bp, Austin unpubl.) - [I]Gallinago delicata[/I], cytb: FJ603651 (voucher: 1B-2986, 702 bp, Baker et al. 2009) These are all near-identical; in trees, they appear embedded in [I]Gallinago[/I]. [B]Conclusion[/B]: AY894240 (Pereira & Baker 2005) is presumably misidentified, and is [I]G. gallinago/delicata[/I]. [/list][list] [*]- [I]Limnodromus scolopaceus[/I], 12s: EF373090 (voucher: MKP-1523, 552 bp, Baker et al. 2007) - [I]Limnodromus scolopaceus[/I], 12s: AF285806 (voucher: /, 462 bp, Spellman & Winker 2001) - [I]Phalaropus tricolor[/I], 12s: DQ674581 (voucher: /, 1042 bp, Fain & Houde 2007) EF373090 is near-identical to DQ674581, but highly divergent from DQ674581; in trees, EF373090 and DQ674581 cluster with [I]Phalaropus tricolor[/I] AY894155, then with tringines, not with scolopacines. [B]Conclusion[/B]: EF373090 (Baker et al. 2007) is presumably misidentified, and is [I]Phalaropus tricolor[/I]. [/list][list] [*]- [I]Limnodromus scolopaceus[/I], cytb: EF373140 (voucher: MKP-1523, 925 bp, Baker et al. 2007) - [I]Limnodromus scolopaceus[/I], cytb: AF285819 (voucher: /, 1019 bp, Spellman & Winker 2001) EF373140 is highly divergent from AF285819; in trees, it appears as the sister group of ([I]Phalaropus fulicarius[/I] JQ963055 + [I]Phalaropus lobatus[/I] AY894239), and associated to tringines, not to scolopacines; for AF285819, see below. [B]Conclusion[/B]: EF373140 (Baker et al. 2007) is problematic/misidentified; considering that Baker et al.'s 12s "[I]Limnodromus scolopaceus[/I]" appears to be a misidentified [I]Phalaropus tricolor[/I], it seems reasonable to me to assume that the same applies to their cytb. [/list][list] [*]- [I]Gallinago gallinago[/I], 12s: EF373082 (voucher: MKP-1590, 554 bp, Baker et al. 2007) - [I]Gallinago gallinago[/I], 12s: DQ674576 (voucher: /, 1044 bp, Fain & Houde 2007) - [I]Gallinago gallinago[/I], 12s: FJ603664-66 (voucher: multiple, 675 bp, Baker et al. 2009) - [I]Limnodromus scolopaceus[/I], 12s: AF285806 (voucher: /, 462 bp, Spellman & Winker 2001) EF373082 is near-identical to AF285806, but highly divergent from DQ674576 and FJ603664-66 (which are all near-identical to one another); in trees, EF373082 and AF285806 appear as the sister group of ([I]Limnodromus griseus[/I] JQ962988 + [I]Limnodromus[/I] sp. DQ674578). [B]Conclusion[/B]: EF373082 (Baker et al. 2007) is presumably misidentified, and is [I]Limnodromus scolopaceus[/I]. [/list][list] [*]- [I]Gallinago gallinago[/I], cytb: EF373132 (voucher: MKP-1590, 943 bp, Baker et al. 2007) - [I]Gallinago gallinago[/I], cytb: FJ603652-54 (voucher: multiple, 702 bp, Baker et al. 2009) - [I]Gallinago gallinago[/I], cytb: FJ787309-10 (voucher: 5984/5312, 943 bp, Hering & Päckert 2010) - [I]Gallinago gallinago[/I], cytb: AF194445-6 (voucher: /, 277 bp, Austin unpubl.) - [I]Gallinago delicata[/I], cytb: FJ603651 (voucher: 1B-2986, 702 bp, Baker et al. 2009) - [I]Limnodromus scolopaceus[/I], cytb: AF285819 (voucher: /, 1019 bp, Spellman & Winker 2001) EF373132 is highly divergent from FJ603652-54, FJ787309-10, AF194445-6, and FJ603651 (which are all near-identical to one another); in trees, it clusters first with AF285819, then with [I]Limnodromus griseus[/I] JQ963049. However, EF373132 and AF285819 are not near-identical; they start the same, remain so over the first 345 bp of EF373132, then start diverging more and more until the end of the sequence (suggesting a sequencing problem in one of them); the overal distance between them is about 5%; in trees, this divergence is reconstructed as fully autapomorphic for AF285819; looking at the distances between the two sequences and their nearest BLAST matches also shows that AF285819 is globally more distant from all closely-related shorebird sequences than EF373132. [B]Conclusion[/B]: This one is a though call. EF373132 is clearly not [I]G. gallinago[/I], and is at least mostly [I]L. scolopaceus[/I]; as Baker et al.'s 12s "[I]G. gallinago[/I]" appears to be, in its entirety, a misidentified [I]L. scolopaceus[/I], it seems reasonable to consider that the same could apply to their cytb. But this doesn't solve everything: besides this consideration, either EF373132 or AF285819 still appears to be incorrect due to a sequencing problem; the apparent autapomorphic character of the divergence of AF285819 leads me to think that a sequencing problem occurred there. IOW: I think the most likely explanations to what I see, is that (1) EF373132 (Baker et al. 2007) is misidentified, and is [I]Limnodromus scolopaceus[/I] (but I see no strong suggestion that it suffers other problems); and (2) AF285819 (Spellman & Winker 2001), although correctly identified, suffers from a sequencing problem. [/list][list] [*]- [I]Gallinago delicata[/I], cytb: JQ963043 (voucher: JGS-1783, 852 bp, Gibson & Baker 2012) - [I]Gallinago gallinago[/I], cytb: FJ603652-54 (voucher: multiple, 702 bp, Baker et al. 2009) - [I]Gallinago gallinago[/I], cytb: FJ787309-10 (voucher: 5984/5312, 943 bp, Hering & Päckert 2010) - [I]Gallinago gallinago[/I], cytb: AF194445-6 (voucher: /, 277 bp, Austin unpubl.) - [I]Gallinago delicata[/I], cytb: FJ603651 (voucher: 1B-2986, 702 bp, Baker et al. 2009) JQ963043 is made of two sequenced fragments (353 and 411 bp respectively), separated by an 88 bp non-sequenced gap (a "poly-N" in the sequence). This sequence is congruent with the other sequences of[I] G. gallinago/delicata[/I] listed above, except for the 213 last bp of the first sequenced fragment: this part differs by 15 substitutions (and has an additional 13 unidentified positions, which suggests a problem as well). [B]Conclusion[/B]: there was apparently a problem with the sequencing of this part of JQ963043 (Gibson & Baker 2012). [/list][list] [*]- [I]Numenius minutus[/I], cytb: EF373145 (voucher: S-072-78498, 963 bp, Baker et al. 2007) - [I]Numenius arquata[/I], cytb: AF417929 (voucher: /, 1143 bp, Chen et al. 2003) These two sequences are near-identical; in trees, they appear embedded in the "large curlew" group (clading with[I] N. madagascariensis[/I] AF417925, as the sister group of [I]N. americanus[/I] JQ963052) which is the position of [I]N. arquata[/I] (both based on traditional morphology/biogeography, and on all other available mt genes). [B]Conclusion[/B]: EF373145 is presumably misdentified, and is [I]Numenius arquata[/I]. [/list][list] [*]- [I]Numenius tahitiensis[/I], 12s: JQ962997 (voucher: BTCU-113, 550 bp, Gibson & Baker 2012) - [I]Limosa limosa islandica[/I], 12s: JQ962990 (voucher: MKP-1596, 555 bp, Gibson & Baker 2012) These two sequences are near-identical; in trees, they appear embedded in [I]Limosa[/I]. [B]Conclusion[/B]: JQ962997 (Gibson & Baker 2012) is presumably misdentified, and is a western [I]L. limosa[/I], probably [I]L. l. islandica[/I]. [/list][list] [*]- [I]Phalaropus lobatus[/I], rag1: AY894222 (voucher: JMP-2057, 885 bp, Pereira & Baker 2005) - [I]Microparra capensis[/I], rag1: EF373194 (voucher: MKP-1479, 2611 bp, Baker et al. 2007) - [I]Tringa stagnatilis[/I], rag1: AY894219 (voucher: MKP-1353, 885 bp, Pereira & Baker 2005) AY894222 and EF373194 are near-identical (where EF373194 overlaps with the others = first 854 bp of EF373194, last 854 bp of the other two); in trees based on this fragment, these three sequences cluster together with high support, outside both Scolopacidae and Jacanidae. BLAST searches based on these sequences produce odd results, with [I]Haematopus ater[/I] AY228794 appearing systematically among the closest matches. [B]Conclusion[/B]: where they overlap, these three sequences are presumably wrong, but I can't identify the problem precisely. Note that the rest of EF373194, as far as can be judged, behaves "normally" (ie., [I]Microparra[/I] clusters with [I]Irediparra[/I] and [I]Actophilornis[/I] in Jacanidae) and may perfectly be correct. [/LIST] Of course it can be difficult to be sure that an apparently wrong sequence in GenBank was actually used in a published analysis. This is particularly true for simple misidentifications, as sequences may as well have been interverted at the time of deposition, and the analysis might be fully correct. Here, however, some aspects of the published tree also suggest that the dataset was not "fully clean". The relationships within [I]Numenius[/I] in this tree, in particular, are clearly wrong as far as I'm concerned, but not unexpected given the problems described above ([I]tahitiensis[/I] "most divergent", attracted towards godwits due to its 12s [I]Limosa[/I] sequence; [I]minutus[/I] probably closer to "large curlews" than it really is, due to its [I]arquata[/I] cytb). The long branches leading to [I]Gallinago gallinago[/I] and [I]G. delicata[/I], as well as the "less-than-1" PP associated to this node, might also be linked to the inclusion of wrong sequences (these taxa are not supposed to differ at all genetically). Similarly, the long branch leading to, and the weakness of the tree reconstruction (low PPs) around [I]Tringa stagnatilis[/I] might easily be due to the rag1 sequence AY894219. [CENTER]================[/CENTER] Taking the above into consideration, I reconstructed a dataset, using cox1 sequences from BOLD, and sequences of the other 4 genes, as available in GenBank. I omitted the sequences that looked problematic to me, or used them as being what I believed they are, as explained above. I divided this dataset into 9 partition as in Gibson & Baker 2012: one for the 12s, one for 1st and 2nd positions of each of the 4 coding genes, one for 3rd positions of each of the 4 coding genes; I selected models for these partitions based on the AICc criterion in TreeFinder; then I reconstructed a "best" ML tree, and ran a 100-replicate bootstrap analysis. The unbootstrapped tree and the consensus tree from the bootstrap analysis are attached. The results are rather similar to Gibson & Baker's, with the main exception of the curlews (I omitted the two problematic sequences, which resulted in a tree consistent with single-gene analyses - [I]minutus[/I] most divergent, [I]tahitiensis[/I] sister to whimbrels. The support for internal nodes, in ML, is rather weak. The [I]Calidris[/I] radiation, in particular, also appears basically unresolved (which is rather unsurprising, given that many of the taxa actually stand in the matrix based on a cox1 sequence only). (My take at high PP/low BS, is that this indicates that a "best" tree is clearly identified given the dataset, but that this tree should probably be expected to be highly sensitive to the addition of new data.) OK, I'll stop with this for now (but I'd welcome thoughts on the above). Cheers, L - [/QUOTE]
Insert quotes...
Verification
Post reply
Forums
Birding
Bird Taxonomy and Nomenclature
Scolopaci
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.
Accept
Learn more...
Top