• Welcome to BirdForum, the internet's largest birding community with thousands of members from all over the world. The forums are dedicated to wild birds, birding, binoculars and equipment and all that goes with it.

    Please register for an account to take part in the discussions in the forum, post your pictures in the gallery and more.
Where premium quality meets exceptional value. ZEISS Conquest HDX.

Peters' Check-list of birds of the world (machine readable) (1 Viewer)

kweetal

Well-known member
Europe
Check-list, in 15 parts

Below each Peters name, I inserted the corresponding name from IOC v14.2, Clements v2023b, HBW/BirdLife v8.1, and Howard & Moore 4.1. These come from IOC's 'vs_other_list' spreadsheet. There are still OCR errors so: caveat emptor. Each Peters name (green line) has a link into the original BHL page.

TWIMC ...
 
I'm trying to extract data from these files; JTidy tells me there's an extra < before a="" in lines like this:

<span class="genus_synonym" <a="" id="Pterocnemia"></span><span class="prefix">1:6:1254</span><span class="taxonname_author_and_year"> Pterocnemia G. R. Gray, 1871</span>
 
Hmyeah, that's wrong of course (found it in the source too). Sorry about that. I will generate new files, within a day or so. I've replaced the v1 file already.
 
I'm trying to extract data from these files;
I've changed all 15 files so they pass the validator at w3.org

These books were produced in 50+ years after 1931 and the format of the files -and thus my parsing- reflects that. Let me know if you spot more trouble in the html or data; especially errors in taxonomic names. (Eventually, it may be easier for me to produce what you need.)

What I still want to do is link (all?) synonyms to the original publication via BHL or other online sites. It'll probably take some time.
 
I've extracted basic taxonomic information from your files -- resulting text file attached. Not sure what I plan to do with it, I would need to match it with modern data. Which I see you have in there already, but it would need more scraping to extract it.
 

Attachments

  • Peters_check-list_of_birds_of_the_world.txt
    1.3 MB · Views: 23
I've extracted basic taxonomic information from your files -- resulting text file attached. Not sure what I plan to do with it, I would need to match it with modern data. Which I see you have in there already, but it would need more scraping to extract it.
That file I produced contained a certain amount of information matching Peters subspecies with IOC 14.2 subspecies, but it was far from complete. It looked like the original data in the HTML had used some automated process to do that matching, but it didn't match a monotypic species to a nominate subspecies or vice versa. And there was a lot of that. To be fair, Avibase stumbles sometimes with identifying that relationship as well.

Anyway my work identified 8904 species in Peters and 10731 species in IOC matching them. So there's about 500 species in IOC which so far don't match anything in that Peters data, for which there could be several reasons.

If anybody wants to have a look at my work so far just start a conversation and I'll send you the outputs.
 
Of course, I'm curious if I did anything wrong. Are you saying that from the 15 files that I posted, there are taxa missing (i.e., they are in Peters but not in 'my' files)?

BTW, one wouldn't expect to match up all Peters names with IOC 14.2 names, if only because IOC has 225 species and 519 subspecies published /after/ 1985, which is the year of the last volume (other volumes are older so the total will be higher).

Quite a few genera names were changed with taxon remaining the same. And there are of course cases where Peters subspecies epithet becomes the an IOC species epithet; that is, a subspecies is upgraded to species status. This latter case causes 2 lines in the IOC 'vs other lists' spreadsheet IIRC.

(The matches from the four world lists that I added to the names in the 15 files is simply what is contained in the IOC 14.2 'vs other lists' spreadsheet. I did not do any matching (between my Peters texts and the IOC data) myself although I suppose it would be possible.)
 
I can't say that you did anything wrong. I scraped your HTML documents and extracted the relevant data into a text document; I don't have any information about Peters names beyond those documents. As I was going through the document I do recall noticing a couple of Peters species which had been entirely synonymized over the years, but I didn't keep track of them. What I did was take all of the entries which were unmatched and put them into Avibase to see what IOC taxon it would match them to. As for Peters taxa, I ignored anything which had been synonymized, expecting that using the senior synonym would take care of the matching. Although I might have to revisit that decision.

What I plan to do next is to get the Peters data into my database, where I can match it with IOC. I'm only working at the species level so there are some messy matches, look at Leucocarbo atriceps/atriventer for example. But once I get those sorted out I will be able to clean up the data.
 
As I was going through the document I do recall noticing a couple of Peters species which had been entirely synonymized over the years, but I didn't keep track of them.
It appears that "a couple" means "upwards of a hundred". Clearly many of them are synonyms which can't be disregarded.
 

Users who are viewing this thread

Back
Top