• Welcome to BirdForum, the internet's largest birding community with thousands of members from all over the world. The forums are dedicated to wild birds, birding, binoculars and equipment and all that goes with it.

    Please register for an account to take part in the discussions in the forum, post your pictures in the gallery and more.
Where premium quality meets exceptional value. ZEISS Conquest HDX.

Latest IOC Diary Updates (10 Viewers)

Another suspicious bunch:
Code:
 7496 | Accip | Lophospiza trivirgata niasensis Mayr, (1949)    -- 4x: should be (Mayr, 1949)
 7497 | Accip | Lophospiza trivirgata javanica Mayr, (1949)     
 7498 | Accip | Lophospiza trivirgata microsticta Mayr, (1949) 
 7499 | Accip | Lophospiza trivirgata palawana Mayr, (1949)

These three "Authority" have no comma:
Code:
  id  |    Genus     | Species (Scientific) | Subspecies |      Authority   
------+--------------+----------------------+------------+---------------------
 3760 | Heteroscenes |                      |            | Cabanis & Heine 1862
 3762 | Caliechthrus |                      |            | Cabanis & Heine 1862
 8338 |              |                      | cnephaeus  | Deignan 1950
                   cnephaeus is: Otus lettia cnephaeus
(id is the row in the spreadsheet master_ioc_list, 14.2)

Code:
  id  |  X | br |                                  Species                                
------+----+----+---------------------------------------------------------------------------
 8829 |    |    | Trogon chrysochloros muriciensis Dickens, Bitton, Bravo & Silviera, 2021  --> Silviera should be Silveira

https://academic.oup.com/zoolinnean/article/193/2/499/6161254?login=false

See:
 
Last edited:
Another suspicious bunch:
Code:
 7496 | Accip | Lophospiza trivirgata niasensis Mayr, (1949)    -- 4x: should be (Mayr, 1949)
 7497 | Accip | Lophospiza trivirgata javanica Mayr, (1949)    
 7498 | Accip | Lophospiza trivirgata microsticta Mayr, (1949)
 7499 | Accip | Lophospiza trivirgata palawana Mayr, (1949)

These three "Authority" have no comma:
Code:
  id  |    Genus     | Species (Scientific) | Subspecies |      Authority  
------+--------------+----------------------+------------+---------------------
 3760 | Heteroscenes |                      |            | Cabanis & Heine 1862
 3762 | Caliechthrus |                      |            | Cabanis & Heine 1862
 8338 |              |                      | cnephaeus  | Deignan 1950
                   cnephaeus is: Otus lettia cnephaeus
(id is the row in the spreadsheet master_ioc_list, 14.2)

Code:
  id  |  X | br |                                  Species                               
------+----+----+---------------------------------------------------------------------------
 8829 |    |    | Trogon chrysochloros muriciensis Dickens, Bitton, Bravo & Silviera, 2021  --> Silviera should be Silveira

https://academic.oup.com/zoolinnean/article/193/2/499/6161254?login=false

See:
Thanks for these and your previous findings. All needed will be fixed in 15.1.
 
Another suspicious bunch:
Code:
 7496 | Accip | Lophospiza trivirgata niasensis Mayr, (1949)    -- 4x: should be (Mayr, 1949)
 7497 | Accip | Lophospiza trivirgata javanica Mayr, (1949)    
 7498 | Accip | Lophospiza trivirgata microsticta Mayr, (1949)
 7499 | Accip | Lophospiza trivirgata palawana Mayr, (1949)

These three "Authority" have no comma:
Code:
  id  |    Genus     | Species (Scientific) | Subspecies |      Authority  
------+--------------+----------------------+------------+---------------------
 3760 | Heteroscenes |                      |            | Cabanis & Heine 1862
 3762 | Caliechthrus |                      |            | Cabanis & Heine 1862
 8338 |              |                      | cnephaeus  | Deignan 1950
                   cnephaeus is: Otus lettia cnephaeus
(id is the row in the spreadsheet master_ioc_list, 14.2)

Code:
  id  |  X | br |                                  Species                               
------+----+----+---------------------------------------------------------------------------
 8829 |    |    | Trogon chrysochloros muriciensis Dickens, Bitton, Bravo & Silviera, 2021  --> Silviera should be Silveira

https://academic.oup.com/zoolinnean/article/193/2/499/6161254?login=false

See:
Cool stuff. How are you doing this—e.g. is it a Bash script ?

Suggest you might develop this into a script which can run all tests against the list each time (i.e. rather than doing it ad hoc). I'm sure you don't but shout if you need help to do that.
 
Cool stuff. How are you doing this—e.g. is it a Bash script ?

Suggest you might develop this into a script which can run all tests against the list each time (i.e. rather than doing it ad hoc). I'm sure you don't but shout if you need help to do that.
Indeed, I already re-used a few tests from earlier versions. Also, my downstream programs, while not necessarily tests, do display hiccups on faulty data which will set me investigating. Tests would be more useful if run by IOC, but that would mean Excel and Windows, neither of which I have. Maybe I could write TAP tests but it's really overkill, the data is pretty good and I know it well.

I simply load the master_ioc_list*.xlsx spreadsheet, via a perl export to csv (and a bit of bash too) into postgres, only adding 1 column for an indispensable sequence number (we must have the order fixed). Now I have the full power of postgres SQL to query the data.

Not for the faint-hearted, I suppose, but anyone who wants to try can get my bash/perl/sql programs (required: perl, bash, postgresql).
 
Last edited:
Indeed, I already re-used a few tests from earlier versions. Also, my downstream programs, while not necessarily tests, do display hiccups on faulty data which will set me investigating. Tests would be more useful if run by IOC, but that would mean Excel and Windows, neither of which I have.
Not necessarily. (I usually save as CSV and then use foreign data wrapper to access in PG).

Agree more useful if done by ioc, the obvious next step might be to consolidate the tests into a postgres function or procedure, and/or translate to sqlite + r (rsqlite)

Do we have IOC people listening in?

Edit: might be able to access the XL directly using ogr2ogr or other similar approach (i.e. without CSV export/Pg import)
 
postgres' fdw works too of course, but you can't have indexes (so things will be slow) and more importantly no sequence (which I deem necessary to force the order - and I wouldn't want to rely on the contained IOC-provided numbers, after all those are part of what should be tested).

I don't think direct access from postgres of .xlsx works well enough yet (bound to be slow too). Admittedly haven't tried it for a while (a year of two).

A SQLite version (import, test, report) might be feasible but for the moment my data-sniffing in postgres, and sending a few messages to IOC/David Donsker (by email or via this forum) seems best to me. After all SQLite is not the powerhouse that is postgres.

But it's good you brought up systematic testing. I'll give it some thought. Let's also not forget that nobody asked for it :).
 
postgres' fdw works too of course, but you can't have indexes (so things will be slow) and more importantly no sequence (which I deem necessary to force the order - and I wouldn't want to rely on the contained IOC-provided numbers, after all those are part of what should be tested).

I don't think direct access from postgres of .xlsx works well enough yet (bound to be slow too). Admittedly haven't tried it for a while (a year of two).

A SQLite version (import, test, report) might be feasible but for the moment my data-sniffing in postgres, and sending a few messages to IOC/David Donsker (by email or via this forum) seems best to me. After all SQLite is not the powerhouse that is postgres.

But it's good you brought up systematic testing. I'll give it some thought. Let's also not forget that nobody asked for it :).
We're getting into techie weeds here, but: when you have 100 million rows, think SQL. When you have 100 thousand rows, think "load it all into memory and write simple code in your preferred language". There's no real reason to use databases and SQL queries for a tiny dataset like this. (I've been a software engineer of sorts - amateur then professional - since the 80s, so I remember when this wasn't true, but today you've got gigabytes of memory and for a <1GB dataset databases are just getting in the way.)
 
postgres' fdw works too of course, but you can't have indexes (so things will be slow) and more importantly no sequence (which I deem necessary to force the order - and I wouldn't want to rely on the contained IOC-provided numbers, after all those are part of what should be tested).

I don't think direct access from postgres of .xlsx works well enough yet (bound to be slow too). Admittedly haven't tried it for a while (a year of two).
well you only need to use FDW to get the data into PG. Since these are quite small tables, reading them into temp tables would work if you need speed or sequences.
A SQLite version (import, test, report) might be feasible but for the moment my data-sniffing in postgres, and sending a few messages to IOC/David Donsker (by email or via this forum) seems best to me. After all SQLite is not the powerhouse that is postgres.
Just thinking about the ability for Windows-based people to roll their own. RSQLite might be less daunting. Or more daunting (as the PG installation is more involved) might be R + PG. I think standalone, no install PG is OK but iirc it was a bit painful to set up and I can't remember how easy it was to get it to speak to R, pgAdmin [i.e. all these on Windows].
But it's good you brought up systematic testing. I'll give it some thought. Let's also not forget that nobody asked for it :).
"Nobody asked for it", but presumably they are interested in improving quality. Assume they act on your feedback? Next logical step is something they can run themselves—just a q as to how to make this as painless as possible. PG is great but it's a steep learning curve, especially if you don't know SQL. Whereas an R RMD where you just have to change a few parameters (like path to file) on a pre-prepared script + press "render" would be much easier.
 
We're getting into techie weeds here, but: when you have 100 million rows, think SQL. When you have 100 thousand rows, think "load it all into memory and write simple code in your preferred language". There's no real reason to use databases and SQL queries for a tiny dataset like this. (I've been a software engineer of sorts - amateur then professional - since the 80s, so I remember when this wasn't true, but today you've got gigabytes of memory and for a <1GB dataset databases are just getting in the way.)
indeed. Perhaps R + Duckdb, or even just pure R [I find its regex a bit painful]. Also surprisingly slow sometimes on small datasets like these.
 
Maybe we shouldn't inflict our technical pipedreams on an uninterested party :)

I for one will just keep sending IOC/David a message when I see something amiss in the data. I don't think my wizardry is transmissable to an ornithologist, or even necessarily to an ornithologists' organisation.
 
Maybe we shouldn't inflict our technical pipedreams on an uninterested party :)

I for one will just keep sending IOC/David a message when I see something amiss in the data. I don't think my wizardry is transmissable to an ornithologist, or even necessarily to an ornithologists' organisation.
Much appreciated!!
 
There are two records with (Leotaud, 1866). I think these should be (Léotaud, 1866), as AviBase has it; it also links to the source document where the title page removes any doubt:
Code:
  id   |             Species              |    Authority  
-------+----------------------------------+-----------------
 12858 |   Dendroplex picus altirostris   | (Leotaud, 1866)   --> 2x should be (Léotaud, 1866)
 15458 |   Cnemotriccus fuscatus cabanisi | (Leotaud, 1866)
 
Last edited:
Quick question, where can I find the source/material for things such as WGAC 1136? I would like to know what is behind the split of Mongolian Gull for example. I can't really find anything relating to any publications etc on the WGAC website. Perhaps someone can point me in the right direction.
 
Quick question, where can I find the source/material for things such as WGAC 1136? I would like to know what is behind the split of Mongolian Gull for example. I can't really find anything relating to any publications etc on the WGAC website. Perhaps someone can point me in the right direction.
My understanding is this is not available for public appraisal yet, but will all be published when the official WGAC checklist is made available, in something like a year or so from now.
 
The link to the "full ssp" version of the list doesn't seem to be working at the moment. All the others downloaded from Master Lists – IOC World Bird List ok but not the ssp version.
I can see no reason why we should have lost the link. We'll check.

The link currently points to https://www.worldbirdnames.org/IOC_Names_IOC_Names_File_Plus-14.2_full-ssp.xlsx
The file is at https://www.worldbirdnames.org/IOC_Names_File_Plus-14.2_full-ssp.xlsx

EDIT - the problem appears to have been corrected now.
 
Last edited:
My understanding is this is not available for public appraisal yet, but will all be published when the official WGAC checklist is made available, in something like a year or so from now.

Thanks for the info. That's sad, so we'll just have to suck it up and accept all these 'secret' decisions! 🙃
 

Users who are viewing this thread

Back
Top