Saturday, June 24, 2017

#Wikipedia - Sister projects in search results

The Wikipedia Signpost informs that the discovery team extended the results for search on Wikipedia. New is that English Wikipedia now includes results from
WikisourceWiktionaryWikiquote and  Wikivoyage and that is indeed welcome news.

There is one puzzling part in the information; "Wikidata and Wikispecies are not within the scope of this feature." It is puzzling because including Wikidata search results is where search has been augmented for years in many Wikipedias including the English Wikipedia by the people who added this little bit of magic Magnus provided.

As you can see in the screenshot of the search for Wilbur R. Leopold, an award was conferred on him and the origin of this factoid is the article on the award. Thanks to Wikidata, information is available for Mr Leopold. There are so many references in Wikidata that have no article in a Wikipedia or any other project that from a search perspective it is probably the next frontier.

When wiki links, red links and even black links can be associated with Wikidata items, it becomes even easier to add precision to the search results. Adding these links is the low hanging fruit to improved quality in Wikimedia projects anyway. 
Thanks,
     GerardM


Sunday, June 18, 2017

#Wikidata - John P. A. Ioannidis and his awards

I am a self confessed award junkie. They are imho important because they are an indication of who is notable and who is less so.

Three awards are associated with professor Ionannidis in Wikidata. One award was also conferred on Hans Rosling and this gives me added confidence in Mr Ionannidis and other recipients of the Chanchlani Global Health Research Award.

Professor Ionannidis throws cold water on much of the practice of scientific practice and consequently on its practitioners. One of his papers has the title: Why most published research findings are false and it is inherently a challenge as well to what we write in the Wikipedias and Wikidata.

At Wikidata a wholesale import is happening of papers, science facts and its authors. This is a great idea, particularly when papers that dismiss much of the nonsense papers gets a prominent place. The result will be that the Neutral Point Of View gets an other twist; it balances what we include with actual science.
Thanks,
     GerardM

Saturday, June 17, 2017

#Wikidata vs #GeoNames - the first to throw a stone

Wikidata has some vocal people vilifying GeoNames. They insist that no data from GeoNames is included in Wikidata because "the quality is so bad". In my last post I wrote down assertions about Wikidata. One of them is that "Never mind how "bad" an external data source is, when they are willing to cooperate on the identification and curation of mutual differences, they are worthy of collaboration".

I wrote an email to Markc Wick, the founder of GeoNames and with his permission I can publish our mail exchange.

Hoi,The import of data from GeonNames into Wikipedia has been controversial. People say that the quality of the GeoNames data is not "good enough". It resulted in the deletion of thousands of articles from the Swedish Wikipedia. I am not Swedish, I did not follow their discussions but the problem is it sours collaboration with other parties because "their data might not be 100%".
This happened in the past, I care for the future. In Wikidata we do link to GeoNames (example Almere [1]).
There are several ways in which we can help each other and potentially even benefit from a collaboration. Wikidata is licensed with a CC-0 license and therefore GeoNames can have all our data and do with it as they please.
My initial proposal is for a comparison of the shared data. The data where GeoNames differs from Wikidata is potentially problematic. Concentrating on these differences together will improve both our and your data.
Would you be interested?
Thanks,
       GerardM
       Gerard Meijssen
His answer is everything I could hope for:
Hi Gerard
Thanks a lot for your email. A couple of weeks ago I have started to parse the wikidata extract and look for the matching attributes. Unfortunately I got interrupted and have not yet looked at the result of the parsing. I will continue as soon as I find the time.
The goal is to add the wikidata identifier to the alternatenames table with pseudos language code 'wkdt'. What I have noted so far is that sometimes the geonameids in wikidata go the wrong concept. For instance going to the city feature when the article is speaking about the administrative division or vice versa. This is one of the things I would like to check before adding the wikidataid as alternatename. GeoNames also has links to wikipedia.
I don't think wikipedia should import all geonames features, not all of them are relevant enough to justify a wikipedia article.
Best Regards 
Not only is there an interest to collaborate; Marc is checking the links in Wikidata referring to GeoNames and as can be expected he finds issues. As I asserted, this is to be expected and collaboration is the only way forward for optimal results.
Thanks,
      GerardM

Tuesday, June 13, 2017

#Wikidata some assertions

Wikidata is no different from any community, there are differences of opinion. Everybody has his or her own perspective but there are assertions that can be made that have a more universal resonance. 

The assertions below represent the underlying arguments I use in my blog posts and in the discussions I take part of. They are the ones I feel are not necessarily "political" or have a negative impact.
Thanks,
       GerardM
  1. There is no data store without problems, this includes Wikipedia and Wikidata.
  2. The data we hold is best understood by applying set theory. The data in Wikidata consists of many subsets; probably the most valuable subset for the WMF are the interwiki links.
  3. The error rate in each subset can be assessed and is by definition different from the overall Wikidata error rate
  4. The absence of data often indicates a bias in the data Wikidata holds. A good example is the lack of data relevant to the global south.
  5. Given the huge influx of data from Wikipedia, the biggest imports have been from English Wikipedia and it is one reason for the existing biases in Wikidata.
  6. An absence of data prevents the application of tools. Tools may suggest writing a Wikipedia article, tools may compare data with other sources.
  7. Concentrating on the differences between Wikidata and any other data source is the most optimal way of improving the quality of existing data in either data set.
  8. Having an application for the data in Wikidata is the best way for improving the usefulness for a subset of data.
  9. Each contributor to Wikidata works on the data set(s) of his/her own choice, these data sets interact in the whole of Wikidata. This may raise issues and this can not always be avoided.
  10. Examples of problematic data must be seen in the light of the total of the data set they are part of. Statistically they may be irrelevant.
  11. Never mind how "bad" an external data source is, when they are willing to cooperate on the identification and curation of mutual differences, they are worthy of collaboration
  12. Wikidata improves continually and as such it is "purrfect" but it will never be perfect.

Monday, June 12, 2017

#Causegraph, an other way of looking at #Wikidata


Causegraph is a tool to visualize and analyze cause/influence relationships using Wikidata. If you have not seen it yet, give it a spin.

Randomly looking at the galaxy of relations, I found a Charles Frédéric Bassenge, he is in Wikidata because he is the father of Pauline Runge. He is in Wikidata because she has an entry in WikiTree. What amazes me most is the quality of the data for the father and his absence in WikiTree. 

Causegraph works on the basis of there being a direct relation between two persons. For Jacob Palis, the doctoral students and doctoral advisers are included and not the other TWAS award winners.

What is really good is that it is regularly updated. It would be even better when it was a Labs tool. This might enable real time updates .. <grin> there is always a wish for more and better </grin>
Thanks,
       GerardM

Sunday, June 11, 2017

How #Wikipedia gets into @Africa


This is a map showing how fiber is getting into Africa. The blind spots is where the Internet does not go. The red lines is where the future for the Wikimedia lies.
Thanks,
        GerardM

#Wikidata - Premio Almirante Álvaro Alberto

The Premio Almirante Álvaro Alberto is named after admiral Álvaro Alberto da Mota e Silva. They are both notable for their own reasons.

The award was mentioned in an article on the German Wikipedia for César Camacho. The award was not known to Wikidata and was added. The website of the conferring organisation gives me the impression that it is the "National Council for Scientific and Technological Development" and part of the Brazilian ministry of sciences. When you look for it in Wikidata, it is embarrassing.

The admiral is probably a child of his time. He was military and also a very relevant scientist. As a military man he held the rank of vice admiral and as a scientist he was twice the president of the academy of scientists. He was also very much involved in the Brazilian nuclear program.

When you consider the notability of Brazil, it is astounding how little is known in Wikidata. Many politicians have been added for Brazil; national senators and deputies. 

Brazil is one of the top twenty countries in the world I think, when you consider any and all of the "lesser" countries it is obvious that we know even less. When Wikipedia and by inference Wikidata is about the sum of all knowledge, there is a lot of white space where all our tools have no impact.
Thanks,
     GerardM

Saturday, June 10, 2017

#Wikidata - #diversity of #science - Professor Govind Swarup

Professor Swarup received the TWAS prize. The TWAS Prize is an annual award instituted in 1985 by The World Academy of Sciences to recognise excellence in scientific research in the global South. It follows that when attention is given to scientists like Mr Swarup, it should be easy to link to other scientists, particularly those from the global south.

With twenty two awards Mr Swarup does not disappoint. Many of the awards are from India; one of the conferring organisations, the Indian Science Congress Association, lists 41 awards. Its rule that a scientist can now receive only one award in his lifetime indicates how many scientists are recognised by the ISCA.

Making the TWAS prize winners more complete by adding the awards helps to improve the diversity of scientists. It is not only women who have not been fully recognised it is also the scientists from the global south.
Thanks,
      GerardM

Monday, June 05, 2017

#Wikimedia - Felix Andries Vening Meinesz and the sum of all #knowledge

Mr Vening Meinesz was an important Dutch scientist. There are a few pointers to his relevance; he was a member to several august bodies and he was awarded many awards, awards from several countries.

One of the medals is the "Alexander Agassiz Medal". When you look at the English Wikipedia article, you find no red links while many award winners do not have an article. Otto S. Pettersson for instance is/was known to Wikidata but he was not associated with the award. When you google for another awardee, it is most likely that the 1926 award winner was Jacob Bjerknes, not Wilhelm Bjerknes.. Even sources get it wrong..

In many ways, awards are rather boring. Getting all the information right is a lot of work and when it is to be written in Wikipedia articles, there are too many Wikipedias and awards for all the awards to get an article in all of them.

When awards are fed to Wikipedia articles from Wikidata like it is done for sources, it becomes a lot more manageable. Increasingly Wikidata knows about more awards for more people. What does it take to reach the necessary tipping point? Which Wikipedia will consider this first?
Thanks,
     GerardM

#Wikidata - a #young face for #science


These are members of the "Jonge Academie".  They are Dutch scientists and for this academy to remain young, membership expires after ten years. Similar academies exist in several other countries. Countries like Pakistan and Belgium.

With politicians riding roughshod over scientific facts, rejuvenation of science is important. The notion that scientists are old is all too easy and it is equally easy to dismiss young scientists for a lack of relevance. Check out these Dutch scientists, they are relevant and will be for a long time.
Thanks,
      GerardM

Saturday, June 03, 2017

#Wikipedia - Bias and the King #Faisal International #Prize

The King Faisal International Prize is an international award recognising five distinct areas. They are: Service to Islam, Islamic Studies, Arabic language and literature, Medicine and Science.

When a Wikipedia is to write in a NPOV way about the King Faisal International Prize, all five categories need to be included. Just listing Medicine and Science and having the article as a "science award" ignores the scientific realities in the other three categories or prevents the inclusion of other theological or literature awards.

This is an unfounded bias and remediation is needed in order to achieve NPOV.
Thanks,
     GerardM

Thursday, June 01, 2017

#Wikipedia - another #German #Award


It is funny in its own way that the only award winner that has no "red" or "blue link" on this award page are Wikipedians; German Wikipedians. They won the 2016 GDCh-Preis für Journalisten und Schriftsteller.

Typically we do not give much attention to our achievements and as such the understated attention can be understood. At Wikidata we need to have an item in order to recognise an award winner. As there was a photo of the award ceremony, it was obvious to add it to the item for these award winners as well.
Thanks,
      GerardM