Sunday, August 31, 2014

#Wikidata - my #workflow enriching Wikidata using tools

As I have other commitments, I do not have the same amount of time to do what I used to do. The workflow I use is now quite stable and dependable so I am happy to publish it. It is fairly easy and obvious. You can do this too.

Important are objectives; mine are:
  • make Wikidata more informative by adding relevant statements
  • Provide the basis for further usage of data
My workflow is based on the people who died in 2014. This is reported in categories. ToolScript informs me about all items that do not have a date of death. Every line represents an item; typically they are human but there are also horses and other critters included. I click the Reasonator icon and, the links to articles provide me with the first lines of that article. Typically the date of birth and death are included. I copy this text when it is not English and use Google translate. From the translated text I copy the dob dod. I click on the Qnumber in the Reasonator and add these dates in Wikidata.

The ToolScript can easily point to 2013 or any other year. Obviously you can make your own script to do whatever.

Once somebody is a registered dead, I look at the article for interesting categories. They can be anything from "Alma mater university x" to "player of Whatever FC". Most interesting are the implied facts NOT reported from the dearly departed. Any category may contain hundreds of other items for whom we are not aware about said fact. The first thing to do is to document said category, this category can be on any wiki. Documenting is done by including a statement with "is a list of" "human" and have a qualifier like "alma mater" "University X". Reasonator will show at most the first 500 entries of the resulting query.

When many entries are still missing, Autolist2 is the tool to use. From the Reasonator page of the category, copy the name of the category, the P and the Q value to the appropriate spot. Do not forget to make sure that the right Wiki has been selected (en in the example). Consider the depth; depth 0 is safest. Make sure that the WDQ mode is on "AND" and press "Run". This will generate the list that is selected for processing. Check the list and copy the P and Q values to the control box. Click "Process commands" when you feel comfortable with the results. Once the process starts, you will find the changes in the Reasonator page for the item you add statements for, in the example of the illustration it is the New Zealand Order of Merit

For best results most entries are often in the "local language" like this example for people who work(ed) at the university of Innsbruck.

With a workflow like this you are more effective. The work is documented and slowly but surely Wikidata becomes truly informative.

Friday, August 29, 2014

#Wikidata - Adolf Butenandt, Nobel laureate, professor and student

For many professors we know in Wikidata that they are or have been employed by what university. Data about this has been added categories at a time. Often this has been repeated for categories about the same university from different Wikipedias.

At the same time information has been added for the universities where people studied. However, there is an increasing number of professors for whom it is not known where they studied.

Professor Butenandt is a case in point; he studied at the university of Marburg and the university of Göttingen. It is known on one Wikipedia and not on others. Given that categories are linked as well, it is fairly easy to signal missed opportunities.

Thanks to this query by Magnus, we know about 23,351 professors without an alma mater. For Mr Butenandt information has been or will be added and, obviously there is much more work left to do.

Wednesday, August 27, 2014

#Wikipedia - Professor Hermann Buhl "Leichtathlet"

Mr Buhl died in Tirol wandering through the Alps.  He used to be an athlete of repute and became a professor at the Julius Maximilians-Universiteit.

It is obvious that Mr Buhl was a professor because of his presence in a category. It is not obvious in the same way where he studied and what he taught. When you read the text, it expects a lot of knowledge about the DDR for the text to make such things obvious.

Every Wikipedia has its notability criteria and, the German Wikipedia is not different. Mr Buhl is certainly notable as an athlete but his career did not end. He is probable notable as well for the "latter" part of his career. Some would argue that he started to contribute in a meaningful way when he taught in university.

Monday, August 25, 2014

#Wikimedia - "Share in the sum of all available knowledge"

When we are to focus on the available knowledge we have to share, statistics are key. They cut the crap and focus on numbers. Given that information can be made out of data, knowing how much additional information is available that is easily understood by people who can read English is relevant. Two reports are relevant; one shows the number of links in English and, the other shows the number of labels in English [1]

At this time there are 757,967 items with an English label and without an article. This is 4,7% of the total number of items Wikidata holds. At the same time 58% of the number of items do not have a label in English.

Not having a label does not mean that we cannot provide meaningful information. The name of a Dutch or Spanish person is for instance perfectly understood; it is typically written exactly the same in English. Reasonator understands this and always presents a label anyway.

It is fairly easy to start sharing this "missing" information. It is already done in many Wikipedias. The suggestion to share more information has been put asked on all Wikipedias and  several "communities" do not think it is a good idea. In effect they prefer an inferior product providing a subset of the information that should be available to all our readers.

[1] it shows the numbers for other languages as well and, the statistics are near real time. It takes a minute for them to be presented to you.

Sunday, August 24, 2014

#Reasonator - A new metric for #Wikimedia

Denny wrote a really good article in the SignPost. It includes a "TL:DR" that I am happy to quote.
TL;DR: We should focus on measuring how much knowledge we allow every human to share in, instead of number of articles or active editors. A project to measure Wikimedia's success has been started. We can already start using this metric to evaluate new proposals with a common measure.
The point Denny makes is great; we aim to enable every human being to share in the sum of all knowledge and we should measure the extend to which we are achieving this goal. When you read the article carefully it does not say Wikipedia, it says Wikimetrics. The point Denny makes is very much that we need to focus on what it takes to bring information to people.

Presenting data that is available to us as information is what Reasonator does. It relies on what is known in Wikidata about articles that exist in any Wikipedia. To make this understood to a person, the number of available statements and the number of available labels for an item are key.

When Wikimetrics is to appreciate the potential of Wikidata and the approach Reasonator takes, it should include three bits of information;
  • the number of statements per item
  • the number of labels per language
  • how items are covered with labels in a language
With such an approach the graph will be substantially different. Not one language covers 50% of all the topics known to Wikidata and consequently the graph will show that there is much more work for us to do. It will also indicate that the amount of information that is available for a public that can read English is much larger and the amount available to people who can only read Gujarati is much less.

#Wikidata - Ameyo Adadevoh a physician from #Nigeria

When a Mr Sawyer arrived in Lagos and showed symptoms of ebola, Mrs Adedevoh took control of the situation and thanks to her efforts ebola was largely contained. In the end it did not save her; as a physician in the frontline of the fight against ebola she became a victim herself.

Mrs Adadevoh is another hero of our times. When you google about ebola and Nigeria, there are two things that are of interest; sadly there are the opinion pieces that see a conspiracy in the coming of Mr Sawyer to Nigeria but more positive is the information about the efforts to contain ebola in Nigeria and what it is that you can do to become infected; personal hygiene is key.

There is a call to ensure that hospital staff are immunised. It is quite obvious that no country can really afford to lose key people like Mrs Adadevoh. It is equally obvious that all doctors and nurses who have to deal with ebola patients need to be protected. Without them containing and treating ebola is impossible.

Saturday, August 23, 2014

#Wikidata - the #beta label lister

At the hackathon of #Wikimania2014, work was done on a new version of the label lister. It is a gadget that allows you to edit labels and aliases in other languages. It proved to be an indispensable tool to me. Today I learned that the new label lister is now available.

The most wonderful thing is that it became much more compact, you do not need to click as much anymore and it "just works". In the screenshot you see Mrs Bundschuh, she is a former member of the Landtag of Bavaria, and as you can see it is trivially easy to add a label in your language.

I hope that functionality like the label lister will make it into a core feature of Wikidata.

Monday, August 18, 2014

#Twitter - #WikiParliaments.. but what about #Wikidata and #Austria?

Twitter advertised several things that I might like. WikiParliaments could be one of them. Today I learned that Othmar Tödling died. He was a member of the "Nationalrat" of Austria. As such he might be very much of interest to WikiParliaments.

Politicians are human too; they die. When they do, it is often noted in a category what function they held. Today I started adding statements for those humans who hold or held the function of parliamentarian in Austria.

My hope is that people who care about parliaments will make it even prettier and embellish them with even more statements and qualifiers.