Wednesday, August 27, 2014

#Wikipedia - Professor Hermann Buhl "Leichtathlet"

Mr Buhl died in Tirol wandering through the Alps.  He used to be an athlete of repute and became a professor at the Julius Maximilians-Universiteit.

It is obvious that Mr Buhl was a professor because of his presence in a category. It is not obvious in the same way where he studied and what he taught. When you read the text, it expects a lot of knowledge about the DDR for the text to make such things obvious.

Every Wikipedia has its notability criteria and, the German Wikipedia is not different. Mr Buhl is certainly notable as an athlete but his career did not end. He is probable notable as well for the "latter" part of his career. Some would argue that he started to contribute in a meaningful way when he taught in university.
Thanks,
      GerardM

Monday, August 25, 2014

#Wikimedia - "Share in the sum of all available knowledge"

When we are to focus on the available knowledge we have to share, statistics are key. They cut the crap and focus on numbers. Given that information can be made out of data, knowing how much additional information is available that is easily understood by people who can read English is relevant. Two reports are relevant; one shows the number of links in English and, the other shows the number of labels in English [1]

At this time there are 757,967 items with an English label and without an article. This is 4,7% of the total number of items Wikidata holds. At the same time 58% of the number of items do not have a label in English.

Not having a label does not mean that we cannot provide meaningful information. The name of a Dutch or Spanish person is for instance perfectly understood; it is typically written exactly the same in English. Reasonator understands this and always presents a label anyway.

It is fairly easy to start sharing this "missing" information. It is already done in many Wikipedias. The suggestion to share more information has been put asked on all Wikipedias and  several "communities" do not think it is a good idea. In effect they prefer an inferior product providing a subset of the information that should be available to all our readers.
Thanks,
      GerardM

[1] it shows the numbers for other languages as well and, the statistics are near real time. It takes a minute for them to be presented to you.

Sunday, August 24, 2014

#Reasonator - A new metric for #Wikimedia

Denny wrote a really good article in the SignPost. It includes a "TL:DR" that I am happy to quote.
TL;DR: We should focus on measuring how much knowledge we allow every human to share in, instead of number of articles or active editors. A project to measure Wikimedia's success has been started. We can already start using this metric to evaluate new proposals with a common measure.
The point Denny makes is great; we aim to enable every human being to share in the sum of all knowledge and we should measure the extend to which we are achieving this goal. When you read the article carefully it does not say Wikipedia, it says Wikimetrics. The point Denny makes is very much that we need to focus on what it takes to bring information to people.

Presenting data that is available to us as information is what Reasonator does. It relies on what is known in Wikidata about articles that exist in any Wikipedia. To make this understood to a person, the number of available statements and the number of available labels for an item are key.

When Wikimetrics is to appreciate the potential of Wikidata and the approach Reasonator takes, it should include three bits of information;
  • the number of statements per item
  • the number of labels per language
  • how items are covered with labels in a language
With such an approach the graph will be substantially different. Not one language covers 50% of all the topics known to Wikidata and consequently the graph will show that there is much more work for us to do. It will also indicate that the amount of information that is available for a public that can read English is much larger and the amount available to people who can only read Gujarati is much less.
Thanks,
       GerardM

#Wikidata - Ameyo Adadevoh a physician from #Nigeria

When a Mr Sawyer arrived in Lagos and showed symptoms of ebola, Mrs Adedevoh took control of the situation and thanks to her efforts ebola was largely contained. In the end it did not save her; as a physician in the frontline of the fight against ebola she became a victim herself.

Mrs Adadevoh is another hero of our times. When you google about ebola and Nigeria, there are two things that are of interest; sadly there are the opinion pieces that see a conspiracy in the coming of Mr Sawyer to Nigeria but more positive is the information about the efforts to contain ebola in Nigeria and what it is that you can do to become infected; personal hygiene is key.

There is a call to ensure that hospital staff are immunised. It is quite obvious that no country can really afford to lose key people like Mrs Adadevoh. It is equally obvious that all doctors and nurses who have to deal with ebola patients need to be protected. Without them containing and treating ebola is impossible.
Thanks,
    GerardM

Saturday, August 23, 2014

#Wikidata - the #beta label lister


At the hackathon of #Wikimania2014, work was done on a new version of the label lister. It is a gadget that allows you to edit labels and aliases in other languages. It proved to be an indispensable tool to me. Today I learned that the new label lister is now available.

The most wonderful thing is that it became much more compact, you do not need to click as much anymore and it "just works". In the screenshot you see Mrs Bundschuh, she is a former member of the Landtag of Bavaria, and as you can see it is trivially easy to add a label in your language.

I hope that functionality like the label lister will make it into a core feature of Wikidata.
Thanks,
     GerardM

Monday, August 18, 2014

#Twitter - #WikiParliaments.. but what about #Wikidata and #Austria?

Twitter advertised several things that I might like. WikiParliaments could be one of them. Today I learned that Othmar Tödling died. He was a member of the "Nationalrat" of Austria. As such he might be very much of interest to WikiParliaments.

Politicians are human too; they die. When they do, it is often noted in a category what function they held. Today I started adding statements for those humans who hold or held the function of parliamentarian in Austria.

My hope is that people who care about parliaments will make it even prettier and embellish them with even more statements and qualifiers.
Thanks,
     GerardM

Sunday, August 17, 2014

#MediaWiki - #MediaViewer rehashed

Some things are plain stupid, sometimes I am and sometimes someone else is. I filed a bug about my experience of the MediaViewer. For me it is a show stopper; it prevents me from using it easily.

The problem is that Chrome shows a really awful URL for an image with funny characters in its title. When I look at it using the MediaViewer it is bad but it looks fine when I look at it from the Commons page.
  • File:%C3%89cole_normale_sup%C3%A9rieure_de_Paris,_26_January_2013.jpg
  • File:École normale supérieure de Paris, 26 January 2013.jpg
According to the Bugzilla triage I must be stupid because it works; it complies with specifications and, indeed technically it works. It just stopped working for me.

Several reactions are possible. My choice was to shrug, mutter "it is the user experience stupid" and I got on with my life. Others find it a precursor to the invasion of an evil overlord who does not understand the world and prepare for war.

By filing a bug, by posting this blog I have rid myself of my frustrations. I know several developers; I met many of them at Wikimania and I know they are really dedicated and mean well. I also know that such things pass. I am sure someone will see the light or Google will fix Chrome (if that is where the bug lives). In the end I do not look at images that often as a result.
Thanks,
       GerardM

#Wikidata - sources or confidence

At this time Wikidata has more than 36,396,372 statements these statements are associated with some 15,335,451 items. The majority of these items have less than five statements and even worse for many items it is not known what they are about.

When you consider the quality of this data, there are two schools of thought. There are those who insist on sources with every statement and, there are those who have confidence in the validity of the data because they know where it came from.

Either way, when you want to assert that a specific approach is superior, it becomes a numbers game and, understanding the relative merits is what it is all about. When something is sourced, you can be confident that it is highly probable at the time of the sourcing. There is however no certainty that the data remains stable. Confidence can be maintained by regularly comparing the data with what the source has to say.

When the data is regularly compared, it does not matter that much if Wikidata has source information itself. The source is typically one of the Wikipedias and they are said to have sources, this may provide us with enough reasons for confidence. The comparison of data increases this confidence particularly when multiple sources prove to be in agreement.

Practically, the basic building blocks to start comparing exist. It has been done before by Amir and he produced long lists of differences. Three things are needed to establish new best practices:
  • a well defined place needs to found where such reports may be found
  • communities need to understand that it raises confidence in their project
Thanks,
   GerardM

#Wikidata - giving a #category an application

Many #Wikimedia categories have interlanguage links. Obviously the content of all these linked categories do not have the same content. Someone has to add the articles, sometimes it gets done and sometimes it doesn't. Often articles just do not exist.

When the facts that are implicit in what a category is about make it to all the items in all the categories, typically you have a superset in Wikidata. It does not stop there; items in Wikidata may be included that are not in any of those linked categories.

This is all theoretical unless ... unless you can query Wikidata and use the results. Much data has been added to Wikidata based on the content of categories and queries have been used to identify missing items this is done using AutoList2. This is one application; it is used by some of the "advanced" users of Wikidata.

What is even more interesting is showing what Wikidata things should be in a category. This is done using Reasonator. At this time for over 690 categories statements are included that define a query. This query is already complex enough that the Wikidata functionality will not be able to express the results..

These queries could be of use to "advanced" Wikipedians because it is a basis for identifying articles that have not been categorised or articles that still need to be written in their Wikipedia. For everyone else it is just interesting; this information exists and it is readily available. It is one way of learning that Wikidata knows for instance about 121,922 politicians.
Thanks,
      GerardM

Saturday, August 16, 2014

#Wikidata - application for its long tail

When Lauren Bacall died this week, it was all over in the news. When Marjorie Stapp died on June 2, 2014 it was noted in the English Wikipedia only yesterday. Today it is known to Wikidata and, several bits of information where added to the item about Mrs Stapp as well.

Among those statements is her identifier in the IMDB. The IMDB does not know yet about the demise of Mrs Stapp and it is not unlikely that there are more actors and actresses we know about that have died. Providing external sources like the IMDB with an RSS feed of the changes that are made in Wikidata is not hard.

When we share our information in this way, we gain friends. With these new friends we may do friendly things like noting differences between the data that we hold. Equally important, we add a reason why people might maintain the data that is in Wikidata. As our data gains in application, we will grow and diversify our community.
Thanks,
     GerardM