Tuesday, July 26, 2016

Have #Wikipedia share the sum of available #knowledge

If Wikipedia is to succeed in sharing the sum of all knowledge, it has to first share the sum of available knowledge. To do this Wikipedians have to become more inclusive. They have to realise that Wikipedia is not about them but about its readers.

Typically the question "What do readers want" is answered by what readers find. This answer has one flaw. It assumes that Wikipedia includes what people seek and it forgets what people seek and do not find. This is a lost opportunity on many levels. To start with, Wikipedia is not singular and a subject may exist in another language. As we do not know what is missed, we do not know what to write to satisfy an existing demand. Finally more and more available information does not even have a Wikipedia article but its information is available in other projects.

A partial solution to these issues was around for a long time. It extends search by adding results from Wikidata. It allows you to find data in any script from any project. If there was no article, it shows information using the Reasonator. It is relatively easy to revive this and it will make even more sense when it results are included as positive results.

Once Wikipedians consider Wikidata as a tool, they will find that both red links and wiki links may link to Wikidata items. Typically they are the same links for the same subject in any language. This is relevant to editors because it is one way to clarify what links exist to an article and, it is only one step away to annotate them as statements in Wikidata and thereby document such links. They will find a lot of erroneous links and it will improve overall quality.

The good news, the links between wiki links and Wikidata items already exist. What is lacking is a verification process that these wiki links are good. Adding links to statements for red links is technically not that hard. It will add some turmoil at the Wikidata end; many items will be added and will have to be merged eventually. One benefit of this approach is that it is not necessary for everyone to collaborate but it will benefit the people that matter most; all the readers of all the Wikipedias.
Thanks,
      GerardM


Saturday, July 23, 2016

#Wikipedia - #GMO controversy as a red herring

#Wikipedia has/had a big discussion on the safety of GMO food. When you read from what the Signpost has to say; it is only about the safety for people to eat this stuff.

The problem is that many promises have been made and this is only one issue, not even the most relevant issue. Read the article "20 years of failure" by Greenpeace or reads its rebuttal to what some Nobel Prize winners had to say.

The question if it is safe to eat is only one. The question if it will do us any good is more relevant. It does not bring us a more reliable food supply. It will not bring us more resiliency against climate change and it is very much in doubt that "golden rice" actually brings additional vitamins while a balanced diet does.

The important point of Greenpeace is that it backs its assertions with science. It is not in it for the money and its aim? A world that we can live in.
Thanks,
       GerardM

Monday, July 18, 2016

#Wikipedia - Dr Mary Meeker and SOI testing

Mrs Meeker and her husband Robert Meeker worked on a system used in education. She is known for applying Guilford's Structure of Intellect theory ("SI") to creating assessments and curriculum materials for use in teaching children and adults. The premise of SI is that intelligence comprises many underlying mental abilities or factors, organized along three dimensions—Operations (e.g., comprehension), Content (e.g., semantic), and Products (e.g., relations). When you are interested, read the article.

The article compared her work to the debunked Myers Briggs Type Indicator. This is something we should not do. The article on Mrs Laurie Helgoe provides all the arguments needed to restrict information on that indicator to that article. It is not best practice to use tools that are ambiguous in its results and therefore using it in comparison is not in the interest of our readers.
Thanks,
      GerardM

Sunday, July 17, 2016

#Wikipedia - notability of Mrs Laurie Helgoe

When popular knowledge gets debunked, it makes for notability. Mrs Helgoe debunked the Myers Briggs Type Indicator. It is used a lot even by those who should know better to classify human personality traits.

It is quite something when research shows how much popular methods are wrong. Instead of representing a 25-30% of the population, introverts make up 57% of the population. It means that Myer Briggs is off by 100%.

The critique of the article for Mrs Helgoe has it that it is an orphan; no articles link to it. Having read the article, it is more valid to find fault at the Myers Briggs article; it does state that the method is not valid but it more less glosses over that fact.

The problem with the Myers Briggs article is that it attempts to explain the method used, a method that is invalid.
Thanks,
     GerardM

#Reasonator - the perspecive on #Wikidata people do no get

#Wikidata is where Wikimedia data lives. It started with a big service to Wikipedia; It centralised its interwiki data and this was a huge step forward in its quality. There is still a lot of work done on improving it even further because many of the problems left need a different perspective.

The next official challenge is to provide data to infoboxes. This problem is utterly different from the challenge replacing interwikilinks. It is impossible to import all the data from infoboxes all at once and start improving. The quality of the data in infoboxes is worse but that is not the problem.

So people have imported oodles of data and the quality is as expected; poor but improving. One problem is that all the work is happening at Wikidata and it does not transfer to Wikipedia. There is not even an official way to have a good look at the data available at Wikidata. The unofficial tool is Reasonator, it is currently broken and it is why I am reflecting.

Reasonator provides an intelligible perspective on the data of an item. It makes many problems "obvious". It shows imported statements and it shows all the references to the item that is shown. It allows you to see all (with a maximum of 500) statements that share common properties.

With a functional Reasonator, many people work on data from Wikipedia with a Wikidata perspective. When Wikidata is to fulfil its promise of improving the quality of data of Wikipedia considerably, the first thing to do is change objectives and perspective. The perspective could be Wikipedia based and the objective is not replacing data in infoboxes but quality. The good thing is that it is actually possible to achieve this.

A few observations; all wikilinks are in effect links between Wikidata items. Many of the links indicate that an article "needs" to be in a category and consequently this can be automated.

Why do this? When people look at all the wikilinks with a Wikidata perspective, it will make a lot of faulty links obvious. A painter of the 16th century did not receive a 20th century award for instance. Quality will improve.  As more statements and possibly items are created, it will affect every article about the same and related topics.

It needs only one thing, a Reasonator like view of the data from a Wikipedia point of view.
Thanks,
      GerardM

Thursday, July 14, 2016

#Wikidata - Virginia Berninger; Samuel Torrey Orton award 2015

The Samuel Torrey Orton award is conferred by the International Dyslexia Association. It is named after Samuel Orton who was a pioneer in the field of dyslexia.

Mrs Berninger was added to Wikidata because she is the 2015 recipient of the award. It is my intent that Wikidata slowly but surely knows about the more recent award winners, one at a time. It so happened that two of my projects intersected; adding information about female psychologists and awards. Mrs Margaret J. Snowling received the award and this bit of data was added.

My notion of quality for Wikidata is that items need their statements and that more links are better. This allows for all kinds of statements. linking awards to the conferring organisation, the website of an award or an organisation, other awardees.

The funny thing is that adding Mrs Berninger may encourage Wikipedians to write an article about her or at least add her to the list of award winners :)
Thanks,
     GerardM

Monday, July 11, 2016

#Wikidata - Margaret D. Foster - a #female #scientist

I was asked to blog about Mrs Foster. The argument was: "This article missed all the points why Mrs Foster is notable". One great feature of the improved article is a picture that was lovingly restored by Adam Cuerden.

Well, to be honest, I remember a presentation by Rosie Stephenson-Goodknight where she argued that the first step to get some gender balance is to write an article warts and all. It does not have to be perfect, the least it does is be there and invite scorn and improvements.

This sentiment is part of the original Wikipedia ethos; it is good to have stubs and red links. It is good to have a start to improve upon. In this sentimental spirit I improved the data on Mrs Foster on Wikidata a bit. I used Autolist to add the content of a few categories and, I added some universities she had attended.

So yes, the article has improved and it is exactly why both Wikipedia and Rosie are a success.
Thanks,
      GerardM

Saturday, July 09, 2016

#Wikidata - Bródy Sándor-díj

When #Wikidata has really succeeded, it includes all the data of all the Wikipedias. The Sándor Bródy Prize is known on three Wikipedias and it is reasonable that the Hungarian Wikipedia has the most information.

The last known winner, Gábor Kálmán, won the prize in 2012. Currently it is a red link. There is no information about who won in later years and my Hungarian is not enough to find out more if the prize was conferred.

All this transpired from a recent idea that in order to improve the quality of Wikidata for awards, we should add all the winners of awards for 2015. Lydia suggested asking on Twitter for a query and both Magnus and Wikidatafacts provided a SPARQL query. For the Sándor Bródy Prize no winners were known, this was remedied with the "Linked Items" tool. As the objective was to only add the last winner, Mr Kálmán and the date for 2012 were added. There are some 13,881 awards known without a 2015 winner..

The objective for the Sándor Bródy Prize has not been achieved. However, the quality of the data has improved considerably. To make it as good as the information on the Hungarian Wikipedia, dates have to be added and two items have to be added to fill in for existing red links.

The point of all this is that it is possible to quantify a lack of data in Wikidata and by inference a lack of quality. As time goes by, people can use these queries as a tool to make improvements or people will just add data and as a consequence the quality will improve. Either way it is obvious that it takes time and effort to get the desired quality. However on a micro level, it is possible for Wikidata to be better than any of the other projects because its data for a specific award is better. For the the Sándor Bródy Prize all it takes is two items and a few dates.
Thanks,
      GerardM