Thursday, September 08, 2016

#Wikimedia - the need for #sceptism

It is all over the news; another psychology study debunked. With two thirds of the repeated studies being debunked, there is a lot in the literature of psychology no longer valid. The source for the article I read is Mr Eric-Jan Wagenmakers professor at the university of Amsterdam.

The NWO, the Netherlands Organisation for Scientific Research, is funding 3 million Euro to repeat key research. The problem is that science is in love with what is new and quick results. Three million is at best a start.

When science cannot be relied on, collaboration with scientists and universities easily becomes controversial. The programs taught are inherently point of view and often a conflict of interest is easily established. Consider; when doctors prescribe substances that are FDA approved, it seems obvious that these substances have a positive effect on patients. Then consider that we have a Wikipedian in Residence at Cochrane, they make a reputation from debunking much of the use of such substances. We provide end user information and it seems obvious that just repeating the list of FDA approved substances without further information is not at all in our users best interest. It is even likely that we are liable for misinformation under several legislatures.

There is a need to be sceptical about sources. It is important that we not only improve the technology behind our sources, we also need an ability to mark information as debunked and have that information filter through our projects and in the information we provide. Remember, debunked is not a POV it comes with sources of its own.
Thanks,
       GerardM

Sunday, September 04, 2016

#Diversity - A Woman's hall of Fame

Wikipedia has a category of some 40 Women's hall of Fame. They are women from the past and the present that are seen as exemplary. For all the women who have an English article there is now a statement indicating that they are seen as such.

For many women who are on these lists there is no article. Obviously when the objective is to have quality articles on notable women, it is good when there are lists with articles that could be written.

There are such lists and the best thing is they is some form of automated maintenance. The Women in Red project has such lists. Many of their lists find their basis in Wikidata and it is therefore possible to add people to their lists by adding key data.

All the women who have articles are now known as such, The next thing is to add the missing articles, the red links. So far I have added items for them one by one and stated what they are known for. Obviously this is a stub. More information is needed to state what they are known for, where they lived, why they are notable. It is not only how you enrich the data it is also how you increase diversity.
Thanks,
      GerardM

#Wikidata - the conflict of interest in medical information

According to the clinical evidence handbook only 12% of the 2500 most prebscribed substances and treatments by doctors are not proven effective. There is a massive conflict of interest when unsubstantiated facts are allowed in Wikidata. Arguments like "it is NPOV" are used to defend the practice or "it is harmful for patients" when they can find out that a substance is no better than a placebo but does have negative side effects.

When an external source knows about a substance, it is fine to link to that source. This is not the same as importing the data wholesale particularly when the data is so obviously categorically problematic.

The Wikimedia Foundation has a responsibility and it is not in indicating what substances are prescribed. When we are to include information it is not on the basis that it has been approved for use but on the basis of that it is actually proven to be beneficial. An error rate of 12% on such vital information is not acceptable.
Thanks,
      GerardM

Sunday, August 28, 2016

#Wikidata - La GalerĂ­a de las Mujeres de Costa Rica

#Marketing is something the #Wikimedia Foundation does not do. It does not mean that concepts like KPI are foreign to the WMF. Take this list from the English article "La GalerĂ­a de las Mujeres de Costa Rica" the women listed are "women who have broken gender stereotypes and advanced human rights principals".

A lot of effort goes into fighting for a diverse Wikipedia where both women are given proper attention. If I were a marketing man, I would say that lists like this provide pointers to people who want to help. I would be happy with a list that shows all the current people with an article and I would be ecstatic when I had a list that would show all the missing articles that would auto update.

The funny thing is that technically it is not that hard to produce. It is not even that hard to include the technology into MediaWiki but it takes a marketing man to drive the point home that you have to engage people and that it shows the quality of a Wikipedia project when we know where we are lacking and where we should concentrate.
Thanks,
     GerardM

Tuesday, August 23, 2016

#Wikidata - Colorado Women's Hall of Fame

There is a continuous effort underway in #Wikipedia to celebrate notable women. When women are seen as a role model, it is obvious that they deserve attention.

The Colorado Women's Hall of Fame is an organisation that celebrates women and every year 10 more women are included. The article on the organisation includes a list and it includes many red links. So more can be done, not only in Wikipedia but also in Wikidata.

As Wikidata is maturing, SPARQL is now of sufficient quality that many of the tools developed by Magnus are transitioning to SPARQL. This takes time and at the same time some tools are discontinued or do not fully function any more. Linked Items is one such tool. It creates a list of items that are found in a Wikipedia text. It is ideal when a text based file full of wiki links exist. It is just a matter of copying in the links and it will generate a list with Wikidata items for you. It is then needed to restrict the items that are used and it was possible to use WDQ the engine that could when SPARQL for Wikidata was a distant dream. Sadly it does not work anymore.

A solution is taking the list of items and copying to Petscan, the tool Magnus favours. It uses SPARQL and it is something of a Swiss army knife for data. When you are used to earlier tools like Autolist, many of the assumptions are wrong and it takes time to discover how the tool works. It does and that is why there are a large number of women who are known to be on the Colorado women's hall of fame.
Thanks,
      GerardM

Sunday, August 14, 2016

#Wikidata - #quality is not abstract

There is a new "Request for Comments" on quality for Wikidata. It is an attempt to describe quality in a top down approach. It is about words, it is abstract and well, I wish them well.

Wikidata has qualities. When you understand Wikidata by what it is and what it does you understand the not so abstract qualities it has. Its principle aim is to bring structure to the data that is in the Wikimedia projects.

The first quality that Wikidata brought was that it replaced the text based interwiki links. The improvement was important; in a short space of time the quality of these interwiki links improved and the associated number of edits went down. The quality of the interwiki links is not absolute but there has been no research on the follow up.

Interwiki links represent  connection between articles of Wikimedia projects that are about the same subject. Within a Wikipedia, a Wikisource there are links that are in essence similar to Wikidata statements. When a university is mentioned, the subject may be a student or staff at that university and when the statement has been made there is a reason for inclusion in categories. We can research the concurrence of such statements and Wikilinks. Quality improves when the concurrence improves.

When enough data is available, it becomes possible to use Wikidata statements in templates. Templates and info boxes expect high quality data in Wikidata and the available data is typically not good enough. When it is easy to make statements to wiki links and red links, the data in an info box will grow with the added statements.

We do need to work on the quality for our readers. This is done best by leveraging the data we have and engage our communities not only to link articles together but also by expanding these links with the statements that bind them together.

Yes, we will have to solve abstract issues but the reality is that they are not so abstract. Issues have their basis in what it is we have to understand this in what we hope to achieve; serving the world with the sum of all our available knowledge.
Thanks,
       GerardM

Monday, August 08, 2016

Is convergence between #Wikipedia and #Wikidata possible?

Wikidata is piggybacking on Wikipedia I was told. This is true; much data is imported from any and all of the Wikipedias and thereby Wikidata changes for the better. It improves in quality and become much more than what any single Wikipedia has to offer. At the same time Wikidata is rather awkward in its use and, there has been too much thinking in terms of what people know and expect for their own project.

Perspectives evolve. I tend to think of Wikidata as not yet good enough for most purposes. It is incomplete and its quality is inconsistent when we consider statements about its items. The remedy is obvious; work on the areas that are relevant and where Wikidata can easily make a difference.

That is fine road plan for me but Wikipedians also use Wikidata, they even need to use Wikidata. When they add an article about a person, the authority control data is served from Wikidata and, they have to add the information to Wikidata if it is to show. So what can be done to make this easy so that the use of Wikidata and Wikipedia may converge?

One aspect that seems important is that Wikidata information needs to function in whatever edit mode. The biggest motivational handicap I found is that most of what I did does not have an effect. It is much more rewarding when effects are more noticeable. All wiki links in an article link to other articles that have items of their own. Why not have a toggle that either shows these links with relations or not? For the brave hearts that take an interest it is cool, The others do not even have to notice.

When such links are annotated, they result in statements and such statements may even imply categories or other subsequent functionality. Currently bots only harvest in Wikipedia but why not have them add to the Wikipedias in a predetermined way? It makes for a much more dynamic editing process and it will definitely improve quality.

What do you think?
Thanks,
      GerardM

Tuesday, July 26, 2016

Have #Wikipedia share the sum of available #knowledge

If Wikipedia is to succeed in sharing the sum of all knowledge, it has to first share the sum of available knowledge. To do this Wikipedians have to become more inclusive. They have to realise that Wikipedia is not about them but about its readers.

Typically the question "What do readers want" is answered by what readers find. This answer has one flaw. It assumes that Wikipedia includes what people seek and it forgets what people seek and do not find. This is a lost opportunity on many levels. To start with, Wikipedia is not singular and a subject may exist in another language. As we do not know what is missed, we do not know what to write to satisfy an existing demand. Finally more and more available information does not even have a Wikipedia article but its information is available in other projects.

A partial solution to these issues was around for a long time. It extends search by adding results from Wikidata. It allows you to find data in any script from any project. If there was no article, it shows information using the Reasonator. It is relatively easy to revive this and it will make even more sense when it results are included as positive results.

Once Wikipedians consider Wikidata as a tool, they will find that both red links and wiki links may link to Wikidata items. Typically they are the same links for the same subject in any language. This is relevant to editors because it is one way to clarify what links exist to an article and, it is only one step away to annotate them as statements in Wikidata and thereby document such links. They will find a lot of erroneous links and it will improve overall quality.

The good news, the links between wiki links and Wikidata items already exist. What is lacking is a verification process that these wiki links are good. Adding links to statements for red links is technically not that hard. It will add some turmoil at the Wikidata end; many items will be added and will have to be merged eventually. One benefit of this approach is that it is not necessary for everyone to collaborate but it will benefit the people that matter most; all the readers of all the Wikipedias.
Thanks,
      GerardM