Sunday, August 28, 2016

#Wikidata - La Galería de las Mujeres de Costa Rica

#Marketing is something the #Wikimedia Foundation does not do. It does not mean that concepts like KPI are foreign to the WMF. Take this list from the English article "La Galería de las Mujeres de Costa Rica" the women listed are "women who have broken gender stereotypes and advanced human rights principals".

A lot of effort goes into fighting for a diverse Wikipedia where both women are given proper attention. If I were a marketing man, I would say that lists like this provide pointers to people who want to help. I would be happy with a list that shows all the current people with an article and I would be ecstatic when I had a list that would show all the missing articles that would auto update.

The funny thing is that technically it is not that hard to produce. It is not even that hard to include the technology into MediaWiki but it takes a marketing man to drive the point home that you have to engage people and that it shows the quality of a Wikipedia project when we know where we are lacking and where we should concentrate.
Thanks,
     GerardM

Tuesday, August 23, 2016

#Wikidata - Colorado Women's Hall of Fame

There is a continuous effort underway in #Wikipedia to celebrate notable women. When women are seen as a role model, it is obvious that they deserve attention.

The Colorado Women's Hall of Fame is an organisation that celebrates women and every year 10 more women are included. The article on the organisation includes a list and it includes many red links. So more can be done, not only in Wikipedia but also in Wikidata.

As Wikidata is maturing, SPARQL is now of sufficient quality that many of the tools developed by Magnus are transitioning to SPARQL. This takes time and at the same time some tools are discontinued or do not fully function any more. Linked Items is one such tool. It creates a list of items that are found in a Wikipedia text. It is ideal when a text based file full of wiki links exist. It is just a matter of copying in the links and it will generate a list with Wikidata items for you. It is then needed to restrict the items that are used and it was possible to use WDQ the engine that could when SPARQL for Wikidata was a distant dream. Sadly it does not work anymore.

A solution is taking the list of items and copying to Petscan, the tool Magnus favours. It uses SPARQL and it is something of a Swiss army knife for data. When you are used to earlier tools like Autolist, many of the assumptions are wrong and it takes time to discover how the tool works. It does and that is why there are a large number of women who are known to be on the Colorado women's hall of fame.
Thanks,
      GerardM

Sunday, August 14, 2016

#Wikidata - #quality is not abstract

There is a new "Request for Comments" on quality for Wikidata. It is an attempt to describe quality in a top down approach. It is about words, it is abstract and well, I wish them well.

Wikidata has qualities. When you understand Wikidata by what it is and what it does you understand the not so abstract qualities it has. Its principle aim is to bring structure to the data that is in the Wikimedia projects.

The first quality that Wikidata brought was that it replaced the text based interwiki links. The improvement was important; in a short space of time the quality of these interwiki links improved and the associated number of edits went down. The quality of the interwiki links is not absolute but there has been no research on the follow up.

Interwiki links represent  connection between articles of Wikimedia projects that are about the same subject. Within a Wikipedia, a Wikisource there are links that are in essence similar to Wikidata statements. When a university is mentioned, the subject may be a student or staff at that university and when the statement has been made there is a reason for inclusion in categories. We can research the concurrence of such statements and Wikilinks. Quality improves when the concurrence improves.

When enough data is available, it becomes possible to use Wikidata statements in templates. Templates and info boxes expect high quality data in Wikidata and the available data is typically not good enough. When it is easy to make statements to wiki links and red links, the data in an info box will grow with the added statements.

We do need to work on the quality for our readers. This is done best by leveraging the data we have and engage our communities not only to link articles together but also by expanding these links with the statements that bind them together.

Yes, we will have to solve abstract issues but the reality is that they are not so abstract. Issues have their basis in what it is we have to understand this in what we hope to achieve; serving the world with the sum of all our available knowledge.
Thanks,
       GerardM

Monday, August 08, 2016

Is convergence between #Wikipedia and #Wikidata possible?

Wikidata is piggybacking on Wikipedia I was told. This is true; much data is imported from any and all of the Wikipedias and thereby Wikidata changes for the better. It improves in quality and become much more than what any single Wikipedia has to offer. At the same time Wikidata is rather awkward in its use and, there has been too much thinking in terms of what people know and expect for their own project.

Perspectives evolve. I tend to think of Wikidata as not yet good enough for most purposes. It is incomplete and its quality is inconsistent when we consider statements about its items. The remedy is obvious; work on the areas that are relevant and where Wikidata can easily make a difference.

That is fine road plan for me but Wikipedians also use Wikidata, they even need to use Wikidata. When they add an article about a person, the authority control data is served from Wikidata and, they have to add the information to Wikidata if it is to show. So what can be done to make this easy so that the use of Wikidata and Wikipedia may converge?

One aspect that seems important is that Wikidata information needs to function in whatever edit mode. The biggest motivational handicap I found is that most of what I did does not have an effect. It is much more rewarding when effects are more noticeable. All wiki links in an article link to other articles that have items of their own. Why not have a toggle that either shows these links with relations or not? For the brave hearts that take an interest it is cool, The others do not even have to notice.

When such links are annotated, they result in statements and such statements may even imply categories or other subsequent functionality. Currently bots only harvest in Wikipedia but why not have them add to the Wikipedias in a predetermined way? It makes for a much more dynamic editing process and it will definitely improve quality.

What do you think?
Thanks,
      GerardM

Tuesday, July 26, 2016

Have #Wikipedia share the sum of available #knowledge

If Wikipedia is to succeed in sharing the sum of all knowledge, it has to first share the sum of available knowledge. To do this Wikipedians have to become more inclusive. They have to realise that Wikipedia is not about them but about its readers.

Typically the question "What do readers want" is answered by what readers find. This answer has one flaw. It assumes that Wikipedia includes what people seek and it forgets what people seek and do not find. This is a lost opportunity on many levels. To start with, Wikipedia is not singular and a subject may exist in another language. As we do not know what is missed, we do not know what to write to satisfy an existing demand. Finally more and more available information does not even have a Wikipedia article but its information is available in other projects.

A partial solution to these issues was around for a long time. It extends search by adding results from Wikidata. It allows you to find data in any script from any project. If there was no article, it shows information using the Reasonator. It is relatively easy to revive this and it will make even more sense when it results are included as positive results.

Once Wikipedians consider Wikidata as a tool, they will find that both red links and wiki links may link to Wikidata items. Typically they are the same links for the same subject in any language. This is relevant to editors because it is one way to clarify what links exist to an article and, it is only one step away to annotate them as statements in Wikidata and thereby document such links. They will find a lot of erroneous links and it will improve overall quality.

The good news, the links between wiki links and Wikidata items already exist. What is lacking is a verification process that these wiki links are good. Adding links to statements for red links is technically not that hard. It will add some turmoil at the Wikidata end; many items will be added and will have to be merged eventually. One benefit of this approach is that it is not necessary for everyone to collaborate but it will benefit the people that matter most; all the readers of all the Wikipedias.
Thanks,
      GerardM


Saturday, July 23, 2016

#Wikipedia - #GMO controversy as a red herring

#Wikipedia has/had a big discussion on the safety of GMO food. When you read from what the Signpost has to say; it is only about the safety for people to eat this stuff.

The problem is that many promises have been made and this is only one issue, not even the most relevant issue. Read the article "20 years of failure" by Greenpeace or reads its rebuttal to what some Nobel Prize winners had to say.

The question if it is safe to eat is only one. The question if it will do us any good is more relevant. It does not bring us a more reliable food supply. It will not bring us more resiliency against climate change and it is very much in doubt that "golden rice" actually brings additional vitamins while a balanced diet does.

The important point of Greenpeace is that it backs its assertions with science. It is not in it for the money and its aim? A world that we can live in.
Thanks,
       GerardM

Monday, July 18, 2016

#Wikipedia - Dr Mary Meeker and SOI testing

Mrs Meeker and her husband Robert Meeker worked on a system used in education. She is known for applying Guilford's Structure of Intellect theory ("SI") to creating assessments and curriculum materials for use in teaching children and adults. The premise of SI is that intelligence comprises many underlying mental abilities or factors, organized along three dimensions—Operations (e.g., comprehension), Content (e.g., semantic), and Products (e.g., relations). When you are interested, read the article.

The article compared her work to the debunked Myers Briggs Type Indicator. This is something we should not do. The article on Mrs Laurie Helgoe provides all the arguments needed to restrict information on that indicator to that article. It is not best practice to use tools that are ambiguous in its results and therefore using it in comparison is not in the interest of our readers.
Thanks,
      GerardM