Page MenuHomePhabricator

How to represent boolean data (new datatype for Wikidata?)
Closed, DeclinedPublic

Description

Occasionally, a possible boolean datatype is mentioned ( https://en.wikipedia.org/wiki/Boolean_data_type ).

The alternative currently suggested is: item-datatype with specific items for applicable values.

On Wikidata: https://www.wikidata.org/wiki/Help:Data_type#Boolean

Event Timeline

Boolean is rarely useful for actually modeling information. We can tell more by linking to certain items. Example off the cuff: A "wifi" property was proposed as boolean. Instead, a "network" with the possibly types of network (wifi, ethernet, etc.) could have been proposed, since, lastly, we also have the special value "noValue" for cases where a location does not have network connectivity.

Where needed (and I struggle to think of a valid case), we could do this with specific items.

This task should be declined.

Item-datatype should do, but I don't think there is a ticket that states that ;)

Lydia_Pintscher subscribed.

Then let's make it this one :)

I agree that most cases are likely even better served by using items.

Boolean is rarely useful for actually modeling information.

In the year of 2020, I like to challenge the argument. I think for such an important data type that is supported by almost all programming languages and in all database formats, it seems a bit too simple as an argument just say *"It's barely helpful and we can use item"*. Especially it would be hard to justify that when we have URL, Monolingual text and string, but we don't have Boolean.

There are many cases where boolean is extremely helpful, e.g.
To model one of the binary outcomes of an activity. e.g.

  1. Barack Obama participate in US 2008 election, and was elected (true), whereas we can create a property of election result to be boolean of true and false.
  2. The Brazilian President was tested Positive with COVID-19.
  3. Some company is NOT(false) a listed company (as opposed to "not defining the property that the company is listed anywhere)
  4. These can be used also in the calculated properties such as "given a Qnum of a company, tell me whether the company has at least a woman and a man in their board of director"

Boolean is rarely useful for actually modeling information.

In the year of 2020, I like to challenge the argument. I think for such an important data type that is supported by almost all programming languages and in all database formats, it seems a bit too simple as an argument just say *"It's barely helpful and we can use item"*. Especially it would be hard to justify that when we have URL, Monolingual text and string, but we don't have Boolean.

There are many cases where boolean is extremely helpful, e.g.
To model one of the binary outcomes of an activity. e.g.

  1. Barack Obama participate in US 2008 election, and was elected (true), whereas we can create a property of election result to be boolean of true and false.
  2. The Brazilian President was tested Positive with COVID-19.
  3. Some company is NOT(false) a listed company (as opposed to "not defining the property that the company is listed anywhere)
  4. These can be used also in the calculated properties such as "given a Qnum of a company, tell me whether the company has at least a woman and a man in their board of director"

Going 1 by 1:

  1. This is really two properties and we already have the first: "head of state" qualified from 2009 to present, and the second is "participated in election" linking to item X.
  2. "has illness X"
  3. "listing: noValue" as opposed to someValue
  4. company has board, board has parts person A/person B (from time C), persons A/B are male/female.

None of this requires a boolean.

As I said, I struggle to think of a use case where one is necessary that is not already covered by "property: someValue" and "property: noValue", which all properties have access to both of those.