Hoppa till innehållet

Modul:Sandlådan/SM5POR/Qutil

Från Wikipedia

Dokumentation [visa] [redigera] [historik] [rensa sidcachen]


Introduction

[redigera wikitext]

This Lua module is inspired by the apparent need for uniform interpretation and use of Wikidata items and properties in Wikipedia and other Wikimedia Foundation projects, as well as in independent, third-party applications.

The aim is to codify standard procedures as established by the Wikidata project, and to make them easily accessible for Wikipedia editors, Lua programmers and others using information from Wikidata for any purpose, be it to create a multilingual, online encyclopedia on any topic conceivable, or to build a highly specialized database application combining data from or about different fields of science, organizations, cultures, countries, languages, communities, individuals, or simply all possible walks of life.

Module status

[redigera wikitext]

The Qutil module is not (yet, July 2020) part of an official Wikimedia Foundation project, but should be considered the author's attempt to voluntarily contribute something that could possibly be useful to other Wikipedia or Wikidata editors. There may already be other modules available that implement more or less the same functionality; if so, this module may be seen as an independent effort to verify the correctness of earlier code, to compare performance, or simply as an educational tool.

The module may also include experimental implementations of procedures not yet confirmed as official Wikidata standard; where this is the case, the corresponding functions will be marked Experimental or Unofficial as appropriate in the documentation, and their future redesign, deprecation or removal should not be entirely ruled out.

This module is intended to provide language-independent functionality, not restricted to any particular natural (or constructed) human language, writing system, visual layout, or target audience. For practical reasons though, the Lua code is written using English-language function and variable names, inline commentary, and diagnostic messages (used for development only); the same applies to this module documentation. For the benefit of end-users, a separate Lutil module is planned to present any message, factual statement, or other verbal information in the end-user's preferred languages.

Basic concepts

[redigera wikitext]

A fundamental part of the Wikidata database is its implementation of a generic ontology relating different database items (objects by convention named using the letter 'Q' and a natural number, hence the name of this module) to each other via a limited set of property objects.

Properties

[redigera wikitext]

Statements

[redigera wikitext]

Data types

[redigera wikitext]

Qualifiers

[redigera wikitext]
Constraints
[redigera wikitext]

References

[redigera wikitext]

Transitive properties

[redigera wikitext]

Physical transitivity

[redigera wikitext]

Geographical transitivity

[redigera wikitext]

Corporate transitivity

[redigera wikitext]

Experimental

Philosophical transitivity

[redigera wikitext]

Inheritable properties

[redigera wikitext]

Experimental

Exclusionary inheritance

[redigera wikitext]

Conditional inheritance

[redigera wikitext]

Additive inheritance

[redigera wikitext]

Technical reference

[redigera wikitext]

System requirements

[redigera wikitext]

Diagnostic mode

[redigera wikitext]

Simulated mode

[redigera wikitext]

Calling conventions

[redigera wikitext]

Exported variables

[redigera wikitext]

While implemented as variables, these are meant to be used as constants or flags in your Lua code. As they are used also internally by the Qutil module, changing them may lead to unpredictable or erratic results, and is thus not recommended.

asymmetricPropertyClass
[redigera wikitext]

qutil.asymmetricPropertyClass

The class of asymmetric Wikidata properties.

qutil.classRoot

The root of the Wikidata class tree, of which every other Wikidata class is supposed to be a subclass (direct or indirect).

disambiguationClass
[redigera wikitext]

qutil.disambiguationClass

The class of Wikipedia pages detailing the possible alternatives for interpreting an ambiguous word of expression.

To be moved to the Wutil module when it's created.

transitiveOverProperty
[redigera wikitext]

qutil.transitiveOverProperty

The property deefining valid inheritance paths.

transitivePropertyClass
[redigera wikitext]

qutil.transitivePropertyClass

The class of transitive Wikidata properties.

A number of flags meant to be selectively ORed together into a binary integer are exported to be used with the getClaims function, and is documented with that function.

Exported functions

[redigera wikitext]

qutil.context( frame )

qutil.getClains( entity, properties, flags )

Fetch values of listed properties for entity. Exact access methods and data returned, including different kinds of metadata, depend on flags specified.

entity
Wikidata Item (Q), Property (P) or Lexeme (L) entity id for which the property values are to be fetched; a string.
properties
A list of property ids to be processed in parallel; a table.
flags
A set of flags to control the details of the function, ORed together as an integer.

By default, with no flags set, getClaims will return a list with the same length as that of the list of properties given in the call, each cell holding the result from retrieving the main values of the corresponding property, either as explicitly assigned to the entity, or as implicitly inherited from other entities along a defined path of transitivity.

flgNoInherit
When set, retrieve only property values explicitly assigned to the entity itself, not those inherited via paths of transitivity.
flgQualifiers
When set, retrieve any qualifiers with their values accompanying the main statements.
flgRank
When set, retrieve all statements stored regardless of rank.
flgReferences
When set, retrieve any reference statements provided in support of the main statements.
local result = qutil.getClaim("Q34", {"P1442"}, qutil.flgQualifiers)

getInheritedProperties

[redigera wikitext]

qutil.getInheritedProperties( item, props, prop0, propt, propf ) Experimental

Fetch requested property values props for item by tracing them along the inheritance path defined by prop0, propt, and propf.

item
Wikidata Item (Q) or Property (P) object id for which the properties are to be fetched; a string.
props
A list of property retrieval paths (format described below) to be processed in parallel; either a table or nil (a shorthand representation for the unconditional, additive null property path, used to trace the entire inheritance tree between item and the applicable root).
prop0
First-step parental Property object id to initiate inheritance path traversal; either a string or nil (in which case inheritance path traversal will begin with propt).
propt
Primary Property object id to use repeatedly for successive retrieval of parental object ids; a string or nil (in which case traversal will immediately make use of propf, if available, and otherwise terminate). Should be transitive to yield meaningful results.
propf
Final Property to be used once after a prop0 or propt request fails to yield a single useable parent id (taking specified constraints into account); a string or nil (in which case no further parent id request will be made).
Property retrieval path format
[redigera wikitext]

A table with up to three fields as follows:

  1. A list of Property id strings, possibly empty (indicating a null path); a table (must not be nil).
  2. A flag word indicating what type of inheritance to apply; a number.
  3. A list of conditions to apply to either the Property value obtained or to the parental object found; a table (further described below).
Property retrieval condition format
[redigera wikitext]

(to be further defined)

local result = qutil.getInheritedProperties("Q34", {{{"P1442"}}}, "P31", "P279", nil)

Code samples

[redigera wikitext]

Optimization

[redigera wikitext]

Processing time

[redigera wikitext]

Memory requirements

[redigera wikitext]

Data caching

[redigera wikitext]

Module internals

[redigera wikitext]

Future development

[redigera wikitext]

Backwards compatibility

[redigera wikitext]

Meta-module considerations

[redigera wikitext]

Translation

[redigera wikitext]

Portability

[redigera wikitext]

Compatible modules

[redigera wikitext]
  • The Butil module (bibliographic context relations)
  • The Cutil module (corporate context relations)
  • The Dutil module (database management)
  • The Futil module (file handling)
  • The Gutil module (graphical processing)
  • The Hutil module (historical context relations)
  • The Jutil module (juvenile context relations)
  • The Lutil module (language processing)
  • The Mutil module (map processing)
  • The Nutil module (network management)
  • The Putil module (property management)
  • The Rutil module (robot management)
  • The Sutil module (scientific context relations)
  • The Tutil module (tabular processing)
  • The Vutil module (visualization processing)
  • The Wutil module (Wikimedia project relations)

Application ideas

[redigera wikitext]

These are so far merely proposed names and content descriptions for anticipated laborative end-user application modules. For electrical sheep, paranoid androids and the technological singularity, see the Ilab module. To design a planet from core to stratosphere, or a galactic cluster populated and colonized by various intellectual property lawyer species, use the Ulab module.

  • The Alab module (art laboratory environment)
  • The Elab module (education laboratory environment)
  • The Ilab module (intelligence laboratory environment)
  • The Olab module (organization laboratory environment)
  • The Ulab module (universe laboratory environment)
  • The Ylab module (youth laboratory environment)

The software module described here, as well as this documentation, is available under CC0 (effectively public domain). To avoid confusion and duplicated work due to multiple forks or versions being distributed simultaneously, you are still both welcome and encouraged to contact the author to discuss potential coordination or cooperation.

local bit32 = require("bit32")

local cache = require("Modul:Sandlådan/SM5POR/Cache")

local diag = require("Modul:Sandlådan/SM5POR/Diag")

local gutil = require("Modul:Sandlådan/SM5POR/Gutil")

local diaglevel = 0

-- teq is recursive

local teq

teq = function(a, b)
	if type(a) == "table" and type(b) == "table" then
		if #a == #b then
			local k, v
			for k, v in pairs(a) do
				if not teq(v, b[k]) then
					return false
				end
			end
		else
			return false
		end
	else
		return a == b
	end
	return true
end

local traceview = function(s)
	if diaglevel > 5 then
		return mw.addWarning(s)
	else
		return ""
	end
end

local z = function(x)
	return tostring(x)
end

-- ztring is recursive

local ztring

ztring = function(t)
	local r
	if type(t) == "table" then
		local k, v
		local o = {}
		local i = 0
		for k, v in pairs(t) do
			i = i + 1
			o[i] = ztring(v)
		end
		r = "<" .. table.concat(o, " ") .. ">"
	else
		r = tostring(t)
	end
	return r
end

local tablefind = function(t, element)
	local k, v
	for k, v in pairs(t) do
		if teq(v, element) then
			return k
		end
	end
	return nil
end

local formatValue = function(pv)
	return string.format("%s %s %s", z(pv.rank), z(pv.type), z(pv.id))
end

local typesize = function(x)
	local k, v
	t = {}
	i = 0
	for k, v in pairs(x) do
		i = i + 1
		t[i] = type(v) .. " [" .. tostring(#v) .. "] = " .. formatValue(v)
	end
	return t
end

local formatValueList = function(vlist)
	local k1, v1
	t1 = {}
	i1 = 0
	for k1, v1 in pairs(vlist) do
		local k2, v2
		t2 = {}
		i2 = 0
		for k2, v2 in pairs(v1[2]) do
			i2 = i2 + 1
			t2[i2] = v2[1] .. gutil.formatList(typesize(v2[2]))
--			t2[i2] = v2[1] .. gutil.formatList(formatValue(v2[2]))
		end
		i1 = i1 + 1
		t1[i1] = v1[3] .. " " .. v1[1] .. gutil.formatList(t2)
	end
	return gutil.formatList(t1)
end

local getStatements = function(item, prop, items)
	local stmts = mw.wikibase.getBestStatements(item, prop)
	local k
	local v
	local t = {}
	local i = 0
	for k, v in pairs(stmts) do
		if items then
			if v.mainsnak.datavalue then
				v = v.mainsnak.datavalue.value.id
			else
				v = nil
			end
		end
		if v then
			i = i + 1
			t[i] = v
		end
	end
	return t
end

local getBareProperty = function(item, prop)
	local stmts = mw.wikibase.getBestStatements(item, prop)
	return stmts
end

local getParents = function(item, prop)
	return getStatements(item, prop, true)
end

local getValues = function(item, prop)
	return getStatements(item, prop, false)
end

local getValuesR = function(item0, path)
	local k
	local v
	local tmp
	local item = item0
	local n = #path
	for k, v in pairs(path) do
		tmp = getValues(item, v)
	end
	return getStatements(item, prop, false)
end

local listItemValues = function(snaklist)
    local k
    local v
    local t = {}
    local i = 0
    for k, v in pairs(snaklist) do
    	i = i + 1
    	t[i] = v.mainsnak.datavalue.value.id
    end
	return t
end

-- getInhProp is recursive

local getInhProp

getInhProp = function(item, props, prop0, propt, propf, gen)
	local props2
	if props then
		props2 = {}
	end
	local p2 = 0
	local goal = {}
	local g = 0
	local parents = {}
	local final
	local done = 0
	if props and #props > 0 then
		done = 1
		local k, v, stmts
		for k, v in pairs(props) do
			done = 2
			stmts = getValues(item, v)
			if #stmts > 0 then
				done = 3
				g = g + 1
				goal[g] = {v, stmts}
			else
				done = 4
				p2 = p2 + 1
				props2[p2] = v
			end
		end
	end
	if props == nil or #props2 > 0 then
		done = 5
		if prop0 then
			done = 5
			parents = getParents(item, prop0)
		else
			done = 7
			parents = getParents(item, propt)
		end
		if propf and #parents == 0 then
			done = 8
			final = true
			parents = getParents(item, propf)
		end
	end
	local found = {}
	local k0, v0, k1, v1
	local sp = {}
	local i = 0
	if props == nil or #goal > 0 then
		done = 16
		i = i + 1
		sp[i] = {item, goal, gen}
	end
	if props2 == nil or #props2 > 0 then
		done = 9
		for k0, v0 in pairs(parents) do
			done = 10
			if final then
				done = 11
				found = getInhProp(v0, props2, nil, nil, nil, gen+1)
			else
				done = 12
				found = getInhProp(v0, props2, nil, propt, propf, gen+1)
			end
		end
		for k1, v1 in pairs(found) do
			done = 14
			if tablefind(sp, v1) == nil then
				done = 15
				i = i + 1
				sp[i] = v1
			end
		end
	end
	if diaglevel > 4 then
--		traceview(string.format("%s %s %s %s %s [props2 %s, goal %s, parents %s, final %s, done %d] => %s",
		traceview(string.format("%s %s %s %s %s => %s",
								z(item),
								ztring(props),
								z(prop0),
								z(propt),
								z(propf),
--								ztring(props2),
--								ztring(goal),
--								ztring(parents),
--								z(final),
--								done,
								ztring(sp)))
	end
	return sp
end

getTransitiveProperties = function(item, props, top)
	local k
	local v
	local b
	local r = {}
	local item2
	local c
	for k, v in pairs(props) do
		b = getBareProperty(item, k)
		if b then
			r[k] = b
		end
		if top[k] then
			item2 = getBareProperty(item, top[k])
			c = getBareProperty(item, k)
		end
	end
end

local qutil = {}

-- Exported variables meant as constants

qutil.asymmetricPropertyClass = "Q18647519"

qutil.classRoot = "Q35120"

qutil.disambiguationClass = "Q4167410"

qutil.transitiveOverProperty = "P6609"

qutil.transitivePropertyClass = "Q18647515"

-- Exported variables meant as flags

qutil.flgNoInherit = 1
qutil.flgQualifiers = 2
qutil.flgRank = 4
qutil.flgReferences = 8

-- Exported real variables

-- qutil.inheritedProperties = {}

-- Additional local functions

local sumflags = function(flgsym)
	return 0
end

local listObjectEntities = function(values)
	local k
	local v
	local r = {}
	local i = 1
	for k, v in pairs(values) do
		r[i] = v.mainsnak.datavalue.value.id
		i = i + 1
	end
	return r
end

local getItemClaim = function(entity, property, flags)
	local r
	if bit32.btest(flags, qutil.flgRank) then
		r = cache.call(60, mw.wikibase.getAllStatements, entity, property)
	else
		r = cache.call(300, mw.wikibase.getBestStatements, entity, property)
	end
	return r
end

local getLexemeClaim = function(entity, property, flags)
	return nil
end

local entitydisp = {}

entitydisp["L"] = getLexemeClaim
entitydisp["P"] = getItemClaim
entitydisp["Q"] = getItemClaim

local filterResults = function(results, entity, property, flags)
	local k1
	local v1
	local k2
	local v2
	local k8 = 1
	local r = {}
	local r2
	local flgRank = bit32.btest(flags, qutil.flgRank)
	local flgRef = bit32.btest(flags, qutil.flgReferences)
	local flgQual = bit32.btest(flags, qutil.flgQualifiers)
	selected = {qualifiers=flgQual, references=flgRef, rank=flgRank}
	selected["qualifiers-order"] = flgQual
	optional = {}
	for k1, v1 in pairs(selected) do
		optional[k1] = true
	end
	for k1, v1 in pairs(results) do
		r2 = {}
		for k2, v2 in pairs(v1) do
			if not optional[k2] or selected[k2] then
				r2[k2] = v2
			end
		end
		r[k8] = r2
		k8 = k8 + 1
	end
	return r
end

local getLocalClaim = function(entity, property, flags)
	local getClaimFunc
	local results
	getClaimFunc = entitydisp[string.char(string.byte(entity, 1))]
	if getClaimFunc then
		results = getClaimFunc(entity, property, flags)
	end
	return filterResults(results, entity, property, flags)
end

-- Exported functions

qutil.buildPropertyPathSet = function(prop)
	return {{prop}}
end

qutil.context = function(frame)
	local item = frame.args["item"]
	local lang
	if item then
		lang = frame.args["lang"]
	else
		item = mw.wikibase.getEntity()
		if diaglevel > 3 then
			mw.addWarning("item (" .. type(item) .. "): " .. tostring(item))
			-- local args = frame:getParent().args
		end
	end
	if item == nil and diaglevel > 2 then
		mw.addWarning("Wikidata item not found")
	end
	return lang, item
end

qutil.formatValueList = formatValueList

qutil.getEntityClass = function(entity)
	return qutil.getEntityItems(entity, "P31")
end

qutil.getEntityItems = function(entity, property)
    return listItemValues(mw.wikibase.getBestStatements(entity, property))
end

qutil.getInheritedProperties = function(item, props, prop0, propt, propf)
	return getInhProp(item, props, prop0, propt, propf, 0)
end

qutil.getProperties = function(item, properties)
	return getTansitiveProperties(item, properties, getTransitiveOverProperties(properties, qutil.transitiveOverProperty))
end

qutil.getProperty = function(item, property)
	local properties = {}
	properties[0] = property
	return getTansitiveProperties(item, properties, getTransitiveOverProperties(props, qutil.transitiveOverProperty))
end

cacheTransitiveOverProperties = function(properties)
	if not qutil.inheritedProperties then
		qutil.inheritedProperties = {}
	end
	local k1
	local v1
	local k2
	local v2
	local np1
	local np2
	for k1, v1 in pairs(properties) do
		if not qutil.inheritedProperties[k1] then
			np1 = listObjectEntities(getLocalClaim(v1, qutil.transitiveOverProperty, 0))
			if #np1 > 0 then
				qutil.inheritedProperties[v1] = np1
				for k2, v2 in pairs(np1) do
					if not qutil.inheritedProperties[v2] then
						np2 = listObjectEntities(getLocalClaim(v2, qutil.transitiveOverProperty, 0))
						if #np2 > 0 then
							qutil.inheritedProperties[v2] = np2
						end
					end
				end
			end
		end
	end
end

qutil.getClaims = function(entity, properties, flags)
	flgInherit = not bit32.btest(flags, qutil.flgNoInherit)
	local k
	local v
	local transitive = {}
	local additional = {}
	local p = {}
	local r = {}
	if flgInherit then
		cacheTransitiveOverProperties({"P31"})
		for k, v in pairs(qutil.inheritedProperties) do
			if v[k] then
				transitive[#transitive+1] = k
			end
		end
		for k, v in pairs(properties) do
			if qutil.inheritedProperties[v] then
				additional[#additional+1] = v
			end
		end
	end
	for k, v in pairs(properties) do
		r[k] = getLocalClaim(entity, v, flags)
	end
	return r
end

qutil.qtest = function(frame)
	local r
	local properties = {}
	properties[0] = frame.args["property"]
	for i = 1, 80 do
		r = diag.var('result', qutil.getClaims(frame.args["entity"], properties, sumflags(frame.args["flags"])))
	end
--	qutil.inheritedProperties = {tre=4}
--	r = diag.var("top", qutil.inheritedProperties)
	local foo = {}
	foo["bar"] = 42
	foo["baz"] = "ooo"
--	r = diag.var("foo", foo)
--	r = diag.var("inline", {"one"; "two"; "three"})
	r = cache.tabulate(cache.statistics(0))
--	r = diag.var("_G", _G)
	return r
end

return qutil