☸gyuri🦁@🗫①♾@INDYLABb@💻bootstrap@indylab1 2022 02 25 12:04

- source : Commons P2P wiki requirements

- by : Simon Grant | LinkedIn

- links :

clone - p2p wiki @asimong

Some ideas for requirements for a wiki

supporting a living knowledge commons

(Page created by Simon Grant , December 2020. If you want to help develop this, please e-mail me.)

(last edited: 2021-08-20, by Simon)

Introduction and context

Various people I've been talking with recently, including John Waters, share an interest (for whatever reason) in developing software to support something like a distributed knowledge commons. There is the unmissable Wikipedia, which everyone knows, but we couldn't possibly use that for any compilation of the live knowledge that is needed to be shared for our radical purposes. One thing we have all noticed is people developing limited knowledge compilations in their own systems -- whether it be existing wiki software (take the P2P Foundation wiki as an example which I have worked on a lot) or based on other blog or content management systems (resources like LowImpact -- there are many others).

Two relatively recent developments caught my attention. One is Ward Cunningham's idea of the Federated Wiki ; the other is the recently popular commercial software, Roam Research . I don't see either as the answer, but both have interestingly different functionality.

The question I want to address here is: what functionality do we need to implement an effective ICT system that can embody a healthy, growing, live, knowledge commons -- of the kind that could be used for our shared purposes? But this page is not about listing and refining those purposes, integrating all the relevant writings: that belongs elsewhere.

If this page could start to develop into a list of requirements that has broad agreement, then we could move on in two ways:

we could invite people to build a reference implementation, along with clarifying the relevant shared protocols, APIs and standards;
we could invite developers of existing open systems to work on incorporating this functionality into their systems.

For instance, due to the popularity of Roam Research, much work has already been done to produce open source emulations based on existing software -- see for instance an article (mid 2020) from Ness Labs , and a piece (early 2020) on Reddit . But we aren't aiming to stimulate another crowd of individual developers to develop their own systems, but rather to come together first in conversation about what really matters, and how to develop the systems to deliver that without all the duplication, re-invention, wasted time and energy, and frustration.

Some issues should go without saying, for instance, different software implementing these requirements can have different editing interfaces, provided they comply with these requirements. WYSIWYG, yes; HTML, yes; lightweight markup, yes; all depending on user preferences and familiarity.

How to contribute to this page

First, ask me for the edit link. Please fill in details and issues, briefly if possible, and linking to other places rather than copying any material in. If there is a feature that you think is vital, please add it at the bottom, copying the template there. Please note and explain any existing implementations of this feature, and add your own evaluation comments: is this feature essential; really important but not absolutely vital enough to be in a MVP; or important but not essential. Please don't put in other features that would be just nice to have. It's going to be hard enough with what we have already!

Links to other articles

Please add right here any links to articles that cover the same or very similar ground to here.

The list of potential requirements

Automatic back-links

Every time a link is published to another wiki, the linked wiki looks at the referrer, and if it passes the acceptance criteria (e.g. its domain is on the whitelist and it is well-formatted) creates a backlink.

Why is this desirable?

Essential and central to this concept -- Roam Research does this only within one implementation I think -- or as happens with some blog systems. This would make linking feasible and maintainable across domains, effectively making traversing wikis easy across multiple hosts.

How it could work, and issues

In outline:

either: the linking page sends a special kind of message to the linked page when the link is created,
or: the linked page looks at every referrer
and in both cases the linked page is acceptable -- if so, it links automatically
the user could specify rules about classifying incoming links
alternatively, incoming links could be put on a list awaiting manual sorting and approval
or some combination

Optionally, a link checker could be run every so often and delete or alert the owners to dead links. Ideally, some alert, or visual cue, would enable editors (and possibly readers) to see when a referred-to page has changed.

It might be possible to delegate some of the work to a SaaS API, but I doubt that would be useful given that it should be a relatively simple process to identify the referrer from the header and then insert an automatically formatted back-link. This might require only the addition of a few lines of code to one or two functions in the wiki engine. It might be possible to create a simple plugin to overload the functions affected.

Evaluation, or existing implementations

Fedwiki has automatic backlinks.

MediaWiki has two apparently unrelated mechanisms, (though only within one wiki,) and I see no need for them to be separate:

What links here
Categories

Some kind of update checking would be useful, so that any reader knows whether a linked page has been changed since the last revision / update of the currently viewed page. More below.

MassiveWiki will do this later...

Semantic links

Every link should have a relationship / predicate.

Why is this desirable?

Fundamental to the Semantic Web. Personally, I feel sure it would make a huge difference to findability. (Though this is not yet generally accepted within the fedwiki community ) In practical terms, this means supporting people looking for specific relationships between pages, including support or critique, for example. I'm thinking of the little known work of Andrew Ravenscroft and the dialogue game he called " InterLoc ", where all replies had to start with a framing.

How it could work, and issues

The types of links would need to be very carefully selected, with maybe SKOS as a basis? But also needed to be included are logical and conversational relationships. In scientific terms, for example, we would want ‘gives evidence for’ and ‘is a counterexample’.

Evaluation, or existing implementations

Implemented in several places, e.g. Semantic MediaWiki ; ... but it is not clear to me how well that fits the requirement.

Separation of page content from page metadata

The page content and the page metadata need to be separated, at least in principle, however it is implemented.

Why is this desirable?

Readers need to be able to browse content without the cognitive load of having to read or skip over metadata. Metadata and other semantic material needs to be available not only to machines (as already enabled by e.g. RDFa) but also by people. Metadata can't always be sensibly represented as content, but is really useful for various reasons.

How it could work, and issues

I envisage a separate metadata page, including the kind of information that is currently shown in "View history" pages; backlinks if they aren't displayed on the page itself (should be one or the other); permissions; etc. Editing and viewing history are treated separately, below.

Evaluation, or existing implementations

Patchy. MediaWiki is not clean about this, and that may be producing usability problems.

Fedwiki displays the JSON corresponding to the page on demand. The JSON seems to include the totality of information about the page. And it is all displayed one way or another when in wiki writing mode.

Make categories to just be pages

Have no separate type of category pages. Categories are displayed by ensuring that all back-links with appropriate relationship are shown in the page.

Why is this desirable?

MediaWiki is confusing on this, in that you can add content to a category page. Simpler = better.

How it could work, and issues

Extremely easy: if back-links are there as above, then it is removing normal redundant functionality

Evaluation, or existing implementations

Not really an issue: it would be simply a matter of some wikis deprecating some functionality, perhaps to be replaced by page type (as follows)

There are a small number of page classes / types

Even Wikipedia has clearly but informally established page types, among which disambiguation is an obvious one. Decide carefully on a minimal set of page types, and stick with those.

Why is this desirable?

Helps comprehension, provided all the types are patently obvious in meaning.

How it could work, and issues

Here is a possible list of what a page can be 'about'.

Page giving information about a particular thing
- Person page
- Organisation page
- other particular thing, or event, but not a concept
Concept / pattern / class / type
Theory
Assertion, including evidence for a theory, or that a pattern applies to a thing
all other pages are unclassified expressions -- including opinion, poetry, preference

This needs thinking through in conjuction with semantic links. It works in with my top ontology , so I'm biased here.

Evaluation, or existing implementations

I know of no good set of page types applicable to knowledge commons.

Access control

The page owner(s) should be able to control permissions for pages

Why is this desirable?

Essential for any restricted or non-public use, and to control editing rights. Without access control, any page is open to abuse. Lack of boundaries here also violates one of Ostrom's principles.

How it could work, and issues

Should be fairly obvious for anyone who has implemented this kind of thing (not me).

And – maybe should be a separate point – maybe little or no harm

Evaluation, or existing implementations

Fedwiki takes the radical approach of having pages owned by just one person, so there is never any question of access control.

DokuWiki and other wiki systems have various levels of access control

The Google Docs system

Edit history and reversion

Any self-respecting wiki has an edit history, which includes the ability to revert to previous versions. This is just an 'undo' function, extended back in time indefinitely, but also with the potential to extend to other users.

Why is this desirable?

Lots of reasons, among which, to track editor contributions, for reputation reasons.

How it could work, and issues

Group edits sensibly – e.g. one editor's uninterrupted edits on a single day could be merged at the end of the day.

As for how to hold the history, that's a good question. It would be great to see some kind of "diff" system that preferably not line-oriented (though line-oriented could do for starters).

Evaluation, or existing implementations

Fedwiki has a very rich edit history.

MediaWiki "View history" sort of works, but could it be simpler? Google Docs "Version history" is also very useful, but may be harder to implement.

Viewing history

The system should be able to record when I view pages and the links I take, irrespective of which browser or hardware I am using. A site may also record page impressions served, and any anonymous information about where they are served to.

Why is this desirable?

This would enable meaningful and useful personal history traces, which could be superposed on a page graph. This would be really useful in terms of tracking interests and learning. Obviously, it would need to be able to be securely private, but also sharable.

How it could work, and issues

Maybe we could think of a "my history" page as just another wiki page on 'my' server, but instead of being updated with backlinks, it would be updated every time I visited a page.

Of course, there is also the possibility of tracking visitors in some way. What are the established methods, involving (essential?) cookies or not?

Evaluation, or existing implementations

All browsers seem to record viewing history for that specific browser, but cross-browser histories seem in the past to have been delegated to bookmarking sites. There must be a lot of work on trying to piece together viewing statistics already done somewhere, and it would seem sensible to include this kind of facility as standard on the wiki software.

Some other ideas are faintly connected to this and to Edit history – like xAPI and ActivityPub – see also https://en.wikipedia.org/wiki/Comparison_of_software_and_protocols_for_distributed_social_networking though it's not social networking we're doing here.

Graphical rendering of the page graph

Various different semantic aspects of the page graph should be able to be viewed graphically.

Why is this desirable?

Some people work much better that way, and I've heard it called for over and over again.

How it could work, and issues

It could just be a matter of ensuring that semantic data can be exported in a form that is recognised by an existing graphing tool. The page graph is not a page, so coming into a page from the page graph does not create a backlink, but it its use is recorded in the viewing history.

Evaluation, or existing implementations

What has this kind of tool built in?

Cmaptools can show a graph with each node linked to a web page, but maybe the reverse is not so easy? https://edotor.net/ looks interesting, but doesn't seem to have labelled arcs, so doesn't do semantic links as required above.

Easy attribution tracking

Where content is quoted or reused by someone else, it should be as automatic as possible to include provenance and attribution information, or at least to be able to trace back to original sources.

Why is this desirable?

Ideas often get messed around when they are quoted and reused. While we can't stop people plagiarising ideas, if it were really easy and automatic to cite the source, more people would do it.

How it could work, and issues

I know very little about how fedwiki does it. What is apparent is that fedwiki can compare two similar pages, to show what is the same and what different. While this is useful, by itself it doesn't facilitate e.g. compilation or comparison pages, where what you want is to be able easily to see sources from several different pages.

Evaluation, or existing implementations

Fedwiki does this in a very interesting way.

Effective search, maybe semantic

Provide effective means for likely users to be able to find the information they are likely to be interested in. To relate to the rest of the world, pages need to be searchable by major search engines, as well as any search facility that is created specifically to cover knowledge commons.

Why is this desirable?

Knowledge in the commons needs to be findable. To the extent that commoners fail to find the information they are looking for, the knowledge commons has failed in its primary purpose.

How it could work, and issues

Google and many other services start from strings, and use various other techniques to make the search better. Internal text string search is common, so presumably has no great challenges; but what about AI-enhanced internal search; and much more challenging, cross-wiki search?

Semantic search such as SPARQL is an obvious path to explore. But how can this work cross-server? Would we need some kind of specialised server keeping an always-updated record of the complete page graph across the current knowledge commons? (and could that effectively define the extent and boundaries of this commons?)

Evaluation, or existing implementations

The challenges seem to be adding AI to local search, and doing any kind of search across a distributed knowledge commons. Of course, the semantic and interlinked nature of this work will make browsing search far more effective, and therefore will provide an alternative to a lot of the AI functionality, which should be more effective in some contexts (but not in others)

Drag and drop from other pages

This may not be in the MVP. Carry over provenance information along with selected text, so that is automatically included in the wiki you are creating. For browsers displaying local files, allow only relative links.

Why is this desirable?

This goes along with attribution tracking, above. As there, the easier it is to do, the more likely people are to do it, and therefore to allow the system to track attribution easily.

How it could work, and issues

I don't know whether selecting text from a browser automatically includes URL information or not. If not, then perhaps all distributed wiki sites could include that in the client-side code? I would not include editing history with this. Editing history can be kept at the source server.

In principle, I would guess this would depend on the DOM .

Evaluation, or existing implementations

Fedwiki allows something similar, but its DOM is incompatible with the rest of the web.

Fine-grained related change indications

As a follow on from edit history, attribution tracking and drag-and-drop, provide an indication on the quoting page of whether a part of a quoted page (that has been excerpted and put into the quoting page) has been changed.

Why is this desirable?

A page author borrowing / reusing material from another page would find it helpful to know if that material has been changed, and be free to decide whether to reflect that change in the quoting page. Probably not MVP.

How it could work, and issues

This doesn't necessarily need to be separate functionality, because one page could look at another to see if a quoted section has been changed or not. The question is how notification of changes is managed. The quoting page could do an explicit check for each of the passages that are quoted, and if the source passage has changed (not just the whole quoted page) then that passage can be marked in some way. The quoting page owner can then decide whether to update the passage in line with the updating in the quoted page, or not. Either way, the status of the quoted passage can be seen (directly or indirectly) by the reader of the quoting page.

Evaluation, or existing implementations

It's perhaps not vital, and there are no known implementations of exactly this, though Fedwiki does something related.

Feed and/or notification of changes

In conjunction with edit history (see above) provide one or preferably several ways in which interested users can be easily alerted to what has changed possibly in conjuction with a watch-list.

Why is this desirable?

People who have contributed to a jointly authored page, or any page which is editable by others, may have an interest in keeping track of other contributions. The point is that they may have established their own understanding of the page topic or material, and to maintain the effectiveness of this page in their own personal knowledge bank, they need to ensure that no contributions have materially changed the concepts; or if they have, they would want the opportunity either to become familiar with the new material or to make further changes to restore comprehensibility.

This is closely related to the above point about change indications, as it would be helpful for other linked wikis to be alerted when something has changed on a linked page, so that they can in turn alert their owners.

How it could work, and issues

Provided users have an effective ID, it is very easy to allow people to record their interest in any number of individual pages, as Mediawiki does. For some sites, an RSS or Atom feed would be useful, as many people use these to scan for changes that may be of interest. This can work however coarse or fine grained the change record may be.

Evaluation, or existing implementations

Mediawiki offers registered users the ability to “watch” pages, meaning that every time that page is changed, the user receives an e-mail. Probably several other wiki systems do likewise.

Some wiki systems (research to be filled in here) offer an RSS or Atom feed of changes, similar to a blog.

Basic web compatibility

The system should be compatible enough with web standards to allow some meaningful interoperability, e.g. import and export, with the rest of the web.

Why is this desirable?

HTML, PDF and plain text are all very different in structure – lightweight markup languages were developed to bridge some of the gap between text and HTML. Word processor export to HTML varies enormously in quality and verboseness. If a knowledge commons is to be built in a reasonable time, there needs to be import from other formats. Unless there is effective export to other formats, people are likely to be cautious about committing time to what would be seen as a 'walled garden'.

How it could work, and issues

There is no need for a basic knowledge commons system to comply with the full WHATWG version of the web page DOM . For export, it should be relatively easy for a knowledge commons wiki to export to a subset of HTML, and from there, existing software such as Pandoc can take it on to many different formats. Import looks potentially more troublesome. It is possible to filter HTML in various ways, so if a clearly defined subset of HTML can be mapped to the wiki format, it should be possible first to filter the HTML so that only those elements occurred, and then to convert into the wiki format.

It would make sense to me if wiki pages had a DOM representation consistent with the HTML generated

Evaluation, or existing implementations

Unfortunately, fedwiki sucks on this issue. Each paragraph-like element has a type, with a default type of 'paragraph' which is ended by a carriage return. However, two of the types are 'HTML' and 'markdown', and HTML in particular is able to produce very long and complex items, themselves including paragraphs and other material, without that being accessible to the fedwiki system. In essence, using the HTML element breaks the whole concept of paragraph entities. Lists are a total joke!

Commenting and suggesting

Have a way by which people can add comments on specific parts of the content without affecting that content; and (more ambitiously) suggest changes that can be accepted very easily.

Why is this desirable?

Commenting and suggesting modes are very well used in Google Docs, for the simple reason that they are very useful. In Wikipedia you have to actually change the text, and risk being reverted. Where there is a suggestion facility, it makes it clearer that the suggester is not certain whether it is an improvement in the eyes of others, and less face is lost if a suggestion is rejected than if an edit is reverted.

How it could work, and issues

I have no idea how this is implemented, but it clearly isn't simple. It will also make the data model quite a lot more complex.

Evaluation, or existing implementations

Commenting is more widespread. Apart from Google Docs and similar, you could see Git as effectively serving the need for suggestion, but you don't get the immediacy that you get with Google Docs. You could also see Fedwiki as meeting a very similar need, in that someone can make their own fork of a page, change some part of it, and the original author can bring the changed version back to theirs. But that is the only way you can do it on Fedwiki.

template:

Requirement name

requirement short description

Why is this desirable?

text on why it is desirable

How it could work, and issues

text on how

Evaluation, or existing implementations

How vital is this? Where has it been done before?

Possible APIs / protocols

Straightforward https for latest version of page content along with last revision date
More detailed history etc. given in response to a request for overall metadata about page.
"give me that page as it was at this date-time"
preserved in (but how demarcated? just characters?); also with date-time of last change within that portion
"do you have a link to this page?" (but could simply get the whole page and look)

Miscellaneous notes

Look up Topic Maps in case that helps
Related talk on Hylo .
Interesting Holochain development .