(Page created by
Simon Grant
, December 2020. If you want to help develop this, please e-mail me.)
(last edited: 2021-08-20, by Simon)
Various people I've been talking with recently, including John Waters, share an interest (for whatever reason) in developing software to support something like a distributed knowledge commons. There is the unmissable Wikipedia, which everyone knows, but we couldn't possibly use that for any compilation of the live knowledge that is needed to be shared for our radical purposes. One thing we have all noticed is people developing limited knowledge compilations in their own systems -- whether it be existing wiki software (take the
P2P Foundation wiki
as an example which I have worked on a lot) or based on other blog or content management systems (resources like
LowImpact
-- there are many others).
Two relatively recent developments caught my attention. One is Ward Cunningham's idea of the
Federated Wiki
; the other is the recently popular commercial software,
Roam Research
. I don't see either as
the
answer, but both have interestingly different functionality.
The question I want to address here is: what functionality do we need to implement an effective ICT system that can embody a healthy, growing, live, knowledge commons -- of the kind that could be used for our shared purposes? But this page is not about listing and refining those purposes, integrating all the relevant writings: that belongs elsewhere.
If this page could start to develop into a list of requirements that has broad agreement, then we could move on in two ways:
For instance, due to the popularity of Roam Research, much work has already been done to produce open source emulations based on existing software -- see for instance
an article (mid 2020) from Ness Labs
, and
a piece (early 2020) on Reddit
. But we aren't aiming to stimulate another crowd of individual developers to develop their own systems, but rather to come together first in conversation about what really matters, and how to develop the systems to deliver that without all the duplication, re-invention, wasted time and energy, and frustration.
Some issues should go without saying, for instance, different software implementing these requirements can have different editing interfaces, provided they comply with these requirements. WYSIWYG, yes; HTML, yes; lightweight markup, yes; all depending on user preferences and familiarity.
First, ask me for the edit link. Please fill in details and issues, briefly if possible, and linking to other places rather than copying any material in. If there is a feature that you think is vital, please add it at the bottom, copying the template there. Please note and explain any existing implementations of this feature, and add your own evaluation comments: is this feature essential; really important but not absolutely vital enough to be in a MVP; or important but not essential. Please don't put in other features that would be just nice to have. It's going to be hard enough with what we have already!
Please add right here any links to articles that cover the same or very similar ground to here.
Every time a link is published to another wiki, the linked wiki looks at the referrer, and if it passes the acceptance criteria (e.g. its domain is on the whitelist and it is well-formatted) creates a backlink.
Essential and central to this concept -- Roam Research does this only within one implementation I think -- or as happens with some blog systems. This would make linking feasible and maintainable across domains, effectively making traversing wikis easy across multiple hosts.
In outline:
Optionally, a link checker could be run every so often and delete or alert the owners to dead links. Ideally, some alert, or visual cue, would enable editors (and possibly readers) to see when a referred-to page has changed.
It might be possible to delegate some of the work to a SaaS API, but I doubt that would be useful given that it should be a relatively simple process to identify the referrer from the header and then insert an automatically formatted back-link. This might require only the addition of a few lines of code to one or two functions in the wiki engine. It might be possible to create a simple plugin to overload the functions affected.
Fedwiki
has automatic backlinks.
MediaWiki
has two apparently unrelated mechanisms, (though only within one wiki,) and I see no need for them to be separate:
Some kind of update checking would be useful, so that any reader knows whether a linked page has been changed since the last revision / update of the currently viewed page. More below.
MassiveWiki
will do this later...
Every link should have a relationship / predicate.
Fundamental to the Semantic Web. Personally, I feel sure it would make a huge difference to findability. (Though this is not yet generally accepted within the fedwiki community
)
In practical terms, this means supporting people looking for specific relationships between pages, including support or critique, for example. I'm thinking of the little known work of Andrew Ravenscroft and the dialogue game he called "
InterLoc
", where all replies had to start with a framing.
The types of links would need to be very carefully selected, with maybe SKOS as a basis? But also needed to be included are logical and conversational relationships. In scientific terms, for example, we would want ‘gives evidence for’ and ‘is a counterexample’.
Implemented in several places, e.g.
Semantic MediaWiki
; ... but it is not clear to me how well that fits the requirement.
The page content and the page metadata need to be separated, at least in principle, however it is implemented.
Readers need to be able to browse content without the cognitive load of having to read or skip over metadata. Metadata and other semantic material needs to be available not only to machines (as already enabled by e.g. RDFa) but also by people. Metadata can't always be sensibly represented as content, but is really useful for various reasons.
I envisage a separate metadata page, including the kind of information that is currently shown in "View history" pages; backlinks if they aren't displayed on the page itself (should be one or the other); permissions; etc. Editing and viewing history are treated separately, below.
Patchy.
MediaWiki
is not clean about this, and that may be producing usability problems.
Fedwiki
displays the JSON corresponding to the page on demand. The JSON seems to include the totality of information about the page. And it is all displayed one way or another when in wiki writing mode.
Have no separate type of category pages. Categories are displayed by ensuring that all back-links with appropriate relationship are shown in the page.
MediaWiki is confusing on this, in that you can add content to a category page. Simpler = better.
Extremely easy: if back-links are there as above, then it is removing normal redundant functionality
Not really an issue: it would be simply a matter of some wikis deprecating some functionality, perhaps to be replaced by page type (as follows)
Even Wikipedia has clearly but informally established page types, among which disambiguation is an obvious one. Decide carefully on a minimal set of page types, and stick with those.
Helps comprehension, provided all the types are patently obvious in meaning.
Here is a possible list of what a page can be 'about'.
This needs thinking through in conjuction with semantic links. It works in with
my top ontology
, so I'm biased here.
I know of no good set of page types applicable to knowledge commons.
The page owner(s) should be able to control permissions for pages
Essential for any restricted or non-public use, and to control editing rights. Without access control, any page is open to abuse. Lack of boundaries here also violates one of Ostrom's principles.
Should be fairly obvious for anyone who has implemented this kind of thing (not me).
And – maybe should be a separate point – maybe little or no harm
Fedwiki
takes the radical approach of having pages owned by just one person, so there is never any question of access control.
DokuWiki
and other wiki systems have various levels of access control
The
Google Docs
system
Any self-respecting wiki has an edit history, which includes the ability to revert to previous versions. This is just an 'undo' function, extended back in time indefinitely, but also with the potential to extend to other users.
Lots of reasons, among which, to track editor contributions, for reputation reasons.
Group edits sensibly – e.g. one editor's uninterrupted edits on a single day could be merged at the end of the day.
As for how to hold the history, that's a good question. It would be great to see some kind of "diff" system that preferably not line-oriented (though line-oriented could do for starters).
Fedwiki
has a very rich edit history.
MediaWiki
"View history" sort of works, but could it be simpler?
Google Docs
"Version history" is also very useful, but may be harder to implement.
The system should be able to record when I view pages and the links I take, irrespective of which browser or hardware I am using. A site may also record page impressions served, and any anonymous information about where they are served to.
This would enable meaningful and useful personal history traces, which could be superposed on a page graph. This would be really useful in terms of tracking interests and learning. Obviously, it would need to be able to be securely private, but also sharable.
Maybe we could think of a "my history" page as just another wiki page on 'my' server, but instead of being updated with backlinks, it would be updated every time I visited a page.
Of course, there is also the possibility of tracking visitors in some way. What are the established methods, involving (essential?) cookies or not?
All browsers seem to record viewing history for that specific browser, but cross-browser histories seem in the past to have been delegated to bookmarking sites. There must be a lot of work on trying to piece together viewing statistics already done somewhere, and it would seem sensible to include this kind of facility as standard on the wiki software.
Some other ideas are faintly connected to this and to Edit history – like
xAPI
and
ActivityPub
– see also
https://en.wikipedia.org/wiki/Comparison_of_software_and_protocols_for_distributed_social_networking
though it's not social networking we're doing here.
Various different semantic aspects of the page graph should be able to be viewed graphically.
Some people work much better that way, and I've heard it called for over and over again.
It could just be a matter of ensuring that semantic data can be exported in a form that is recognised by an existing graphing tool. The page graph is not a page, so coming into a page from the page graph does not create a backlink, but it its use is recorded in the viewing history.
What has this kind of tool built in?
Cmaptools
can show a graph with each node linked to a web page, but maybe the reverse is not so easy?
https://edotor.net/
looks interesting, but doesn't seem to have labelled arcs, so doesn't do semantic links as required above.
Where content is quoted or reused by someone else, it should be as automatic as possible to include provenance and attribution information, or at least to be able to trace back to original sources.
Ideas often get messed around when they are quoted and reused. While we can't stop people plagiarising ideas, if it were really easy and automatic to cite the source, more people would do it.
I know very little about how fedwiki does it. What is apparent is that fedwiki can compare two similar pages, to show what is the same and what different. While this is useful, by itself it doesn't facilitate e.g. compilation or comparison pages, where what you want is to be able easily to see sources from several different pages.
Fedwiki
does this in a very interesting way.
Provide effective means for likely users to be able to find the information they are likely to be interested in. To relate to the rest of the world, pages need to be searchable by major search engines, as well as any search facility that is created specifically to cover knowledge commons.
Knowledge in the commons needs to be findable. To the extent that commoners fail to find the information they are looking for, the knowledge commons has failed in its primary purpose.
Google and many other services start from strings, and use various other techniques to make the search better. Internal text string search is common, so presumably has no great challenges; but what about AI-enhanced internal search; and much more challenging, cross-wiki search?
Semantic search such as
SPARQL
is an obvious path to explore. But how can this work cross-server? Would we need some kind of specialised server keeping an always-updated record of the complete page graph across the current knowledge commons? (and could that effectively define the extent and boundaries of this commons?)
The challenges seem to be adding AI to local search, and doing any kind of search across a distributed knowledge commons. Of course, the semantic and interlinked nature of this work will make browsing search far more effective, and therefore will provide an alternative to a lot of the AI functionality, which should be more effective in some contexts (but not in others)
This may not be in the MVP. Carry over provenance information along with selected text, so that is automatically included in the wiki you are creating. For browsers displaying local files, allow only relative links.
This goes along with attribution tracking, above. As there, the easier it is to do, the more likely people are to do it, and therefore to allow the system to track attribution easily.
I don't know whether selecting text from a browser automatically includes URL information or not. If not, then perhaps all distributed wiki sites could include that in the client-side code? I would not include editing history with this. Editing history can be kept at the source server.
In principle, I would guess this would depend on the
DOM
.
Fedwiki
allows something similar, but its DOM is incompatible with the rest of the web.
As a follow on from edit history, attribution tracking and drag-and-drop, provide an indication on the quoting page of whether a part of a quoted page (that has been excerpted and put into the quoting page) has been changed.
A page author borrowing / reusing material from another page would find it helpful to know if that material has been changed, and be free to decide whether to reflect that change in the quoting page. Probably not MVP.
This doesn't necessarily need to be separate functionality, because one page could look at another to see if a quoted section has been changed or not. The question is how notification of changes is managed. The quoting page could do an explicit check for each of the passages that are quoted, and if the source passage has changed (not just the whole quoted page) then that passage can be marked in some way. The quoting page owner can then decide whether to update the passage in line with the updating in the quoted page, or not. Either way, the status of the quoted passage can be seen (directly or indirectly) by the reader of the quoting page.
It's perhaps not vital, and there are no known implementations of exactly this, though Fedwiki does something related.
In conjunction with edit history (see above) provide one or preferably several ways in which interested users can be easily alerted to what has changed possibly in conjuction with a watch-list.
People who have contributed to a jointly authored page, or any page which is editable by others, may have an interest in keeping track of other contributions. The point is that they may have established their own understanding of the page topic or material, and to maintain the effectiveness of this page in their own personal knowledge bank, they need to ensure that no contributions have materially changed the concepts; or if they have, they would want the opportunity either to become familiar with the new material or to make further changes to restore comprehensibility.
This is closely related to the above point about change indications, as it would be helpful for other linked wikis to be alerted when something has changed on a linked page, so that they can in turn alert their owners.
Provided users have an effective ID, it is very easy to allow people to record their interest in any number of individual pages, as Mediawiki does. For some sites, an RSS or Atom feed would be useful, as many people use these to scan for changes that may be of interest. This can work however coarse or fine grained the change record may be.
Mediawiki
offers registered users the ability to “watch” pages, meaning that every time that page is changed, the user receives an e-mail. Probably several other wiki systems do likewise.
Some wiki systems (research to be filled in here) offer an RSS or Atom feed of changes, similar to a blog.
The system should be compatible enough with web standards to allow some meaningful interoperability, e.g. import and export, with the rest of the web.
HTML, PDF and plain text are all very different in structure –
lightweight markup languages
were developed to bridge some of the gap between text and HTML. Word processor export to HTML varies enormously in quality and verboseness. If a knowledge commons is to be built in a reasonable time, there needs to be import from other formats. Unless there is effective export to other formats, people are likely to be cautious about committing time to what would be seen as a 'walled garden'.
There is no need for a basic knowledge commons system to comply with the full
WHATWG
version of the
web page DOM
. For export, it should be relatively easy for a knowledge commons wiki to export to a subset of HTML, and from there, existing software such as
Pandoc
can take it on to many different formats. Import looks potentially more troublesome. It is possible to filter HTML in various ways, so if a clearly defined subset of HTML can be mapped to the wiki format, it should be possible first to filter the HTML so that only those elements occurred, and then to convert into the wiki format.
It would make sense to me if wiki pages had a DOM representation consistent with the HTML generated
Unfortunately,
fedwiki
sucks on this issue. Each paragraph-like element has a type, with a default type of 'paragraph' which is ended by a carriage return. However, two of the types are 'HTML' and 'markdown', and HTML in particular is able to produce very long and complex items, themselves including paragraphs and other material, without that being accessible to the fedwiki system. In essence, using the HTML element breaks the whole concept of paragraph entities. Lists are a total joke!
Have a way by which people can add comments on specific parts of the content without affecting that content; and (more ambitiously) suggest changes that can be accepted very easily.
Commenting and suggesting modes are very well used in Google Docs, for the simple reason that they are very useful. In Wikipedia you have to actually change the text, and risk being reverted. Where there is a suggestion facility, it makes it clearer that the suggester is not certain whether it is an improvement in the eyes of others, and less face is lost if a suggestion is rejected than if an edit is reverted.
I have no idea how this is implemented, but it clearly isn't simple. It will also make the data model quite a lot more complex.
Commenting is more widespread. Apart from Google Docs and similar, you could see Git as effectively serving the need for suggestion, but you don't get the immediacy that you get with Google Docs. You could also see Fedwiki as meeting a very similar need, in that someone can make their own fork of a page, change some part of it, and the original author can bring the changed version back to theirs. But that is the only way you can do it on Fedwiki.
template:
requirement short description
text on why it is desirable
text on how
How vital is this? Where has it been done before?