In less than a week, a new proposal for a group project will be discussed at WordCamp Europe / #WCEU (WP Cafe in Berlin @ 15:30 on Friday, the 21.06.2019):
TAGSEO is a collaborative, group project to develop, manage and improve advanced search capabilities, search functionality and search engine optimization within WordPress. To put it simply, it is an open source search engine based on WP data structures.
About two years ago, I asked Matt Mullenweg a question. Basically, I suggested WordPress ought to recognize its role as an open source search engine, and to realize that this is something the world actually desperately needs. There’s a video of the Q&A event in Paris on the homepage of wordpress.tv, and I created a deep link to my particular question here.
To summarize, Matt’s
answer was excellent. He addressed all of the main relevant issues
with great technical expertise. And he also addressed one of the main
critical shortcomings – and so that issue is something I want to
Matt acknowledged that “You could have sites that tie together better than they do today”.
In the following, I want to compartmentalize various aspects of the WordPress community. By and large, WordPress consists of humans and technology – and also the interactions between some humans and other humans, between humans and technology, and also the interactions between some kinds of technology and other technologies. All in all, it really is a quite complex system, but at first glance – taking a wide scope, or a so-called “bird’s eye view” – one can identify a sort of triangle, with each corner representing different “players” in a game of interaction:
Authors and such
organization and such
Obviously, this is a
vast oversimplification, because these groups are actually not
completely distinct from one another. For example, when an internet
user (normally considered to be a “reader” or “audience”
participant) types in a URL or uses their mouse to move the cursor or
to click on something, they are actually sending
information rather than receiving
Yet the reason why I
consider these groups useful is because they embody what I consider
to be quite clear information asymmetries within the WordPress
I apologize for this
diagram – that’s why I call it “Completely Ridiculous and
Cringeworthy”. If it doesn’t make you cringe, then congratulations!
Let me explain. It’s
basically just a bunch of strings that flew into my head, having to
do with interactions within the WordPress community. They are
arranged in a way that might indicate something about the types of
interactions that take place. So if a creator takes out their
smartphone and shoots a picture, and then posts that picture to their
blog, then that might qualify as a “tech” (hardware) interaction.
Yet if the same person uses their smartphone to open a browser
window, then that technology is being used to consume information. In
both of these cases, WordPress might also be involved. When WordPress
delivers html to a browser, then that would be a “user”-oriented
interaction. When WordPress reads information from a database, that
would generally be an interaction with (something like) an “author”.
The crux of the matter
becomes more obvious when I point out that even though WordPress
software is open source, information asymmetries continue to exist
among these three subgroups – perhaps in many ways, but at any rate
in one very crucial way: the amount of data available to each
In the remainder of this
blog post, I wish to focus on one particular type of data that is
particularly unevenly distributed: the semantic organization of blogs
via “tag” and “category” data.
Before I can
underscore how valuable such information is to what Tim Berners-Lee
once envisioned as a “semantic web”, I need to point out that the
semantic organization of a blog is not an issue related to a single
blog post. Instead, it is only when someone is able to survey all
of the blog posts that each individual blog post becomes meaningful –
much in the same way as the words uttered in a sentence become
meaningful if and only if the entire sentence can be considered
meaningful. Another good example of this phenomenon is almost any
classification system. The conundrum of whether particular life forms
ought to be classified as plants or animals only makes sense for
classifying life forms, not for rocks, (inorganic) chemical compounds
or elements (which are all considered to be non-living). In all of
these cases, the meaning of something is at least in part derived
from what it is not,
there is a dichotomy of content
the optical “illusion” of foreground vs. background (commonly
known from the vase vs. two faces silhouette image) only makes sense
when contrasting black vs. white.
something is categorized as “business” makes sense insofar as it
perhaps is also)
categorized as “politics”, or maybe “health” becomes
particularly meaningful if and only if it is clearly stated that
“health” is a “science” and that all instances of the
semantic category “science” actually refer to “sciences
excluding health”. (for those interested to learn more about the
science of thesaurus design and/or vocabulary control [both subsets
of the field of information science], such annotations have been
traditionally referred to as “scope notes” 😉 )
What does all of this have
to do with the purported information asymmetry?
I’m glad I asked! 😀
Without being able to view all of the category information and/or all of the tag information, a user or reader cannot really make a reasonable estimate of what this particular blog post is about. If a particular blog post is tagged “fake news”, does that mean it is not about “propaganda”? If another particular blog post is tagged “spam”, does that mean it is also about “advertising”? The way WordPress works now, the author always knows, the user never (or at least extremely rarely*) knows, and the WordPress organization sometimes knows.
There are valid reasons for such asymmetries. As Esther Dyson pointed out well over a decade ago, the price of copying data is basically nil. If someone made all the data in their database freely available, then this data would exist in innumerable copies at the drop of a hat. Yet even though providing carte blanche access to everything may seem risky, bots like Google will almost certainly pore over anything and everything remotely available in order to make a pretty penny off of it anyways.
Google is not your friend.
Facebook is not your friend. There is a long list of companies and
actors out there, none of whom is your friend. Does the WordPress
community need to move forwards towards establishing and promoting
friendly relationships? I think it should, and I even have a concrete
suggestion about how to do it.
I think WordPress should
establish something like an open marketplace for ideas. Now this is
very complicated, very abstract, not yet very clear and also not yet
completely worked out, so please bear with me.
First of all, there should
be very low barriers to entry – think “free” (but it would
actually not be 100% free, as there’s no such thing).
Second, it would be – at least potentially – low risk. Market participants would basically be allowed to introduce themselves to each other. “OMG!” I hear you say: “NO WAY!! I get enough robocalls and other spam already!” Just a minute – remember I said it would not be 100% free. In order to introduce themselves (i.e., their blog), participants would need to lay bare their entire list of categories and tags – metaphorically speaking, they would be required to drop their pants and expose themselves. Perhaps there might also be an opportunity for a third party to verify that the information provided is not spoofed in some way, but even without verification it seems quite obvious that the recipient of such an introduction could quickly and easily ascertain whether there is any chance at all that such a “semantic friendship” might make sense (or whether on the other hand it seems too risky).
If the recipient were to
decide to accept the offer of friendship, then both participants
would be able to view each others’ current semantic organization and
also to receive updates regarding changes in the semantic
organization of each others’ websites… – but wait: there’s even
I think it would be very fortuitous if such friends could declare which semantic categories they especially agree on. For example, perhaps friends do not particularly agree on “health” topics, but they pretty much completely agree when the topic is “sports”.
Let’s imagine that one market participant has 100 such friends, and that they are in agreement with 5 of them regarding the topic “business”, and with 1 of these 5 they are also in agreement regarding “politics”. Now consider what could happen if some user visits this participant’s website and views an article (or blog post) about business. Imagine how useful it could be for that user to be able to see that there are 5 other blogs that are related, 1 of which is even also related from another perspective. Now also imagine an adjustable slider were available, offering new and different perspectives – linking to other categories of information, which are also available (and perhaps even also linked to / aligned with “business”) from those 5 friends who agree with respect to the “business” category. In this way, the categories of agreement could function as springboards to other categories which are not even on the map for the current blog… thereby providing additional, new perspectives related to the current topic. For example, if another group of friends agree with the current blog on the topic “real estate”, then if the current blog’s focus is New York, other blogs may offer unique perspectives from Los Angeles, San Francisco, Miami, etc.
All of this is simply one
hypothetical example, and all of it is pretty much fantasy
out-of-the-box thinking. Why not try it? What could go wrong? What
other ways might also be useful? Where do we go from here? Anywhere?
Nowhere? Somewhere? Maybe somewhere better than what we have right