Feature Proposal: Want to view a page that lists all internal links that don't yet have a created wiki page.


Keeping track of links that go to pages that are not yet created.

Description and Documentation

It seems to me that to have in one place a listing of internal links that don't yet have a wiki page created for them would be a great way to visualize areas of the wiki that need attention and further development.





-- Contributors: KeithParker - 24 Aug 2012


That should be easy to do I guess, since all dead links do point to "/bin/edit/" - or may there be a conflict, a problem traffic usage? I would also appreciate such a "wanted pages" file!

-- LieVen - 22 Jul 2014

Actually it's not generally easy to do.

I'm been writing a Foswiki Store using SQL and because I wanted to provide fast link searching I also needed to track dead links. This requires scanning a topic on every save to find any potential links and store them in the DB.

So, when my store is finished I plan this capability as a feature.

-- JulianLevens - 22 Jul 2014

Julian is right that it isn't all that easy to do; here's why:
  1. The most obvious approach is to search all topics. This would be horribly slow.
  2. While internal links can be identified using a regular expression, it's not so easy to identify the context in which they are used (for example, inside verbatim blocks)
  3. Internal links can be composed using preferences. For example, I might have two words define using preferences:
  • Set BOO = Boo
  • Set JUM = Jum
and then write %BOO%%JUM%. When the topic is viewed, this will be visible as an internal link, thus: BooJum. However there's nothing in the saved topic to identify it as such - and the value of either of those preferences might be overridden in different viewing contexts such that the link doesn't exist, or points somewhere else.

Setting aside the complexity of problems 2 and 3 for another day, you could easily write an offline (command-line) tool to do this using grep to scan the .txt files (at some cost to your server when it runs). Script left as an exercise for the reader.

-- CrawfordCurrie - 23 Jul 2014

Hi, first question: is there any progress with this issue? I guess not.

Next, I am not a programmer, but honestly, I'm sure there must be an easy way. I will not mention MediaWiki because I know they use an entirely different basis for programming. I would like to comment your explanation:
  1. If it's slow, it could be done using a cron command that does a search only once a day, or even one so smart that it only needs to search all topics that have been edited since its last search.
  2. About internal links - what about doing a search of the finished HTML files? Those will indeed have even more links (like "edit", "copyright notice", etc), but any link will be clearly displayed as a link. Actually, there is free software on the web which does automatically that.
  3. Actually, I just tried that and noticed that's it's difficult indeed, since dead links are not recognized as dead links, because they still lead to some place. What I'd actually need is a search in the HTML-files revealing to me all the links which have a class "FoswikiNewLink". Any suggestions?

-- LieVen - 19 Nov 2015

The problem is that HTML files do not really exist. We have topics that are assembled from INCLUDE bits, all MACROs expanded and finally the Foswiki::Render code renders the expanded TML into HTML. And other than the Cache process, that rendered HTML is not saved. So there is nothing to scan.

What might be possible is an extension that uses a completePageHandler to scan the rendered page for the links with "FoswikiNewLink". Those could then be logged for offline processing, or recorded into a database.

The problem is quite difficult, even with something like this. Consider the following.

TalkComment is an INCLUDE that generates a link using %BASETOPIC%Talk. It's a practical application, because maybe the ...Talk topic is generated only when someone comments on the page. Wherever that TalkComment is actually included, gets a different link generated. TopicOneTalk, TopicTwoTalk ... and so on. And if I view the include directly, the link is TalkCommentTalk.

And this doesn't even touch on the subject of View Templates. Links might be rendered as part of a template, taking data from $formfields from the topic.

Anyway, it's really a very difficult issue to solve. Other wiki's have it easier because they don't have the capabilities of generating wiki applications quite as powerfully as Foswiki.

-- GeorgeClark - 19 Nov 2015
Topic revision: r6 - 19 Nov 2015, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy