This question about Topic Markup Language and applications: Answered

Formatted Search with regular expressions

I have problems with regular expressions.

I need a search for all topics starting with "AbteilungsPortal" and I want to show the first line (the header) of the topic, not like the summary a fixed amount of characters.

The header starts with + and ends with a CR/LF

Thanks

Andreas


%SEARCH{
  "^[^a-zA-Z0-9]*AbteilungsPortal"
  type="regex"
  format="$pattern(^([^a-zA-Z0-9]*?AbteilungsPortal.*?).*)"
}%

In the search string:
  • First ^ anchors the search at the beginning of the topic (or if we had multiple="on", it would anchor to beginning of the line)
  • [^a-zA-Z0-9] matches non-alphanumeric characters
  • The following * means match the non-alphanumeric characters zero or more times
  • Then AbteilungsPortal must follow that pattern.
In the pattern string:
  • ^ anchors the match to the beginning
  • ( begins the pattern to be extracted
    • [^a-zA-Z0-9] matches non-alphanumerics
    • Following * means match the non-alphanumerics zero or more times
    • ? makes the match "non-greedy" (in combination with * - match zero or more times until the first occurance of AbteilungsPortal)
    • .*? : . means "any character", * means "zero or more times", ? means "non-greedy"
  • ) finishes the pattern to be extracted
  • .* finishes the regex. In Foswiki, we must always finish $pattern() in this way
Result shown on this topic:

Searched: ^[^a-zA-Z0-9]*AbteilungsPortal
Number of topics: 1

-- PaulHarvey - 04 Apr 2010</verbatim>

Dear Paul,

I think I don't explain my problem very good.

In the company I work for we have a lot of departments (geman: Abteilung).

Each department will have a portal topic. The topic names are AbteilungsPortalA, AbteilungsPortalB, AbteilungsPortalC........ The portal topic starts with "---+!! KA-1 Machinery construction" for example

My own try:
%SEARCH{"AbteilungsPortal" scope="topic" nonoise="on" format="[[$topic]] $pattern(.*?---\+!!*([\n\r]+).*)"}%
I search for a topic its name contains "AbteilungsPortal". And I want show the first line of the founded topic.

Andreas

UPDATE:

I got it...mostly. (It helps to read the manual carefully!)
%SEARCH{
  "AbteilungsPortal" 
   scope="topic" 
   nonoise="on" 
   format="[[$topic]] $pattern(.*?([:blank:].*?([\n\r]+)).*)"
}%

New problems:

The topic I'm looking for ("AbteilungsPortalElt") starts with a heading1:
---+!! ELT-Abteilung
---++ Internal documents

The search-result is

AbteilungsPortalElt LT-Abteilung

The 'E' is surpressed! If the Text starts with 'A' or 'K' it will be shown correct.

And I tried to use the founded string as a link.
%SEARCH{
  "AbteilungsPortal" 
   scope="topic" 
   nonoise="on" 
   format="[[$topic][$pattern(.*?([:blank:].*?([\n\r]+)).*)]]"
}%

But this did not work?!?

Andreas


I think it is a current bug that [:classes:] are not recognised by $pattern(), and I don't know if it is an easy to fix (we don't want to prevent future non-grep search algorithms - Development.NormaliseRegexSyntax and Development.AddMatchOperatorToQueryLanguage has some background).

Anyway, the $pattern() is treating the [:blank:] literally: matching :, b, l, a, n, k characters. I would suggest using \s instead but I seem to recall that here again $pattern() doesn't handle that notation either, I could be wrong though. Which is why I wrote a pattern to match non-alphanum characters: [^a-zA-Z0-9]

-- PaulHarvey - 06 Apr 2010

Try writing the class as [[:blank:]] - classes have to be within a double square-brackets. I'm not sure about the rest of the regex.

-- GeorgeClark - 07 Apr 2010

Dear Paul,

your idea with [^a-zA-Z0-9] is good. Now I get the results I want!
  • Note you may want to also try George's note that the character classes look like [[:blank:]] instead of [:blank:]. This would be better, especially because [^a-zA-Z0-9] does not contain accented characters, etc.

In the Sandbox I made some tries. If you (or somebody else) have time please have a look! SandboxAndreas

-- AndreasEllguth - 07 Apr 2010

Thank you for the very clear questions you wrote.

I have moved them into the Support web, because they are a nice series of questions that could be useful to other users. I hope you don't mind.

NewlinesAndFormattedSearch

-- PaulHarvey - 07 Apr 2010

QuestionForm edit

Subject Topic Markup Language and applications
Extension
Version Foswiki 1.0.9
Status Answered
Topic revision: r9 - 07 Apr 2010, PaulHarvey
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy