Configuring apache mod_proxy and mod_proxy_html for Foswiki

  • Tip Category - Installation and Upgrading
  • Tip Added By - PaulHarvey - 27 Feb 2012 - 00:09
  • Extensions Used -
  • Useful To - Experts
  • Tip Status - New
  • Related Topics -

Problem

You want to access Foswiki via a reverse-proxy, for example, to present some subwebs as the root of some other domain

Context

You're using apache, and can change the config

Solution

QUESTION: why bother with a reverse proxy when VirtualHostingContrib is available?

ANSWER: In this example, example.org/Foo actually proxies realwiki.org/Web/SubWeb/Foo - most people probably don't want this.

  • realwiki.org/ is the Foswiki installation you want to proxy
  • realwiki.org/ is running http & https
  • example.org/ is the site we want to proxy from
  • example.org/System proxies realwiki.org/System
  • example.org/Main proxies realwiki.org/Main
  • example.org/Foo proxies realwiki.org/Web/SubWeb/Foo

The following fragment is common to both http & https vhosts. You should make it a separate includable conf file to avoid divergent configuration between http/https:
#LogLevel info
#ProxyHTMLLogVerbose On
#INPUT;proxy-html;DEFLATE is necessary if the realwiki.org server is serving gzip compressed pages.
SetOutputFilter INFLATE;proxy-html;DEFLATE
ProxyRequests Off
ProxyPassReverseCookieDomain realwiki.org example.org
ProxyHTMLURLMap http://example.org/Web/SubWeb/  http://example.org/
ProxyHTMLURLMap https://example.org/Web/SubWeb/ https://example.org/
ProxyHTMLURLMap /Web/SubWeb/ /
RewriteEngine On
RewriteRule ^/Web/SubWeb(.*)$ $1 [R]

This is example https config. Repeat with http instead for your http vhost.
SSLProxyEngine On

# These scripts aren't returning HTML at all, so don't rewrite their responses.
<LocationMatch "/bin/(rest|query)(auth)?">
    Order deny,allow
    Allow from all
    ProxyPass https://realwiki.org/bin
    ProxyPassReverse https://realwiki.org/bin
</LocationMatch>

# All other scripts return HTML (we hope), but mod_proxy_html totally ignores any <!DOCTYPE from the realwiki server and forces it to something arbitrary.
# We force it to XHTML-Transitional, otherwise it tries too hard to clean the markup, and that strips important attributes off some HTML tags.
<Location /bin>
    Order deny,allow
    Allow from all
    ProxyPass https://realwiki.org/bin
    ProxyPassReverse https://realwiki.org/bin
    ProxyHTMLDoctype "<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'>" XHTML
</Location>

# Don't rewrite anything from pub.
<Location /pub>
    Order deny,allow
    Allow from all
    ProxyPass https://realwiki.org/pub
    ProxyPassReverse https://realwiki.org/pub
</Location>

# In this config, we are presenting the Main web as-is.
<Location /Main>
    Order deny,allow
    Allow from all
    ProxyPass https://realwiki.org/Main
    ProxyPassReverse https://realwiki.org/Main
    ProxyHTMLDoctype "<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'>" XHTML
</Location>

# In this config, we are presenting the System web as-is.
<Location /System>
    Order deny,allow
    Allow from all
    ProxyPass https://realwiki.org/System
    ProxyPassReverse https://realwiki.org/System
    ProxyHTMLDoctype "<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'>" XHTML
</Location>

# In this config, we are presenting the Web/SubWeb/ web as the root of the domain.
<Location />
    Order deny,allow
    Allow from all
    ProxyPass https://realwiki.org/Web/SubWeb/
    ProxyPassReverse https://realwiki.org/Web/SubWeb/
    ProxyHTMLDoctype "<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN' 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'>" XHTML
</Location>

Known Uses

Known Limitations

PaulHarvey thinks that mod_proxy_html is doomed. Although I'm using (apparently) without issues now, the fact it tries to re-write responses which are clearly NOT html (Eg. when Content-Type headers clearly state something like "application/json"), and also rather crudely strips out the DOCTYPE and replaces it with its own, we can't help but think that there will be corner cases which will get harder and harder to identify and isolate.

See Also

BestPracticeTipsForm edit

Category Installation and Upgrading
Related Topics
Topic revision: r1 - 27 Feb 2012, PaulHarvey
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy