Quo vadis or Why Foswiki (maybe) do not want PSGI.

Don't panic - The above title is could be an surprise. (even more from me, who years screaming about the PSGI). But now after reading much of the past IRC & Foswiki.Development discussions and learned about the current Foswiki development philosophy, (and also - honestly - learned (as-as) how Foswiki works internally), the title reflects some of my concerns.

Because my english grammar is terrible
  • I created few images to better understanding what i mean - in a hope avoiding misunderstanding / 1 image = 1000 words smile / - they're svg - easily editable, (just moved them to the topic QuoVadisPsgiFoswikiImages as an workaround for the Tasks.Item13908)
  • to ensure that the reader will understand what i mean this topic grows to an too big - i have no time to read one. frown, sad smile frown, sad smile
  • so, big sorry for the "too much" everybody knows this trivia.

Anyway, would be nice
  • if you read thru (hopefully it contains some usable info)
  • and repair (shorten, reword) some of my outpourings (This is an wiki, right?) smile

TOC

For the start, cite from an older version of the PSGI::FAQ.

My framework already does CGI, FCGI and mod_perl. Why do I want to support PSGI?

If your web application framework already supports most server environments, performance is good, and the backends are well tested, there may not be a direct benefit for you to support PSGI immediately -- though you would be able to remove any code that overlaps with PSGI backends.

Currently we have an well tested Foswiki::Engine::* codebase for plain CGI, FCGI and mod_perl. So, the following text is trying to "somewhat" approach the part: " though you would be able to remove any code that overlaps with PSGI backends ".

The benefits of PSGI - for the web-app development

Now, forget Foswiki for a while. Probably everybody who interested in PSGI already knows PSGI enough deeply, so skip this chapter . I included it only because it is handy to have references for the content later and I added it mainly for defining the "terminology".

The Intro into PSGI

  • PSGI is a perl version of the python's WSGI - (Web Server Gateway Interface)
  • For other languages: Rack(ruby), LuaWSGI, JWSGI(Java), JSGI/Jack(JavaScript), Clack(Common Lisp) and many others.
  • The PSGI - is the specification = (document)
  • The Plack:: is the perl namespace for the reference implementation = (perl code)
  • THE hero: Tatsuhiko Miyazawa CPAN and his blog and Plack advent calendar and plackperl.org.

The PSGI defines the interface - only 2 things:
  1. how the PSGI-server should execute an (PSGI compliant) application, and
  2. how the (PSGI compliant) application should respond to the PSGI-server
The PSGI compliant application (in short as "PSGI application"):
  • is an Perl code reference,
    • It accepts (takes) exactly one argument - the environment (hash-reference), usually called as $env
  • The application should return an array reference containing exactly three values:
    • the response status (e.g. 200)
    • array-ref containing the headers
    • array-ref with the the body, e.g. the content of the page. (in fact, it could be an code reference or IO-like object too - but this is an more advanced topic)
    • Everything are bytes(!) - e.g. not perl's internal wide chars (that's mean: the "border" is well-defined)

Because the server execute the application by calling the code reference and in return wants arrayref means: the server must be written in perl.

And because the server also prepares the $env environment, and it is very similar to Java servlet containers, the server is often called as: application server or application container or PSGI server or PSGI container. The "server" (or container) in PSGI terminology doesn't means the web-server!

Each PSGI based application has two parts:
  1. the PSGI-server (container) part
  2. and the PSGI-application part

So, the terminology:

The terminology

Web server
Web servers accept HTTP requests issued by web clients, dispatching those requests to web applications if configured to do so, and return HTTP responses to the request-initiating clients. It is an application that processes requests and responses according to HTTP(S).
PSGI Server (Container)
A PSGI Server is a Perl program providing an environment for a PSGI application to run in, calls the PSGI application and processes the returned array-ref. In the following text for this i will use the PSGI-Container term, and the "web-server" term for the ... ehm web-server :).
PSGI applications
are web applications conforming to the PSGI interface, prescribing they take the form of a code reference with defined input and output.

Forms of the PSGI container

Because the PSGI defines only the communication between the PSGI-container (aka PSGI-server) and the Application, the PSGI container could be implemented in many ways:
  • it could be an standalone web-server & PSGI-container written entirely (or partly) in perl. In such case, the PSGI+web-server)
    • handles the communication with the web client
    • and also acts as an PSGI-container, e.g. calls the application's code-ref and provide the $env hashref to the app
  • could be embedded into some web-server (like mod_perl), in such case
    • the web server (e.g. Apache) handles the communication with the web client
    • and the PSGI-Container (as an embedded module in the web-server) calls the application (calls the code ref)
  • could be connected to the web server (for example FCGI), in such case
    • The PSGI Server/Container (as FCGI script) handles the communication to web-server via FCGI
    • and calls the Application coderef (PSGI)
  • also could be invoked by the web server (plain old CGI)
    • in such case, the web-server invoke the PSGI-Container as CGI script
    • and the PSGI-container (invoked as CGI-script) calls the Application code-ref

It is very clean interface for the developers and therefore it is easy to develop the above containers for handling the communication with different web-server deployment scenarios. But don't need to do - they're already done!
  • Just search Metacpan for the Plack::Handler::
  • Exists Plack::Handler:: (CGI, FCGI, Apache1, Apache2, CLI ... and much more) - for nearly ANY possible usage scenario.

The basic PSGI benefits

In the result, the above means:
  • The PSGI-application must "know" only PSGI - e.g. easy to develop, the developer doesn't need to care about the different "Engines".
  • the PSGI-Container handles (and hides) the details about the communication with the web-server and "TRANSLATE" the specific web-server interface (aka. apache's $r or FCGI-sockets) TO PSGI and vice-versa.
  • because of the perl-native constructs (subroutine reference & hash-refenence "in" and array reference for "out") - opens a whole new world for the development - by layering and splitting the applications to smaller and better manageable parts - middlewares.
  • and of course, the PGSI-application is runnable using any PSGI-container - regardless of its implementation (standalone, embedded etc...) - so easy deployment.
  • in fact, usually the PSGI application is saved into app.psgi which returns the code-reference which gets called by the PSGI-container

The CGI.pm based apps and the PSGI

The CGI and PSGI are different beasts.
  CGI PSGI
calling mechanism Fork/exec subroutine reference
input "protocol" STDIN + %ENV $hash reference
output "protocol" headers, empty line, content on the STDOUT array reference

The calling mechanism

For the calling mechanism - the CGI app must be "wrapped" into an subroutine reference which could be called by the PSGI-container.

Two approaches:
  1. The CGI file from the /cgi-bin is wrapped into subref, which is executed in forked environment - e.g. it will work exactly as the classic /cgi-bin/script (so it is slow, but works without any changes to the /cgi-bin/scripts)
  2. the developer replaces the /cgi-bin/scripts with an manually crafted subroutine reference - e.g. develops the wrapper subroutine(s) itself (for the Foswiki this means 5 lines of code) smile

The "Protocol" translation

Because of differences between CGI and PSGI protocol, for running any CGI-based application (aka Foswiki) inside of any PSGI-container need to do the protocol translation based on the above table.

The "input" side

The input side (from the CGI.pm point of view) is easy. Basically the "translator" could use two approaches
  1. Create an environment which is understand-able to original CGI.pm, so the web-application will continue to use the original CGI.pm, so:
    • Just need populate the %ENV from the $env
    • and redirect the $env{'psgi.input'} to *STDIN
    • and the CGI.pm will be happy and the application will work.
  2. Overrides the methods in the CGI.pm so:
    • in the web-app is needed replace every use CGI with something like use MyApp::CGI (more bellow)
    • and the MyApp::CGI subclasses the CGI.pm, and overrides the needed original CGI's subroutines with new ones which understands PSGI.

In both cases the needed modifications to the CGI-application are none or small. In the
  • case 1.) no modification is needed at all,
  • case 2.), usually {grin} is enough replace the use statements, but depends on the level of the CGI-hacking in the app itself.

In both above cases, the web-application will continue to use the original cgi-constructions like $q->param and such, e.g. from the web-app developers point of view nothing (or only few) changes in the "input" side.

The "output" side

The main problems are at the output from CGI.pm. Because the CGI simply prints to the STDOUT, the CGI based applications could:
  • using the print-as-you-go approach - e.g. the application output something to STDOUT anytime when want
  • the application collecting the output to some variable and the final print is done at one place (this is the Foswiki case ... ehm ... just mostly)

For translating the output from CGI.pm based app to PSGI again exists basically two approaches:
  1. the "translator" redirects the *STDOUT from the CGI.pm based app to some file, and when the CGI.pm app finishes the translator will
    • parse the captured output for: 1.) status code 2.) headers 3.) body
    • create the needed array-ref from the parsing
    • and returns the $arrayref to the PSGI-container
    • because the CGI "knows nothing" about his output - no modification is needed in the web-application
  2. the web-application itself is modified to return the arrayref.
    • This is very easy process when the CGI application already collecting his output and the real print is done only at one place.
    • Could be much more work for the print-as-you-go type applications.

What about Foswiki?

Foswiki as an CPAN:CGI based application
  • using heavily the CPAN:CGI for two things:
    • as an interface e.g. for communication for the web-server
    • for HTML-generation like CGI::start_table , etc...
For allowing Foswiki to be PSGI compatible application we could
  • use some of the above methods to continue support CGI (in some form)
  • or not smile /more bellow/

The CPAN modules

For the above both CGI-PSGI translation approaches exists already developed helper modules on the CPAN - both are simple and small
  • CPAN:CGI::Emulate::PSGI
    • on the input it translates the $env to %ENV + *STDIN
    • execute the wrapper subroutine (calls the CGI-app)
    • captures in STDOUT into file
    • parses the file
    • and returns the arrayref to PSGI-container
    • no changes need to the web-application
  • CPAN:CGI::PSGI
    • it subclasses CGI.pm and replaces the needed subroutines to understand PSGI
    • in the CGI-application need replace every use CGI with use CGI::PSGI
    • the CGI-application must be modified to returning the arrayref (e.g. replace every print to STDOUT by collecting the output)
    • unfortunately(?) the CGI::PSGI does only "clean overrides" and some CGI-hacks in the app could cause problems
    • it doesn't imports the HTML generation functions, e.g. after the replace calls such CGI::h1(...) will not work

Foswiki mostly collects the output and the print is done in the writeCompletePage, but not generally! Examples: error messages many times are written immediately when the error happens - e.g. Foswiki ( as usually frown, sad smile ) not homogenous and uses the mixture of the collecting output approach and the print-as-you-go approach - so the needed modifications are an bigger task.

In-house approach

Hybrid approach (in-house development)
  • basically will do the same as CGI::PSGI (overrides the CGI.pm subroutines)
  • but could take into account the in-house CGI-hacks
  • also could import the HTML generation functions (but see(!): http://foswiki.org/Development/ReduceImpactOfCGIDotPMinFoswiki)
  • so the core and the Extensions could continue to use CGI-like constructs
  • but need to say:
    • the core and every extension using CGI directly must be changed to return the needed arrayref
    • somewhat "doable" for the Foswiki's github extensions ( but what about the user's in-house developed extensions? ) (but the http://foswiki.org/Development/ReduceImpactOfCGIDotPMinFoswiki impacts the in-house extensions too)

Summarization for CGI based Foswiki inside of the PSGI-container

In the above approaches,
  • no changes (case: CPAN:CGI::Emulate::PSGI) or some changes (case: CPAN:CGI::PSGI or in-house dev) are needed in Foswiki Core and Extensions
  • the Foswiki will continue to use the CGI.pm methods /overridden in case of CPAN:CGI::PSGI or the original in case of CPAN:CGI::Emulate::PSGI/
  • limited use of the middlewares (very limited) - because they could be used only before the "translation" and the Foswiki knows nothing about them (still uses CGI) e.g. the scheme is:

  • for the CPAN:CGI::Emulate::PSGI no modifications are needed - e.g. the CPAN:CGI::Emulate::PSGI provides the compatibility layer
  • but in any other approach - needs modify - not only the CORE but also any Extension which uses CGI.pm (especially impacts user's in-house extensions), and for the reasons outlined above - isn't possible to have an "compatibility" layer. The CPAN:CGI simply works differently as PSGI.

  • happens two protocol translations for example for the connected (FCGI) PSGI
  • we don't need such thing - we already have an well-tested FCGI based solution (and also mod_perl)

So, the whole PSGI-fied Foswiki (which continues to use CGI internally (overwritten or the "original") is an mostly harmless, but strange solution.
  • It is nice, because could be executed for example under Starman
  • cool for the hacking (see the bottom of this topic)
  • but otherwise do not brings any major advancement into Foswiki. (no really important middlewares so no "code reduction" etc...)

Dropping the CGI and go for full PSGI Foswiki

Possible approach (but tremulous amount of work) - re-develop the whole Foswiki to be an native PSGI app
  • and forget the CGI.pm at all. This is (more or less) inline with the http://foswiki.org/Development/ReduceImpactOfCGIDotPMinFoswiki .
  • (also forget Foswiki in his current form)
  • divide the Foswiki into smaller and more manageable parts
  • start heavily use existing Plack::Middlewares (in the context of the previous bullet)
But that's mean
  • pull out from the core many parts to middlewares
  • modify every extension using CGI and/or Foswiki::Request. (problem with the user's developed extensions - not in the Foswiki's github)
    • For example: PSGI apps natively uses CPAN:Hash::MultiValue. It is very nice (inside-out) object. Would be nice to use it directly (and not the $q->multi_params) but that's mean: replace every $req->param and such in every source.
  • The Foswiki::Request should be subclassed Plack::Request, but the Plack::Request uses the $req->session method for the session management (e.g. persistent data storage - a'la CGI::Session), it is extremely confusing with the Foswiki's current $request->{session} (as the Foswiki singleton) and like... and many other problematic things, name collisions and so...
  • etc..etc...etc...

Andrew's (partial PSGI) approach: cites from MakeFoswikiPSGIConformant (I checked in details his approach in his repository The idea is to:
  • Remove Foswiki::Engine::*
  • Foswiki::Engine will be the entry point for psgi
  • Replace Foswiki::Request and Foswiki::Response with CPAN:Plack::Request and CPAN:Plack::Response
    • Although for compatibility the Foswiki:: objects will be subclasses of the Plack:: for backwards compatibility, and may provide compatibility functions

Later, in comments:
  • Feedback Required: Do Foswiki::Request and Foswiki::Response need to be backwards compatible, or can we replace them with Plack::Request and Plack::Response?
  • I initially tried to make FW::Request and FW::Response subclasses of the Plack equivalents, overriding to provide compatibility, but there are too many cases where they use the same method name for verry different methods, so I don't think this is viable.

Of course it isn't viable for the CGI.pm (read: Foswiki::Request as ISA CGI) based Foswiki. Andrew meets the same problems as outlined above. (and others would come - for example the CGI::Session, cookie management, rest, the whole CGI... cry frown, sad smile frown, sad smile ).

SO, trying to "rape" the current Foswiki to be (partly) PSGI-based app is IMHO much more work than start all over again a develop the key parts (PSGI) cleanly from the start.

I'm sure not an PSGI expert (i'm not an perl expert at all) - but because I'm using PSGI in all my non-foswiki web-app development, I'm pretty sure that there is no (usable) "partial" approach. Foswiki will continue to use CGI or should go to full PSGI. Any "partial approach" e.g. that will try to rewrite Foswiki:: Request and like to something that will understand PSGI is doomed to extinction - or at least doomed to: "too many problems, for minor advancements".

It’s the weird color-scheme that freaks me. Every time you try to operate one of these weird black controls, which are labeled in black on a black background, a small black light lights up in black to let you know you’ve done it! smile

So maybe? would? be the best start the FoswikiNG:
  • choosing some well-known and actively developed PSGI framework which provides an standard solution for the common needs
    • Request/response & error handling
    • Logging (configurable backends)
    • Caching (configurable backends)
    • Configuration management (universal for every application) related: this and this
    • Utilities (encoding, URl & HTML escaping, etc...)
    • allows easy integration (pure PGSI, modular, standards based, etc...)
  • collecting the usable middlewares, and rid of from the current Foswiki the parts which are could be solved with existing middlewares and/or using the "framework", for things like:
    • session management
    • authentication
    • access & (partly) authorization (who is allowed to run the Foswiki, by configurable rules)
    • antibot guard helpers
    • security (Block CSRF & XSRF attacks, signed cookies, allow CrossOrigin headers, etc...)
    • REST request router & content negotiation (but see bellow Web::Machine)
    • rate-limiting (specific) HTTP requests.
    • rewriting rules (like apache's mod_rewrite)
    • serving static files
    • compressing / packing (& caching) static resources like CSS, JS etc..
    • watermarking images
    • debug panels (memory, ajax, git status, env, NYTProf, etc,,,)
    • etc... etc... etc...
    • (currently exists few hundreds of middlewares on the CPAN)
  • because the framework and the middlewares are universal allows develop other applications and/or allows simpler integration of the Foswiki into some bigger ecosystem
  • and Foswiki should be just an simple and more manageable application
    • which renders the MACROS & TML
    • and does the "wiki-logic" (with more cleaner design)

Additional things to consider:

The full-PSGI rework touches nearly EVERY part of Foswiki
  • which aren't directly related with the MACRO rendering and TML.
  • e.g. HTTP-states, directory layout, config management, loggers, caching, REST-handling, session management, etc.. etc...

At the result we could get the FoswikiNG, which
  • is 100% compatible with the current macros , TML and github-based Extensions - e.g. no changes needed to the user's topics
  • is ??% compatible with the user's in-house developed extensions which uses only the defined API calls and callbacks,
  • 0% (!!!) compatible with the in-house developed extensions whose using Foswiki::Request or CGI - they needed to be modified (maybe heavily)

In the "rework" context, is worth also to consider some proposed major changes to the Foswiki:

Again - the full PSGI is an extreme rework. many-many hours and even more concentrated developers efforts (from all devs). IMHO, this isn't in the category: brach and start hacking smile The result could be mostly backward compatible - but ... many opened questions.

So, discuss
  • WHAT DO YOU THINK
  • WHAT ARE THE PRIORITIES
  • WHAT should/could BE DONE
  • and WHAT ISN'T.
  • How to continue (if even)?
  • Ideas?
  • Wishes?

PLEASE TALK, (best if you will go into deep (technical) details)!


Some other things (not needed to read anything bellow)

Bellow are some more examples and use cases for the PSGI. Not need to read it. smile It is only for more outlining.

How the middlewares works

Best with the code. So install Plack and start hacking:
  • debian users - just install the "starman" debian package, it will install all required dependencies
  • the hardocore perl users - cpanm Task::Plack (it will blow in perl 5.22+ but don't care - you will get the important modules)

The "hello world" example

#my PSGI-application
my $app = sub {        # the $app - coderef
    my $env = shift;   # accepts one arg - the $env

    #and returns the needed 3 member arrayerf.
    return [
        200,                                    # status
        ['Content-Type' => 'text/plain'],       # headers
        [                                       # content
            "Hello world!",
            "Type:", $env->{X-TRUSTED} // "untrusted"
        ],
    ];
};
  • Save the above as app.psgi
    • (just remember - the above app.psgi must return coderef - so don't put the usual 1; to it's end!)
  • and run plackup
  • visit http://localhost:5000
  • it will print: Hello world!Type:untrusted

Extending

The next step - add an another code reference to the end of the SAME PSGI-application as above

# this is the my original application code reference
my $app = sub {        # the $app - coderef
    my $env = shift;   # accepts one arg - the $env

    #and returns the needed 3 member arrayerf.
    return [
        200,                                    # status
        ['Content-Type' => 'text/plain'],       # headers
        [                                       # content
            "Hello world!",
            "Type:", $env->{X-TRUSTED} // "untrusted"
        ],
    ];
};

# and added an another code reference
my $mw = sub {
    my $env = shift;  # also accepts one arg - the $env

    #does something with the $env
    $env->{X-TRUSTED} = "trusted" if $env->{REMOTE_ADDR} =~ /\A127./;

    #call the "original" $app with the modified $env and store the returned arrayref into $res
    my $res = $app->($env);

    #... do something with the content in the $res
    #s/(.*)/\U$1/ for $res->[2]->@*;
    s/(.*)/\U$1/ for( @{$res->[2]} ); #arrgh - gac410 doesn't likes the new perl syntax... :) :)

    return $res; #return the (modified) arrayref from the $app
};

Note:
  • the app.psgi file now returns the $mw coderef
  • so,
    • the PSGI-Server calls the returned $mw coderef,
    • and the original $app is called by the $mw as $app->($env) - e.g. the $app is "wrapped" by the $mw
    • intentionally not using the Plack::Request - this simple-code way is more "descriptive"

The very important "points of views"

From the psgi-container's point of view
  • the middleware looks like the application
  • e.g. the psgi-server "thinks":
    • it is an coderef so I can call it
    • and if it returning me an 3-member $arrayref, i'll be happy
  • e.g. server sees no difference between the middleware and application
From the $app application's point of view
  • the middleware is like the "psgi-server"
  • e.g. the web-app "thinks"
    • i got called (by my coderef)
    • it gave me the $env hashref
    • so i will return the 3-member arrayref
  • e.g. the $app by default sees no difference between the server and middleware

Wonderful and transparent possibilities - and very-very-very powerful.
  • such wrapping can be done as many times as wanted
  • each wrapping subroutine adds some "layer" around the $app

Such layering possibilities are the one of most important feature of the PSGI. Allows divide the application to well-defined layers, which are more manageable and more clear, GREATLY simplifies the application itself e.g. speeds up the development process.

The middlewares
  • could be universal, e.g. usable by any PSGI application - (currently exists few hundreds of middlewares on the CPAN)
  • or could be application specific (used only for the clear separation of the discrete application layers)

For example for the Foswiki:
  • why Foswiki should care about the Authentication - when here exists already developed authentication Middlewares?
  • why the Foswiki should care about the sessions, when exists the PSGI::Middleware::Session? - which is (honestly) much-much more powerful and more configurable as the current foswiki's session-management - and more importantly solves on of the main Foswiki problem - integration of the Foswiki into bigger applications ecosystem - with shared sessions (which needed for example for the authentication).
  • and many others smile

The Plack::Builder

Easy Middlewares

use Plack::Builder;
my $app = sub { ... };

builder {
      enable "Deflater";
      enable "Session", store => "File";
      enable "Debug", panels => [ qw(DBITrace Memory Timer) ];
      enable "+My::Own::Plack::Middleware";
      $app;
};
  • each enable line adds one middleware (adds one layer),
  • The short name - like "Deflater" - loads the Plack::Middleware::Deflater module.
  • the topmost middleware is called by the psgi-server

Mounting applications

  • the Plack::Builder also allows "mounting" different PSGI applications to different URIs
use Plack::Builder;
my $app1 = sub { ... };
my $app2 = sub { ... };
my $app3 = sub { ... };
builder {
    mount "/path1" => builder { $app1 };
    mount "http://otherhost.com/" => builder { $app2 };  #NOTE(!): it's like Apache's virtualhost
    mount "/" => builder { $app3 };
};
  • the correctly written PSGI application again, just works - regardless to it's mount-point.
  • extremely powerful feature
    • the admin develops (configures) it's apps locally (e.g. on his notebook)
    • and the deployment to the production server is fast and painless (usually doesn't needs to change anything)
  • allows code reuse, e.g. imagine:
use Plack::Builder;
my $wiki = sub { ... };
my $list_of_wikis = sub { ... };
my $wikihosts = $cfg->get('wiki_hosts');

builder {
    # mount "http://foo.com/" => $wiki;
    # mount "http://bar.com/" => $wiki;
    # mount "http://baz.com/" => $wiki;
    # mount "http://quu.com/" => $wiki;
    # mount "http://qxy.com/" => $wiki;

    # or the shorter:
    mount $_ => $wiki for (@$wikihosts);

    mount "/" => builder { $list_of_wikis };
};
  • ONE subroutine (once in the memory per server process) for MANY different domains.
  • of course, the $wiki could not use request singletons (unfortunately not a case of foswiki)

The typical real world scenario - via reverse proxy

Probably most common usage scenario. Scheme:

The reverse-proxy-web-server
  • could serve static files (do not need bother perl with simple file-serving)
  • does the ssl-offloading (CPU intensive)
  • forwards the plain HTTP to the backend web-server (on the same machine or to some other host)
  • the RP is configured only once for the web-app lifetime (e.g. /someapp -> localhost:8080/app)
  • it's configuration is simple - the RP-web-server's admin must not care about the application details (routing inside of the app)
  • the reverse-proxy doesn't need to be restarted for the application restart
    • seamless application modification/upgrade - without the need of any admin-rights for the web-server
    • simple "cut-offs", temporary shutdowns etc..
  • the reverse-proxy adds only few microseconds overhead - negligible
  • the reverse proxy could be security hardened - and doesn't reveals the application's web-server details and cares about the things about the common application-admin (e.g. foswiki-admin) should not (and many times could not) care.

The perl-application web-server
  • very simple deployment (unfortunately due the Foswiki the bad design (absolute paths in the LSC) makes the deployment more complicated frown, sad smile
  • runs on unprivileged port (and could run on another (internal) machine)
  • therefore it could be restarted by ordinary non-root - application admin

Usually for the PSGI-web-app deployment is needed the following 4 bullet list:
  • one non-root account (usually with vpn or ssh access)
  • the C-compiler (for compiling plenv and the XS - e.g. doesn't care about the "quality" of the system's perl-packages - it is not app's admin business and doesn't depends on the local sysadmin knowledge (or lack of))
  • configured reverse-proxy for the e.g. /apppath => apphost:8080/apppath (the port-number and uri-path)
  • site integration details (LDAP/AD, SSO, mail-server, etc...)

Running the app as unprivileged (ordinary) user has many benefits:
  • the organization does not need provide root-access,
  • the org's main web-server doesn't need to be restarted for the application restart
  • easier comply with the organization security policy
  • ordinary user with a $HOME, easy to manage (plain login - not need to use any su/sudo)
  • many other... smile

Of course, the above is for the enterprise installations.

In some hosted scenarios where using of the reverse-proxy isn't possible - no problem - the PSGI app could be deployed in embedded or connected mode too, but loosing some of the "unprivileged user" benefits.

The position/role of an PSGI FRAMEWORK

Regardless of the fact that the application is only a subroutine reference, there are many things that are common to many web-applications, e.g. they're against to the DRY principe. Therefore application developers usually uses some PSGI compliant "web-frameworks". (Un?)fortunately here are tens of different web-app-frameworks. Some frameworks aren't directly developed as "pure" PSGI frameworks, but they're adapted to run in PSGI env - example:Mojolicious).

An "good" PSGI web framework:
  • uses app.psgi natively - e.g. the application developed by the framework is usable directly from any app.psgi
  • doesn't enforce any data-storage-backend (there are many frameworks that forces to use an sql-database)
  • doesn't enforces any templating system (the user could choose any)
  • doesn't enforces (but allows) any design pattern (MVC, MVP, MVVP, etc.)
  • provides an interface to common things like:
    • flexible configuration (global, local, plugins, different layers for development, deployment, etc.. with well-defined serialized storage - for example: Config::App, Config::Merge, etc... (because it is loaded only once - it could be "slow" but must be powerful)
    • flexible and relocatable filesystem environment and hierarchy for the app (config dirs, perl-libs, tmp, logs etc...)
    • logging interface (configurable to log4perl or Log::Dispatch etc..)
    • caching interface (usually CHI + Memoise)
    • testing
    • utility functions for common web-based "things" (url-decoding, html-espace, uri-escape etc...), debugging (development dumping/logging utilities to STDERR or to LOG, etc...)

The image for the "PSGI-compliant web app" could be changed to:

Where the "white" squares are already existing CPAN-modules (and scripts). Using such architecture allows minimizing the development effort (doesn't reinvent the wheels many times again) - e.g. "free" developer's hours for the project.

Such application:
  • from the run_the_web_app.pl
    • (using the framework) reads the config at the start and provide an global interface to the config values.
    • sets up the app-environment (filesystem hierarchy, loads perl modules at the startup, etc..)
    • sets up the logging and caching
    • and finally starts the configured PSGI-server (plackup, Starman etc..)
  • the PSGI-server loads the app.psgi and executes it, where
    • many things are done at middleware level... (and the middlewares configuration is done based on the globally accessible framework provided config utilities)
    • and the developer must develop really only the main application logic.

Common (real world) app.psgi

The typical multi-PSGI apps looks like the following scripts:

the run_the_web_app.pl

use My::Framework::Config qw($config);
use Plack::Runner;
my $runner = Plack::Runner->new;
$runner->parse_options( $config->get_server_args() );
$runner->run;

and the app.psgi

use My::Framework::Config qw($cfg);    # the PSGI Framework loads the config
                                       # (usually an blessed hashref)
                                       # before(!) the app.psgi is even processed.
                                       # good practice for easy deployment
                                       # the config is usually in well-known serialized format
                                       # like YAML of IniFiles and so on..
                                       # all(!) applications should(!) use it
use Plack::Builder;

my $app1 = My::App1::Instance->get();
my $app2 = My::App2::Instance->get();
# etc,,,
my $main = sub { ... };

builder {
   enable "Session";         # session management is usually COMMON (!) for all apps

   enable "Static" .....;    # serving static public files (like from /pub)
                             # but usually it is handled by the rev-proxy itself

   enable "Auth::Some ... "; # central authentication (login) - (Who are you?) - or 401

                             # NOTE: The authorization (What can you do?) is a split between
                             # - some midlewares (e.g. the user with the given identity could run the app?)
                             # - and (ofc) the app's internal logic (e.g. 403 - this is admin-only function)

   enable "Other::Middlewares::Common::For::All::Apps";

   mount "/app1" => builder {
      enable "Access", rules => ... # some rules based on the $cfg and/or the above Auth
      enable "Some::App::Specific::Middleware";
      $app1;
   };

   mount "/app2" => builder {
      enable "Access", rules => ... # another rules ...
      enable "Another::Set::Of::Middlewares";
      $app2;
   };

   mount "/" => $main;
}

The PSGI allows easy integration of many PSGI compliant applications. But this also means:
  • the startup_script.pl and the app.psgi is usually precisely tailored for the local environment
  • it is possible to "ship" the both as an simple example scripts which allows run the web-app
  • but usually they're modified to take into account the installation specific env.
  • e.g.: With great power, comes great responsibility. smile

More informations - Links

For more information, check:


For the hackers...

The current fast-path Foswiki under PSGI

Foswiki just works with CPAN:CGI::Emulate::PSGI under any PSGI-server.

Some (not really) problems:
  • Because the Foswiki does the "bootstrapping" the not configured Foswiki can't run under Starman. Because the Starman preforks N-workers, e.g. after the config save the workers aren't configured. Solvable by the $env->{psgix.harakiri}, but still is better doing the configuration with "plackup".
  • The server must be restarted after the LSC changes - reason: as above. Solvable - but better to change "when and how" the configuration is loaded.
Some bigger problem:
  • The CPAN:CGI::Emulate::PSGI in the "protocol translation" attaches an tempfile to the CGI's STDOUT. E.g. the response from the CGI app going first into en IO::File->new_tmpfile;, which is parsed by CPAN:CGI::Parse::PSGI. The tempfile creation slows down a bit the responses (but not much - it is comparable fast with the current FCGI - and in the SSD filesystem it is really FAST - as any other persistent solution).
The real problem
  • it leaks memory as Titanic. frown, sad smile (and yet don't know why and where - and honestly - my perl isn't enough good to find the leaks.) cry sweating
  • e.g. using the CPAN:Plack::Middleware::SizeLimit is absolutely a must. (but on the OS X it slows down the requests, because it uses CPAN:Process::SizeLimit::Core and this module does poor job on OS X (it forks the top command for every memory check)). cry

This is my app.psgi - me using it also for quick hacking:

use 5.014;
use warnings;

use Cwd;

use Plack::Builder;
use Plack::Request;
use Plack::Response;

use Plack::App::Directory;
use CGI::Emulate::PSGI;

use Data::Dumper;
use JSON;
use Encode;

sub PE { print STDERR @_ }   #helper...

my($root);

BEGIN {
   $root = getcwd;
   die "Wrong directory (cwd: $root)! Can't found $root/bin/setlib.cfg" unless -f "$root/bin/setlib.cfg";
   $ENV{FOSWIKI_SCRIPTS} = "$root/bin";
   require "$root/bin/setlib.cfg";
   $Foswiki::cfg{Engine} = 'Foswiki::Engine::CGI';
   mkdir "$root/NYT" unless -d "$root/NYT";   #NYTProf dir
}

# the 5-line manually crafted Foswiki/CGI wrapper
my $foswiki = CGI::Emulate::PSGI->handler(sub {
   CGI::initialize_globals();
   use Foswiki();
   use Foswiki::UI ();
   $Foswiki::engine->run();
   CGI::initialize_globals();
});

builder {
   enable SizeLimit => ( max_process_size_in_kb => '128000000', check_every_n_requests => 100, log_when_limits_exceeded => 1);

   enable "Static", path => qr{/pub/}, root => "$root";

   enable sub {   #inline middleware (hacking)
      my $app = shift;
      return sub { my $env = shift;

         # quick runtime hacking of the LSC
         # without the need edit LSC or run the configure
         #$Foswiki::cfg{Password} = '$apr1$lkVoNZWx$F8o4t7huQT7o9q8cTY8Ps/' # password: q
         $Foswiki::cfg{Trace}{LoginManager} = 1;
         #PE Dumper $Foswiki::cfg{SwitchBoard};

         # dump something from the input, like the configure's JSON-RPC
         #my $req = Plack::Request->new($env);
         #if($req && $req->content_type &&  $req->content_type =~ /json/i ) {
         #   PE "REQUEST:\n",JSON->new->utf8->pretty->encode( decode_json( $req->content ) );
         #}

         #call the next...
         my $res = $app->($env);

         #hack/dump the body content @{$res->[2]} or the headers @{$res->[1]} directly here
         #PE Dumper $res->[1] ;   #dump the headers
         return $res;

         #or play with the response object
         #my $response = Plack::Response->new(@$res);
         #if( $response->content_type =~ /json/i ) {
         #   PE "RESPONSE:\n", JSON->new->utf8->pretty->encode( decode_json( $req->content ) );
         #}
         #return $response->finalize;
      };
   };

   enable 'Rewrite', rules => sub { s!^/!/bin/view/! unless m!^(/bin/|/NYT/)! };   #short URL's

   #mount "/NYT" => Plack::App::Directory->new(root=>"$root/NYT")->to_app;   #uncomment for NYTProf, also bellow in Debug

   mount "/bin" => builder {
      enable 'Debug', panels => [
         qw( Environment Response Memory Timer Session GitStatus Ajax ),
         #['Profiler::NYTProf', base_URL => '/NYT', root => "$root/NYT", minimal  => 0]
      ];
      $foswiki;
   };
   mount "/" => sub { return [ 302, [ 'Location' => "/Main/WebHome" ], [ '' ] ]; };
};

  • The above script is saved as ~/bin/foswiki.psgi
  • In the .profile:
    fastfw() { plackup "$@" ~/bin/foswiki.psgi; }
  • quick hack FW:
    • untar Foswiki
    • cd /to/foswiki/root/dir
    • fastfw [ -p someport ]
    • e.g. it is something as tools/lighttpd.pl but far more flexible (read hackable)... smile

-- Main.JozefMojzis - 30 Dec 2015 - 02:17

Thanks Jozef for this in depth analysis and collection of thoughts. Even the topic title is well chosen smile

There are some very important points made here, some are missing:

  1. downsize Foswiki's code base and replace Foswiki's home-made calling semantics with a standard one
  2. facilitate a completely new class of unit tests
  3. better scalability and resource mangagement when deploying on a large scale

I am not so sure whether any of the PSGI middleware layers/plugins will be of much benefit (to the end user) as we are almost covering them all using Foswiki plugins & infrastructure. At least there is some substantial feature overlap between both communities.

When facing such a big untertaking the motivation has to be much stronger than PSGI middleware.

Before we decide on such a rewrite we have to make clear which real problems we are facing now and increasingly over time in terms of the next 5-10 years. That's where I'd motivate this initiative. So which critical problems are we already suffering from today?

  1. cgi ... is history, but not yet for Foswiki
  2. mod_perl ... I would not recommend its use as loading Foswiki into the web-server is not scaling well. And this is an apache-only solution.
  3. fcgi ... is currently the only calling semantics that I'd recommend and that is supported by all major web-server (apache, nginx, ...)

Alas FCGI and even more FCGI::ProcManager are problematic with regards to its process life cycle, scalability and configurability. I once looked into FCGI::ProcManager and was a bit of worried by its code quality. It doesn't seem to be well maintained upstream either.

So for me none of the three (major) engines offer a sustainable base for Foswiki for the next decade. I am not expecting much from FCGI.pm anymore other than following the same trajectory of CGI.pm.

That's why I'd consider a full PSGI rewrite. Retrofitting Foswiki into YAEL (yet another emulation layer) would imho not provide much benefit rather than suffering from problems of both worlds.

-- MichaelDaum - 30 Dec 2015

I would revive this topic a bit by asking a question to JozefMojzis. I had a conversation with George Clark today related to issues of sharing common codebase per several virtual hosts. George has raised few issues related to manageability of such setups like the need to upgrade all webs of each virtual host simultaneously upon every major code upgrade. Taking into consideration possible tens and hunderds of thousands of pages to be converted this is a serious challenge for an admin.

Another big issue which actually triggered the discussion is that currently Foswiki doesn't support per-virtual host configuration. In other words every virtual host has to be located in a directory of its own. I was thinking about fixing this deficiency but must admit that it's not a piece of cake. Yet, the result will not be compatible with current $Foswiki::cfg approach and has to be very carefully considered before implementation is started.

-- VadimBelman - 19 Mar 2016

I'd be extremely concerned if we redesign so that I MUST use foswiki managed vhosts to manage multiple foswiki hosts on a single server. The concept of sharing a single foswiki codebase for multiple vhosts may be handy in some circumstances, but I'd never agree to force it onto everyone, and would most likely never use that feature myself.

  • Isolation of servers. I trust vmware / qemu. I trust os provided "jails", I generally trust Apache and fcgi, I'm less trusting of mod_perl and if we go and roll our own virtualization code, when will we see our first CVE's for leakage.
  • Separation of hosts for upgrade purposes. "one change at a time"
  • Support for "other software". I use apache vhosts for other code. sympa email lists, limesurvey, etc. Yes I can set up proxies, but that's more complexity.
Anyway, I'm definitely NOT sold on any thoughts of merging in the VirtualHostingContrib or it's concepts into mainline Foswiki.

-- GeorgeClark - 19 Mar 2016

Guys, mostly when i wrote something it is totally misunderstood. Really. Like the above Georges comment. It is clear to anybody who knows even the absolute basics of the PSGI's that any "forcing" is total nonsense... Also myself never talked about something as that I MUST use foswiki managed vhosts to manage multiple foswiki hosts on a single server . So I haven't idea from where George got the idea about the: "MUST. Again, It is an nonsense. Using the common codebase is a possibility which is (again once) nothing magical... Please, understand - any PSGI app is nothing but an code-reference. The whole foswiki will be called/used/run (read as you wish) just as one subroutine. As an subroutine (already loaded in the memory) is reusable by the principe. (Of course, with different data) Free your mind - we were leaving the forking CGI and heading to 21th century with persistent perl interpreter, even with python based servers (read uwsgi), SCSS and JS based frameworks ("bower" is a must currently), and modern OO which allows simpler and faster development, (THANX VRURG!), and much more...

This is like the misunderstanding about the event-loop based web-apps. Compare the event-loop based servers with VirtualHostingContrib is like compare Moose with the HigherOrder perl, (read: lisp-like functional perl programming) which is great - but totally different approach. Continuous misunderstanding. As about the Unicode::Collate::Locale /omg -2 years frown, sad smile / - or the CLDR, or the CHI::Memoize, etc.etc.. Or as about Moose - everyone talks about it as about something "magical" - even if the whole Moose is just perl - and no magic at all... cry Or about the Moose's slowness (which is an total bullshit in current typical web-environment - yes, it has startup penaly, but who care about it in the persistent PSGI?) The perl's big-gun framework (which powering for example the MetaCPAN itself) - the catalyst - is Moose based... Is MetaCPAN anyhow slow?! I would be happy if Foswiki's ajax calls would be as fast as in the (slow Moose based) MetaCPAN... cry

Giving up. It is pointless to start talking for example about the Jemplate which is great for generating (once per release) the needed javascripts for the ajaxified calls - because the reaction will be something as: we don't want another templating system - even if it should be used only once per each new release. etc..etc.. And the Jenplate generates really wonderfull (browser side) javascript code - and we haven't enough JS coders. Or, by using an simple Extension with JavaScript::Duktape (server-side JavaScript) we could write thing like: %JS{ any javascript construction with functions, loops, everything }% while maintaining the left-to-right-inside-out rendering logic but gain full-blown and extremely fast topic language...

Simply, the foswiki somewhat don't want use anything what helps - besides the pure manpower - develop (nearly) everything inhouse - byte-by-byte.

Because the problem probably isn't with the your (the reader's) knowledge, the problem is with my english and with my ability express my thoughts clearly, "diplomatically" and precisely understandable. This is my fault. I like modern and powerful trends in the web development, I like to use powerful CPAN modules. Simply, my approaches isn't compatible with the rest of the team. (except vrurg) Again, this is my fault, but sorry, i haven't time to learn the current foswiki's way of thinking. (like calling Some::Fcking::Long::Package::Name::function with full package-path, development with minimal encapuslation and so on...)

Therefore i decided leave the project - i continue to use it as normal user, but without trying suggesting any improvements, searching bugs and like. I'm not an core developer, so no manpower loss to you and to the (mostly great) project. I was an controversial person anyway and me not helping at all the otherwise really friendly and great team. smile

Excuse me for all confusions, problems and misunderstandings what i created.

Crossing fingers - keep up good work. smile So Long, and Thanks for All the Fosbits... smile

/Ps: probably is best delete this comment, as it is off-topic to the PSGI brainstorming :)/

-- JozefMojzis - 19 Mar 2016

yeesh. Please don't leave the project over me being concerned, or making obvious my lack of understanding on how these things work.

-- GeorgeClark - 19 Mar 2016

Jozef, before you make a hasty decision think of another aspect of this situation. Lets presume that you're correct in your accusations. This is not totally true but this assumption would underline my thought. So, once again: let's accept the project resembles a stationary system which trying to oppose any changes. Any such system has a tendency to collapse sooner or later. The only way to prevent this is to have internal opposition – and this is you! wink This is a joke but you know what they say: "Any joke contains some joke in it!" wink

Now, as I tangled it enough it would be important to say that the project needs an idea generator. Not just 'lets make this piece better than it is now' but something bigger. So far this is what you do. Don't be upset because your ideas are not instantly accepted. These are seeds to be planted – not every seed sprouts. Of those which does not every gives are the fruits we would expect. But this should not stop us from planting them.

PS. BTW, you didn't respond to my question. wink

-- VadimBelman - 19 Mar 2016

Vadim, I just finished with personal email to George, but seems I need clarify some things here too.

The pure truth is - i have too many different activities:
  • currently developing (two different!) brand-new web applications for intranet
  • playing with Forex trading ( needs much time frown, sad smile when trying marriage the CPAN:Finance::FXCM::Simple and the http://geniustrader.org )
  • designing furniture in sketchup
  • sometimes helping on stack-overflow with bash/perl ...
  • playing word of warcraft smile /more precisely, want start playing again after a year pause - will need much time :)/
  • have 4 dogs and 3 cats and one wife smile

I'm not contributing any code.

So, in short:
  • I simply haven't time for the Foswiki (now)
  • my leaving has nothing with George's or anobody's comment
  • maybe, when i finish some of the above bullets - and when you finishes the Moo-conversion smile
  • i will return to the project (to upset some people again) wink smile smile smile /kidding/

Please, delete these comments - theyre really offtopic to the PSGI . wink

-- JozefMojzis - 19 Mar 2016

My most important thought that applies to ANYONE who stops by. "Contributing Code" is not the only value to the project. All the code in the world, without someone to test it, translate it, document it, someone to use it, and someone to suggest ways to improve it, is worthless. So your contributions are IMHO at least as valuable as that contributed by the coders. Regardless of whether it's finding typos in docs, proposing the next new thing, or plain down and dirty coding, the project is successful only as each small part makes up the whole. </soapbox>

-- GeorgeClark - 19 Mar 2016

From the IRC:
Engines are not what start the app – it's app which uses engine to communicate with the outside world. Driver concept.
This is OK when we talking about CGI + FCGI and so on. But definitely NOT for the PSGI. The PSGI app - by definition - must be
  • an coderef
  • which accepts exacly one arg - the $env
  • and returns 3 member array ref.
This is by definition. So, (for example) we don't have PSGI app until foswiki web-app didn't returns the 3 member array ref.

And when the foswiki app fullils all the above 3 requirements - definitions - we have an PSGI app, and, you don't need other engines, because you can execute the above CODEREF under already existing (already developed) "engines" - e.g. CGI, FCGI and so on...

So, the PSGI is exactly the opposite to the "driver" concept!!!
  • The engine is the first (started by the deployment environment) - and therefore the engine ISN'T THE PART (!!!) od the web-application (foswiki))
  • the above engine (which is started before foswiki code is even read) - JUST CALLS the application coderef - etc... see above.

So, when you once managed the PSGI app, the "driver concpet" is wrong. (unnecessary, redundant, overcomplicated - simply the main PURPOSE of the PSGI is: Do not care about the deployment environment. The PSGI app works in every deployment. (even as CGI) - or with other words: the app should NOT care about the environment - it should be designed as just as PSGI.)

For moving Foswiki to PSGI need read thru the Plack::Request, Hash::MultiValue, Plack::Builder, and such basic PSGI implemntation parts - othervise all talks are pointless. We can't negotiate our views when we didn't knows, understands and accepts the basics about the PSGI design and philosophy.

-- JozefMojzis - 25 Apr 2016

Also, please keep in mind. One PSGI server could run MANY different PSGI applications. (e.g. Foswiki and some other PSGI apps). So, care about the "engines implementation" inside of the Foswiki - is pointless.

Maybe my text isn't understandable and written with bad english. Therefore I try simplify it as much as possible. Imagine, the WHOLE Foswiki is JUST ONE SUBROUTINE. Lets use it as:
my $app = Foswiki->new->to_app; #the $app contains the reference to the subroutine = coderef
When the $app subroutine is called, it MUST return the 3-member arrayref. That's is all (roughly).

Now imagine, what happens when you want run the above PSGI app as Apache/CGI:
  • the apache forks the CGI script
  • the CGI-script contains only
use Plack::Handler::CGI;
my $app = Foswiki->new->to_app; #the $app contains the reference to the subroutine
Plack::Handler::CGI->new->run($app);  #This is standard CPAN module - already developed - in the Plack:: namespace...
and you're done. This CGI script is just as any other CGI script. What does the Plack::Handler::CGI under the hood? It also very simple:
  • translates the %ENV which got from the Apache to the $env
  • connects the STDIN, to the psgi.input
  • calls the $app subroutine with $env as arg - (the subroutine returns the status, headers and body of the response (the HTML) - as arrayref)
  • after the Foswiki subroutine finishes and returns the headers and the HTML - the handler prints the the status code, headers, empty line, and the HTML to the STDOUT (so, for the apache it is exactly as any other CGI script).

The same is applies to the FCGI, mod_perl or any other possible PSGI deployment. The handlers (like the above Plack::Handler::CGI ) are already developed. Because the Plack::Handler::things are called as *first* by the deployment environment, the $app doesn't care about the engines implementation and therefore it VERY simplifies the development and the maintenance. (only one "engine").

Roughly thats all - it is (really) simple - just must understand how it works - and where are the "borders" and give up the maintenance nightmare about 4 different "inside -app engines". wink

-- JozefMojzis - 25 Apr 2016
 
Topic revision: r12 - 25 Apr 2016, JozefMojzis
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy