Feature Proposal: Foswiki.pm is to be split and session object is to be reconsidered.

Motivation

This subject is related to NewOODesignPlan topic.

Description and Documentation

Foswiki.pm is overloaded. It's a mixture of static API functions and methods of the core of all Foswiki code – the session object. Actually it isn't really a session as some may expect. We need to change this.

Background

Aside of what is stated in the previous section I would like to mention another serious design flaw: total overuse of BEGIN blocks. Not only they're difficult to debug (Komodo, for example, cannot debug them whatsoever) but most likely they're not compatible with future PSGI and multi-domain hosting. Consider, for example, %macros hash which is global. If a plugin registers it's own macro then it becomes available across all handled domains independently of what plugins do they use.

The View

Foswiki.pm has to be split. I propose the original module to be a container for:

  • StaticMethods aka API functions.
  • System wide domain independent constants.
  • Few execution environment adjustments.

The key words here are 'domain independent'. If something doesn't fit this demand – it doesn't fit into Foswiki.pm.

The code related to what we now call session I would extract and put into Foswiki::App class which would be the alpha and omega of processing a single request.

Why App? Because Application is too long! smile Who needs another reason, really? smile Ok, for those who actually does I would give the following reasons:

  • It is puts together all other code (frameworks, modules, contribs, and plugins) as an application does.
  • It gets through basic stages of an application life cycle: prepare, init, run, shutdown.
  • It actually handles the task of organizing the interaction with a user by processing his requests.

Eventually the startup code of a CGI script shall consist of few lines:
#!env perl
use Foswiki::App;
Foswiki::App->run;

It is then the job of Foswiki::App constructor to do the rest:

  1. Determine the environment we're running in.
  2. Setup basic error handling.
  3. Read the configuration.
  4. Initialize the base structures and objects.
  5. Initiate the request handling.
  6. Collect the garbage, clean up, turn off the iron, the kitchen stove... Oops, it's from another topic. wink

And no BEGINS unless there is no other way. use locale doesn't fit this model because it depends on %Foswiki::cfg. But:

  1. It was told that the locale pragma is no good and different solution must to be used.
  2. For the time being until locale could be wiped out it's use might be regulated by some kind of environment variable. Apache can set shell environment on per-virtual host basis, for example. But then again: if I do understand the problem here any persistent environment similar to mod_perl or FastCGI would have the same problem of sharing the locale from the first time a module was loaded.

So, Carthage locale has to be removed!

Another big change with impact on plugins: $Foswiki::Plugins::SESSION will disappear. Plugin handlers without supplied $topicObject (aka $meta) parameter would need to get an additional one – like the preload() or tag handlers.

The above condition could be ignored if there is 100% guarantee that a process (I – system process identified by it's PID) serves no more than one request at a time. The latter means that no frameworks like POE or alike are used nor threads are involved. Though in latter case a variable could be declared thread-local. In POE case replacing variable's content upon context switching would resolve the problem too.

Unfortunately I'm running out of my time and have to stop here. But hopefully the seed has been thrown and the idea is clear.

New classes emerging from Foswiki.pm

In this section I will be introducing new classes brought to life in the process of splitting the Foswiki.pm. Due to lack of time and because the new classes might be a kind of a moving target detailed description shall not be expected within this proposal. Yet, most of their functionality including method names will be coming from Foswiki.pm and as such should be familiar to anybody interested.

Foswiki::Macros

Macros handling and expansion.

Foswiki::UI

Converted to a class. Request handling functions are to be moved into Foswiki::App. Would be a great place for UI-related staff like generic page generation in replace of CGI-based generation.

I'm also considering to move into Foswiki::UI pure UI-related code like:

  • user handling
  • methods like getSkin() or inlineAlert()

Changes in behavior

Some classes would do things differently or their role is to be changed.

Foswiki::Engine

Engine was previously the initiator of the request processing. This role is handed over to Foswiki::App. Engine used to setup Foswiki::Request – this is where Foswiki::Request would do the job better. Engine is to become a mere mediator between Foswiki core and the environment we're being ran under. In other words it will serve as a source of information (what connection is being used; what query parameters are supplied; what cookies are being sent) and as the output destination (i.e. – method write()).

Foswiki::Request

This class would take more active role in setting up itself – from parsing the path info string to fetching and setting various parameters like connection attributes (secure/insecure, port number), action name, etc.

Examples


$this->app->macro('SEARCH');

# Let's assume that %macros converted into a Foswiki::Macros object.
$this->app->macros->expand($topic); # $topic->expandMacros is still a comfortable wrapper.

# But if the above is considered an overkill:
$this->app->expandMacros($topic); # We can still have this form. Or preserve it as the only one.

# Passing additional app parameter to a newly created object every time might be pretty boring. Why not to automate it?
$this->create(
    'Foswiki::Meta',
    web => $this->web,
);

# create() method would basically be like this:
sub create {
    my $this = shift;
    my $class = shift;
    return $class->new(
        app => $this->app,
        @_
    );
}

# In certain cases data structures would require some kind of chaining (parent/child, prev/next).
# It won't burden the code too:
sub createChild {
    my $orig = shift;
    my $this = shift;
    my $class = shift;
    my $newObj = $orig->(
        $this, $class,
        parent => $this,
        @_
    );
    return $this->addChild( $newObj );
}

$this->createChild( 'Foswiki::Meta', web => $subWeb );  

Etc., etc., etc...

Impact

%WHATDOESITAFFECT%
edit

Implementation

-- Contributors: VadimBelman - 04 Mar 2016

Discussion

It is safe to assume one PID processes only one request at a time. So Foswiki::Plugins::SESSION could stay when there is no other reason to motivate its removal.

-- MichaelDaum - 04 Mar 2016

Take the following "Hello world" PSGI application. (install cpanm Task::Plack)
use Plack::Builder;

my $app = sub {
    my $env = shift;
    return [
      200,
      [
         'Content-Type' => 'text/plain'
      ],
      [
         "Hello World from $env->{HTTP_HOST} PID: $$",
      ]
   ];
};

builder {
   mount 'http://mylocal1/' => $app,
   mount 'http://mylocal2/' => $app,
   mount 'http://mylocal3/' => $app,
   mount '/' => $app,
}
As you can see, the same application is mounted for 4 different hostnames.

For the testing add too your /etc/hosts : 127.0.0.1 localhost, mylocal1, mylocal2, mylocal3

Now enter the simple starman, and try on the browser http://mylocal1:5000/ and in the another browser http://mylocal2:5000/. Reload few times the pages in both browsers - you will see different PIDs.

Now, stop the starman and run the twiggy command and reload few times both browsers. You will see always the same PID for both.

This is because different http servers are used. The starman is an preforked one (aka many workers), the twiggy is AnyEvent based (one "worker"). Even the basic "development" server (the simple plackup command) will show only one PID.

E.g. no, it isnt' safe assume that: "one PID processes only one request at a time" , or maybe i didn't understand right what do you mean... wink

-- JozefMojzis - 04 Mar 2016

The question is whether ALL of Foswiki is reentrant or not. From what I know all of these calling semantics run one thread at a time, even for event-based execution. At least that's what I'd assume. Otherwise we'd probably stuffed all along, not only with Foswiki::Plugins::SESSION.

-- MichaelDaum - 04 Mar 2016

Once in the memory, or not once in the memory, That is the question. Not exactly Hamlet, but ... smile

Imagine a scenario:
  • want setup Foswiki for 500 different "local gardeners groups".
  • e.g. such wikis are low traffic, but every group want have his own wiki, because they're otherwise "enemies"... smile smile
  • so, each group will register his own domain.
  • with traditional way, (aka apache name-based virtual-hosts + FCGI) we will have 500 copies of the Foswiki and 500 copies of the perl interpreter in the memory - isn't very effective.

So, would be nice to have an ability to run foswiki as
my $foswiki = sub { ... };
builder {
   mount "$_" => $foswiki for (@all_500_domains);
}
aka - having only as many perl interpreters in the memory as much "workers" are executed under the starman. (usually 10-20 processes). E.g. would be possible to run dozens of different Foswikis for many different hosts with only few real workers - e.g. call this as an: "memory-effeicient wiki-farming".

I don't know how much work is needed to achieve this. Maybe it will be "too much work". Need decide.

-- JozefMojzis - 04 Mar 2016

This is already doable using VirtualHostingContrib and works fine. I think this is hunting ghosts unless you can prove that there actually is a problem. I doubt that as far as I can see.

-- MichaelDaum - 04 Mar 2016

I just
  1. showed an example of the PID thing - and AFAIK, the AnyEvent based twiggy server didn't uses threads. It is an simple select based event-loop. So, the PID is always the same.
  2. and added and wish/question about: "how many times is the perl loaded into the memory for multiple virtual hosts"...

That's all. If it is already done, Great! But.. we still don't have an PSGI based Foswiki - so... i doubt about the "done". smile

-- JozefMojzis - 04 Mar 2016

Reference locales see: CleanUpFoswikiLocales

-- JulianLevens - 05 Mar 2016 - 08:51

Just to make it 100% clear:

  • each request owns the perl interpreter exclusively.
  • we do not use concurrent threads within the same perl process, not for one request, not for multiple independent requests being served at the same time
  • two requests hitting foswiki at the very same second will be handled by two independent processes
  • each processes finishes from start to end, i.e it initializes all global variables within Foswiki ... including Foswiki::Plugins::SESSION
  • there is no such thing like not initializing Foswiki::Plugins::SESSION yet still serving a Foswiki request
  • of course the same process will be free to handle the next request ... but only AFTER the previous one has been finished completely ... which includes cleaning up Foswiki::Plugins::SESSION and all objects that hang from it.
  • even when a Foswiki::App is called within an event loop will that loop occupy the perl process exclusively
  • only when the Foswiki::App is coming back from an event loop will it be ready to serve the next request
  • we only have to make sure that each event loop leaves behind perl in a sane state, i.e. clean up global vars, destroy objects, etc.

That's why you will see the same PID being used for multiple requests ... but only one after the other, never concurrent.

This all means that your server will only be able to handle as many concurrent requests at the same time as there are perl processes loaded into memory. Anything else is queued up.

A sensible number of Foswiki workers is rather depending on the number of CPUs in your sever instead of the number of domains hosted on it.

-- MichaelDaum - 05 Mar 2016

Just to make it 100% clear for event-loop based servers too :).

Michael, You're precise and 100% right until we talking about the traditional, old-school (in the early '90s - apache-like prefork) model. With this, the only way scaling web-servers for handling more requesrs and/or serving long duration requests is using the "one request = one process (or thread) method". As requests comes, they get assigned to one of the already preforke process, which does nothing else other than handle that request from start to finish.

Imagine some long duration requests (for example streaming multipart XMLHTTPRequest or similar long polling comet). The response isn't finishes for long time. Using the old-school (aka blocking) logic this is solvable only by increasing the process (threads) counts, otherwise the server would become unresponsive because the (as you said) "..but only one after the other, never concurrent."

Because forking processes (or creating threads) are expensive, and the processes (threads) many times does nothing just waiting (but occupying memory), many modern web server uses another method: single process (single thread) event-loop based servers.

The single process event-loop based servers uses non-blocking IO, which means: one process could serve dozens of concurrent processes Thats mean: one process = many requests concurently. For such web servers/apps *doesn't true* the following:
  • each request owns the perl interpreter exclusively
  • two requests hitting the web-app at the very same second will be handled by two independent processes

Theyre called as "event loop" because the IO is non-blocking and the server just loops over the created sockets and checks their state (aka waits for events) a'la: connected, able to read, able to send, finished etc).

Everything is done in the same process - concurently - for many (partially) served requests. The main point of the event-loop based server is the concurency per requests. Therefore the we only have to make sure that each event loop leaves behind perl in a sane state, i.e. clean up global vars, destroy objects, etc. isn't true also... Please google for non-blocking web servers.

Of course, the Foswiki team could decide: we will never support modern non-blocking web-servers, so we will not bothering with them. In such case - just ignore all the above. So long, and thanks for all the bits.. smile

-- JozefMojzis - 07 Mar 2016

Did you check psgi.multithread and psgi.multiprocess in your psgi app's environment?

-- MichaelDaum - 07 Mar 2016

The relevant variables are server dependent and are set as:
variable plackup twiggy starman corona
psgi.multiprocess
0
0
1
0
psgi.multithread
0
0
0
1
psgi.nonblocking
0
1
0
1
e.g. they're correctly set for the event-loop based twiggy.

-- JozefMojzis - 07 Mar 2016

Means only Corona is a problem using co-routines, but not PSGI/Plack in general. Is that so? Most seem to be using a kind of worker pool, such as Starman. As such I'd consider them in the same class as FCGI.

-- MichaelDaum - 07 Mar 2016

Damn RSS didn't dispay changes in the topic to me. Therefore I'm late to jump on the discussion train.

I think it takes to much to discuss the global $SESSION variable. We can keep it as Michael is totally right: as soon as it is one request per time per process – it's nothing to worry about. Even if someday we manage to develop event-driven Foswiki keeping $SESSION actual is only a matter of setting it to proper value on each context switch. Nothing to speak about.

I don't think we ever go in multithreading direction as perl5 and multithreading are nearly mutually exclusive things. Yet even if this happens some day $SESSION could be declared thread-local variable which is good enough too to me.

So, I think it shall stop now. What I'm mostly looking for here is for objections or proposals to the model itself. Especially I would like to focus on the OO API where ->app or $SESSION becomes the root of it all. I.e., using an example from the initial ImproveOOModel topic, getUrlHost() would be replaced with $app->req->urlHost or by using Moo's delegations we may have a shortcut $app->getUrlHost.

-- VadimBelman - 07 Mar 2016

So, the decision:
  • We will NOT support event-loop based PSGI web-servers, like CPAN:Twiggy .
  • We will support only PSGI web-servers, which traditional one request == one process model, like starman .

Not really understand the reason of the decision - but OK. Because i'm really very limited by my english - i'm unable explain things to be understandable - even the simple ones. The lack of the english knowledge is my fault - so it is pointless to blabbing more here. smile

I will be happy with any Foswiki - even the current one is great... smile smile

-- JozefMojzis - 07 Mar 2016

Jozef, you got it wrong. What I'm saying is that if we speak about non-threaded environments then whichever framework is responsible for request handling there will be no more than one request served per process at any given moment of time. This is true even in event-driven environments simply by design. It means that there is always a solution to have $SESSION initialized correctly.

For threading the situation would be different but solvable too.

So, nobody is imposing any limits. We would only need different solutions for different environments. Which means in turn that we can happily let the variable be there where it is now and have the same functionality.

-- VadimBelman - 07 Mar 2016

I have committed first outline for the Foswiki::App class. It's just a few cell embryo (or a pen esquisse – whatever one prefers) of the future core object. Yet it would trigger some discussion even now.

An important note: it's gonna be much less compatible with the current design as expected. Otherwise it wouldn't worth the efforts.

-- VadimBelman - 17 Mar 2016
 
Topic revision: r24 - 11 Apr 2016, VadimBelman
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy