Feature Proposal: For Store Utilities multiple store instances will be valuable

Motivation

I have been hacking change_store.pl and a copy renamed test_store.pl (amended for performance testing). Both of these would benefit from the ability to pass different config options to the Store, even identical store implementations.

For example, change_store.pl assumes it has to monkey around with config values (data and pub dir), this is not generic.

Description and Documentation

Amend all Foswiki::cfg{Store} references in store code to use $this->{cfg} and amend new() to accept a config hash.

In Foswiki.pm there is the line:
$this->{store} = $base->new();

Which would need to become:
$this->{store} = $base->new($Foswiki::cfg{Store});

The new method would need amending e.g. from Plainfile.pm
sub new {
    my $class = shift;
    my $this  = $class->SUPER::new(@_);

    # Compatibility with old config settings
    unless ( defined $Foswiki::cfg{Store}{filePermission} ) {
        $Foswiki::cfg{Store}{filePermission} =
          $Foswiki::cfg{RCS}{filePermission};
        $Foswiki::cfg{Store}{dirPermission} = $Foswiki::cfg{RCS}{dirPermission};
    }
    return $this;
}
To:
sub new {
    my $class = shift;
    my $this  = $class->SUPER::new(@_);

    $this->{cfg} = shift;
    # Compatibility with old config settings
    unless ( defined $this->{cfg}{filePermission} ) {
        $this->{cfg}{filePermission} =
          $Foswiki::cfg{RCS}{filePermission};
        $this->{cfg}{dirPermission} = $Foswiki::cfg{RCS}{dirPermission};
    }
    return $this;
}

Other code in the store that references $Foswiki::cfg{Store} changed to $this->{cfg}.

This also raises issues about namespacing of Store in Foswiki::cfg
  1. $Foswiki::cfg{Store}{PlainFile}
  2. $Foswiki::cfg{PlainFileStore} leaving {Store} for core store config values (e.g. ImplementationClasses)
  3. $Foswiki::cfg{RCS} legacy updated to {RCSStore} or {RCS} left as a legacy exception but should that be reserved for RCSStores and should other file system based stores borrow some of these as default settings?

It has been suggested that multiple Foswiki instances are an alternative, but I can see that causing more headaches: IRC discussion here: http://irclogs.foswiki.org/bin/irclogger_log/foswiki?date=2013-10-14,Mon&sel=4#l0

Examples

Impact

%WHATDOESITAFFECT%
edit

Implementation

-- Contributors: JulianLevens - 15 Oct 2013

Discussion

It seems to me your goal is not to pass a config hash, but to modify the existing config hash based on some implementation-specific requirements that may not be captured by $Foswiki::cfg.

I'm uncomfortable with this for several reasons.
  1. $Foswiki::cfg is a static structure that (with the exception of a couple of glaring holes) is never written to by a running Foswiki. This is a good pattern.
  2. A running Foswiki can't afford the time to do this sort of analysis. That's what configure is for.
  3. A $Foswiki::cfg maintainer (viz. the admin who runs configure) loses control of aspects of the config.
Note, however, that these constraints don't necessarily apply outside the context of an HTTP-request-handler. Stand-alone scripts can already modify the Foswiki::cfg hash - that's how the Unit Tests set up their environment - it's just a case of doing it in the right place....

BTW I already did some work on namespacing in 1.2.0 - please look at the trunk config hash.

-- CrawfordCurrie - 23 Oct 2013

I've apparently not been very clear.

I have no intention of amending the $Foswiki::cfg in action, it is as you say a read only device. Indeed part of my motivation is to remove that. change_store.pl does change this on the fly between reading a topic from one store and writing the topic to the target store. Each store should know it's own directory independently.

Please bear in mind that we also need to consider {Store}{ImplementationClasses} which may well mean multiple Stores acting in concert as one logical store. This issue affects my work on VersatileStore quite a lot as Versatile is a topic only store. It requires another store to handle attachments. Currently, I'm using PlainFileStore as the attachment back-end. This means that implementation classes lists PlainFileStore and VersatileStore. During Foswiki instantiation this means that VersatileStore inherits from PlainFileStore. Therefore, many method calls are passed directly to PlainFileStore some method calls are handled by VersatileStore with super calls to PlainFileStore as required.

An alternative config would be to have a cloud-based attachment store.

In a testing situation. I may well test VersatileStore plus PlainFileStore against VersatileStore plus CloudStore. In this situation, I would have two different databases for VersatileStore and two different configurations for the other stores. Therefore, it is not sufficient to simply change the directory configuration between calls to the store. It would instead require changes to the database configuration. change_store.pl space and test_store.pl can not in general know what needs to change in the config from one type of store to another. Therefore, the store should accept a config hash during instantiation. the hashes passed to each store are separate entities with no shared refs. They are still read only. each store instantiation will therefore read the correct config values when required.

it is worth noting in the above scenario that VersatileStore is instantiated twice. Therefore it is critical that each instantiation has a different config. It is not enough to assume that a different namespace for each store will suffice.

however, that does give me an alternative idea. Rather than passing a whole config simply pass the name of your store config to use.

For example:
  • Foswiki::cfg{Store}{Versatile} etc as the default
  • Foswiki::cfg{Store}{VersaTest1} as the alternative

In this situation the store would be passed the alternative name and config would need to be referenced via $Foswiki::cfg{Store}{$this->{name}}. This would be for {Store} specific config options only. For example, cfg{OS} is Foswiki global. This does limit store testing to using the same OS.

Indeed, I think this is a better option. I do not have to maintain a separate config file somehow. the standard config could end up littered with many extra store configs. However, it is only going to occur for a developer testing many stores.

This requires a way to instantiate a store as a set of implementation classes for test/change_store.pl type situations.

The config structure would become like the following:
$Foswiki::cfg{Stores}{ImplementationClasses} = 1;
$Foswiki::cfg{Stores}{ImplementationClass}{PlainFile} = 100;
$Foswiki::cfg{Stores}{ImplementationClass}{Versatile} = 200;
...
$Foswiki::cfg{Store}{PlainFile}{data} = '/foswiki/data';
$Foswiki::cfg{Store}{PlainFile}{pub} = '/foswiki/pub';
...
$Foswiki::cfg{Store}{Versatile}{connection} = 'dbi:mysql:...';
$Foswiki::cfg{Store}{Versatile}{dbuser} = 'fwdbuser';
...

I need to come back later and finish this smile

-- JulianLevens - 23 Oct 2013

After further thought I may even combine the hash and name ideas.

-- JulianLevens - 23 Oct 2013

Indeed I will combine the hash and name ideas.

The proposal has become to standardise the store related config parameters and pass the appropriate subset as a hash ref.

The config structure would become like the following:
$Foswiki::cfg{Store}{ImplementationClasses} = 1;
$Foswiki::cfg{Store}{ImplementationClass}{PlainFile} = 100;
$Foswiki::cfg{Store}{ImplementationClass}{Versatile} = 200;
...
$Foswiki::cfg{Store}{PlainFile}{data} = '/foswiki/data';
$Foswiki::cfg{Store}{PlainFile}{pub} = '/foswiki/pub';
...
$Foswiki::cfg{Store}{Versatile}{connection} = 'dbi:mysql:...';
$Foswiki::cfg{Store}{Versatile}{dbuser} = 'fwdbuser';
...

The idea is to then allow:

$Foswiki::cfg{StoreTest}{ImplementationClasses} = 1;
$Foswiki::cfg{StoreTest}{ImplementationClass}{PlainFile} = 100;
$Foswiki::cfg{StoreTest}{ImplementationClass}{Versatile} = 200;
...
$Foswiki::cfg{StoreTest}{PlainFile}{data} = '/foswiki/testdata';
$Foswiki::cfg{StoreTest}{PlainFile}{pub} = '/foswiki/testpub';
...
$Foswiki::cfg{StoreTest}{Versatile}{connection} = 'dbi:mysql:...';
$Foswiki::cfg{StoreTest}{Versatile}{dbuser} = 'fwdbuser2';
...

And either in addition or alternatively
$Foswiki::cfg{StoreVersaPlain}{ImplementationClasses} = 1;
$Foswiki::cfg{StoreVersaPlain}{ImplementationClass}{PlainFile} = 100;
$Foswiki::cfg{StoreVersaPlain}{ImplementationClass}{Versatile} = 200;
...
$Foswiki::cfg{StoreVersaPlain}{PlainFile}{data} = '/foswiki/plaindata';
$Foswiki::cfg{StoreVersaPlain}{PlainFile}{pub} = '/foswiki/plainpub';
...
$Foswiki::cfg{StoreVersaPlain}{Versatile}{connection} = 'dbi:mysql:...';
$Foswiki::cfg{StoreVersaPlain}{Versatile}{dbuser} = 'fwdbuser3';
...

I would also suggest a config param as follows:
$Foswiki::cfg{Store}{ImplementationConfig} = 'Store'; # Defaults to 'Store' if not provided

Foswiki.pm would pass $Foswiki::cfg{Store}{$Foswiki::cfg{ImplementationConfig} || 'Store'} when new()ing the store.

Test/change_store.pl both initialise Foswiki and the default store would be created. In addition parameters can then be passed to these (and other) store tools, e.g. StoreTest (or simply Test). Remember we need the ability to test different store configurations not just a single store.

I am suggesting that the primary ::cfg key should be reserved to begin Store. If a Secondary key in a store config begins Implementation then that is reserved for the overall config of that Store and cannot be used for a store name (alas ImplememtationStore will not be valid).

Foswiki::Store will need a new sub routine that will be passed a StoreConfig name (Store, StoreNew, StoreTest etc) which will examine $Foswiki::cfg{$StoreConfig}{ImplementationClasses} etc and build the appropriate class structure as is currently done in Foswiki.pm and return the new() instantiated store.

Any store tool that needs to create an additional store config can call this again as required with alternate StoreConfig names.

Other thoughts are to insist that a store should inspect it's own config options for parameters before looking for defaults elsewhere in $Foswiki::cfg, this allows finer control while allowing global defaults.

BTW: I see value in some NullStores, both reading and writing, but more on that in the next exciting episode ...

-- JulianLevens - 24 Oct 2013

A long, long time ago I implemented multiple stores (or more accurately, store-per-web), though the core wasn't ready for it at the time and I had to abandon it. What I learned was:
  1. The store implementation is currently associated with the Foswiki instance. For multiple stores, you need to associate the store implementation with the data. There is a cost inherent in this that must be minimised.
  2. You can't do it piecemeal - you have to consider the implications throughout the ecosystem. For example, some plugins manipulate files-on-disc, assuming that is the implementation.
  3. Hacking it - as is done in change_store.pl - is not a general solution. Not even close.
Passing a store config is just one step in achieving this, but it has to be done taking into account these requirements - there has to be a long term plan, or we will get it wrong.

Regarding your notes above. I am not a fan of leaking store identities into the top level of the config hash - it's untidy. Keep everything under the {Store} namespace.

I don't like the idea of falling back to defaults in the core code. That's configure's job - to assemble the detailed configuration from a user-friendly input subset.

-- CrawfordCurrie - 25 Oct 2013

First up, I agree with the {Store} namespace for all of this.

As for multiple active stores in a Foswiki instance: well that certainly ups the stakes.

I was only considering multiple configs from the tools perspective.

As you say you need to associate the store implementation with the data and that needs an intermediate layer.

The obvious (possibly naive) option would be something like:
$Foswiki::cfg{Store}{Webs}{Main} = 'StoreConfig1';
$Foswiki::cfg{Store}{Webs}{System} = 'StoreConfig2';
$Foswiki::cfg{Store}{Webs}{Sandbox} = 'StoreConfig2';
$Foswiki::cfg{Store}{Webs}{'*'} = 'StoreConfig3';

So the core which instantiate many store configs and quickly choose the correct one via a hash ($session->{Store}{$web}->readTopic). If this intermediate layer is much more complex then as you say, it's quite an overhead.

However, what's the driver for multiple stores anyway. The obvious answer is to simply to increase the capacity. However, VersatileStore has an SQL back end which in turn has clustering etc available and increasing capacity by two orders of magnitude, I question the demand for this capability.

I grant you that VersatileStore is hardly proven yet. It is certainly designed to cope with the current store requirements: absolutely no typing (nothing reliable) fields being defined, redefined and deleted at the whim of the user. It also will have the ability to add extra tables to provide better indexing of data forms (amongst others). In addition, it will have the ability to connect to other databases treating the rows of those databases as Foswiki data forms.

Returning to for multiple active stores: what are the demands driving this?

It also seems to me, defining distinct store configurations is orthogonal to the needs of multiple active stores, not that I'm certain of this.

I am also concerned that the community does not have the capacity to support many, many stores. I am optimistic that PlainFileStore will become the new default store and part of the core. VersatileStore will hopefully become the default store upgrade — not actually core but almost, with the community happy to maintain it.

That is not to deny some specialist stores will have value, but they will be maintained by specific individuals rather than the community. Of course, this is all in a state of flux, but I cannot see the community maintaining more than one or two stores at any one time.

Can you explain why defining distinct store configurations is not orthogonal to the needs of multiple active stores. Better yet, can you suggest some design ideas which would show where its not?

-- JulianLevens - 25 Oct 2013

I don't have a great depth of experience with implementing new stores, but I have hacked enough on various plugins and the core to know that Crawford is right: you do need to consider the broader picture and have a long term plan. For example, although fewer than they were, there are still assumptions about the store, spread through the core and core plugins, documentation and downloadable extensions.

Why would you want multiple stores? Here is just one reason, no doubt there are others: the various stores have different strengths, and you may have different webs with different use cases and different requirements and there may be stores well-matched to each but no store that does everything well. So you end with a compromise unless you can use different stores for different webs. For the specific instance I have in mind, separate Foswiki installations may also have done the job.

I agree that conceptually, defining distinct store configurations does seem orthogonal to using multiple active stores. However, there is almost always more than one way to do it, and the way in which distinct store configurations are defined will affect performance when using multiple active stores. Of course it will affect performance when only one store is active, but it may affect performance differently when multiple stores are active. I cannot think of a situation where that would happen, but neither can I say with any confidence that it could never happen. Then again - maybe this is actually a side-issue that is not related to the proposal.

I am raising a concern because there is a date of commitment but the "Description and Documentation" is different to what is addressed in the later part of the discussion. I do not think the proposal is doomed, but I do think it needs further discussion smile and I am concerned that people may be somewhat distracted with the 1.1.9 release.

-- MichaelTempest - 01 Nov 2013

Thanks for the extra feedback.

I appreciate the theoretical need for alternative stores. I not so sure we are close to a real practical need for it. My experience is limited here and I'm may simply be unaware of real demand.

Note that there are significant other considerations here. Some store actions may cross two stores: e.g. moveTopic and moveWeb. If the move is from same to same then the store action is likely to be quite efficient. OTOH, if it's moving the data across then that's potentially a significant action with many reads and saves going on. Of course you could insist web moves only occur within the same store.

I've actually started writing some code around this proposal after all I need to improve the tools I'm using to help me complete Versatile. This has also helped me refine the idea and I'm sure there's more to come. I'll need to update this proposal with the final idea — although it's essentially the same one.

I can see how I can leave a door open for one store to be a master store in a multi-config set-up, albeit it's rather basic.

The problem I have right now is that I cannot see why it's not orthogonal. If it is orthogonal then the concern you raise would be moot. If someone can suggest multi-store designs where the store config is not orthogonal; that would a) prove it requires more thought and b) suggest ways in which the design needs modification.

As it stands I have nowhere to go.

-- JulianLevens - 01 Nov 2013

Just to clarify - my concern is twofold:
  1. The proposal description at the top of this topic and the Topic Summary do not reflect the latest thinking that has come out of the discussion. This is straightforward to address.
  2. I think this needs more thought and the impending release may be a distraction. I will remove my concern after the release of 1.1.9 or once there is more participation in this discussion, whichever happens first.

Back to the discussion...

With regards to orthogonality, I read through the discussion again (and again) and I can see that I was confused. Sorry about that. I agree with your analysis. I cannot see whether it is orthogonal or not, though. How about updating the proposal to state that using multiple stores simultaneously is orthogonal to this proposal and beyond the scope of this proposal? In particular, it would mean that associating a store with data rather than with a Foswiki instance would be beyond the scope of this proposal.

Moving beyond orthogonality:

Regarding modifications to $Foswiki::cfg, I would like to note that the part of PlainFile::new() that modifies $Foswiki::cfg is not part of this proposal. That code appears to provide backwards-compatibility when using the PlainFile store with older versions of Foswiki.

Would a configure Checker be the right way to make sure that $Foswiki::cfg{ImplementationConfig} is set to something sensible? I think that is what Crawford suggested...

What happens to $Foswiki::cfg{PubDir} and $Foswiki::cfg{DataDir} (and all of the code that uses them, including non-core plugins) if those properties move into $Foswiki::cfg{Store} ? Should that information move into $Foswiki::cfg{Store} or should the same information be encoded in more than one place in $Foswiki::cfg ? If they don't move there, then what do $Foswiki::cfg{PubDir} and $Foswiki::cfg{DataDir} mean if not using a file-based store? As an example, $Foswiki::cfg{MimeTypesFileName} and $Foswiki::cfg{Htpasswd}{FileName} both default to files beneath $Foswiki::cfg{DataDir} and both of those (as far as I can see) should be associated with the Foswiki instance and not with specific data and therefore should not be associated with the store(s). I do think these questions fall within the scope of this proposal and I'm sorry they are somewhat nebulous.

FWIW, I suggest $Foswiki::cfg{PubDir} and $Foswiki::cfg{DataDir} should be moved somewhere beneath $Foswiki::cfg{Store} such that they only exist for file-based stores. Perhaps Foswiki::Configure::Load::readConfig() should remap them much like it does for other deprecated configuration items. I am not sure what to do with $Foswiki::cfg{MimeTypesFileName} and $Foswiki::cfg{Htpasswd}{FileName} - perhaps move them to the same location as LocalSite.cfg.

-- MichaelTempest - 02 Nov 2013

This captures some of the work I did in this area: https://github.com/Jlevens/StoreToolsContrib

I also changed Foswiki.pm to do this:

    # construct the store object
    $this->{store} = Foswiki::Store::newConfig();

With this sub added to Foswiki::Store (this was in Nov 2013, it's now July 2015 - Foswiki core has changed quite a bit so beware)
# JulianLevens: I considered possibility of allowing a Store to be defined in the hierarchy twice with different
# config params via:
#   my $module = $cfg->{$class}->{module} || $class;
# and then using $module instead of $class. However, while the config is a hierarchy of Stores ultimately you
# instantiate only one store. Therefore there is one $this and no way to disambiguate between the two different
# configs of these stores. (There might be a sophisticated and complicated and maintenance nightmare way of doing it, but ...)
#
# In a running FW a NullStore could be placed between stores to block operations.
#     PlainFile, NullStore, Versatile      because if Versatile by default passes on topic saves then NullStore stops this, PlainFile just handles attachments
#
# During change_store for the source Config
#     PlainFile, NullStore                 this NullStore would effectively convert PlainFile to a topic only store, just change topics to new store

sub newConfig {
    my ($cfgName) = @_;
    use Class::Load qw/try_load_class/;

    # print "newConfig=$cfgName\n";
    $cfgName = ($Foswiki::cfg{Store}{ConfigName} || 'Default') if !$cfgName || !$Foswiki::cfg{StoreConfig}{$cfgName};
    print STDERR "newConfig=$cfgName\n";
    
    my $cfg = $Foswiki::cfg{StoreConfig}{$cfgName};
    my @classes =
        sort { ($cfg->{$a}{order} // 0) <=> ($cfg->{$b}{order} // 0) }
        keys( %{ $cfg } );

    if(scalar @classes == 0) {
        ASSERT( undef, "no Store config/module found" ) if DEBUG;
    }

    my $base = 'Foswiki::Store';
    my $storeCfg = {};
    # this allows us to add an arbitary set of mixins for things
    # like recordChanges
    foreach my $class (@classes) {
        next if substr($class,0,1) eq '_';

        # Copy all config values from $Foswiki::cfg copying and overwriting:
        #    1 {Store} keys as Foswiki defaults
        #    2 {$cfgName}{_defaults} as defaults for this particular Store instantiation
        #    3 {$cfgName}{$class} as the particular values for this Store class within the set of classes of the config
        #
        # Therefore in general, for any StoreClass: $this->{cfg}{StoreClass}{dirPermission} should be used and the relevant value will be set
        # note that StoreClass is a literal, the StoreClass knows it's own name so we have: PlainFile, Versatile, RcsLite etc
        #
        # It can also be valid to refer to $this->{cfg}{_defaults} where a config level item is appropriate rather than store level
        # e.g. $this->{cfg}{_defaults}{QueryAlgorithm} can be set differently per store Config. The default sub query in this file
        # uses this, indeed it cannot refer to a specific Store.
        #
        # This does not exclude a QueryAlgorithm per Store, but in that case a Store will need to implement it's own 'sub query'.
        
        @{ $storeCfg->{$class} }{ keys %{ $Foswiki::cfg{Store}} } = values %{ $Foswiki::cfg{Store} };
        @{ $storeCfg->{$class} }{ keys %{ $Foswiki::cfg{Store}{$class} } } = values %{ $Foswiki::cfg{Store}{$class} };
        @{ $storeCfg->{$class} }{ keys %{ $cfg->{_defaults}} } = values %{ $cfg->{_defaults} };
        @{ $storeCfg->{$class} }{ keys %{ $cfg->{$class}} } = values %{ $cfg->{$class} };

        $storeCfg->{$class}{NameFilter} = $Foswiki::cfg{NameFilter};
        $storeCfg->{$class}{WebPrefsTopicName} = $Foswiki::cfg{WebPrefsTopicName};
        $storeCfg->{$class}{CharSet} = $Foswiki::cfg{Site}{CharSet}; # SMELL more thinking required
        # $storeCfg->{$class}{AccessControl} ??? For tools usage only ???
        # $storeCfg->{$class}{PrefsBackEnd} ??? for tools usage only ???
        # $storeCfg->{$class}{RCS} = $Foswiki::cfg{RCS} # already superceded by {Store}{RCS} (configure required to run so maybe not enough coverage)
        # $storeCfg->{$class}{UsersWebName} = $Foswiki::cfg{UsersWebName}; # Must be the same across all stores

        @{ $storeCfg->{_defaults} }{ keys %{ $Foswiki::cfg{Store} } } = values %{ $Foswiki::cfg{Store} };
        @{ $storeCfg->{_defaults} }{ keys %{ $cfg->{_defaults} } } = values %{ $cfg->{_defaults} };
        
        $storeCfg->{_name} = $cfgName;

        my $module = $storeCfg->{$class}{module} // "Foswiki::Store::$class";
        my ( $ok, $error ) = try_load_class($module);
        ASSERT( $ok, $error ) if DEBUG;
        if ($ok) {
            no strict 'refs';
            @{ $module . '::ISA' } = ($base);
            use strict 'refs';
            $base = $module;
        }
        else {
            print "<<$error>>\n";
            #just ignore it and move on to the next class..
            #Foswiki::Func::Log(...)
        }
    }
    ASSERT( $base, "no Store config object created" ) if DEBUG;

    return $base->new($storeCfg);
}

A Store would be initialised with $this->{cfg} and therefore all code inside a store like $Foswiki::cfg{Foo}{Bar} would be replaced by $this->{cfg}{Foo}{Bar}.

This worked well but this proposal has not been approved. I need to convert the StoreToolsContrib to copy all relevant values from one Foswiki::cfg to another between calls to store functions. It will work but will be seriously inelegant and I suspect a greater maintenance issue. OTOH, one step at a time.

-- JulianLevens - 11 Jul 2015

The commit date was reset to today to restart the clock. Last objection was mainly due to timing in the release. Developer has left the project to removed the objection.

-- GeorgeClark - 07 Mar 2016

I've parked this for two reasons:
  1. It needs to be extended to how to handle config as a hash throughout Foswiki not just Stores
  2. It keeps appearing as discussion points in Moo proposals
    • The management of ::cfg (or equivalent) may even need to be broken into separate hashes (or some such)

-- JulianLevens - 09 Mar 2016 - 13:40

 
Topic revision: r16 - 09 Mar 2016, JulianLevens
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy