Item8803: http://translate.foswiki.org/ is down

pencil
Priority: Normal
Current State: Closed
Released In: n/a
Target Release: n/a
Applies To: Web Site
Component: Pootle
Branches:
Reported By: RaulFRodriguez
Waiting For: AndreUlrich, Main.KwangErnLiew, Main.OlivierRaginel
Last Change By: OlivierRaginel
This has been down since 27th February.

When that was happening in the past, I used to bug AndreUlrich, but now he told me that the installation is run on the official Foswiki.org server.

Andre directed me to OliverKrueger, but this is still down. I do not know why.

Since the page is advertised in several places of foswiki.org, it is not good to have it return an error, in case it gets a few hits from visitors.

Also, the purpose of this service is not met, since the translation platform is not available.

-- RaulFRodriguez - 30 Mar 2010

It's running. Not sure what happened, as all the services were up and running. It needed a restart kick for some odd reason.

-- KwangErnLiew - 31 Mar 2010

I experienced the same problem, but I thought it was my fault. Solutions was monit:

Pootle got startet automatically via "/etc/init.d/monit start" during boot up

Monit tests regulary if pootle still running (looks if port 8080 available). We took an interval of 23 seconds within /etc/default/monit:

set logfile syslog facility log_daemon

check host git.localhost with address 127.0.0.1
        start program = "/etc/init.d/pootle start"
        stop program = "/etc/init.d/pootle stop"
        if failed port 8080 proto http
                then restart
        if 12 restarts within 15 cycles then timeout 

check if "/var/lib/pootle/" & "/etc/pootle" belong to "pootle:pootle"

-- AndreUlrich - 31 Mar 2010

Thanks Andre. I'll try to monitor this and see what's wrong.

I don't quite want to install monit for just one service TBH. Will whip out a simple script when I understand what's going on.

So far the service is still up and running.

-- KwangErnLiew - 31 Mar 2010

I have set a cron job on my server that runs a bash script checking daily the translate.foswiki.org site for new strings to translate or validate. It uses sed and curl to check the status, and sends the status to me via XMPP. So, I also get notified if the script was unsuccessful because the server is down.

Here is a reduced version of my script that will only check if a user of Pootle can login successfully, and otherwise will send a notice to a designated address:

#!/usr/local/bin/bash

TRANSLATOR="RaulFRodriguez"
PASSWORD="mypassword"
LANGUAGE="fr"
XMPPADDRESS="my_xmpp_address@example.com"

CURL="/usr/local/bin/curl"
SED="/usr/bin/sed"
ECHO="/bin/echo"
SENDXMPP="/usr/local/bin/sendxmpp"

COOKIES="/tmp/$TRANSLATOR-cookies.txt"

POST="username=$TRANSLATOR&password=$PASSWORD&language=$LANGUAGE&islogin=true"
$CURL -s -c $COOKIES "http://translate.foswiki.org/login.html" --data "$POST"

STATUSPAGE=`$CURL -s -b $COOKIES "http://translate.foswiki.org/$LANGUAGE/foswiki/index.html?editing=1"`

ISLOGGED=`$ECHO $STATUSPAGE | $SED -e "s/^.*$TRANSLATOR.*$/logged in/"`
if [ "$ISLOGGED" == "logged in" ]
then
  $ECHO $ISLOGGED
  rm $COOKIES
else
  $ECHO "Error cannot login"
  $ECHO "Error, could not login as $TRANSLATOR using http://translate.foswiki.org/login.html" | $SENDXMPP -s "Pootle problem" $XMPPADDRESS
fi

If the server runs on FreeBSD (as I think it is), the paths for the commands there should be fine, otherwise change them to the actual ones. If you do not want to install and use the sendxmpp client (that's fairly simple and convenient if you have a Jabber address or a Gmail address, and you can install it from the ports on FreeBSD /usr/ports/net-im/sendxmpp), you can easily adapt the script to have the notice sent to you in some other way, or to restart the Pootle service without further review.

Otherwise, I will bug you when that will repeat (it looks like that's a common problem of Pootle, or at least of the way Pootle is being used here), since my script will be bugging me everyday wink

-- RaulFRodriguez - 31 Mar 2010

The server is down again. Please restart it.

-- RaulFRodriguez - 02 May 2010

Right. In pootle.prefs, the preference for the Turkish translation's wasn't indented correctly. This is fixed.

2010-05-02 06:06:21: Listening on port 8080
2010-05-02 06:06:21: To use the server, open a web browser at http://musmo.foswiki.org:8080/
2010-05-02 06:06:21: Traceback (most recent call last):
  File "/var/lib/python-support/python2.5/jToolkit/web/simplewebserver.py", line 613, in getserver
    server = getserverwithprefs(options.prefsfile, options.instance, httpd)
  File "/var/lib/python-support/python2.5/jToolkit/web/__init__.py", line 160, in getserverwithprefs
    serverprefs.parsefile(prefsfile)
  File "/var/lib/python-support/python2.5/jToolkit/prefs.py", line 428, in parsefile
    self.parse(contents)
  File "/var/lib/python-support/python2.5/jToolkit/prefs.py", line 419, in parse
    self.parseassignments()
  File "/var/lib/python-support/python2.5/jToolkit/prefs.py", line 356, in parseassignments
    self.raiseerror("indent without preceding :", tokennum)
  File "/var/lib/python-support/python2.5/jToolkit/sparse.py", line 225, in raiseerror
    raise ParserError(self, message, tokennum)
ParserError: indent without preceding : at line 602, char 10 (token indent(3))

2010-05-02 06:06:21: Error initializing server, exiting: indent without preceding : at line 602, char 10 (token indent(3))

-- KwangErnLiew - 02 May 2010

Hello, I am setting back this task as "New" since the "translate" server is down again.

It has been down since at least 14 May 2010 02:02am (date when my daily cron job started reporting to me failures at my XMPP address). So it looks like I am the only one surveying the state of this server with my remote script frown, sad smile

The quick fix is to reboot the server, as I understand it. Please do so.

But it needs either to be finally fixed, or to be surveyed by the people in charge of its maintenance, so that you don't only rely on me to tell you if something needs to be done (more than 3 days after the server started to be inaccessible, in this case).

Thanks,

-- RaulFRodriguez - 17 May 2010

Restarted. Will see if it makes sense configuring some monit or nagios on musmo, but this should be discussed with Kwang as it's his machine.

Ok, read the full task, and it seems Kwang is not very keen on having monit for just one service. We could also monitor the bot smile

-- OlivierRaginel - 17 May 2010

Sorry, was on a business trip. Will be back to base tomorrow.

I'd prefer to dig down to the root cause of why the service hangs/stops unexpectedly rather than run a service that restarts the service. Does anyone even know? Some time ago, I've checked for bug reports but nothing came up.

Thanks Olivier for the heads up. smile

-- KwangErnLiew - 26 May 2010

Kwang, or anybody at InfrastructureTaskTeamGroup can you please restart the Pootle service, since trying to access the pages of http://translate.foswiki.org is again giving the error: "The page you are looking for is temporarily unavailable. Please try again later.".

My daily cron job notified me that this has been down since at least today at 02:02am. Thanks.

-- RaulFRodriguez - 17 Jun 2010

Shortly after I reported the problem yesterday, the service was running again, but only for a few hours. It appears to be down again.

-- RaulFRodriguez - 18 Jun 2010

During my vacation period, the Pootle server has had downtime from:

  • 19 Jul 2010 to 31 Jul 2010
  • 02 Aug 2010 to 10 Aug 2010

Also:

  • 20 Aug 2010 to 21 Aug 2010

It looks like if I'm not around it takes some time to the persons managing the server to notice that there is something wrong with Pootle.

Since a new version of Pootle is now running, I hope this problem with down-time is a thing of the past. Thanks for that smile

I need to adapt my surveillance script to the new Pootle interface. So hopefully, if I'm around to receive the notices of downtime, I should be able to inform the InfrastructureTaskTeamGroup like I used to do.

For the moment, I am closing this task smile

-- RaulFRodriguez - 22 Sep 2010

The Pootle server translate.foswiki.org is down since 2 days.

Please restart it.

-- RaulFRodriguez - 11 Oct 2010

Hello Raul,

I can't help with the pootle server, but it might be better to specify individuals from InfrastructureTaskTeam instead of 'InfrastructureTaskTeamGroup' in this task.

That way, somebody will get an E-mail (I don't think our tasks system knows how to send E-mails to 'InfrastructureTaskTeamGroup')

-- PaulHarvey - 11 Oct 2010

Added the names of the people I think may know how to do this. Guys, could you document it for the rest of the infra team, please?

-- CrawfordCurrie - 12 Oct 2010

It did notify me, and it was rectified ASAP.

-- KwangErnLiew - 12 Oct 2010

This time the machine was down, so only Kwang could do something, and we notified him as soon as we knew (ok, only bugged him on IRC, next time maybe some mail?)

Anyway, we should implement some monitoring... I'll look into this.

-- OlivierRaginel - 12 Oct 2010

Thanks guys and Kwang smile

Olivier, since my bash script is adapted to the new Pootle interface (similar to the one I proposed in my comment of 31 Mar 2010 above), I can provide that to you if that is something you are likely to find helpful.

Just me me know. Otherwise, I am sure you will come up with a nice and more efficient way to do this !

-- RaulFRodriguez - 12 Oct 2010

For the moment I configured monit to do the same, but maybe post your bash script somewhere like here so I can have a look at what it checks.

Here is what I configured in monit:
# Check a remote host availability by issuing a ping test and check the 
# content of a response from a web server. Up to three pings are sent and 
# connection to a port and an application level network check is performed.
#
check host translate.foswiki.org with address 82.94.190.217
  if failed icmp type echo count 3 with timeout 3 seconds then alert
  if failed url http://translate.foswiki.org/projects/foswiki/
     and content == 'Interface translations for Foswiki.'
     then alert
And alerts are directed to root, which should be redirected to the sysadmin list.

-- OlivierRaginel - 29 Oct 2010

Olivier, probably your test that http://translate.foswiki.org/projects/foswiki/ contains a string 'Interface translations for Foswiki.' will be sufficient.

My bash script is meant to log in as a Pootle user, and check if there are suggested strings pending validation and sends me a notice, if there are, to my XMPP address (Jabber). When the script cannot log in, it sends a notice too complaining about this.

Here is a reduced version of the script that will only check if log in is possible, and complain if not (you can change "$SENDXMPP" to the location of your "mail" command, if you prefer to send an e-mail, and replace $XMPPADDRESS with the root e-mail address).

#!/usr/local/bin/bash

TRANSLATOR="RaulFRodriguez"
PASSWORD="mypassword"
LANGUAGE="fr"
XMPPADDRESS="my_xmpp_address@example.com"

CURL="/usr/local/bin/curl"
SED="/usr/bin/sed"
ECHO="/bin/echo"
SENDXMPP="/usr/local/bin/sendxmpp"

COOKIES="/tmp/$TRANSLATOR-cookies.txt"


LOGINPAGE=`$CURL -s -c $COOKIES "http://translate.foswiki.org/accounts/login/"`
CSRFTOKEN=`$ECHO $LOGINPAGE | $SED -e "s/^.*name='csrfmiddlewaretoken' value='//" -e "s/' \/>.*$//"`

POST="csrfmiddlewaretoken=$CSRFTOKEN&username=$TRANSLATOR&password=$PASSWORD&language=&login=Connexion&next="
$CURL -s -b $COOKIES -c $COOKIES "http://translate.foswiki.org/accounts/login/" --data "$POST"

STATUSPAGE=`$CURL -s -b $COOKIES "http://translate.foswiki.org/$LANGUAGE/foswiki/review.html"`

ISLOGGED=`$ECHO $STATUSPAGE | $SED -e "s/^.*Log Out.*$/logged in/"`
if [ "$ISLOGGED" == "logged in" ]
then
  $ECHO $ISLOGGED
  rm $COOKIES
  exit
else
  $ECHO "Error cannot login"
  $ECHO "Error, could not login as $TRANSLATOR. Login there: http://translate.foswiki.org/accounts/login/" | $SENDXMPP -s "Pootle problem" $XMPPADDRESS
fi

You may need to adapt the paths to the commands. These are for a FreeBSD box.

-- RaulFRodriguez - 29 Oct 2010

Good idea Raoul. What I suggest is that the next time your script alerts you, you let me know, and I'll check if monit alerted / did something. Also, I've installed this monit on the foswiki.org server, thus on a FreeBSD box too. I saw that AndreUlrich installed a monit instance on musmo too, but I wanted to have something external.

So, I'll close this task, and Raul, next time it fails, please send me a mail with the timestamp of your alert, and I'll check the logs.

Thanks,

-- OlivierRaginel - 07 Nov 2010

I messaged Olivier, but I keep receiving notices every day from my script complaining that translations cannot be checked since Nov. 10, 2010 02:00:03 GMT +1. Maybe Olivier is a bit busy currently.

1.

On Nov. 10, 2010 the page http://translate.foswiki.org/accounts/login/ was accessible, as well as the page http://translate.foswiki.org/projects/foswiki/ which indeed contained 'Interface translations for Foswiki.

But login into Pootle was impossible and returned:

"Server error

An error has occurred. Please wait a moment."

(it appeared in French, so that is a rough translation of "Une erreur s'est produite. Veuillez patienter."

2.

Today accessing http://translate.foswiki.org/accounts/login/ or http://translate.foswiki.org/projects/foswiki/ results in a timeout and browser message saying the site may be temporarily unavailable. Same for http://translate.foswiki.org/

I am reopening this Item8803.

-- RaulFRodriguez - 12 Nov 2010

Some time on 13 Nov 2010 someone must have dealt with the server by restarting Pootle or something, since I don't get any errors after 13 Nov.

I am leaving the task opened, since it looks like the type of check done with monit did not catch the problem that affected Pootle since Nov. 10, 2010.

-- RaulFRodriguez - 17 Nov 2010

Yeah, Kwang managed to get a hold of the people in charge of the datacenter where the machine hosting pootle is located, and have them reboot it (machine was down).

I will try to do something more advanced with monit, and close the task when done.

-- OlivierRaginel - 17 Nov 2010

Before anyone reports again, I am aware that it is down (again). Seems that a HDD crashed and everything is down...even when it's running on RAID10. Great. :/

Update: Right, so not everything got saved. Gotta restore from backup. Doing it now. smile 22/11/2010 Update: It's all back up and running. smile

-- KwangErnLiew - 20 Nov 2010

I just still need to upgrade monit's checking with something more advanced, and to test it as monit didn't even report it down, afair.

-- OlivierRaginel - 18 Dec 2010

Just wanted to report that today at 02:00:11am CET, my script reported that it could not login at http://translate.foswiki.org/accounts/login/.

Now the server seems to be up, and login succeeds too. So I guess that monit did its job, whatever was the problem (unless someone took care of the problem manually).

-- RaulFRodriguez - 11 Jan 2011

Kwang upgraded the machine, as he wrote to some ML. The upgrade went bad, but he managed to fix it this morning, so now everything is back to normal.

Monit hasn't done anything this time smile And I still have to enhance its logic, hence I'm keeping this open.

-- OlivierRaginel - 11 Jan 2011

Just wanted to make sure that someone was aware that the http://translate.foswiki.org/ server is down with a nasty Python error.

Since we are working on finalising the translations for the coming release, it would be good that this is up as soon as possible smile

This has been down at least since 02:00am GMT+2

-- RaulFRodriguez - 01 Apr 2011

Olivier, did Monit notice anything about the recent crash ?

-- RaulFRodriguez - 08 Apr 2011

No, I don't think monit noticed. Anyway, Kwang expressed his wish to have pootle removed from his server, after the crash you're talking about (which wasn't due to pootle, but got fixed by gmc afaik).

So, I've setup a test installation on gmc's foswiki.org server and I will migrate after 1.1.3 is released and fully translated. I will then install proper monitoring on the server.

-- OlivierRaginel - 08 Apr 2011

Having trouble migrating as I'm not sure where the DB is stored, and not sure if we shouldn't just trash it all and copy the users.

Andre, what do you think? Is there anything interesting in the old installation apart from translation and users?

-- OlivierRaginel - 14 May 2011

Hm, maybe the config which user is able to edit which language getting lost too.

-- AndreUlrich - 21 May 2011

FYI, the following directories are frequently backed-up...

/root/ /etc/mysql/ /etc/pootle/ /etc/apache2/ /etc/cron.d/ /var/spool/cron/ /home/ /usr/share/pootle/ /var/lib/pootle/

Restoring these files to its place works and is quick and clean.

-- KwangErnLiew - 21 May 2011

Except we're planning onto migrating to a newer pootle on FreeBSD, so I doubt this will work. Pootle uses a MySQL database? But yes, I should start doing some work on this today. Andre, I'll ping you for some testing if you can (nothing urgent).

-- OlivierRaginel - 21 May 2011

as long as the files from those directories are migrated/ported, it should all be fine. We don't want any surprises, right? wink

-- KwangErnLiew - 21 May 2011

Is the translate site down? What should I do if I have to make an update of my translation?

-- ChYang - 07 Jun 2011

Yes, the site is down currently. It seems like Kwang's machine crashed again, before I had the chance to migrate it. Either you wait for him to fix this, or you use SVN to directly commit your changes. I'm afraid I have no idea on a timeline though...

-- OlivierRaginel - 07 Jun 2011

I've migrated the data from Kwang's server to foswiki.org, but of course, lucky me, pootle is broken on FreeBSD because it needs an old lucene (2.x) and it has 3.x.

Trying to find a fix...

-- OlivierRaginel - 11 Jun 2011

Oh, I forgot to update that task. So, pootle has been migrated, but I haven't restored the cron jobs to commit stuff. I will do it right away.

It's now using MySQL as a database backend, and memcached. I haven't configured monit to monitor it yet, but I do not expect it to go down, as the main website never does.

I'll close the task when I've implemented the monitoring.

-- OlivierRaginel - 18 Jun 2011

The "translate" site has been down for the last 3 days according to my bot that alerts me of new translation strings.

The homepage of http://translate.foswiki.org/ fires an python error trace complaining that it is not able to "extract file(s) to egg cache".

Just wanted to bring this to your attention in case someone hasn't noticed yet smile

-- RaulFRodriguez - 23 Sep 2011

Ok, I've configured pootle to use WSGI instead of relying of the old way to use mod_proxy and a PootleServer. It shouldn't crash anymore, as long as apache is up.

But I don't remember testing auto-commit and stuff like that. Andre, would be great if you could have a look and let me know. Or I'll try to remember next week...

-- OlivierRaginel - 23 Sep 2011

Ok, as I've migrated the site to the main foswiki.org server, and it seems to be working fine so far, I'll close that one. There are still some minor issues, especially if people commit in SVN, it should get integrated into pootle, but it might need some manual intervention.

Anyway, this is good enough until we migrate to git smile

Closing...

-- OlivierRaginel - 03 Dec 2011

 
Topic revision: r49 - 03 Dec 2011, OlivierRaginel
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy