View Single Post
Old 17-10-2009, 06:12 PM   #8
James
RYL Systems Administrator
 
Join Date: Oct 2004
Location: Suffolk, UK

I am aware of this problem, we've been seeing it intermittently since RYL was moved to new servers earlier on this year. In short, it seems to most likely be caused by a bug in the PHP framework the web server runs although there are so many variables it's hard to be sure of the precise cause despite hours of diagnostic work on my part. There is also the possibility that it is caused by a bug in RYL's code or another system such as Live Help. It occurs very rarely however so tracking down the precise cause is proving exceptionally difficult.

Since we serve so many pages per second, if PHP crashes too often within a short space of time, the server performs something called 'rapid fail protection'. This basically means that all web requests are immediately denied with a 500 error code for a certain period of time until the application pools which serve the RYL website recover - this protects the web server from excessive load caused by failing web requests. This also explains why, on the rare occasion RYL misbehaves, you'll see a lot of 500 errors during this time.

I am working on upgrading the web server to PHP 5.3.0 this afternoon and this may improve the situation (assuming PHP crashes are in fact the cause). If you notice RYL dropping offline intermittently over the next few hours, don't be concerned, this will simply be a side effect of the work I am undertaking.

As usual I'll keep you all posted on further developments and I'll continue to monitor this situation closely.

James

James is offline   Reply With Quote