Tech Meeting Minutes 20111027
ScotGrid Tech Meeting 27 October 2011
Mark - chair
Sam - minutes
APEL publishing has been fixed. (It was claiming to be publishing to the accounting portal while actually not doing anything of the sort.)
Three tickets: 75671: missing library at ECDF. Andy: I think this is them not us. The ticket is on hold while they investigate. They need to fix their path to call the right gcc.
The other tickets are the spacetoken changes for ATLAS and the CVMFS tracking ticket.
Andy: will be meeting with the ECDF technical team today. Things will move along with CVMFS then, so hopefully we'll soon be in a testing phase.
Everything else seems to be going well. Looking forward to the end of the accounting period.
Mark notes that corrective actions needing downtime should probably be performed in November - post the accounting period, and before the stress of the new procurement.
Mark is going to raise the metrics issue at the Ops Team meeting. We'll agree internally what needs to be done for the Quarterly reports (especially for shared sites, where there is disagreement over how to represent the site).
One ticket: the enmr.eu tag modification issue.
David: the problem with this is that what Christophe is trying to do is an lcg-tags query, which intermittently works. There doesn't seem to be any reasonable pattern as to how this is correlated - we now have some longer term test results from him.
David: the other main thing for us was a "weird event" on Tuesday, where we started failing a chunk of jobs and also experienced a kernel panic on the software area NFS server (which also breaks ATLAS jobs, as they need to read their customisation scripts from it, despite CVMFS). Not sure of the causal flow here.
We've also begun a process of rolling upgrades to the front-end services to bring them up to recent releases.
Mark: we got access to master at Durham, which enabled Stuart to finally crack the weirdness on the Durham worker nodes; a security vulnerability in a kernel which is not actually used at Durham had been "mitigated" against in a way that broke 32bit code! Mike, meanwhile, has managed to fix ce01. Durham is now mostly working, and will get some test jobs from ATLAS soon.
While Mike is at Durham on his own, we'll be providing support for the Grid systems remotely from Glasgow.
Mark reminded people of the existence of the ScotGrid Google Calendar, which we should start using more. This will be particularly important for coordinating support for Mike.
We're also reminded to submit any CHEP abstracts that aren't up yet.
Disaster planning: last year, the entirety of Scotland broke due to snow. We should probably look at contingency plans for this year, just in case of similar issues.
The second annual Scotgrid F2F meeting is tentatively scheduled for some time in February, in Glasgow this time.
- Chat log:
[10:58:00] Mark Mitchell Service will start at 11:02 Please standby
[11:01:19] Andrew Washbrook ok
[11:02:40] Sam Skipsey That's 11:02 Russian East Coast time.
[11:04:49] Sam Skipsey Warning, "facts" presented in this meeting may be subject to "inaccuracies".
[11:05:32] David Crooks Russian East Coast time is, apparently, already 10:03 pm
[11:05:41] David Crooks (in Anadyr at least)
[11:05:45] Sam Skipsey I'm astonished it even exists!
[11:06:09] David Crooks It's "Magadan Time" apparently
[11:06:26] Sam Skipsey Everyone knows Russia is just a cunning propaganda exercise used to explain away the giant hole in the Earth between the Ukraine and India.
[11:17:41] Wahid Bhimji joined
[11:18:01] Wahid Bhimji Sorry! I forgot to come
[11:20:41] Mark Mitchell no Gold star today Wahid
[11:25:15] Andrew Washbrook sorry the postman has arrived - i have to pick a parcel!
[11:25:21] Andrew Washbrook will be back in 60secs
[11:26:13] Andrew Washbrook i am back
[11:28:08] Wahid Bhimji ok