Tech Meeting Minutes 20111117
ScotGrid Tech Meeting 17 November 2011
One ticket open: 75542. Spacetoken tracking.
Wahid: Brian asked me to reduce scratchdisk -> Datadisk. Since we have more use for localgroupdisk, I was going to stick it in localgroupdisk instead. Politics here, though, since localgroupdisk doesn't count towards the "space for global atlas" figures.
We're also not running many jobs at the moment, due to contention from local physics users. CVMFS is progressing with no significant problems. We had a conversation with ECDF about how best to spend networking money, so we have a plan for spending.
76487 - shared area problem at Durham (LHCb), no space left on device.
Mike: it looks like the experiment software share has filled up. (500GB!) Looks like ATLAS have filled up the space, so we're going to add Alessandro De Salvo to the ticket so he can remove ATLAS stuff to let LHCb work ;) (The medium term fix is to move to CVMFS).
Another issue: the worker node fixes seem to have not stuck that Stuart made. It's possible that master is redoing the "fixes" that broke them in the first place?
75488: authentication problems for compchem.
Durham IT decided to panic about ssh, and block port 22 in their firewall, so Mike was distracted last week trying to beat sense into them.
Dave: We've had an "exciting" week.
Our ticket still open
is the enmr one. For some reason, their proxies aren't working for doing lcg_tags things. It looks like an enmr specific problem.
Monday was an interesting experience. Despite our efforts to prevent it, updates to the Cream CEs pulled in torque 2.5. (Torque2.5 is mutually incompatible with 2.4). In the end, it became easier to move the entire batch system live to 2.5 from 2.4, mostly thanks to Stuart typing very rapidly, which we did on the afternoon. There are some remaining niggles to do with the var paths used by torque also changing.
svr023 is currently blanked, and is being worked on with the EMI WMS install. Dependancy issues are currently stalling this.
There's also the security issue about BDIIs. We've applied the suggested patch, which seems fine except on our DPM - the BDII can't work with the new permissions.
disk062 is somewhat busy. It's possible it has an issue with one of the raidsets.
- Network spend.
STFC had the money, which is now in the JANET pot, which can bid against. The money is to buy networking equipment to support research, either in the cluster or connecting to the wider MANs. It may be harder to buy disk than we thought, since JANET are now overseeing the budget.
Mike has no received any documents concerning this. The email went to Frank Crowse? Mark will resend the email to Mike, who really should have been on the list.
Our awards have been set. Both this and the network spend have to happen concurrently.
- Scotgrid New Year Meeting
Provisional date: 9th February. (or the 16th). Provisional location: Glasgow.
[11:02:09] David Crooks joined
[11:05:50] Mike Johnson yes, microphone issues as usual
[11:10:02] Wahid Bhimji ok
[11:21:08] Sam Skipsey Alessandro de Salvo
[11:21:30] Wahid Bhimji Alessandro De Salvo
[11:21:54] Wahid Bhimji Alessandro.DeSalvo@roma1.infn.it
[11:32:47] Wahid Bhimji hes the collaboration board memeber
[11:34:59] Wahid Bhimji well this is a chance to buy new hardware
[11:42:29] Wahid Bhimji left
[11:42:31] Mike Johnson left