Tech Meeting Minutes 20110407

From ScotGrid

Present: Graeme, David, Wahid, Andy, Sam, Peter

https://indico.cern.ch/conferenceDisplay.py?confId=133726

Backlink: http://www.scotgrid.ac.uk/wiki/index.php/ScotGrid_Technical_Meetings

No Simpsons trivia question this week. Mark on Holiday

Table of contents

Site Issues

Durham

  • Peter is "installing the Cream CE"
  • Having some DPM CGI-SOAP errors with biomed and e-nmr. (Might be

Grid-France issue?) (The e-nmr problem may be a permissions issue with the tags directory.)

  • The issue of "cross-site login" was brought up again.
  • The issue of published pledges was brought up. It was emphasised that

the pledge publication is purely for processing by the brains of project managers and so on.

  • The vagaries of CMS Tier-2 use were discussed.



ECDF

  • Andy asked if we got CMS jobs regularly at Glasgow (we do - but we've

never been *full* of CMS jobs). ECDF had had the opposite problem in the past- too many CMS jobs (it may be due to ECDF looking as if it has free space constantly).

  • Storage issues were:

1) "slight" outage on Monday, due to /var filling up. 2) some jobs failing due to lost files. Files were recovered with the ATLAS integrity tool.

  • Wahid also did the "clearing out old DB requests" config for their

DPM. (DPM REQCLEAN 3m in /etc/shift.conf)

  • Second CREAM CE is basically up and will be in production fairly soon.
  • Speaking to Orlando about new virtual instances for ECDF. In the short

term, for a SQUID, CVMFS, maybe ARGUS.

  • Pseudo-whole-node queue available for ATLAS testing, being

investigated. Scheduling latency very high, and batch system tuning is critical (made worse by pilot jobs).

http://www.gridpp.rl.ac.uk/blog/tag/whole-node-jobs/

Glasgow

  • Sam erroneously claimed we don't have a whole node queue. (We do, but

it wasn't usable by anyone…)

  • Mostly things okay. Was an issue with the WMS (filled sandbox quotas).

Power outage to fix the power issue permanently will be happening at some point. The Legendary Final Disk Server will be installed tomorrow (end of an Era).

  • svr018 will be moving to 1.8.0 and the disk servers will be moving to

SL5 slowly now the LFDS is arriving.

  • We allocated a couple of slots to Atlas Analysis on Graeme's

suggestion, to encourage more analysis

AOB

  • None Reported. No Comments.