Tech Meeting Minutes 20100603

From ScotGrid

Present: Graeme, Sam, Andrew, Wahid

Indico: http://indico.cern.ch/conferenceDisplay.py?confId=96377

Table of contents

Open Problems

  • LHCb upload problems at Glasgow. Sam has discovered a strong correlation between upload failures and network traffic peaks. The network traffic seems to be internal, so why it affects outbound gridftp is still a mystery.
  • Glasgow were hit by very high load on a DPM disk server overnight. It caused 2 race condition directory problems and many failed transfers. Worryingly, this seems very correlated with SL5. Sam thinks it might have something to do with xfs on SL5.

Infrastructure

  • squid at ECDF: Wahid reports that access to Lancaster squid is now working.
  • CREAM: Still waiting for extra servers from systems team. Need to discuss the recommended prologue/epilogue scripts which cream on SGE would like the batch system to have. ACTION: Andy to investigate and raise with Orlando.
  • ECDF will have 150TB in storm. We aim to get them an official ATLAS datashare. ADC will not send data to sites with less then 25TB (soon to be 40TB) in MC/DATADISK. Graeme's initial suggestion: MCDISK, DATADISK 50TB; PRODISK 10TB; SCRATCHDISK 20TB; LOCALGROUPDISK 18TB; HOTDISK 2TB.

glexec

  • Jose is working on this, but needs to make sure he submits the initial job with Role=pilot.

Procurements

  • Glasgow's cooling off period is about to end...

AOB