Tech Meeting Minutes 20100720

From ScotGrid

Present: Graeme, Sam, David, Wahid

Indico: http://indico.cern.ch/conferenceDisplay.py?confId=107488

Table of contents

Security

Current vulnerability of great concern.

  • Glasgow will test CERN kernel and aim to patch and reboot WNs + UI + CREAM CE today.
  • ECDF will await RH kernel. Closed to user workloads, but will run ATLAS production.
  • Durham - no news yet, but have gone into downtime?
  • ACTION: Please keep everyone informed what's happening.

Availability

  • Not a great quarter for us, but problems seem to be understood and largely dealt with. See
    • [[1] (http://pprc.qmul.ac.uk/~lloyd/gridpp/plots/SAM_A_3Q10_UKI-SCOTGRID-ECDF.png)]
    • [[2] (http://pprc.qmul.ac.uk/~lloyd/gridpp/plots/SAM_A_3Q10_UKI-SCOTGRID-GLASGOW.png)]
    • [[3] (http://pprc.qmul.ac.uk/~lloyd/gridpp/plots/SAM_A_3Q10_UKI-SCOTGRID-DURHAM.png)]

Infrastructure

  • Glasgow have had weird auth errors on their lcg-CEs - causes condor to cancel healthy jobs. Not understood.
  • ATLAS making much more use of CREAM
    • Glasgow plan to move to 2 CREAM + 1 lcg-CE.
  • ECDF ce02 is unhealthy. mw05 is ok. Junk ce02 at lcg-CE, move to CREAM.
  • Sites need to update to new glite-APEL service
    • May require running with glite-MON in parallel during migration.

AOB