Tech Meeting Minutes 20101202
Present: Sam, Mark, David, Peter, Wahid, Andy
Backlink: http://www.scotgrid.ac.uk/wiki/index.php/ScotGrid_Technical_Meetings  Hot Topics
* Glasgow o svr026, New Cream CE reporting Memory Issues. Memtest run on the server and no issues found. Svr026 will be re-entered into production on 02/12/2010 o WMS MyProxy ticket (GGUS 63640) - reassigned to RAL to investigate hostname issues as there is a problem there. "Waiting for reply" o svr023 matching non existant CEs (GGUS 63931). David found an old BDII setting for this machine in YAIM. He'll figure out where it gets put and reconfigure it. o Svr014 at Glasgow requires its Cream install to be upgraded once Svr026 is back in full production. * Durham o BDII not yet changed to publish GRIDPP tag. Peter will update this shortly.
Actions and Deployment
* Glasgow o glite-APEL now done. David has a few notes to add to the official instructions which he will put in the blog. Now green to go at other sites. ACTION. o Disk deployment - not yet done, but in final tests. Half of the Disks 62 - 71 now tested completely. o This then frees up enough space to tackle: + Migration of SL4 disk servers to SL5 + Restructure partitions on smaller servers to 10TB, closer to the 15TB partitions on the new servers. + Draining good for data distribution across servers!
* ECDF o glite-APEL: Andy has requested a VM for this. + Clarified that all the batch specific publisher tweaks are on the CE.
+ Work to commence shortly for this install.
o CREAM. Needed to install patch for software to allow full service testing. Patch should be installed this week or early next week. o ATLAS Analysis: + New disk is deployed + Production has moved to the new disk servers + Wahid did a small analysis test (30 jobs) # Saturated links out of disk servers, so need to get their additional interfaces working (each server has 4x1Gb links) # nfs server was under very heavy load - now reinstalling atlas s/w to new machine
* Durham o Will try to do glite-APEL upgrade early next week.
* Graeme noted major ATLAS downtime at RAL, 6-7 Dec. Good time for sites to take downtime if needed. o Mark thinks Glasgow might do so, to deal with heat flow issues. * Stuart is chasing up stuck jobs in cream - all ATLAS pilots. * Graeme noted impending security challenge. ACTION to review procedure document.
- Peter to commence Certificate upgrades at Durham.
- Third Cream CE to be installed at Glasgow on Svr008 once Svr026 and Svr014 are back into full production.