Difference between revisions of "RAL Tier1 weekly operations castor 29/04/2016"

From GridPP Wiki
Jump to: navigation, search
(Minutes)
 
 
Line 1: Line 1:
 
AD discussion>>>>
 
AD discussion>>>>
 +
 
batch farm > 10k job slots on farm (15 > 26k)
 
batch farm > 10k job slots on farm (15 > 26k)
 +
 
most throughput from fts
 
most throughput from fts
 +
 
batch farm & fts about equal number of files  
 
batch farm & fts about equal number of files  
 +
 
CMS throttled - 1200 job slots (direct IO) - 1200 as it works, MoU should be 3k
 
CMS throttled - 1200 job slots (direct IO) - 1200 as it works, MoU should be 3k
 +
 
Atlas - fair share 7-8k jobs ... pos prev max load + 20%?
 
Atlas - fair share 7-8k jobs ... pos prev max load + 20%?
 +
 
LHCb - ok at the moment  
 
LHCb - ok at the moment  
 +
 
Alice use farm (quite significant) but dont really use castor  
 
Alice use farm (quite significant) but dont really use castor  
 +
 
Non LHC vos - mostly going to tape as 'archive' >>> will probably go to echo
 
Non LHC vos - mostly going to tape as 'archive' >>> will probably go to echo
 +
 
2 defn of efficency success/total or cpu time/wall time - CMS raising issues with both
 
2 defn of efficency success/total or cpu time/wall time - CMS raising issues with both
  
  
 
AP - come up with all potential solutions even if ££££
 
AP - come up with all potential solutions even if ££££
 +
 
*2014 disk serevrs can be put into castor - poss cms .. for IO throughput  
 
*2014 disk serevrs can be put into castor - poss cms .. for IO throughput  
 +
 
reduce the number of drives used (castor partitions) on above machines
 
reduce the number of drives used (castor partitions) on above machines
 +
 
atlas log files could be put onto echo
 
atlas log files could be put onto echo
 +
 
find prob workflows  
 
find prob workflows  
 +
 
get echo working  
 
get echo working  
 +
 
second raid in hardware
 
second raid in hardware

Latest revision as of 09:43, 8 December 2016

AD discussion>>>>

batch farm > 10k job slots on farm (15 > 26k)

most throughput from fts

batch farm & fts about equal number of files

CMS throttled - 1200 job slots (direct IO) - 1200 as it works, MoU should be 3k

Atlas - fair share 7-8k jobs ... pos prev max load + 20%?

LHCb - ok at the moment

Alice use farm (quite significant) but dont really use castor

Non LHC vos - mostly going to tape as 'archive' >>> will probably go to echo

2 defn of efficency success/total or cpu time/wall time - CMS raising issues with both


AP - come up with all potential solutions even if ££££

  • 2014 disk serevrs can be put into castor - poss cms .. for IO throughput

reduce the number of drives used (castor partitions) on above machines

atlas log files could be put onto echo

find prob workflows

get echo working

second raid in hardware