Message boards :
CMS Application :
CMS&Atlas host disk problem
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 Nov 08 Posts: 20 Credit: 9,777,865 RAC: 3,769 ![]() ![]() ![]() |
Hello, I still have problems with LHC@Home. Mainly with Atlas and CMS VBox tasks. The problem lies in how a combination of Boinc and LHC tasks works with different disks. Computer ID: 10570926 8 processors allowed (meaning 8 simultaneous tasks with single processor) Here's what I've found so far. I broke down the whole process in some steps: 1. Boinc contacts the server and downloads tasks (in case of LHC it downloads many tasks - like 8 or so - at the same time) 2. Boinc starts the task or tasks (depending if they are multi-threaded or not) 3. the LHC first copies the disk image to the BOINC/slots/ directory 4. after image is copied it registers a VM in VBox Manager and sets up parameters (base memory, processors, attaches disks etc) 5. VM starts the boot-up process 6. VM starts and does it's work 7. VM finishes the work and the VM shuts-down 8. after VM shuts down there is an extra 5-6 min that I don't know exactly what's going on (there's very little CPU activity but no disk nor ethernet activity... I think some kind of result preparation?) 9. then follows VM deregistration from VBox Manager and a computational error comes up in Boinc Manager (this error is not so important right now) 10. reporting result to the LHC server In my case: step 1 is not critical as the internet connection is slower than disk data speed step 2 - after jobs downloaded Boinc Manager started 8 CMS tasks at the same time (see below for detailed analysis) Atlas disk image is around 2,54 GB, CMS disk image is around 2,8 GB. Starting eight Atlas or CMS jobs at the same time is not advisable in my case as writing 8 VM disk images to BOINC/slots/ directory completly overwhelms the disk for a long time. The disk cannot handle so many write requests. As is seen below different disks have different write queues. SSDs and even SD cards can handle 8 write requests, but HDDs cannot. Is it possible to do one of the following:
|
![]() Send message Joined: 15 Jun 08 Posts: 2141 Credit: 175,402,068 RAC: 104,454 ![]() ![]() ![]() |
To avoid overloading your disk IO you should: 1. Not start lots of VMs concurrently. Instead start only one at a time. 2. Not stop lots of VMs concurrently. Instead stop only one at a time. 3. Avoid context switches that automatically lead to (1.) or (2.). Since the BOINC client doesn't support it, (1.) and (2.) have to be done done manually or by a (self made) script. The client's behavior in case of (3.) might be better if multithreaded tasks, e.g. N-Body Simulation, are configured to use less cores. A more complex solution would be to run separate BOINC clients for load critical projects. |
©2023 CERN