Message boards : Number crunching : Bandwidth and ram for vb and native tasks?
Message board moderation

To post messages, you must log in.

AuthorMessage
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 16
Credit: 119,862
RAC: 0
Message 41038 - Posted: 23 Dec 2019, 21:02:43 UTC

So all Windows and Linux machines have finally run dry over here. Time to actually migrate over to LHC.
Do all tasks using native and / or virtual box require constant internet access? I have 75 down and a claimed 6 up, yet only get around 3-4. Will this be a problem with around 5 8-core machines and 1 16? I'm assuming CMS, Theory and Atlas use similar amounts of bandwidth with more incoming than outgoing.

Now for ram - from my reading, Atlas takes the most for a singlecore task with CMS behind that and theory taking the least, even less so for native tasks under Linux.
Just as an example on my 12 thread i7-8750H, which has 32 GB ram running Windows 10. I should be fine running 10 single core theory or CMS tasks, however should run 3 4-core Atlas tasks to prevent the use of too much ram?

Thanks. I will of course be experimenting slowly. I don't want to start off with more than the machines or internet can handle. All of the machines I'm referencing apart from the laptop are dedicated crunchers.
ID: 41038 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 449
Credit: 12,154,810
RAC: 6,566
Message 41039 - Posted: 23 Dec 2019, 21:32:26 UTC - in response to Message 41038.  
Last modified: 23 Dec 2019, 21:53:02 UTC

Now for ram - from my reading, Atlas takes the most for a singlecore task with CMS behind that and theory taking the least, even less so for native tasks under Linux.

I can only speak for Linux:

Native ATLAS takes 2 GB, and CMS (VBox) takes 3 GB. At least that is what they show in BoincTasks, but actual usage is less. I expect you will need to plan for the worst-case in order to run the number you want.
Theory (native) is very small - averaging around 20 MB, though occasionally more, and SixTrack only about 100 MB in Linux.
ID: 41039 · Report as offensive     Reply Quote
lazlo_vii
Avatar

Send message
Joined: 20 Nov 19
Posts: 21
Credit: 1,074,330
RAC: 0
Message 41040 - Posted: 23 Dec 2019, 22:10:38 UTC
Last modified: 23 Dec 2019, 22:36:19 UTC

The RAM usage is well known but I think the bandwidth is not. You can get a total daily transfer history from BOINC on the command line:

boinccmd --get_daily_xfer_history


But it will not be broken down by project and/or sub-project. It will just show the daily transfer totals for that host. If you have a proxy server running for your BOINC clients you might be able to save a lot of network traffic with it on the native workloads.
ID: 41040 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1477
Credit: 79,155,788
RAC: 81,326
Message 41041 - Posted: 23 Dec 2019, 22:14:39 UTC - in response to Message 41038.  

Do all tasks using native and / or virtual box require constant internet access?

Yes.


I have 75 down and a claimed 6 up, yet only get around 3-4.

I guess this means 75 Mbit/s download and 3-4 Mbit/s upload?
75 down are far away from being a problem in this case.

3-4 up might be a problem:
ATLAS uploads can be up to 240 MB/task (currently around 100-120 MB/task).
CMS generates (roughly) 1 MB/min.

Other LHC@home tasks need much less upload bandwidth.
ID: 41041 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 16
Credit: 119,862
RAC: 0
Message 41043 - Posted: 24 Dec 2019, 0:01:03 UTC - in response to Message 41040.  

I can only speak for Linux:

Native ATLAS takes 2 GB, and CMS (VBox) takes 3 GB. At least that is what they show in BoincTasks, but actual usage is less. I expect you will need to plan for the worst-case in order to run the number you want.
Theory (native) is very small - averaging around 20 MB, though occasionally more, and SixTrack only about 100 MB in Linux.

Okay, so apart from Theory, which I thought took tons more, even on Linux, I had everything almost right, if a little backwards. Thanks
The RAM usage is well known but I think the bandwidth is not. You can get a total daily transfer history from BOINC on the command line:

boinccmd --get_daily_xfer_history


But it will not be broken down by project and/or sub-project. It will just show the daily transfer totals for that host. If you have a proxy server running for your BOINC clients you might be able to save a lot of network traffic with it on the native workloads.

Oh that will be a life saver - thank you. I plan on making LHC exclusive, at least for a little while, apart from maybe CPDN. Local proxies are far, far down the road for me. This will be by far one of the most involved projects I'll be in - or maybe that's me overthinking and overcomplicating everything....

I guess this means 75 Mbit/s download and 3-4 Mbit/s upload?
75 down are far away from being a problem in this case.

3-4 up might be a problem.

Yes, that's what I meant - Mbit.
I do have other machines that are in datacenters and internet isn't an issue. One day that 300/25 will come my way and I can be happy. And maybe a few more Ryzens too.
I'm aiming for theoretically using up around 75% of each machine if at all possible. Should I be aiming at more or less? Basically 2 threads not being used on an 8 threads machine and similar on 16 - though might bump that up to 3 not in use.
ID: 41043 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 449
Credit: 12,154,810
RAC: 6,566
Message 41044 - Posted: 24 Dec 2019, 0:16:31 UTC - in response to Message 41043.  

I see you are on Ubuntu 18.04.3. I have not had much luck getting native ATLAS to work on recent installs (native Theory is OK).
Let me know what you come up with.
ID: 41044 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 16
Credit: 119,862
RAC: 0
Message 41045 - Posted: 24 Dec 2019, 3:14:19 UTC - in response to Message 41044.  
Last modified: 24 Dec 2019, 3:15:29 UTC

I see you are on Ubuntu 18.04.3. I have not had much luck getting native ATLAS to work on recent installs (native Theory is OK).
Let me know what you come up with.

I'd answer that, but my 4 Linux machines haven't received any Atlas tasks. 3 out of 4 received Theory, the last one wasn't able to get anything but Sixtrack, and nothing was able to pull CMS. One of my Windows machines did get 2 Atlas tasks (ironically, I have 2 CPU cores set in preferences with unlimited jobs. I guess I have to make an app config for what I want specifically on each machine).
I still need to do some fiddling, I think Boinc is failing to recognize Intel VTX is enabled on one machine, but I also know that's in the checklist, so I'll get to that at some point.
For some reason I figured this would be a lot more involved, but once everything is set up correctly - and really all the setup involved were the few commands in Linux - everything went smoothly.
My next question. In regards to native, are there any tweeks folks recommend making to the default.local file or is it fine as is? I just grabbed the one referenced over at https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4971
thanks
ID: 41045 · Report as offensive     Reply Quote
lazlo_vii
Avatar

Send message
Joined: 20 Nov 19
Posts: 21
Credit: 1,074,330
RAC: 0
Message 41062 - Posted: 24 Dec 2019, 20:58:40 UTC - in response to Message 41043.  
Last modified: 24 Dec 2019, 21:06:23 UTC

...I do have other machines that are in datacenters and internet isn't an issue. One day that 300/25 will come my way and I can be happy. And maybe a few more Ryzens too.
I'm aiming for theoretically using up around 75% of each machine if at all possible. Should I be aiming at more or less? Basically 2 threads not being used on an 8 threads machine and similar on 16 - though might bump that up to 3 not in use.


I started LHC@home with one Ryzen 3700X (my desktop) running at 25-75% load for 10 days straight and then added a second (my server) running at 75%-100% load. 15 days after that (25 days total) I had one million points for this project. So yes, running at less than 100% is very do-able especially if you have lots of threads, RAM, and need your systems for other tasks.

EDIT: I forgot to add that when running the server at 75% the "load average" (how many treads were asking my kernel for a CPU at once) was a little less than 13 on average. When I turned LHC up to 100% the load average would quickly spike into the low 20's and stay there. That's not good on a system that can only handle 16 threads.
ID: 41062 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 16
Credit: 119,862
RAC: 0
Message 41064 - Posted: 24 Dec 2019, 21:22:30 UTC - in response to Message 41062.  

...I do have other machines that are in datacenters and internet isn't an issue. One day that 300/25 will come my way and I can be happy. And maybe a few more Ryzens too.
I'm aiming for theoretically using up around 75% of each machine if at all possible. Should I be aiming at more or less? Basically 2 threads not being used on an 8 threads machine and similar on 16 - though might bump that up to 3 not in use.


I started LHC@home with one Ryzen 3700X (my desktop) running at 25-75% load for 10 days straight and then added a second (my server) running at 75%-100% load. 15 days after that (25 days total) I had one million points for this project. So yes, running at less than 100% is very do-able especially if you have lots of threads, RAM, and need your systems for other tasks.

EDIT: I forgot to add that when running the server at 75% the "load average" (how many treads were asking my kernel for a CPU at once) was a little less than 13 on average. When I turned LHC up to 100% the load average would quickly spike into the low 20's and stay there. That's not good on a system that can only handle 16 threads.

I'm not understanding your last point, but then I'm not a solid Linux user and a lot of things are still over my head - I'm assuming what you mean is the project was calling too much of the CPU into action and it couldn't keep up.
Is there a big difference in which tasks you run vs. credits? For instance does Atlas pay more than sixtrack? How are your Ryzen's configured? My machines are 16-32 gb ram with my Ryzen 7 having 64 gb total, so it should be able to handle 7 2-core Atlas's just fine with my other i7's having 1 or 2 core Atlas's running depending on 16 or 32 gb of ram and 75% CPU usage.

It's a quiet Christmas eve here, so I'm currently throwing together an old Xeon w3520 I found in the parts drawer. I think I have an i7 from the same time period here somewhere...will be two good space heaters in my office in the cold Canadian winter, I just need to find a cooler for the latter and see how my electric bill fairs. No cases on either of these - not enough room, ironically. They will of course both run Linux - my main desktop and laptop will continue running Windows, at least for now, while I slowly but surely navigate around the Linux command line and learn how to break things, then fix them. Old as these are every little bit will help is my motto.
ID: 41064 · Report as offensive     Reply Quote
lazlo_vii
Avatar

Send message
Joined: 20 Nov 19
Posts: 21
Credit: 1,074,330
RAC: 0
Message 41088 - Posted: 27 Dec 2019, 0:26:22 UTC - in response to Message 41064.  
Last modified: 27 Dec 2019, 0:26:40 UTC

I'm not understanding your last point, but then I'm not a solid Linux user and a lot of things are still over my head - I'm assuming what you mean is the project was calling too much of the CPU into action and it couldn't keep up.
Is there a big difference in which tasks you run vs. credits? For instance does Atlas pay more than sixtrack? How are your Ryzen's configured? My machines are 16-32 gb ram with my Ryzen 7 having 64 gb total, so it should be able to handle 7 2-core Atlas's just fine with my other i7's having 1 or 2 core Atlas's running depending on 16 or 32 gb of ram and 75% CPU usage.

It's a quiet Christmas eve here, so I'm currently throwing together an old Xeon w3520 I found in the parts drawer. I think I have an i7 from the same time period here somewhere...will be two good space heaters in my office in the cold Canadian winter, I just need to find a cooler for the latter and see how my electric bill fairs. No cases on either of these - not enough room, ironically. They will of course both run Linux - my main desktop and laptop will continue running Windows, at least for now, while I slowly but surely navigate around the Linux command line and learn how to break things, then fix them. Old as these are every little bit will help is my motto.


Here is a short history of load averages in Linux. It has two things going for it: First, it is very well written, has a simple to understand summary on the first page, and provides concrete examples with code snippets later on. Second, it was the very first article that came up when I did a Google search for "Unix load average explained".

http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html
ID: 41088 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 16
Credit: 119,862
RAC: 0
Message 41097 - Posted: 28 Dec 2019, 3:07:46 UTC - in response to Message 41088.  

I'm not understanding your last point, but then I'm not a solid Linux user and a lot of things are still over my head - I'm assuming what you mean is the project was calling too much of the CPU into action and it couldn't keep up.
Is there a big difference in which tasks you run vs. credits? For instance does Atlas pay more than sixtrack? How are your Ryzen's configured? My machines are 16-32 gb ram with my Ryzen 7 having 64 gb total, so it should be able to handle 7 2-core Atlas's just fine with my other i7's having 1 or 2 core Atlas's running depending on 16 or 32 gb of ram and 75% CPU usage.

It's a quiet Christmas eve here, so I'm currently throwing together an old Xeon w3520 I found in the parts drawer. I think I have an i7 from the same time period here somewhere...will be two good space heaters in my office in the cold Canadian winter, I just need to find a cooler for the latter and see how my electric bill fairs. No cases on either of these - not enough room, ironically. They will of course both run Linux - my main desktop and laptop will continue running Windows, at least for now, while I slowly but surely navigate around the Linux command line and learn how to break things, then fix them. Old as these are every little bit will help is my motto.


Here is a short history of load averages in Linux. It has two things going for it: First, it is very well written, has a simple to understand summary on the first page, and provides concrete examples with code snippets later on. Second, it was the very first article that came up when I did a Google search for "Unix load average explained".

http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html


That was quite informative. Very interesting reading there.
thanks for that.
ID: 41097 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1477
Credit: 79,155,788
RAC: 81,326
Message 41098 - Posted: 28 Dec 2019, 9:18:57 UTC - in response to Message 41045.  

In regards to native, are there any tweeks folks recommend making to the default.local file or is it fine as is?

It's a project recommendation to use the Cloudflare proxies instead of the original stratum one CERN servers - even if a local proxy is in use.

The default CVMFS package is made for various scenarios and therefore it does not include the required settings.
Hence LHC@home users are asked to configure them locally.

Create the file /etc/cvmfs/domain.d/cern.ch.local with the following content:
CVMFS_SERVER_URL="http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ral-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1fnal-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1unl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1asgc-cvmfs.openhtc.io:8080/cvmfs/@fqrn@;http://s1ihep-cvmfs.openhtc.io/cvmfs/@fqrn@"

Run (as root):
cvmfs_config reload

Check if the configuration from /etc/cvmfs/domain.d/cern.ch.local is used:
cvmfs_config showconfig |grep 'CVMFS_SERVER_URL'
ID: 41098 · Report as offensive     Reply Quote
Alessio Mereghetti
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 29 Feb 16
Posts: 157
Credit: 2,392,714
RAC: 152
Message 41177 - Posted: 6 Jan 2020, 9:00:01 UTC - in response to Message 41041.  

Hi,
I just remind that SixTrack tasks are CPU-intensive only (with some use of the HD for C/R), with very limited bandwidth requirements and negligible internet access - just to download input files (typically <1MB) or upload results (typically <100kB), no 24/7 access.
Cheers,
A.
ID: 41177 · Report as offensive     Reply Quote

Message boards : Number crunching : Bandwidth and ram for vb and native tasks?


©2020 CERN