Message boards : Number crunching : boinc - enhancing research workloads for the benefit of mankind & humanity - Computer Optimisation - CPU , GPU & RAM - PC, Mac & ARM development
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30489 - Posted: 25 May 2017, 11:00:13 UTC

from :
http://bit.ly/HPC-Dev - High Performance Computing

http://bit.ly/tRNG-Dev - T/C/RNG Devices "we need random seeds"

***

boinc - enhancing research workloads for the benefit of mankind & humanity - Computer Optimisation - CPU , GPU & RAM - PC, Mac & ARM development

HPC - High Performance Computation for beneficial goals and obvious worth.

(Guide, experimentation, developer kit's and manuals)


Observing the workloads of many beneficial projects we find that commonly the workload data set is small,
In addition to the memory set being smaller or larger than a machine can compute optimally; we find that feature sets such as fae and avx have commonly not been implemented,

Some projects like asteroids at home and the seti project are using enhanced computation instruction sets ... like avx and memory loads that benefit from the 4gb or more ram that is available on decent gaming and home laptops.

Not all modern machines have loads of ram; However research and or university establishments use sufficiently powerful machines that can glow on the boinc record in full glory with a 256mb to 768mb workload,

In addition the machines are operand,xen ... commonly and servers may have such as Sparc or power pc specific hardware and instruction sets,

In order to examine examples .. below we can see workloads include small data arrays; in the 40mb to 79mb range..

In line with servers and gaming rigs .. we have 1gb of ram per core, of course not all issues require a larger array in the workload and some machines have 256mb per core !

However much Ram you allocate to the projected workload; small memory loads can and will be sufficient for data swapping and or paging (like DNA Replicators)...

Some task can sufficiently benefit from larger thread and data models, to my mind DNA and mapping data are fine examples of specific workloads; Where memory counts,

In addition thread count can be 4 or other numbers and i suggest that a single task can use more than one core and instruction set (neon for example or Symmetric threading FPU, SMT)

Specific workload optimisation, or rather generic with SSE and AVX and FPU threading and precision optimisation would be very cool while we deal with the workload running app

In particular the Ryzen multi-core is a new and exciting product,

So take care to read the guides in the lower half of the document, AVX2, RDSEED, ADX and additional encryption formats are some of the most exciting changes to the AMD Ryzen Arch.

Further thought ... Efficiency :

add a MHz/Dhrystone's/MIP'S performance per watt to each system ...
then projects will further optimise workloads to improve upon workload energy & environmental efficiency versus work carried out.

Work Hours x Mhz / (efficiency per watt)
-------
Hours / % of projects finished with work completed

Also bear in mind that GPU's need watt efficiency and task management to optimise power used versus work done....

worker priority should always be :

efficiency + merit of the work
--------
time / % necessity

Please examine the issue further.


Rupert S

https://www.worldcommunitygrid.org

https://boinc.berkeley.edu/

http://esa-space.blogspot.com/

HPC Computing work load Photos http://bit.ly/HPCImpact

http://bit.ly/HPC-Dev

http://bit.ly/tRNG-Dev

http://esa-space.blogspot.ru/2017/04/rng-and-random-web.html - we need Chaos Seeds : Random seeds for our work

HPC Best Practices..

http://www.intertwine-project.eu/best-practice-guides

AMD Platform Optimization - please read for all developers

https://community.amd.com/thread/213045 - particular instruction differences for microcode optimisation

http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Optimizing-For-AMD-Ryzen.pdf - code optimisation a few very important lessons... may seem simple to some but obviously is not to be taken for granted.

http://support.amd.com/TechDocs/24593.pdf - AMD64 Architecture Programmer’s Manual Volume 2: System Programming

CPU Optimisation - utility and function.

http://www.agner.org/optimize/ - code optimisation for all programmers on X86,X86-64bit and some others.

http://www.agner.org

for example : Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 popcnt aes f16c syscall nx lm avx sse4a osvw xop wdt fma4 topx page1gb rdtscp bmi1

11000 Mips & 2700 FPU Mips - per Core

**
Compilers and Make

https://cmake.org/

http://llvm.org/
http://llvm.org/docs/FAQ.html

https://gcc.gnu.org/

**
an article that took some deep learning... itself ôo, anyway very interesting....
hip c++ will we think be simpler than open CL then as a higher level code port...
and machine converted CUDA-code to 99.6%

http://www.anandtech.com/show/10831/amd-sc16-rocm-13-released-boltzmann-realized

**
PC/Mac/Windows/Linux/Android

https://www.khronos.org/news/events/2016-isc-high-performance

https://www.khronos.org/assets/uploads/developers/library/2008_siggraph_bof_opengl/OpenCL%20and%20OpenGL%20SIGGRAPH%20BOF%20Aug08.pdf HPC Report

https://www.microsoft.com/en-us/download/details.aspx?id=54507 Microsoft HPC Pack 2016 including linux

https://technet.microsoft.com/en-us/library/cc514029(v=ws.11).aspx all HPC Packs 2016,2012 to 2008 info and download

https://msdn.microsoft.com/en-us/library/ff976568.aspx Microsoft High Performance Computing for Developers - info and downloads

https://docs.microsoft.com/en-us/azure/virtual-machines/windows/hpcpack-cluster-active-directory - information and virtualisation

**
OpenVX for high performance Computing : Multi platform spec
"OpenVX for HPC Neural Nets and processing .... a new way to deliver on research, gaming & processing of data and images"


https://www.khronos.org/news/tags/tag/OpenVX

https://www.khronos.org/news/press/openvx-1.2-specification-cross-platform-acceleration-power-efficient-vision

**
Open CL "GPU Development" links

https://www.khronos.org/blog/iwocl-where-you-learn-the-latest-on-opencl

https://www.khronos.org/opencl/

https://www.khronos.org/opencl/resources for SDK, learning & optimisation resources.

http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/opencl-optimization-guide/

https://github.com/RadeonOpenCompute - ROCm: Platform for GPU Enabled HPC and UltraScale Computing

http://gpuopen.com/professional-compute/

http://gpuopen.com/compute-product/hcrng/

https://bitbucket.org/multicoreware/hcrng

http://gpuopen.com/compute-product/clrng/

installing the AMD SDK improves compute performance, Optimise your code !

https://streamhpc.com/blog/2017-05-21/amd-open-sourced-rocms-opencl-driver-stack/

https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime/blob/amd-master/README.md

http://developer.amd.com/tools-and-sdks/opencl-zone/

http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/

http://gpuopen.com/games-cgi/

http://developer.amd.com/tools-and-sdks/graphics-development/

http://hgpu.org information and interesting learning & source

http://dspace.princeton.edu/jspui/bitstream/88435/dsp01wm117r22g/1/Jia_princeton_0181D_11168.pdf Optimisation for parallel computing information.

https://arxiv.org/pdf/1705.05249 - CLBlast: A Tuned OpenCL BLAS Library demonstration.

HIP - HSA - the CUDA Compatible C++ for Heterogeneous Computing

http://developer.amd.com/wordpress/media/2012/09/7637-HIP-Datasheet-V1_4-US-Letter.pdf

http://developer.amd.com/wordpress/media/2012/10/hsa10.pdf - a full guide

http://www.hsafoundation.com/

http://www.hsafoundation.com/hsa-developer-tools/

https://github.com/HSAFoundation/HSA-docs-AMD/wiki#initial-implementation

https://github.com/HSAFoundation/HSAIL-Tools

https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver - Driver for kernel

http://www.amd.com/Documents/SDN-Whitepaper.pdf - Smart Software Defined Networks

http://support.amd.com/TechDocs/55766_SEV-KM%20API_Spec.pdf - Secure Encrypted Virtualization Key Management

http://support.amd.com/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf - PROTECTING VM REGISTER STATE WITH SEV-ES

http://support.amd.com/TechDocs/50742_15h_Models_60h-6Fh_BKDG.pdf - bios and kernel drivers

**
ARM Development software/SDK's & tools - HPC

https://developer.arm.com/products/software-development-tools

https://developer.arm.com/products/software-development-tools/hpc for high performance computing (ideal for Boinc)

https://developer.arm.com/products/software-development-tools/compilers for both HPC and APP development.

https://developer.arm.com/products/system-design/fixed-virtual-platforms

https://www.synopsys.com/verification/virtual-prototyping/vdk/vdk-for-arm.html

https://www.synopsys.com/designware-ip/technical-bulletin/designware-hybrid-ip.html

**
IOT links - (internet of things)

https://www.infoq.com/articles/thread-protocol-for-home-automation

http://wso2.com/wso2_resources/wso2_whitepaper_a-reference-architecture-for-the-internet-of-things.pdf

**
Linux arch reference material

https://www.ibm.com/developerworks/library/l-linuxuniversal/

**
Agency GPL

https://code.nasa.gov/

**
Workers :

https://www.upwork.com/hire/driver-development-freelancers/

http://www.wcgsig.com/342585.gif

Update 2:

for a comparison of Gflops/Mips throughput of various Boinc Tasks ..

here we show the relevance of the code or function used ... AVX for example is multi threaded ! and so is the FPU pipeline of the AMD FX & Ryzen processor.....

http://bit.ly/HPCImpact (original non edited photos ...)

and set 2 (newer) http://bit.ly/2HPCImpact ....

see the work throughput GFlops compared to code efficiency per task !

sometimes entropy is needed to for-fill the task one would imagine (for example on android) http://bit.ly/tRNG-Dev

the improvement of the boinc and worldcommunitygrid projects has been observed, noted and one feels improved upon, ..

further improvement should be implemented as soon as possible; To improve work versus output efficiency.

thank you kindly programmers/Workers & scientists for your perseverance & effort.

RS

http://bit.ly/BoincStudies - Result Studies

https://browser.geekbench.com/v4/compute/743093 GPU Function
https://browser.geekbench.com/v4/cpu/2831836 CPU Function
ID: 30489 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30574 - Posted: 31 May 2017, 15:15:28 UTC - in response to Message 30489.  

https://www.youtube.com/watch?v=mLQGXlxemlg - Optimizing HPC Service Delivery by a life time super computing tec
ID: 30574 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 649
Credit: 18,203,292
RAC: 14,930
Message 30580 - Posted: 1 Jun 2017, 4:01:19 UTC - in response to Message 30574.  

https://www.youtube.com/watch?v=mLQGXlxemlg - Optimizing HPC Service Delivery by a life time super computing tec



Makes sense.....us computer geeks here in the Great NW breath the air of Boeing and Microsoft (actually longer with Boeing for me) since I lived in Redmond when it was just trees way before it became
Microsoft 1 Microsoft Way, Redmond, WA
(years ago my neighbor was Grandma Boeing)

Has to be why I have been here at SixTrack almost 13 years 24/7
Volunteer Mad Scientist For Life
ID: 30580 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30595 - Posted: 2 Jun 2017, 17:41:53 UTC - in response to Message 30580.  

https://www.youtube.com/watch?v=mLQGXlxemlg - Optimizing HPC Service Delivery by a life time super computing tec



Makes sense.....us computer geeks here in the Great NW breath the air of Boeing and Microsoft (actually longer with Boeing for me) since I lived in Redmond when it was just trees way before it became
Microsoft 1 Microsoft Way, Redmond, WA
(years ago my neighbor was Grandma Boeing)

Has to be why I have been here at SixTrack almost 13 years 24/7



yeh, older does not mean deadbeat :-D
ID: 30595 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30621 - Posted: 3 Jun 2017, 18:45:27 UTC - in response to Message 30595.  

CPU Optimisation - utility and function.

http://gpuopen.com/compute-product/codexl/ - CodeXL is a code efficiency analyser optimiser debugger for GPU and CPU and system.

http://bit.ly/CoXLPhoto - CodeXL in action photos

http://support.amd.com/TechDocs/24593.pdf - AMD64 Architecture Programmer’s Manual Volume 2: System Programming

http://www.agner.org/optimize/ - code optimisation for all programmers on X86,X86-64bit and some others.
ID: 30621 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,578,195
RAC: 42
Message 30626 - Posted: 4 Jun 2017, 12:08:38 UTC - in response to Message 30621.  

Well we are far from trying to optimise GPU code.
First let me explain that we have a tracking loop over turns
(up to 1,000,000 hoping for 10,000,000 soon) which contains
a large number of inner loops over particles, currently up to 64.
Luckily these loops over particles can be parallelised as each
particle is totally independent. In addition the original author F. Schmidt
pre-calculated everything possible before entering the tracking loop.
Each turn involves some 10,000 steps over a varying number of inner loops,
e.g. straight section, quadrupole, beam-beam interaction, power supply ripple, etc etc
of which there are about 50 different possibilities. A straight section is really just
a multiply and add, whereas beam beam involves hundreds or more FLOPs.
The first idea would be to use a much larger number of particles to best
utilise the GPU. This however would produce a large amount of I/O and
use a lot of disk space, but maybe not insurmountable. However all the code is
Fortran, the outer loop calls subroutines (could inline), and has many tests/branches.
It would be great if the main loop fitted entirely into the GPU and we would have
rare Host access for I/O or BOINC checkpoint and progress calls or when
one or more particles are lost.
My colleague Riccardo is actively looking at redoing in C which would also allow
much more portability and also allow to be parallel on multi-core systems.
For the moment we just run tasks in parallel, which works rather well (apart
from some current infrastructure problems). I hope to come up with
some numbers next week on GPU testing.

The code itself has been regularly measured and optimised; for example we
re-ordered array indices to optimise memory access and rewrote the Error Function
of a Complex Number to be faster but with adequate precision.

Portability does come at a price but ensures accuracy of results. I shall publish
measurements in an upcoming paper. I am sure we gain much more from being portable
and being able to use almost any IEEE 754 compliant processor.

On the issue of SixTrack and/or experiments this will shortly be under discussion at
CERN I am sure. Currently SixTrack has many more Hosts/volunteers, is simple to install,
and has been around for 13 years. Not everyone loves VMbox. Not a big deal at
present as we rarely have enough SixTrack work to keep all volunteers busy.

I hope to re-address all this in some weeks after current BOINC infrastructure issues
are resolved and we have the new "super" sixtrack with much broader appliaction
e.g.collimation studies and we support a much wider range of platforms MacOS ARM
and use features such as AVX.

Eric.
ID: 30626 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30723 - Posted: 10 Jun 2017, 14:09:15 UTC - in response to Message 30626.  

thank you for the reply ! In reference to the use of virtual box there is a new product by berkley > http://singularity.lbl.gov/ called singularity that handles repeatable condition containers... and has low overhead for virtualisation data-set.

as to the particle spread one should possibly consider the multiple core and threaded core model specific to the Ryzen and intel sets...

one could imagine that the multi-threaded nature of arm server cores combined with the nature of multi-threaded and headed arm CPU's and GPU Run-script environments is a new and uncompromising land of opportunity and challenge.

many of the instructions on the FMV4 and Vector instruction sets have multi-threaded en-action at lower precision...

http://32ipi028l5q82yhj72224m8j.wpengine.netdna-cdn.com/wp-content/uploads/2017/03/GDC2017-Optimizing-For-AMD-Ryzen.pdf - code optimisation a few very important lessons... may seem simple to some but obviously is not to be taken for granted.

Compilers and Make compliant with SMT and other HPC Standards

https://cmake.org/

http://llvm.org/
http://llvm.org/docs/FAQ.html

https://gcc.gnu.org/

*not free obviously .. intel*
https://software.intel.com/en-us/articles/intel-advisor-roofline

Well we are far from trying to optimise GPU code.
First let me explain that we have a tracking loop over turns
(up to 1,000,000 hoping for 10,000,000 soon) which contains
a large number of inner loops over particles, currently up to 64.
Luckily these loops over particles can be parallelised as each
particle is totally independent. In addition the original author F. Schmidt
pre-calculated everything possible before entering the tracking loop.
Each turn involves some 10,000 steps over a varying number of inner loops,
e.g. straight section, quadrupole, beam-beam interaction, power supply ripple, etc etc
of which there are about 50 different possibilities. A straight section is really just
a multiply and add, whereas beam beam involves hundreds or more FLOPs.
The first idea would be to use a much larger number of particles to best
utilise the GPU. This however would produce a large amount of I/O and
use a lot of disk space, but maybe not insurmountable. However all the code is
Fortran, the outer loop calls subroutines (could inline), and has many tests/branches.
It would be great if the main loop fitted entirely into the GPU and we would have
rare Host access for I/O or BOINC checkpoint and progress calls or when
one or more particles are lost.
My colleague Riccardo is actively looking at redoing in C which would also allow
much more portability and also allow to be parallel on multi-core systems.
For the moment we just run tasks in parallel, which works rather well (apart
from some current infrastructure problems). I hope to come up with
some numbers next week on GPU testing.

The code itself has been regularly measured and optimised; for example we
re-ordered array indices to optimise memory access and rewrote the Error Function
of a Complex Number to be faster but with adequate precision.

Portability does come at a price but ensures accuracy of results. I shall publish
measurements in an upcoming paper. I am sure we gain much more from being portable
and being able to use almost any IEEE 754 compliant processor.

On the issue of SixTrack and/or experiments this will shortly be under discussion at
CERN I am sure. Currently SixTrack has many more Hosts/volunteers, is simple to install,
and has been around for 13 years. Not everyone loves VMbox. Not a big deal at
present as we rarely have enough SixTrack work to keep all volunteers busy.

I hope to re-address all this in some weeks after current BOINC infrastructure issues
are resolved and we have the new "super" sixtrack with much broader appliaction
e.g.collimation studies and we support a much wider range of platforms MacOS ARM
and use features such as AVX.

Eric.
ID: 30723 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 483
Credit: 3,757,636
RAC: 4,704
Message 30729 - Posted: 10 Jun 2017, 20:55:24 UTC - in response to Message 30723.  

thank you for the reply ! In reference to the use of virtual box there is a new product by berkley > http://singularity.lbl.gov/ called singularity that handles repeatable condition containers... and has low overhead for virtualisation data-set.

Some CMS users have reported problems when their jobs land at sites running Singularity -- to the point that they blacklist sites they know to run the product. I have not heard yet whether the problem has been identified, nor solved.
ID: 30729 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30732 - Posted: 11 Jun 2017, 8:03:18 UTC - in response to Message 30729.  

well ... there IS Docker CE (community edition) and this comes with sever edition also !

So what does the project and on behalf of boinc the system.. feel and sense around the subject of using Docker CE ?

Obviously the professional version could be used for support of the main project and the CE edition of docker for the user..

https://store.docker.com/editions/community/docker-ce-desktop-windows

https://www.ctl.io/developers/blog/post/what-is-docker-and-when-to-use-it/

https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-getting-started

https://www.howtoforge.com/tutorial/how-to-use-docker-introduction/

thank you for the reply ! In reference to the use of virtual box there is a new product by berkley > http://singularity.lbl.gov/ called singularity that handles repeatable condition containers... and has low overhead for virtualisation data-set.

Some CMS users have reported problems when their jobs land at sites running Singularity -- to the point that they blacklist sites they know to run the product. I have not heard yet whether the problem has been identified, nor solved.
ID: 30732 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30733 - Posted: 11 Jun 2017, 8:52:07 UTC - in response to Message 30732.  

and : QEMU is obviously be of use on many projects because of machine emulation and virtualisation..
Comes in flavours including Windows, Mac and Linux.

http://www.qemu.org/

well ... there IS Docker CE (community edition) and this comes with sever edition also !

So what does the project and on behalf of boinc the system.. feel and sense around the subject of using Docker CE ?

Obviously the professional version could be used for support of the main project and the CE edition of docker for the user..

https://store.docker.com/editions/community/docker-ce-desktop-windows

https://www.ctl.io/developers/blog/post/what-is-docker-and-when-to-use-it/

https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-getting-started

https://www.howtoforge.com/tutorial/how-to-use-docker-introduction/

thank you for the reply ! In reference to the use of virtual box there is a new product by berkley > http://singularity.lbl.gov/ called singularity that handles repeatable condition containers... and has low overhead for virtualisation data-set.

Some CMS users have reported problems when their jobs land at sites running Singularity -- to the point that they blacklist sites they know to run the product. I have not heard yet whether the problem has been identified, nor solved.
ID: 30733 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 196
Credit: 206,991
RAC: 141
Message 30745 - Posted: 12 Jun 2017, 7:45:56 UTC - in response to Message 30723.  
Last modified: 12 Jun 2017, 7:46:23 UTC

thank you for the reply ! In reference to the use of virtual box there is a new product by berkley > http://singularity.lbl.gov/ called singularity that handles repeatable condition containers... and has low overhead for virtualisation data-set.


Containers are not visualization. Our challenge is that 85% of the volunteers have Windows and the HEP applications only run on Linux. This project is constrained by the production code that the experiments are using. You may be interested to follow the work of the HEP software foundataion.
ID: 30745 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 196
Credit: 206,991
RAC: 141
Message 30746 - Posted: 12 Jun 2017, 7:50:42 UTC - in response to Message 30733.  

and : QEMU is obviously be of use on many projects because of machine emulation and virtualisation..
Comes in flavours including Windows, Mac and Linux.

http://www.qemu.org/



Why would this be better than VirtualBox?
ID: 30746 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30751 - Posted: 12 Jun 2017, 9:25:59 UTC - in response to Message 30746.  

QEMU operates within the virtualisation component of windows ....
has multiple machines to emulate .... & is reliable..

i have noticed virtualbox is not the fastest machine on the planet and needs improvement..

QEMU fit's right into the Linux kernel also ..

so with a little improvement should be ideal.

and : QEMU is obviously be of use on many projects because of machine emulation and virtualisation..
Comes in flavours including Windows, Mac and Linux.

http://www.qemu.org/



Why would this be better than VirtualBox?
ID: 30751 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 196
Credit: 206,991
RAC: 141
Message 30752 - Posted: 12 Jun 2017, 9:35:24 UTC - in response to Message 30751.  

QEMU operates within the virtualisation component of windows ....
has multiple machines to emulate .... & is reliable..


Is it easier to install that VirtualBox? Does it require any BIOS changes?
ID: 30752 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30753 - Posted: 12 Jun 2017, 9:36:57 UTC - in response to Message 30745.  

humm ! so ideally the HEP applications would be run under an especially installed Linux kernel and system on a virtual drive or a real Linux partition... that would be run under the windows Virtualisation client.

for example it would be possible to use an efficient AWE's amazon type system image ...

thank you for the reply ! In reference to the use of virtual box there is a new product by berkley > http://singularity.lbl.gov/ called singularity that handles repeatable condition containers... and has low overhead for virtualisation data-set.


Containers are not visualization. Our challenge is that 85% of the volunteers have Windows and the HEP applications only run on Linux. This project is constrained by the production code that the experiments are using. You may be interested to follow the work of the HEP software foundataion.
ID: 30753 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30754 - Posted: 12 Jun 2017, 9:39:57 UTC - in response to Message 30752.  
Last modified: 12 Jun 2017, 9:43:55 UTC

QEMU is a virtualiser and can run other os images ..

look here : http://wiki.qemu.org/Documentation

https://qemu.weilnetz.de/doc/qemu-doc.html

QEMU operates within the virtualisation component of windows ....
has multiple machines to emulate .... & is reliable..


Is it easier to install that VirtualBox? Does it require any BIOS changes?
ID: 30754 · Report as offensive     Reply Quote
[VENETO] boboviz

Send message
Joined: 7 May 08
Posts: 11
Credit: 17,331
RAC: 15
Message 30816 - Posted: 17 Jun 2017, 21:47:18 UTC - in response to Message 30626.  


My colleague Riccardo is actively looking at redoing in C which would also allow
much more portability and also allow to be parallel on multi-core systems.
For the moment we just run tasks in parallel, which works rather well (apart
from some current infrastructure problems). I hope to come up with
some numbers next week on GPU testing.


VERY interesting!
ID: 30816 · Report as offensive     Reply Quote
QuantumEthos

Send message
Joined: 26 Dec 11
Posts: 86
Credit: 134,289
RAC: 180
Message 30817 - Posted: 17 Jun 2017, 22:38:24 UTC - in response to Message 30816.  


My colleague Riccardo is actively looking at redoing in C which would also allow
much more portability and also allow to be parallel on multi-core systems.
For the moment we just run tasks in parallel, which works rather well (apart
from some current infrastructure problems). I hope to come up with
some numbers next week on GPU testing.


VERY interesting!



take a company like boing that will work on high performance computing for many successful years > they learn also adapt.

https://www.youtube.com/watch?v=mLQGXlxemlg - Optimizing HPC Service Delivery by a life time super computing tec

HPC Best Practices..

http://www.intertwine-project.eu/best-practice-guides
ID: 30817 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,578,195
RAC: 42
Message 30818 - Posted: 18 Jun 2017, 1:43:24 UTC - in response to Message 30817.  

Have not got the numbers (yet)., but anecdotally, both Boeing and CERN were
members of the Cray User Advisory committee many many years ago at the
beginning of the end of the mainframe era. (I am 76 years old so I am afraid I
may be a bit slow to adapt :-) My priorities are RFP,
Reliability, no use if it fails
Functionality, needs to do what you want
Performance, as fast as possible.
Eric.
ID: 30818 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 783
Credit: 5,512,922
RAC: 8,960
Message 30819 - Posted: 18 Jun 2017, 6:32:33 UTC - in response to Message 30818.  

My priorities are RFP,
Reliability, no use if it fails
Functionality, needs to do what you want
Performance, as fast as possible.
Eric.

Fully d'accord :-) :-) :-)
ID: 30819 · Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : boinc - enhancing research workloads for the benefit of mankind & humanity - Computer Optimisation - CPU , GPU & RAM - PC, Mac & ARM development


©2018 CERN