Publications by Marc Paterno
Comparing VEGAS and CUHRE for Genz 1 (absolute value) in 5d
Purpose of this document This document shows a comparison of the speed of the VEGAS and CUHRE algorithms, as implemented in the CUBA (http://www.feynarts.de/cuba/) library, and wrapped by cubacpp (https://bitbucket.org/mpaterno/cubacpp). The algorithms vegas is the Vegas algorithm of Lepage, as implemented in the CUBA library. This version uses ...
4633 sym 3 img
Timing analysis of June 30 run
Purpose of this document This document presents an analysis of the HEPnOS large-run timing data, to determine whether more work is needed on the code before we write our paper. Read the performance analysis data We read the raw data, and make the per-rank global dataframe. raw <- readRDS("theta_es_7168_2020-06-30_01.rds") ranks <- make_global_df...
3113 sym R (1095 sym/18 pcs) 13 img
Mathematica integration of Genz_1_abs in 5D
Purpose of this document This document looks at the Genz_1 (absolute value) integral in 5D as solved by Mathematica. This integral can not be solved analytically, so we try using Mathematica’s facilities for numerical integration with controlled error bounds. Data The Mathematica routine NIntegrate takes an argument PrecisionGoal that specifie...
2903 sym 3 img
Analysis of per-rank performance of events election
1 The data We are looking at a run using 16 nodes running the HEPnOS daemon, with 512 targets, and 112 nodes running eventselection, each node running 64 ranks. The dataset used is the 1691 subrun sample from the NOvA ND. raw <- readRDS("theta_es_7168_2020-06-30_01.rds") global <- make_global_df(raw) events <- make_events_df(raw) 2 Total job run...
925 sym R (1617 sym/8 pcs) 4 img
PandAna Performance part 3
1 Introduction This document presents an analysis of the reading performance of PandAna. PandAna uses the Python package h5py (http://h5py.org) to read tabular data stored in HDF5 (https://portal.hdfgroup.org/display/HDF5/HDF5). We have run a sample PandAna application on Haswell nodes of Cori at NERSC, using the same code and a varying number of...
2664 sym R (1718 sym/11 pcs) 5 img
Scaling analysis of per-rank performance of eventselection
1 The data raw_112_64 <- read_raw_dataframes("campaign_4/112_64/timing*.dat") raw_008_64 <- read_raw_dataframes("campaign_4/008_64/timing*.dat") events_112_64 <- make_events_df(raw_112_64) events_008_64 <- make_events_df(raw_008_64) We are looking at a run using 16 nodes running the HEPnOS daemon, with 512 targets. We use two runs of the eventsel...
1398 sym R (3333 sym/11 pcs) 2 img
Phase 1 regions
1 Purpose This document contains analysis of the Phase 1 behavior of the parallel CUHRE algorithm. This analysis is for the DES integrand “SigmaMiscent”, which has 7 variables of integration. 2 Read data Note: reading the RDS file is relatively slow. use the function rds_to_feather to convert the RDS file into a (larger, but faster to read) ...
2339 sym R (2267 sym/11 pcs) 6 img
Programming GPUs: A taste of CUDA and Kokkos
Programming GPUs A taste of CUDA and Kokkos Marc PaternoFermi National Accelerator Laboratory 2022-01-21 Introduction Goal of this presentation The goal of this presentation is to give the flavor of GPU programming, as opposed to both CPU (“normal”) programming and the use of GPU-accelerated libraries or tools (such as are common in the mac...
15179 sym R (3035 sym/12 pcs) 2 img
Analysis of m-Cubes error estimates
Analysis of mCubes error estimates 2021-12-06 1 Read the data We read the raw data and transform it into the dataframe x: ## Rows: 4,100 ## Columns: 21 ## $ id <fct> f2 6D, f2 6D, f2 6D, f2 6D, f2 6D, f2 6D, f2 6D, f2 6D… ## $ integrand <fct> f2, f2, f2, f2, f2, f2, f2, f2, f2, f2, f2, f2, f2, f2… ## $ ndim <int...
1999 sym R (1772 sym/1 pcs) 5 img