<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Documentation – EC3 Applications and Tools</title><link>/users/compute/orchestration/im/ec3/apps/</link><description>Recent content in EC3 Applications and Tools on Documentation</description><generator>Hugo -- gohugo.io</generator><atom:link href="/users/compute/orchestration/im/ec3/apps/index.xml" rel="self" type="application/rss+xml"/><item><title>Users: ECAS</title><link>/users/compute/orchestration/im/ec3/apps/ecas/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/users/compute/orchestration/im/ec3/apps/ecas/</guid><description>
&lt;p>The following guide is intended for researchers who want to use ECAS, a complete
environment enabling data analysis experiments, in the EGI cloud.&lt;/p>
&lt;p>ECAS (ENES Climate Analytics Service) is part of the EOSC-hub service catalog
and aims to:&lt;/p>
&lt;ol>
&lt;li>provide server-based computation,&lt;/li>
&lt;li>avoid data transfer, and&lt;/li>
&lt;li>improve reusability of data and workflows.&lt;/li>
&lt;/ol>
&lt;p>It relies on &lt;a href="https://ophidia.cmcc.it/">Ophidia&lt;/a>, a data analytics framework for
eScience, which provides declarative, server-side, and parallel data analysis,
jointly with an internal storage model able to efficiently deal with
multidimensional data and a hierarchical data organization to manage large data
volumes (&amp;ldquo;datacubes&amp;rdquo;), and on JupyterHub, to give users access to ready-to-use
computational environments and resources.&lt;/p>
&lt;p>Thanks to the Elastic Cloud Compute Cluster (EC3) platform, operated by the
&lt;a href="https://www.upv.es/index-en.html">Polytechnic University of Valencia (UPV)&lt;/a>,
researchers will be able to rely on the EGI Cloud Compute service to scale up to
larger simulations without being worried about the complexity of the underlying
infrastructure.&lt;/p>
&lt;p>This guide will show how to:&lt;/p>
&lt;ul>
&lt;li>deploy an ECAS elastic cluster of VMs in order to automatically install and
configure the whole ECAS environment services, i.e. JupyterHub, PyOphidia,
several Python libraries such as &lt;code>numpy&lt;/code>, &lt;code>matplotlib&lt;/code> and &lt;code>Basemap&lt;/code>;&lt;/li>
&lt;li>perform data intensive analysis using the Ophidia HPDA framework;&lt;/li>
&lt;li>access the ECAS JupyterHub interface to create and share documents containing
live code, equations, visualizations and explanatory text.&lt;/li>
&lt;/ul>
&lt;h2 id="deploy-an-ecas-cluster-with-ec3">Deploy an ECAS cluster with EC3&lt;/h2>
&lt;p>In the latest release of the EC3 platform a new Ansible receipt is available
for researchers interested to deploy ECAS cluster on the EGI Infrastructure.
The next sections provide details on how to configure and deploy an ECAS cluster
on EGI resources.&lt;/p>
&lt;h3 id="configure-and-deploy-the-cluster">Configure and deploy the cluster&lt;/h3>
&lt;p>To configure and deploy a Virtual Elastic Cluster using EC3, access the
&lt;a href="https://servproject.i3m.upv.es/ec3-ltos/index.php">EC3 platform front page&lt;/a> and
click on the &lt;strong>&amp;quot;Deploy your cluster&amp;quot;&lt;/strong> link as shown in the figure below:&lt;/p>
&lt;p>&lt;img src="ecas-front.png" alt="EC3 front page">&lt;/p>
&lt;p>A wizard will guide you through the cluster configuration process. Specifically,
the general wizard steps include:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>LRMS selection&lt;/strong>: choose &lt;strong>ECAS&lt;/strong> from the list of LRMSs (Local Resource
Management System) that can be automatically installed and configured by EC3.&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="ecas-lrms.png" alt="LRMS selection">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Endpoint&lt;/strong>: the endpoints of the providers where to deploy the ECAS elastic
cluster. The endpoints serving the &lt;code>vo.access.egi.eu&lt;/code> VO are dynamically
retrieved from the cloud information system using REST APIs.&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="ecas-endpoint.png" alt="FedCloud endpoint">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Operating System&lt;/strong>: choose EGI CentOS7 as cluster OS.&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="ecas-os.png" alt="Cluster OS">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Instance details&lt;/strong>, in terms of CPU and RAM to allocate for the front-end
and the working nodes.&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="ecas-instance.png" alt="Instance details">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Cluster&amp;rsquo;s size and name&lt;/strong>: the name of the cluster and the maximum number of
nodes of the cluster, without including the frontend. This value indicates the
maximum number of working nodes that the cluster can scale to. Initially, the
cluster is created with the frontend and only one working node: the other
working nodes will be powered on on-demand.&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="ecas-size.png" alt="Cluster size and name">&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Resume and Launch&lt;/strong>: a summary of the chosen cluster configuration. To start
the deployment process, click the Submit button.&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="ecas-summary.png" alt="Cluster summary">&lt;/p>
&lt;p>When the frontend node of the cluster has been successfully deployed, you
will be notified with the credentials to access via SSH.&lt;/p>
&lt;p>&lt;img src="ecas-end.png" alt="Cluster ssh credentials">&lt;/p>
&lt;p>The cluster details are available by clicking on the &amp;quot;Manage your deployed
clusters&amp;quot; link on the front page:&lt;/p>
&lt;p>&lt;img src="ecas-manage.png" alt="Cluster management">&lt;/p>
&lt;div class="alert alert-primary" role="alert">
&lt;h4 class="alert-heading">Note&lt;/h4>
The configuration of the cluster may
take some time. Please wait for its completion before starting to use the
cluster.
&lt;/div>
&lt;h3 id="accessing-the-cluster">Accessing the cluster&lt;/h3>
&lt;p>To access the frontend of the cluster:&lt;/p>
&lt;ul>
&lt;li>download the SSH private key provided by the EC3 portal;&lt;/li>
&lt;li>change its permissions to &lt;code>600&lt;/code>;&lt;/li>
&lt;li>access via SSH providing the key as identity file for public key
authentication.&lt;/li>
&lt;/ul>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ce5c00;font-weight:bold">[&lt;/span>user@localhost EC3&lt;span style="color:#ce5c00;font-weight:bold">]&lt;/span>$ ssh -i key.pem cloudadm@&amp;lt;YOUR_CLUSTER_IP&amp;gt;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Last login: Mon Nov &lt;span style="color:#0000cf;font-weight:bold">18&lt;/span> 11:37:29 &lt;span style="color:#0000cf;font-weight:bold">2019&lt;/span> from torito.i3m.upv.es
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ce5c00;font-weight:bold">[&lt;/span>cloudadm@oph-server ~&lt;span style="color:#ce5c00;font-weight:bold">]&lt;/span>$ sudo su -
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ce5c00;font-weight:bold">[&lt;/span>root@oph-server ~&lt;span style="color:#ce5c00;font-weight:bold">]&lt;/span>&lt;span style="color:#8f5902;font-style:italic">#&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Both the frontend and the working nodes are configured by Ansible. This
process usually takes some time. You can monitor the status of the cluster
configuration using the &lt;code>is_cluster_ready&lt;/code> command-line tool:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ce5c00;font-weight:bold">[&lt;/span>root@oph-server ~&lt;span style="color:#ce5c00;font-weight:bold">]&lt;/span>&lt;span style="color:#8f5902;font-style:italic"># is_cluster_ready&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Cluster is still configuring.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The cluster is successfully configured when the command returns the following
message:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ce5c00;font-weight:bold">[&lt;/span>root@oph-server ~&lt;span style="color:#ce5c00;font-weight:bold">]&lt;/span>&lt;span style="color:#8f5902;font-style:italic"># is_cluster_ready&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Cluster configured!
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>As SLURM is used as workload manager, it is possible to check the status of the
working nodes by using the &lt;code>sinfo&lt;/code> command, which provides information about Slurm
nodes and partitions.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ce5c00;font-weight:bold">[&lt;/span>root@oph-server ~&lt;span style="color:#ce5c00;font-weight:bold">]&lt;/span>&lt;span style="color:#8f5902;font-style:italic"># sinfo&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>debug* up infinite &lt;span style="color:#0000cf;font-weight:bold">1&lt;/span> down* oph-io2
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>debug* up infinite &lt;span style="color:#0000cf;font-weight:bold">1&lt;/span> idle oph-io1
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="accessing-the-scientific-eco-system">Accessing the scientific eco-system&lt;/h3>
&lt;p>ECAS provides two different ways to get access to its scientific eco-system:
Ophidia client (&lt;code>oph_term&lt;/code>) and JupyterHub.&lt;/p>
&lt;h4 id="perform-some-basic-operations-with-ophidia">Perform some basic operations with Ophidia&lt;/h4>
&lt;p>Run the Ophidia terminal as &lt;code>ophuser&lt;/code> user.&lt;/p>
&lt;p>&lt;img src="ecas-oph_term.png" alt="Ophidia terminal">&lt;/p>
&lt;p>The default parameters are already defined as environmental variables inside the
&lt;code>.bashrc&lt;/code> file:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#204a87">export&lt;/span> &lt;span style="color:#000">OPH_SERVER_HOST&lt;/span>&lt;span style="color:#ce5c00;font-weight:bold">=&lt;/span>&lt;span style="color:#4e9a06">&amp;#34;127.0.0.1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#204a87">export&lt;/span> &lt;span style="color:#000">OPH_SERVER_PORT&lt;/span>&lt;span style="color:#ce5c00;font-weight:bold">=&lt;/span>&lt;span style="color:#4e9a06">&amp;#34;11732&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#204a87">export&lt;/span> &lt;span style="color:#000">OPH_PASSWD&lt;/span>&lt;span style="color:#ce5c00;font-weight:bold">=&lt;/span>&lt;span style="color:#4e9a06">&amp;#34;abcd&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#204a87">export&lt;/span> &lt;span style="color:#000">OPH_USER&lt;/span>&lt;span style="color:#ce5c00;font-weight:bold">=&lt;/span>&lt;span style="color:#4e9a06">&amp;#34;oph-test&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Create an empty container and a new datacube with random data and dimensions.&lt;/p>
&lt;p>&lt;img src="ecas-container-1.png" alt="Create container">&lt;/p>
&lt;p>&lt;img src="ecas-container-2.png" alt="Create random datacube">&lt;/p>
&lt;p>Now, you can submit your first operation of data transformation: let&amp;rsquo;s reduce
the whole datacube in a single value for grid point using the average along the
time:&lt;/p>
&lt;p>&lt;img src="ecas-reduce.png" alt="Reduce datacube">&lt;/p>
&lt;p>Let&amp;rsquo;s have a look at the environment by listing the datacubes and containers in
the session:&lt;/p>
&lt;p>&lt;img src="ecas-list.png" alt="List objects in session">&lt;/p>
&lt;p>By default, the Ophidia terminal will use the last output datacube PID. So, you
can use the &lt;code>oph_explorecube&lt;/code> operator to visualize the first 100 values.&lt;/p>
&lt;p>&lt;img src="ecas-explore.png" alt="Explore cube">&lt;/p>
&lt;p>For further details about the Ophidia operators, please refer to the official
&lt;a href="https://ophidia.cmcc.it/">documentation&lt;/a>.&lt;/p>
&lt;h4 id="accessing-the-jupyter-interface">Accessing the Jupyter interface&lt;/h4>
&lt;p>To access the Jupyter interface, open the browser at
&lt;code>https://&amp;lt;YOUR_CLUSTER_IP&amp;gt;:443/jupyter&lt;/code> and log in to the system using the
username and password specified in the &lt;code>jupyterhub_config.pyp&lt;/code> configuration
file (see the &lt;code>c.Authenticator.whitelist&lt;/code> and &lt;code>c.DummyAuthenticator.password&lt;/code>
lines) located under the &lt;code>/root&lt;/code> folder.&lt;/p>
&lt;p>&lt;img src="ecas-jupyterhub.png" alt="JupyterHub interface">&lt;/p>
&lt;p>From JupyterHub in ECAS you can do several things such as:&lt;/p>
&lt;ul>
&lt;li>create and run a Jupyter Notebook exploiting PyOphidia and other Python
libraries for data manipulation, analysis and visualization (e.g. NumPy,
matplotlib, Cartopy);&lt;/li>
&lt;li>browse the directories, download and update files in the home folder;&lt;/li>
&lt;li>execute operators and workflows directly from the Ophidia Terminal;&lt;/li>
&lt;li>access to a read-only data repository hosted in a Onedata space and perform
any analysis on this shared data.&lt;/li>
&lt;/ul>
&lt;p>The ECAS space shared in the ECAS environment through the Onedata services is
available at the &lt;code>onedata/ecas_provider/ECAS_space&lt;/code> folder located under the
&lt;code>/data&lt;/code> directory.&lt;/p>
&lt;p>&lt;img src="ecas-space.png" alt="Onedata ECAS space">&lt;/p>
&lt;p>To get started with the ECAS environment capabilities, open the
&lt;code>ECAS_Basics.ipynb&lt;/code> notebook available under the &lt;code>notebooks/&lt;/code> folder in the home
directory.&lt;/p>
&lt;p>&lt;img src="ecas-jupyter.png" alt="ECAS notebooks">&lt;/p>
&lt;h3 id="accessing-the-grafana-ui">Accessing the Grafana UI&lt;/h3>
&lt;p>This section will show how to monitor the ECAS environment and the resource
usage and get aggregated information over time.&lt;/p>
&lt;p>To access the Grafana monitoring interface, open the browser at
&lt;code>https://&amp;lt;YOUR_CLUSTER_IP&amp;gt;:3000&lt;/code> and log in to the system using the &lt;em>admin&lt;/em>
username and the password specified in the &lt;code>.grafana_pwd&lt;/code> file located under the
&lt;code>/root&lt;/code> folder.&lt;/p>
&lt;p>&lt;img src="grafana-login.png" alt="Grafana UI">&lt;/p>
&lt;p>The Grafana-based monitoring system provides two dashboards in order to monitor
the ECAS cluster both at system and application level.&lt;/p>
&lt;ul>
&lt;li>The &lt;strong>infrastructure dashboard&lt;/strong> provides information about the percentage of
CPU, RAM, SWAP and disk used on each Node.js (the frontend and the working
nodes).&lt;/li>
&lt;/ul>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th style="text-align:center">frontend node&lt;/th>
&lt;th style="text-align:center">working node&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td style="text-align:center">&lt;img src="ecas-infra-server.png" alt="Front-end infra metrics">&lt;/td>
&lt;td style="text-align:center">&lt;img src="ecas-infra-io.png" alt="wn infra metrics">&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;ul>
&lt;li>The &lt;strong>application dashboard&lt;/strong> shows information about which operator/workflow
is being executed and its current execution status and provides aggregated
information over time (e.g. number of total, completed and failed
workflows/tasks, hourly weighted average of running cores).&lt;/li>
&lt;/ul>
&lt;p>&lt;img src="ecas-application-dashboard.png" alt="Application metrics dashboard">&lt;/p>
&lt;h3 id="destroy-the-cluster">Destroy the cluster&lt;/h3>
&lt;p>To destroy the running cluster use the &lt;code>delete&lt;/code> action from the cluster
management page.&lt;/p>
&lt;p>&lt;img src="ecas-manage.png" alt="Destroy cluster">&lt;/p>
&lt;h2 id="references">References&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="https://ophidia.cmcc.it/">Ophidia&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/ECAS-Lab">GitHub: ECAS-Lab&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://github.com/OphidiaBigData/ansible-role-ophidia-cluster">GitHub: Ansible role Ophidia cluster&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://www.grycap.upv.es/ec3">EC3&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://www.github.com/grycap/ec3">GitHub EC3&lt;/a>&lt;/li>
&lt;/ul></description></item><item><title>Users: HTC</title><link>/users/compute/orchestration/im/ec3/apps/htc/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>/users/compute/orchestration/im/ec3/apps/htc/</guid><description>
&lt;h2 id="templates">Templates&lt;/h2>
&lt;p>We will build a torque cluster on one of the EGI Cloud providers using EC3.
Create a directory to store EC3 configuration and init it with the
&lt;a href="../../../../../../getting-started/cli">FedCloud client&lt;/a>:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>mkdir -p torque
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#204a87">cd&lt;/span> torque
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>fedcloud ec3 init --site &amp;lt;your site&amp;gt; --vo &amp;lt;your vo&amp;gt;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>We will use the following templates:&lt;/p>
&lt;ol>
&lt;li>&lt;code>torque&lt;/code> (from ec3 default templates)&lt;/li>
&lt;li>&lt;code>nfs&lt;/code> (from ec3 default templates),&lt;/li>
&lt;li>&lt;code>ubuntu-1604&lt;/code> (user&amp;rsquo;s template),&lt;/li>
&lt;li>&lt;code>cluster_configure&lt;/code> (user&amp;rsquo;s template)&lt;/li>
&lt;/ol>
&lt;p>You can find the content below (make sure that you adapt them to your needs):&lt;/p>
&lt;p>&lt;code>templates/ubuntu-1604.radl&lt;/code> specifies the VM image to use in the deployment:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-plaintext" data-lang="plaintext">&lt;span style="display:flex;">&lt;span>description ubuntu-1604 (
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> kind = &amp;#39;images&amp;#39; and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> short = &amp;#39;Ubuntu 16.04&amp;#39; and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> content = &amp;#39;FEDCLOUD Image for EGI Ubuntu 16.04 LTS [Ubuntu/16.04/VirtualBox]&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>)
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>system front (
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> cpu.arch = &amp;#39;x86_64&amp;#39; and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> cpu.count &amp;gt;= 4 and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> memory.size &amp;gt;= 8196 and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> disk.0.os.name = &amp;#39;linux&amp;#39; and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> disk.0.image.url = &amp;#39;ost://&amp;lt;url&amp;gt;/&amp;lt;image_id&amp;gt;&amp;#39; and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> disk.0.os.credentials.username = &amp;#39;ubuntu&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>)
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>system wn (
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> cpu.arch = &amp;#39;x86_64&amp;#39; and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> cpu.count &amp;gt;= 2 and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> memory.size &amp;gt;= 2048m and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ec3_max_instances = 10 and # maximum number of working nodes in the cluster
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> disk.0.os.name = &amp;#39;linux&amp;#39; and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> disk.0.image.url = &amp;#39;ost://&amp;lt;url&amp;gt;/&amp;lt;image_id&amp;gt;&amp;#39; and
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> disk.0.os.credentials.username = &amp;#39;ubuntu&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;code>templates/cluster_configure.radl&lt;/code> customises the torque deployment to match our
needs:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-plaintext" data-lang="plaintext">&lt;span style="display:flex;">&lt;span>configure front (
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>@begin
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>---
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - vars:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - USERS:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - { name: user01, password: &amp;lt;PASSWORD&amp;gt; }
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - { name: user02, password: &amp;lt;PASSWORD&amp;gt; }
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[..]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> tasks:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - user:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> name: &amp;#34;{{ item.name }}&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> password: &amp;#34;{{ item.password }}&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> shell: /bin/bash
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> append: yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> state: present
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> with_items: &amp;#34;{{ USERS }}&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - name: Install missing dependences in Debian system
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> apt: pkg={{ item }} state=present
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> with_items:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - build-essential
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - mpich
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - gcc
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - g++
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - vim
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> become: yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> when: ansible_os_family == &amp;#34;Debian&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - name: SSH without password
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> include_role:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> name: grycap.ssh
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> vars:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ssh_type_of_node: front
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ssh_user: &amp;#34;{{ user.name }}&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> loop: &amp;#39;{{ USERS }}&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> loop_control:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> loop_var: user
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - name: Updating the /etc/hosts.allow file
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> lineinfile:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> path: /etc/hosts.allow
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> line: &amp;#39;sshd: XXX.XXX.XXX.*&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> become: yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - name: Updating the /etc/hosts.deny file
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> lineinfile:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> path: /etc/hosts.deny
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> line: &amp;#39;ALL: ALL&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> become: yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>@end
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>)
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>configure wn (
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>@begin
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>---
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - vars:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - USERS:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - { name: user01, password: &amp;lt;PASSWORD&amp;gt; }
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - { name: user02, password: &amp;lt;PASSWORD&amp;gt; }
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[..]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> tasks:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - user:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> name: &amp;#34;{{ item.name }}&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> password: &amp;#34;{{ item.password }}&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> shell: /bin/bash
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> append: yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> state: present
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> with_items: &amp;#34;{{ USERS }}&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - name: Install missing dependences in Debian system
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> apt: pkg={{ item }} state=present
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> with_items:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - build-essential
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - mpich
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - gcc
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - g++
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - vim
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> become: yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> when: ansible_os_family == &amp;#34;Debian&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - name: SSH without password
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> include_role:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> name: grycap.ssh
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> vars:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ssh_type_of_node: wn
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ssh_user: &amp;#34;{{ user.name }}&amp;#34;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> loop: &amp;#39;{{ USERS }}&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> loop_control:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> loop_var: user
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - name: Updating the /etc/hosts.allow file
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> lineinfile:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> path: /etc/hosts.allow
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> line: &amp;#39;sshd: XXX.XXX.XXX.*&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> become: yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> - name: Updating the /etc/hosts.deny file
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> lineinfile:
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> path: /etc/hosts.deny
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> line: &amp;#39;ALL: ALL&amp;#39;
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> become: yes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>@end
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="create-the-cluster">Create the cluster&lt;/h2>
&lt;p>Deploy the cluster using ec3 docker image:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>$ docker run -it -v &lt;span style="color:#000">$PWD&lt;/span>:/root/ -w /root &lt;span style="color:#4e9a06">\
&lt;/span>&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#4e9a06">&lt;/span> grycap/ec3 launch torque_cluster &lt;span style="color:#4e9a06">\
&lt;/span>&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#4e9a06">&lt;/span> torque nfs ubuntu-1604 refresh cluster_configure &lt;span style="color:#4e9a06">\
&lt;/span>&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#4e9a06">&lt;/span> -a auth.dat
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Creating infrastructure
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Infrastructure successfully created with ID: 529c62ec-343e-11e9-8b1d-300000000002
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Front-end state: launching
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Front-end state: pending
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Front-end state: running
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>IP: 212.189.145.XXX
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Front-end configured with IP 212.189.145.XXX
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Transferring infrastructure
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Front-end ready!
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>To access the cluster, use the command:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>$ docker run -ti -v &lt;span style="color:#000">$PWD&lt;/span>:/root/ -w /root grycap/ec3 ssh torque_cluster
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Warning: Permanently added &lt;span style="color:#4e9a06">&amp;#39;212.189.145.140&amp;#39;&lt;/span> &lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>ECDSA&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span> to the list of known hosts.
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Welcome to Ubuntu 14.04.5 LTS &lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>GNU/Linux 3.13.0-164-generic x86_64&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> * Documentation: https://help.ubuntu.com/
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Last login: Tue Feb &lt;span style="color:#0000cf;font-weight:bold">19&lt;/span> 13:04:45 &lt;span style="color:#0000cf;font-weight:bold">2019&lt;/span> from servproject.i3m.upv.es
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="configuration-of-the-cluster">Configuration of the cluster&lt;/h2>
&lt;h3 id="enable-password-based-authentication">Enable Password-based authentication&lt;/h3>
&lt;p>Change settings in &lt;code>/etc/ssh/sshd_config&lt;/code>&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-plaintext" data-lang="plaintext">&lt;span style="display:flex;">&lt;span># Change to no to disable tunnelled clear text passwords
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>PasswordAuthentication yes
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>and restart the ssh daemon:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>sudo service ssh restart
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="configure-the-number-of-processors-of-the-cluster">Configure the number of processors of the cluster&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>$ cat /var/spool/torque/server_priv/nodes
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>wn1 &lt;span style="color:#000">np&lt;/span>&lt;span style="color:#ce5c00;font-weight:bold">=&lt;/span>XX
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>wn2 &lt;span style="color:#000">np&lt;/span>&lt;span style="color:#ce5c00;font-weight:bold">=&lt;/span>XX
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#ce5c00;font-weight:bold">[&lt;/span>...&lt;span style="color:#ce5c00;font-weight:bold">]&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>To obtain the number of CPU/cores (np) in Linux, use the command:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>$ lscpu &lt;span style="color:#000;font-weight:bold">|&lt;/span> grep -i CPU
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>CPU op-mode&lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>s&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span>: 32-bit, 64-bit
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>CPU&lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>s&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span>: &lt;span style="color:#0000cf;font-weight:bold">16&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>On-line CPU&lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>s&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span> list: 0-15
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>CPU family: &lt;span style="color:#0000cf;font-weight:bold">6&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Model name: Intel&lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>R&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span> Xeon&lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>R&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span> CPU E5520 @ 2.27GHz
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>CPU MHz: 2266.858
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>NUMA node0 CPU&lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>s&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span>: 0-3,8-11
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>NUMA node1 CPU&lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>s&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span>: 4-7,12-15
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="test-the-cluster">Test the cluster&lt;/h3>
&lt;p>Create a simple test script:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>$ cat test.sh
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#8f5902;font-style:italic">#!/bin/bash&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#8f5902;font-style:italic">#PBS -N job&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#8f5902;font-style:italic">#PBS -q batch&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#8f5902;font-style:italic">#cd $PBS_O_WORKDIR/&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>hostname -f
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>sleep &lt;span style="color:#0000cf;font-weight:bold">5&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Submit to the batch queue:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>qsub -l &lt;span style="color:#000">nodes&lt;/span>&lt;span style="color:#ce5c00;font-weight:bold">=&lt;/span>&lt;span style="color:#0000cf;font-weight:bold">2&lt;/span> test.sh
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="destroy-the-cluster">Destroy the cluster&lt;/h2>
&lt;p>To destroy the running cluster, use the command:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="background-color:#f8f8f8;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-shell" data-lang="shell">&lt;span style="display:flex;">&lt;span>$ docker run -it -v &lt;span style="color:#000">$PWD&lt;/span>:/root/ -w /root grycap/ec3 destroy torque_cluster
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>WARNING: you are going to delete the infrastructure &lt;span style="color:#ce5c00;font-weight:bold">(&lt;/span>including frontend and nodes&lt;span style="color:#ce5c00;font-weight:bold">)&lt;/span>.
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Continue &lt;span style="color:#ce5c00;font-weight:bold">[&lt;/span>y/N&lt;span style="color:#ce5c00;font-weight:bold">]&lt;/span>? y
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>Success deleting the cluster!
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item></channel></rss>