<div class="reveal">
<div class="slides">
<section id="title-slide">
<h1 class="title">Docker for research</h1>
<p class="subtitle">… and data analysis</p>
<p class="author">J. Fernando Sánchez (<a href="mailto:jf.sanchez@upm">jf.sanchez@upm</a>)</p>
<p class="date">2018</p>
<section><section id="intro" class="title-slide slide level1 white" data-background="img/intro.jpg"><h1>Intro</h1></section><section id="before-we-begin" class="slide level2">
<h2>Before we begin</h2>
<p>Code available at:</p>
<p><a href="" class="uri"></a></p>
<p>Live demos at:</p>
<p><strong><a href="" class="uri"></a></strong></p>
<p><a href="" class="uri"></a></p>
<p><a href="" class="uri"></a></p>
<p>Feel free to log in, but try not to break them for now 😉</p>
</section><section id="my-name-is-fernando-and" class="slide level2">
<h2>My name is Fernando and…</h2>
<p><img data-src="img/im-a-researcher.jpg" /></p>
</section><section id="at-grupo-de-sistemas-inteligentes" class="slide level2">
<h2>At Grupo de Sistemas Inteligentes</h2>
<div class="columns">
<div class="column" style="width:50%;">
<p><img data-src="img/gsi.png" /></p>
</div><div class="column" style="width:50%;">
<li>Machine Learning and Big Data</li>
<li>NLP and Sentiment Analysis</li>
<li>Social Network Analysis</li>
<li>Agents and Simulation</li>
<li>Linked Data and Semantic Technologies</li>
<p><a href="" class="uri"></a></p>
</section><section id="and-i-docker" class="slide level2">
<h2>And I ❤ Docker</h2>
<div class="columns">
<div class="column" style="width:50%;">
<p><img data-src="img/docker.jpg" /></p>
</div><div class="column" style="width:50%;">
<li>Docker+research for 3+ years</li>
<li>Advocate for ~2 years</li>
<li>Internal infrastructure: ansible, k8s and docker</li>
<li>Teach (with) it</li>
</section><section id="about-this-talk" class="slide level2">
<h2>About this talk</h2>
<p>Takeaway: <strong><em>you can set up a multi-user data analysis environment with isolation in minutes</em></strong></p>
<p>Plus: using docker to perform and share experiments is even easier</p>
<p>Related Meetups:</p>
<p><a href="">Big Data and Machine Learning with Docker</a></p>
<p><a href="">Using Docker in Machine Learning Projects</a></p>
<section><section id="for-researchers" class="title-slide slide level1 white" data-background="img/research.jpg" style="color:white"><h1>For researchers</h1></section><section id="experiment-publish-repeat" class="slide level2">
<h2>Experiment, publish, repeat</h2>
<p><img data-src="img/peerreview.jpg" /></p>
</section><section id="reproducibility" class="slide level2">
<img data-src="img/goodluck.png" alt="@ianholmes" /><figcaption><a href="">@ianholmes</a></figcaption>
</section><section id="obstacles" class="slide level2">
<div class="columns">
<div class="column" style="width:50%;">
<li><strong>Missing data</strong></li>
<li>Bleeding edge tools and libraries</li>
<li>Throwaway software
<li>Little to no documentation</li>
<li>Multiple languages</li>
</div><div class="column" style="width:50%;">
<img data-src="img/will_it_work.png" alt="" style="height:80.0%" /><figcaption><a href="" class="uri"></a></figcaption>
</section><section id="obstacles-1" class="slide level2">
<p><img data-src="img/noidea-pc.png" /></p>
</section><section id="is-it-a-problem" class="slide level2">
<h2>Is it a problem?</h2>
<img data-src="img/reproducibility.jpg" alt="" style="height:80.0%" /><figcaption><a href=""></a></figcaption>
</section><section id="jupyter-notebooks" class="slide level2">
<h2>Jupyter notebooks</h2>
<p><img data-src="img/jupyter-screenshot.png" /></p>
</section><section id="jupyter-architecture" class="slide level2">
<h2>Jupyter architecture</h2>
<img data-src="img/jupyter-architecture.png" alt="" /><figcaption><a href="" class="uri"></a></figcaption>
</section><section id="docker-to-the-rescue" class="slide level2">
<h2>Docker to the rescue</h2>
<img data-src="img/dockerrescue.png" alt="" /><figcaption><a href=""></a></figcaption>
</section><section id="jupyterdocker-stacks" class="slide level2">
<p><img data-src="img/dockerstacks.png" style="height:50.0%" /></p>
</section><section id="reproducible-environment" class="slide level2">
<h2>Reproducible environment</h2>
<div class="sourceCode" id="cb1"><pre class="sourceCode bash"><code class="sourceCode bash"><a class="sourceLine" id="cb1-1" data-line-number="1"><span class="ex">docker</span> run --rm -p 8888:8888 \</a>
<a class="sourceLine" id="cb1-2" data-line-number="2"> -v <span class="va">$(</span><span class="ex">WDIR</span><span class="va">)</span>/:/home/jovyan/work/ \</a>
<a class="sourceLine" id="cb1-3" data-line-number="3"> jupyter/scipy-notebook</a></code></pre></div>
</section><section id="and-friendly-too" class="slide level2">
<h2>And friendly, too</h2>
<div class="sourceCode" id="cb2"><pre class="sourceCode yaml"><code class="sourceCode yaml"><a class="sourceLine" id="cb2-1" data-line-number="1"><span class="fu">version:</span><span class="at"> </span><span class="st">&#39;2&#39;</span></a>
<a class="sourceLine" id="cb2-2" data-line-number="2"><span class="fu">services:</span></a>
<a class="sourceLine" id="cb2-3" data-line-number="3"> <span class="fu">jupyter:</span></a>
<a class="sourceLine" id="cb2-4" data-line-number="4"> <span class="fu">image:</span><span class="at"> jupyter/scipy-notebook</span></a>
<a class="sourceLine" id="cb2-5" data-line-number="5"> <span class="fu">volumes:</span></a>
<a class="sourceLine" id="cb2-6" data-line-number="6"> <span class="kw">-</span> <span class="st">&quot;./.nbconfig:/home/jovyan/.jupyter/nbconfig&quot;</span></a>
<a class="sourceLine" id="cb2-7" data-line-number="7"> <span class="kw">-</span> <span class="st">&quot;./work:/home/jovyan/work/&quot;</span></a>
<a class="sourceLine" id="cb2-8" data-line-number="8"> <span class="fu">ports:</span></a>
<a class="sourceLine" id="cb2-9" data-line-number="9"> <span class="kw">-</span> <span class="st">&quot;8888:8888&quot;&quot;</span></a></code></pre></div>
<div class="sourceCode" id="cb3"><pre class="sourceCode bash"><code class="sourceCode bash"><a class="sourceLine" id="cb3-1" data-line-number="1"><span class="ex">docker-compose</span> up</a></code></pre></div>
</section><section id="related-projects" class="slide level2">
<h2>Related projects</h2>
<li>Using docker images to share trained systems</li>
<img data-src="img/gym.png" alt="" height="500" /><figcaption><a href="" class="uri"></a></figcaption>
<section><section id="for-small-groups" class="title-slide slide level1 white" data-background="img/group.jpeg"><h1>For small groups</h1></section><section id="requirements" class="slide level2">
<li>Shared environments</li>
<li>Resource sharing</li>
<li>Easy configuration</li>
<p>And <strong>little to no overhead</strong></p>
</section><section id="isolation" class="slide level2">
<p><img data-src="img/noidea.jpg" /></p>
</section><section id="jupyterhub" class="slide level2">
<div class="columns">
<div class="column" style="width:60%;">
<img data-src="img/jhub-parts.png" alt="" height="500" /><figcaption><a href="" class="uri"></a></figcaption>
</div><div class="column" style="width:40%;">
<h4 id="authenticators">Authenticators</h4>
<h4 id="spawners">Spawners</h4>
</section><section id="more-infrastructure" class="slide level2">
<h2>More infrastructure</h2>
<p><img data-src="img/docker-gitlab.jpg" class="noborder" height="250" /> <img data-src="img/nextcloud.jpg" class="noborder" height="250" /></p>
<p><img data-src="img/sharelatex.jpg" class="noborder" height="250" /></p>
<section><section id="demo" class="title-slide slide level1" data-background="img/party.jpg"><h1>Demo</h1></section><section id="its-demo-time" class="slide level2">
<h2>Its demo time</h2>
<p><img data-src="img/demogods.jpg" style="height:80.0%" /></p>
<p><a href="" class="uri"></a> <a href="" class="uri"></a></p>
<section><section id="other-tools" class="title-slide slide level1"><h1>Other tools</h1></section><section id="zeppelin" class="slide level2">
<li>Alternative to Jupyter</li>
<img data-src="img/zeppelin.png" alt="" /><figcaption><a href="" class="uri"></a></figcaption>
</section><section id="cocalc" class="slide level2">
<li>Alternative to Jupyter</li>
<img data-src="img/cocalc.png" alt="" height="500" /><figcaption><a href="" class="uri"></a></figcaption>
</section><section id="docker-nvidia" class="slide level2">
<li>CUDA for docker</li>
<img data-src="img/dockernvidia.png" alt="" /><figcaption><a href="" class="uri"></a></figcaption>
</section><section id="jupyter-binder" class="slide level2">
<h2>Jupyter Binder</h2>
<li>Custom Jupyter from git repositories</li>
<img data-src="img/binder.png" alt="" height="500" /><figcaption><a href="" class="uri"></a></figcaption>
</section><section id="knowledge-repo" class="slide level2">
<img data-src="img/knowledgerepo.png" alt="" /><figcaption><a href="" class="uri"></a></figcaption>
<section><section id="conclusions" class="title-slide slide level1"><h1>Conclusions</h1></section><section id="lessons-learned" class="slide level2">
<h2>Lessons learned</h2>
<li>Docker + Docker-compose
<li>Reproducible environments (partially)</li>
<li>Reduced tooling / experience</li>
<li>Ephemeral containers force you to automate/document installation</li>
<li>Shared environments</li>
<li>Web interface (zero knowledge)</li>
</section><section id="whats-missing" class="slide level2">
<h2>Whats missing?</h2>
<li>Roles and permissions</li>
</section><section id="thanks-for-listening" class="slide level2">
<h2>Thanks for listening!</h2>
<p><a href="" class="uri"></a></p>
<p><a href=""></a></p>
