1
0
mirror of https://github.com/balkian/lab-in-a-box.git synced 2024-12-22 04:48:12 +00:00
lab-in-a-box/slides/slides.html
J. Fernando Sánchez 83b7b60897 Add slides
2018-05-21 19:30:55 +02:00

378 lines
17 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<meta name="author" content="J. Fernando Sánchez (jf.sanchez@upm)">
<meta name="dcterms.date" content="2018-01-01">
<title>Docker for research</title>
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no, minimal-ui">
<link rel="stylesheet" href="reveal.js/css/reveal.css">
<style type="text/css">
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
</style>
<style type="text/css">
a.sourceLine { display: inline-block; line-height: 1.25; }
a.sourceLine { pointer-events: none; color: inherit; text-decoration: inherit; }
a.sourceLine:empty { height: 1.2em; position: absolute; }
.sourceCode { overflow: visible; }
code.sourceCode { white-space: pre; position: relative; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
code.sourceCode { white-space: pre-wrap; }
a.sourceLine { text-indent: -1em; padding-left: 1em; }
}
pre.numberSource a.sourceLine
{ position: relative; }
pre.numberSource a.sourceLine:empty
{ position: absolute; }
pre.numberSource a.sourceLine::before
{ content: attr(data-line-number);
position: absolute; left: -5em; text-align: right; vertical-align: baseline;
border: none; pointer-events: all;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
a.sourceLine::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<link rel="stylesheet" href="reveal.js/css/theme/white.css" id="theme">
<link rel="stylesheet" href="style.css"/>
<!-- Printing and PDF exports -->
<script>
var link = document.createElement( 'link' );
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = window.location.search.match( /print-pdf/gi ) ? 'reveal.js/css/print/pdf.css' : 'reveal.js/css/print/paper.css';
document.getElementsByTagName( 'head' )[0].appendChild( link );
</script>
<!--[if lt IE 9]>
<script src="reveal.js/lib/js/html5shiv.js"></script>
<![endif]-->
</head>
<body>
<div class="reveal">
<div class="slides">
<section id="title-slide">
<h1 class="title">Docker for research</h1>
<p class="subtitle">… and data analysis</p>
<p class="author">J. Fernando Sánchez (<a href="mailto:jf.sanchez@upm">jf.sanchez@upm</a>)</p>
<p class="date">2018</p>
</section>
<section><section id="intro" class="title-slide slide level1 white" data-background="img/intro.jpg"><h1>Intro</h1></section><section id="before-we-begin" class="slide level2">
<h2>Before we begin</h2>
<p>Code available at:</p>
<p><a href="https://github.com/balkian/lab-in-a-box" class="uri">https://github.com/balkian/lab-in-a-box</a></p>
<p>Live demos at:</p>
<p><strong><a href="https://github.todevnull.com" class="uri">https://github.todevnull.com</a></strong></p>
<p><a href="https://lab.todevnull.com" class="uri">https://lab.todevnull.com</a></p>
<p><a href="https://hub.todevnull.com" class="uri">https://hub.todevnull.com</a></p>
<p>Feel free to log in, but try not to break them for now 😉</p>
</section><section id="my-name-is-fernando-and" class="slide level2">
<h2>My name is Fernando and…</h2>
<p><img data-src="img/im-a-researcher.jpg" /></p>
</section><section id="at-grupo-de-sistemas-inteligentes" class="slide level2">
<h2>At Grupo de Sistemas Inteligentes</h2>
<div class="columns">
<div class="column" style="width:50%;">
<p><img data-src="img/gsi.png" /></p>
</div><div class="column" style="width:50%;">
<ul>
<li>Machine Learning and Big Data</li>
<li>NLP and Sentiment Analysis</li>
<li>Social Network Analysis</li>
<li>Agents and Simulation</li>
<li>Linked Data and Semantic Technologies</li>
</ul>
</div>
</div>
<p><a href="http://www.gsi.dit.upm.es" class="uri">http://www.gsi.dit.upm.es</a></p>
</section><section id="and-i-docker" class="slide level2">
<h2>And I ❤ Docker</h2>
<div class="columns">
<div class="column" style="width:50%;">
<p><img data-src="img/docker.jpg" /></p>
</div><div class="column" style="width:50%;">
<ul>
<li>Docker+research for 3+ years</li>
<li>Advocate for ~2 years</li>
<li>Internal infrastructure: ansible, k8s and docker</li>
<li>Teach (with) it</li>
</ul>
</div>
</div>
</section><section id="about-this-talk" class="slide level2">
<h2>About this talk</h2>
<p>Takeaway: <strong><em>you can set up a multi-user data analysis environment with isolation in minutes</em></strong></p>
<p>Plus: using docker to perform and share experiments is even easier</p>
<p>Related Meetups:</p>
<p><a href="https://www.meetup.com/Docker-Madrid/events/240357800/">Big Data and Machine Learning with Docker</a></p>
<p><a href="https://www.meetup.com/Docker-Madrid/events/237067604/">Using Docker in Machine Learning Projects</a></p>
</section></section>
<section><section id="for-researchers" class="title-slide slide level1 white" data-background="img/research.jpg" style="color:white"><h1>For researchers</h1></section><section id="experiment-publish-repeat" class="slide level2">
<h2>Experiment, publish, repeat</h2>
<p><img data-src="img/peerreview.jpg" /></p>
</section><section id="reproducibility" class="slide level2">
<h2>Reproducibility</h2>
<figure>
<img data-src="img/goodluck.png" alt="@ianholmes" /><figcaption><a href="https://twitter.com/ianholmes/status/288689712636493824">@ianholmes</a></figcaption>
</figure>
</section><section id="obstacles" class="slide level2">
<h2>Obstacles</h2>
<div class="columns">
<div class="column" style="width:50%;">
<ul>
<li><strong>Missing data</strong></li>
<li>Bleeding edge tools and libraries</li>
<li>Throwaway software
<ul>
<li>Hacky</li>
<li>Little to no documentation</li>
</ul></li>
<li>Multiple languages</li>
</ul>
</div><div class="column" style="width:50%;">
<figure>
<img data-src="img/will_it_work.png" alt="https://xkcd.com/1742/" style="height:80.0%" /><figcaption><a href="https://xkcd.com/1742/" class="uri">https://xkcd.com/1742/</a></figcaption>
</figure>
</div>
</div>
</section><section id="obstacles-1" class="slide level2">
<h2>Obstacles</h2>
<p><img data-src="img/noidea-pc.png" /></p>
</section><section id="is-it-a-problem" class="slide level2">
<h2>Is it a problem?</h2>
<figure>
<img data-src="img/reproducibility.jpg" alt="https://www.nature.com/" style="height:80.0%" /><figcaption><a href="https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970">https://www.nature.com/</a></figcaption>
</figure>
</section><section id="jupyter-notebooks" class="slide level2">
<h2>Jupyter notebooks</h2>
<p><img data-src="img/jupyter-screenshot.png" /></p>
</section><section id="jupyter-architecture" class="slide level2">
<h2>Jupyter architecture</h2>
<figure>
<img data-src="img/jupyter-architecture.png" alt="http://jupyter.readthedocs.io" /><figcaption><a href="http://jupyter.readthedocs.io" class="uri">http://jupyter.readthedocs.io</a></figcaption>
</figure>
</section><section id="docker-to-the-rescue" class="slide level2">
<h2>Docker to the rescue</h2>
<figure>
<img data-src="img/dockerrescue.png" alt="towardsdatascience.com" /><figcaption><a href="https://towardsdatascience.com/how-docker-can-help-you-become-a-more-effective-data-scientist-7fc048ef91d5">towardsdatascience.com</a></figcaption>
</figure>
</section><section id="jupyterdocker-stacks" class="slide level2">
<h2>Jupyter/docker-stacks</h2>
<p><img data-src="img/dockerstacks.png" style="height:50.0%" /></p>
</section><section id="reproducible-environment" class="slide level2">
<h2>Reproducible environment</h2>
<div class="sourceCode" id="cb1"><pre class="sourceCode bash"><code class="sourceCode bash"><a class="sourceLine" id="cb1-1" data-line-number="1"><span class="ex">docker</span> run --rm -p 8888:8888 \</a>
<a class="sourceLine" id="cb1-2" data-line-number="2"> -v <span class="va">$(</span><span class="ex">WDIR</span><span class="va">)</span>/:/home/jovyan/work/ \</a>
<a class="sourceLine" id="cb1-3" data-line-number="3"> jupyter/scipy-notebook</a></code></pre></div>
</section><section id="and-friendly-too" class="slide level2">
<h2>And friendly, too</h2>
<div class="sourceCode" id="cb2"><pre class="sourceCode yaml"><code class="sourceCode yaml"><a class="sourceLine" id="cb2-1" data-line-number="1"><span class="fu">version:</span><span class="at"> </span><span class="st">&#39;2&#39;</span></a>
<a class="sourceLine" id="cb2-2" data-line-number="2"><span class="fu">services:</span></a>
<a class="sourceLine" id="cb2-3" data-line-number="3"> <span class="fu">jupyter:</span></a>
<a class="sourceLine" id="cb2-4" data-line-number="4"> <span class="fu">image:</span><span class="at"> jupyter/scipy-notebook</span></a>
<a class="sourceLine" id="cb2-5" data-line-number="5"> <span class="fu">volumes:</span></a>
<a class="sourceLine" id="cb2-6" data-line-number="6"> <span class="kw">-</span> <span class="st">&quot;./.nbconfig:/home/jovyan/.jupyter/nbconfig&quot;</span></a>
<a class="sourceLine" id="cb2-7" data-line-number="7"> <span class="kw">-</span> <span class="st">&quot;./work:/home/jovyan/work/&quot;</span></a>
<a class="sourceLine" id="cb2-8" data-line-number="8"> <span class="fu">ports:</span></a>
<a class="sourceLine" id="cb2-9" data-line-number="9"> <span class="kw">-</span> <span class="st">&quot;8888:8888&quot;&quot;</span></a></code></pre></div>
<div class="sourceCode" id="cb3"><pre class="sourceCode bash"><code class="sourceCode bash"><a class="sourceLine" id="cb3-1" data-line-number="1"><span class="ex">docker-compose</span> up</a></code></pre></div>
</section><section id="related-projects" class="slide level2">
<h2>Related projects</h2>
<ul>
<li>Using docker images to share trained systems</li>
</ul>
<figure>
<img data-src="img/gym.png" alt="https://gym.openai.com" height="500" /><figcaption><a href="https://gym.openai.com" class="uri">https://gym.openai.com</a></figcaption>
</figure>
</section></section>
<section><section id="for-small-groups" class="title-slide slide level1 white" data-background="img/group.jpeg"><h1>For small groups</h1></section><section id="requirements" class="slide level2">
<h2>Requirements</h2>
<ul>
<li>Shared environments</li>
<li>Resource sharing</li>
<li>Easy configuration</li>
<li>Versioning</li>
<li>Backups</li>
</ul>
<p>And <strong>little to no overhead</strong></p>
</section><section id="isolation" class="slide level2">
<h2>Isolation</h2>
<p><img data-src="img/noidea.jpg" /></p>
</section><section id="jupyterhub" class="slide level2">
<h2>Jupyterhub</h2>
<div class="columns">
<div class="column" style="width:60%;">
<figure>
<img data-src="img/jhub-parts.png" alt="http://jupyterhub.readthedocs.io/" height="500" /><figcaption><a href="http://jupyterhub.readthedocs.io/" class="uri">http://jupyterhub.readthedocs.io/</a></figcaption>
</figure>
</div><div class="column" style="width:40%;">
<h4 id="authenticators">Authenticators</h4>
<ul>
<li>Local</li>
<li>OAuth</li>
<li>LDAP</li>
<li>JWT</li>
</ul>
<h4 id="spawners">Spawners</h4>
<ul>
<li>Local</li>
<li>Docker</li>
<li>Kubernetes</li>
<li>Marathon</li>
</ul>
</div>
</div>
</section><section id="more-infrastructure" class="slide level2">
<h2>More infrastructure</h2>
<p><img data-src="img/docker-gitlab.jpg" class="noborder" height="250" /> <img data-src="img/nextcloud.jpg" class="noborder" height="250" /></p>
<p><img data-src="img/sharelatex.jpg" class="noborder" height="250" /></p>
</section></section>
<section><section id="demo" class="title-slide slide level1" data-background="img/party.jpg"><h1>Demo</h1></section><section id="its-demo-time" class="slide level2">
<h2>Its demo time</h2>
<p><img data-src="img/demogods.jpg" style="height:80.0%" /></p>
<p><a href="https://github.todevnull.com" class="uri">https://github.todevnull.com</a> <a href="https://github.com/balkian/lab-in-a-box" class="uri">https://github.com/balkian/lab-in-a-box</a></p>
</section></section>
<section><section id="other-tools" class="title-slide slide level1"><h1>Other tools</h1></section><section id="zeppelin" class="slide level2">
<h2>Zeppelin</h2>
<ul>
<li>Alternative to Jupyter</li>
</ul>
<figure>
<img data-src="img/zeppelin.png" alt="https://zeppelin.apache.org/" /><figcaption><a href="https://zeppelin.apache.org/" class="uri">https://zeppelin.apache.org/</a></figcaption>
</figure>
</section><section id="cocalc" class="slide level2">
<h2>CoCalc</h2>
<ul>
<li>Alternative to Jupyter</li>
</ul>
<figure>
<img data-src="img/cocalc.png" alt="https://cocalc.org/" height="500" /><figcaption><a href="https://cocalc.org/" class="uri">https://cocalc.org/</a></figcaption>
</figure>
</section><section id="docker-nvidia" class="slide level2">
<h2>Docker-Nvidia</h2>
<ul>
<li>CUDA for docker</li>
</ul>
<figure>
<img data-src="img/dockernvidia.png" alt="https://github.com/NVIDIA/nvidia-docker" /><figcaption><a href="https://github.com/NVIDIA/nvidia-docker" class="uri">https://github.com/NVIDIA/nvidia-docker</a></figcaption>
</figure>
</section><section id="jupyter-binder" class="slide level2">
<h2>Jupyter Binder</h2>
<ul>
<li>Custom Jupyter from git repositories</li>
</ul>
<figure>
<img data-src="img/binder.png" alt="https://mybinder.org/" height="500" /><figcaption><a href="https://mybinder.org/" class="uri">https://mybinder.org/</a></figcaption>
</figure>
</section><section id="knowledge-repo" class="slide level2">
<h2>Knowledge-Repo</h2>
<figure>
<img data-src="img/knowledgerepo.png" alt="http://knowledge-repo.readthedocs.io/" /><figcaption><a href="http://knowledge-repo.readthedocs.io/" class="uri">http://knowledge-repo.readthedocs.io/</a></figcaption>
</figure>
</section></section>
<section><section id="conclusions" class="title-slide slide level1"><h1>Conclusions</h1></section><section id="lessons-learned" class="slide level2">
<h2>Lessons learned</h2>
<ul>
<li>Docker + Docker-compose
<ul>
<li>Reproducible environments (partially)</li>
<li>Reduced tooling / experience</li>
<li>Ephemeral containers force you to automate/document installation</li>
</ul></li>
<li>Jupyterhub
<ul>
<li>Shared environments</li>
<li>Web interface (zero knowledge)</li>
</ul></li>
</ul>
</section><section id="whats-missing" class="slide level2">
<h2>Whats missing?</h2>
<ul>
<li>Roles and permissions</li>
<li><p>Backups</p></li>
<li>Ideas:
<ul>
<li>Kubernetes?</li>
<li>OpenShift?</li>
</ul></li>
</ul>
</section><section id="thanks-for-listening" class="slide level2">
<h2>Thanks for listening!</h2>
<p><a href="https://github.com/balkian/lab-in-a-box" class="uri">https://github.com/balkian/lab-in-a-box</a></p>
<p><a href="mailto:jf.sanchez@upm.es">jf.sanchez@upm.es</a></p>
</section></section>
</div>
</div>
<script src="reveal.js/lib/js/head.min.js"></script>
<script src="reveal.js/js/reveal.js"></script>
<script>
// Full list of configuration options available at:
// https://github.com/hakimel/reveal.js#configuration
Reveal.initialize({
// Push each slide change to the browser history
history: true,
height: 800,
// Optional reveal.js plugins
dependencies: [
{ src: 'reveal.js/lib/js/classList.js', condition: function() { return !document.body.classList; } },
{ src: 'reveal.js/plugin/zoom-js/zoom.js', async: true },
{ src: 'reveal.js/plugin/notes/notes.js', async: true }
]
});
</script>
</body>
</html>