mirror of https://github.com/balkian/lab-in-a-box.git synced 2024-09-28 22:51:44 +00:00
J. Fernando Sánchez 83b7b60897 Add slides
2018-05-21 19:30:55 +02:00

307 lines
5.2 KiB

title: Docker for research
subtitle: ... and data analysis
author: J. Fernando Sánchez (<jf.sanchez@upm>)
tags: [Docker, CI, research]
date: 2018
abstract: Talk about docker for research and data analysis
# Intro { .white data-background="img/intro.jpg"}
## Before we begin
Code available at:
Live demos at:
Feel free to log in, but try not to break them for now 😉
## My name is Fernando and...
## At Grupo de Sistemas Inteligentes
:::::::::::::: {.columns}
::: {.column width="50%"}
::: {.column width="50%"}
- Machine Learning and Big Data
- NLP and Sentiment Analysis
- Social Network Analysis
- Agents and Simulation
- Linked Data and Semantic Technologies
## And I ❤ Docker
:::::::::::::: {.columns}
::: {.column width="50%"}
::: {.column width="50%"}
* Docker+research for 3+ years
* Advocate for ~2 years
* Internal infrastructure: ansible, k8s and docker
* Teach (with) it
## About this talk
Takeaway: ***you can set up a multi-user data analysis environment with isolation in minutes***
Plus: using docker to perform and share experiments is even easier
Related Meetups:
[Big Data and Machine Learning with Docker](https://www.meetup.com/Docker-Madrid/events/240357800/)
[Using Docker in Machine Learning Projects](https://www.meetup.com/Docker-Madrid/events/237067604/)
# For researchers {.white data-background="img/research.jpg" style="color:white"}
<!-- ## Research is about data -->
<!-- ![The scientific method](img/scientificmethod.png){.noborder height="500px"} -->
## Experiment, publish, repeat
## Reproducibility
## Obstacles
:::::::::::::: {.columns}
::: {.column width="50%"}
* **Missing data**
* Bleeding edge tools and libraries
* Throwaway software
* Hacky
* Little to no documentation
* Multiple languages
::: {.column width="50%"}
![<https://xkcd.com/1742/>](img/will_it_work.png){ height=80% }
## Obstacles
## Is it a problem?
![[https://www.nature.com/](https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970)](img/reproducibility.jpg){ height=80% }
## Jupyter notebooks
## Jupyter architecture
## Docker to the rescue
## Jupyter/docker-stacks
![](img/dockerstacks.png){ height=50% }
## Reproducible environment
docker run --rm -p 8888:8888 \
-v $(WDIR)/:/home/jovyan/work/ \
## And friendly, too
version: '2'
image: jupyter/scipy-notebook
- "./.nbconfig:/home/jovyan/.jupyter/nbconfig"
- "./work:/home/jovyan/work/"
- "8888:8888""
docker-compose up
## Related projects
* Using docker images to share trained systems
![<https://gym.openai.com>](img/gym.png){ height=500px }
# For small groups { .white data-background="img/group.jpeg" }
## Requirements
* Shared environments
* Resource sharing
* Easy configuration
* Versioning
* Backups
And **little to no overhead**
## Isolation
## Jupyterhub
:::::::::::::: {.columns}
::: {.column width="60%"}
![<http://jupyterhub.readthedocs.io/>](img/jhub-parts.png){ height=500px }
::: {.column width="40%"}
#### Authenticators
* Local
* OAuth
#### Spawners
* Local
* Docker
* Kubernetes
* Marathon
## More infrastructure
![](img/docker-gitlab.jpg){.noborder height="250px"}
![](img/nextcloud.jpg){.noborder height="250px"}
![](img/sharelatex.jpg){.noborder height="250px"}
# Demo { data-background="img/party.jpg"}
## It's demo time
![](img/demogods.jpg){ height=80% }
# Other tools
## Zeppelin
* Alternative to Jupyter
## CoCalc
* Alternative to Jupyter
![<https://cocalc.org/>](img/cocalc.png){ height=500px }
## Docker-Nvidia
* CUDA for docker
## Jupyter Binder
* Custom Jupyter from git repositories
![<https://mybinder.org/>](img/binder.png){ height=500px }
## Knowledge-Repo
# Conclusions
## Lessons learned
* Docker + Docker-compose
* Reproducible environments (partially)
* Reduced tooling / experience
* Ephemeral containers force you to automate/document installation
* Jupyterhub
* Shared environments
* Web interface (zero knowledge)
## What's missing?
* Roles and permissions
* Backups
* Ideas:
* Kubernetes?
* OpenShift?
## Thanks for listening!