What is it?
Binder allows the re-creation of a custom computing environment for reproducible execution of notebooks (and potentially many other types of applications). Users who create their own notebooks in the EGI Notebooks to analyze data can easily create a shareable link for those notebooks in the form of a GitHub repository. Based on this link, anyone can then reproduce the same data analysis using the link in the EGI Binder service.
The service builds on BinderHub, an Open Source tool that allows to build docker images from a Git repository and then makes them available through your browser.
EGI Binder offers a service similar to the publicly accessible mybinder.org site. However, EGI Binder has the following additional features:
- Access with academic user accounts: login via Check-in that’s connected to eduGAIN and social media accounts.
- Access to scalable storage: selected storage spaces of EGI DataHub are
directly available under the
datahubfolder, simplifying the access to shared data from Binder notebooks.
- Guaranteed capacity: environments have 2GB of RAM guaranteed and can reach 4GB as maximum.
- Persistent sessions: There is no hard limit on the session time per user, although sessions will be shut down automatically after 1 hour of inactivity (see session limitations at the public mybinder.org service).
- Access to the rest of EGI services: a personal access token is available in the Binder session to interact with the rest of the EGI infrastructure.
- Community Binder environments: User communities can have their customized Binder service instance from EGI, with extra features as requested (such as access to GPUs, integration with community specific data repositories and services). EGI offers consultancy and support the setup of these instances, and provides operational oversight for them.
Binder facilitates the sharing and reproducibility of digital data analysis:
- Users can define their computational analysis in the EGI Notebooks service.
- Once the notebook is ready for publishing, it can be shared in a GitHub repository.
- Optionally, users can use the Zenodo-GitHub integration for generating DOIs that can be cited in publications and can be discovered by fellow researchers
- Anyone can use the link to the GitHub repository or Zenodo DOI to reproduce the computational analysis in EGI Binder.
Access to the service
EGI’s Binder has the same access conditions as the centrally operated Notebooks service from EGI. Before using the service, you need to have an EGI account and be a member of one of the supported resource pools (alias Virtual Organisations). Follow the instructions on the EGI Binder login page for access
Creating a Binder repository
Binder starts from a code repository that contains the code or notebook you’d like to run and a set of configuration files that specify what’s the exact computational environment your code needs to run.
Binder then creates a reproducible container using repo2docker, and generates a user session to interact with the container in the browser.
The configuration for building the container supports specifying conda environments; installing Python, R and Julia environments; installing additional OS packages; and even complete custom Dockerfiles to bring any application to the system. The code repository can be hosted on popular git hosting platforms like GitHub and GitLab and can also be referenced with a DOI from Zenodo, FigShare or Dataverse. You can learn more on the configuration of your repository with Binder at the Binder user documentation
You can start by forking the EGI-Federation/binder-example GitHub repository for creating your own reproducible environment. To run this directly on EGI’s Binder click on the button below:
You can create such link to share your notebooks from the Binder interface, as shown in the screenshot below, you can copy the URL shown when the building is in progress:
The binder examples organisation on GitHub contains more sample repositories for common configurations that can help you getting started.
Your notebooks running in Binder have outgoing internet connectivity, so you can connect to external services to bring data in for analysis or deposing the notebooks output.
Every session that you start will also provide access to your spaces in the
DataHub under a folder named
those spaces configured to be mounted locally will be made available
automatically. Check the documentation for the
Notebook’s DataHub support for more