Pre-installed Packages¶
Package installation procedures vary across different programming languages. Basic python packages such as numPy, pandas, scikit-learn, matplotlib, etc., are installed across the main DataHub. R hubs supports packages such as shiny, dplyr, tidyR, RSQLlite, etc.
You can query the list of installed python packages in a jupyter notebook:
!pip list
You can query R for the list of installed packages as well:
installed.packages()
You can check the packages installed in Julia by accessing the Julia Hub.
Install Packages¶
There are two methods for installing packages. If you will be using a package regularly in your course, we recommend using the long-term installation method.
Temporary Installation¶
You can run language-specific commands for installing packages on the hubs. This will install the software within the running environment, however the changes will not be permanent. The environment will get reset when users restart their servers.
Refrain from installing python packages via pip install --user
. If packages are installed this way, it may interfere with the functioning of the Jupyter server itself.
Long-term Installation¶
You can request that additional packages be installed in any of the environments in the user docker images. When the software is installed this way, it will persist across server restarts. The packages will remain in the user images for at least the length of the academic term. We periodically review the list of installed packages in between semesters.
On Reproducibility¶
Make sure to specify a version for any library you request. If you do not, the deployment process may break at some point during the semester. Omitting a version will not enable the user environment to always have the latest version -- it will only have the latest version that existed on the date that CI process runs.
If you want to use an unreleased version of a library, specify the corresponding git SHA of that library’s repository.