- **Input of Personal Access Token (PAT) in Jupyter is not necessary any more**
While starting the kernel, the kernel manager will use an own kernel client and create the Spark Session and other artifacts via REST API and the secure SSH tunnels *[(DEMO)](docs/v2/news/start-kernelspec.md)*.
- **Native Windows support**
Anaconda and Jupyter on Windows 10 (with OpenSSH) can be used with *JupyterLab Integration* *[(DEMO)](docs/v2/news/windows.md)*.
- **Docker support**
No need for local Anaconda and *JupyterLab Integration* installation - the quickest way to test *JupyterLab Integration* *[(DEMO)](docs/v2/news/docker.md)*.
- **Browsers**
- **DBFS browser with file preview**
The DBFS browser does not use sidecar any more and allows to preview many text files like csv, sh, py, ... *[(DEMO)](docs/v2/news/dbfs-browser.md)*
- **Database browser with schema and data preview**
The Database browser does not use sidecar any more and allows to preview the table schema and shows sample rows of the data *[(DEMO)](docs/v2/news/database-browser.md)*
- **MLflow browser**
A mlflow experiements browser that converts all runs of an experiment into a Pandas Dataframe to query and compare best runs in pandas. *[(DEMO - Intro)](docs/v2/news/mlflow-browser-1.md)*, *[(DEMO - Keras)](docs/v2/news/mlflow-browser-2.md)*, *[(DEMO - MLlib)](docs/v2/news/mlflow-browser-3.md)*
- **dbutils**
- **Support for `dbutils.secrets`**
`dbutils.secrets` allow to hide credentials from your code *[(DEMO)](docs/v2/news/dbutils.secrets.md)*
- **Support for `dbutils.notebook`**
Higher compatibility with Databricks notebooks:
- `dbutils.notebook.exit` stops "Running all cells" *[(DEMO)](docs/v2/news/dbutils.notebook.exit.md)*
- `dbutils.notebook.run` allows to run `.py`and `.ipynb` files from notebooks in JupyterLab Integration *[(DEMO)](docs/v2/news/dbutils.notebook.run.md)*
- **Support for kernels without Spark**
Create a *JupyterLab Integration* kernel specification with `--nospark` if no Spark Session on the remote cluster is required, e.g. for Deep Learning *[(DEMO)](docs/v2/news/with-and-without-spark.md)*
- **Support of Databricks Runtimes 6.4 and higher (incl 7.0)**
The changed initialisation from DBR 6.4 and above (*pinned* mode) is now supported
- **JupyterLab 2.1 is now default**
Bumped JupyterLab to the latest version
- **Experimental features**
- **Scala support (*experimental*)**
The `%%scala` magic will send Scala code to the same Spark Context *[(DEMO)](docs/v2/news/scala-magic.md)*
- **%fs support (*experimental*)**
The `%fs` of `%%fs` magic is supported as a shortcut for `dbutils.fs.xxx` *[(DEMO)](docs/v2/news/fs-magic.md)*
V1.0.x (December 2019)
- **Use *Databricks CLI* profiles and contain URLs and tokens**
*Jupyterlab Integration* used officially supported *Databroicks CLI* configurations to retrieve the Personal Access Tokens and URLs for remote cluster access. Personal Access Tokens will not be copied to the remote cluster
- **Create and manage Jupyter kernel specifications for remote Databricks clusters**
*Jupyterlab Integration* allows to create Jupyter kernel specifications for remote Databricks clusters via SSH. Kernel specifications can also be reconfigured or deleted
- **Configure SSH locally and remotely**
*Jupyterlab Integration* allows to create a local ssh key pair and configure the cluster with the public key for SSH access. INjecting the public key will restart the remote cluster
- **Create a Spark session and attach notebooks to it**
With *Jupyterlab Integration*, one needs to provide the Personal Access Token in th browser to authenticate the createion of a Spark Session. The current notebook will then be connected with this Spark session.
- **Mirror a a remote Databricks environment**
*Jupyterlab Integration* can mirror the versions of Data Science related libraries to a local conda environment. A blacklist and a whitelist allow to control which libraries are actually mirrored
Release documentation
Release process
1 Tests
- On Macos set limits:
bash
$ sudo launchctl limit maxfiles 65536 200000
- On Windows one needs `OpenSSH 8` for the tests:
bash
C:\>choco install openssh
set SSH=C:\Program Files\OpenSSH-Win64\ssh.exe
C:\> cmd /c ""%SSH%"" -V
OpenSSH_for_Windows_8.0p1, LibreSSL 2.6.5
Note: This assumes that [https://chocolatey.org](https://chocolatey.org) is installed
- Copy `<root>/tests/config-template.yaml` to `<root>/config.yaml` and edit it accordingly
- Depending on whether test should be run against AWS or Azure, set one of
bash
$ export CLOUD=aws
$ export CLOUD=azure
or under Windows one of
cmd
C:\>set CLOUD=aws
C:\>set CLOUD=azure
- Start clusters
bash
python tests/00-create-clusters.py
or restart clusters:
bash
python tests/01-restart-clusters.py
- Create secret scope and key for tests (if not already exists)
bash
python tests/05-create-secret-scope.py
- Execute tests
Note: For dev tests (the current version is not published to pypi), enable `30-install-wheel_test.py`, i.e. comment the skip marks decorating the test.
Execute the tests
bash
pytest -v -o log_cli=true
- Remove clusters
bash
python 99-destroy-clusters.py
2 Python package
In case the jupyter labextions and/or the python code has been changed:
1. Run tests
bash
make tests
2. Clean environment
bash
make clean delete all temp files
make prepare commit deletions
3. Bump version of databrickslabs_jupyterlab
- A new release candidate with rc0
bash
make bump part=major|minor|patch
- A new build
bash
make bump part=build
- A new release
bash
make bump part=release
- A new release without release candidate
bash
make bump part=major|minor|patch version=major.minor.patch
4. Create distribution
bash
make dist
5. Create and tag release
bash
make release
6. Deploy to pypi
bash
make upload
3 Labextension
1. Change directory to `databrickslabs_jupyterlab_status`
2. Follow steps 2-6 of 2.
4 Docker image
1. Create docker image
bash
make docker
2. Publish image
bash
make upload_docker
5 Push changes
1. Push repo and tag
bash
git push --no-verify
git push origin --no-verify --tags