General
This release contains the core functionality of the Petals platform described in [our paper](https://arxiv.org/pdf/2209.01188.pdf).
What's Changed
* Rudimentary decentralization by justheuristic in https://github.com/bigscience-workshop/petals/pull/9
* Update model by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/17
* Chained rpc_forward & rpc_backward by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/18
* Implement block selection on servers by borzunov in https://github.com/bigscience-workshop/petals/pull/20
* LM head module by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/19
* Measure and cache network & compute throughput by borzunov in https://github.com/bigscience-workshop/petals/pull/21
* Shallow prompt tuning with run example on SST-2 by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/22
* minimalistic automated tests by justheuristic in https://github.com/bigscience-workshop/petals/pull/23
* Clean up readme by justheuristic in https://github.com/bigscience-workshop/petals/pull/24
* [Test CI] add instructions to test the full model by justheuristic in https://github.com/bigscience-workshop/petals/pull/25
* Fix default branch in CI by justheuristic in https://github.com/bigscience-workshop/petals/pull/26
* Fix CI runs in master by justheuristic in https://github.com/bigscience-workshop/petals/pull/27
* CI: use GIT_REF_NAME instead of GIT_HEAD_REF by justheuristic in https://github.com/bigscience-workshop/petals/pull/28
* Add GenerationMixin class by artek0chumak in https://github.com/bigscience-workshop/petals/pull/29
* Decouple make_sequence and move to RemoteSequenceManager by justheuristic in https://github.com/bigscience-workshop/petals/pull/30
* fix is_subsequence by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/32
* Miscellaneous fixes to automatic tests by justheuristic in https://github.com/bigscience-workshop/petals/pull/35
* Efficient forward & backward by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/36
* Pack of Inference Changes by artek0chumak in https://github.com/bigscience-workshop/petals/pull/37
* Support various backend dtypes & async serialization by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/38
* Use "PETALS" as the readme title by borzunov in https://github.com/bigscience-workshop/petals/pull/40
* integrate mixed-8bit model by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/39
* Rename 350m -> 560m by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/43
* make pytest outputs more verbose by justheuristic in https://github.com/bigscience-workshop/petals/pull/44
* Distributed prompt tuning by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/42
* Reduce vocabulary size in test model, fix bug in routing when overlapped by justheuristic in https://github.com/bigscience-workshop/petals/pull/45
* Convert actual model weights by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/46
* [quickfix 1/n] remove expensive assertions in inference code by justheuristic in https://github.com/bigscience-workshop/petals/pull/48
* [Fix] make distributed seq cls to not create the full bloom model by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/49
* Fix recovering for sequential_backward by dbaranchuk in https://github.com/bigscience-workshop/petals/pull/50
* Inference: require max sequence length instead of assuming 2048 by justheuristic in https://github.com/bigscience-workshop/petals/pull/52
* Add shallow prefix-tuned inference by artek0chumak in https://github.com/bigscience-workshop/petals/pull/55
* remove transformer block, implement as sequence size 1 by GreenFatGuy in https://github.com/bigscience-workshop/petals/pull/54
* Update readme for the 1st public release by borzunov in https://github.com/bigscience-workshop/petals/pull/57
* Use latest version of Petals scheme, shrink Petals logo by borzunov in https://github.com/bigscience-workshop/petals/pull/59
* Update bullet points with feedback from Tim and other people by borzunov in https://github.com/bigscience-workshop/petals/pull/61
* Update readme with arxiv link and more discussions by borzunov in https://github.com/bigscience-workshop/petals/pull/62
* Warn that current instructions involve 6B model but we will replace them soon by borzunov in https://github.com/bigscience-workshop/petals/pull/63
* Add deep prompt inference by artek0chumak in https://github.com/bigscience-workshop/petals/pull/66
* Fix calling rpc_info multiple times by justheuristic in https://github.com/bigscience-workshop/petals/pull/60
* Make attention cache wait until memory is freed by justheuristic in https://github.com/bigscience-workshop/petals/pull/53
* Build cpuonly from bitsandbytes main by justheuristic in https://github.com/bigscience-workshop/petals/pull/70
* Priority tasks by GreenFatGuy in https://github.com/bigscience-workshop/petals/pull/47
* Update dependency versions by justheuristic in https://github.com/bigscience-workshop/petals/pull/71
* fix protobuf version by justheuristic in https://github.com/bigscience-workshop/petals/pull/74
* Add prompt tuning example on Personachat dataset by artek0chumak in https://github.com/bigscience-workshop/petals/pull/69
* Quality of life changes: update readme, simplify run_server interface by justheuristic in https://github.com/bigscience-workshop/petals/pull/75
* Use bitsandbytes==0.34.0, update readme by justheuristic in https://github.com/bigscience-workshop/petals/pull/76
* Make small readability & style changes to the instructions by borzunov in https://github.com/bigscience-workshop/petals/pull/77
* Rebalance swarm when necessary by borzunov in https://github.com/bigscience-workshop/petals/pull/34
* Update hivemind to 1.1.2, mark `model` argument as required by borzunov in https://github.com/bigscience-workshop/petals/pull/81
* Fix "Too many open files" during rebalancing by borzunov in https://github.com/bigscience-workshop/petals/pull/83
* Add colab-related changes by artek0chumak in https://github.com/bigscience-workshop/petals/pull/80
* Enable rebalancing by default by borzunov in https://github.com/bigscience-workshop/petals/pull/84
* Implement exponential backoff for forward & backward by borzunov in https://github.com/bigscience-workshop/petals/pull/85
* Add sst-2 ipynb example by artek0chumak in https://github.com/bigscience-workshop/petals/pull/86
* Fix floating point issues in block_selection.py by borzunov in https://github.com/bigscience-workshop/petals/pull/89
* Implement timeouts in forward/backward by borzunov in https://github.com/bigscience-workshop/petals/pull/90
* Force reinstall of hivemind in example notebooks by artek0chumak in https://github.com/bigscience-workshop/petals/pull/88
* Make inference, forward, and backward fully fault-tolerant by borzunov in https://github.com/bigscience-workshop/petals/pull/91
* Use public swarm by default by borzunov in https://github.com/bigscience-workshop/petals/pull/92
* Make ServerState announcements work better by borzunov in https://github.com/bigscience-workshop/petals/pull/93
* Require hivemind with fixed compression and protobuf working on Colab by borzunov in https://github.com/bigscience-workshop/petals/pull/94
* Try to fix protobuf versions once again by borzunov in https://github.com/bigscience-workshop/petals/pull/95
* Add Beam Search decoding algorithm by artek0chumak in https://github.com/bigscience-workshop/petals/pull/87
* Improve server's logging by borzunov in https://github.com/bigscience-workshop/petals/pull/96
* Add various server timeouts, lower --max_batch_size and --inference_max_length defaults by borzunov in https://github.com/bigscience-workshop/petals/pull/97
* Fix dtype- and device-related client issues by borzunov in https://github.com/bigscience-workshop/petals/pull/98
* Make Petals a pip-installable package (attempt 2) by borzunov in https://github.com/bigscience-workshop/petals/pull/102
* Fix dtypes in backend schemas by borzunov in https://github.com/bigscience-workshop/petals/pull/99
* Fix ptune with `low_cpu_mem_usage=True` (as in Colab) by borzunov in https://github.com/bigscience-workshop/petals/pull/103
* Add Dockerfile by mryab in https://github.com/bigscience-workshop/petals/pull/82
* Remove unused imports, add missing arguments to docstrings by mryab in https://github.com/bigscience-workshop/petals/pull/108
* Expose request_timeout to DistributedBloomConfig by artek0chumak in https://github.com/bigscience-workshop/petals/pull/105
* Optimize RemoteSequenceManager by justheuristic in https://github.com/bigscience-workshop/petals/pull/106
* Hotfix span selection by justheuristic in https://github.com/bigscience-workshop/petals/pull/110
* Patch Linear8bit to enable CxB backward by justheuristic in https://github.com/bigscience-workshop/petals/pull/111
* Fix Linear8bitlt state config, update tests by justheuristic in https://github.com/bigscience-workshop/petals/pull/112
* Measure throughput for different configs, devices, and dtypes separately by borzunov in https://github.com/bigscience-workshop/petals/pull/114
* Support --load_in_8bit on pre-Turing GPUs by justheuristic in https://github.com/bigscience-workshop/petals/pull/113
* Fix tile size on ampere by justheuristic in https://github.com/bigscience-workshop/petals/pull/116
* Make server use smart defaults by borzunov in https://github.com/bigscience-workshop/petals/pull/115
* Suppress quantization warning and fix dtype defaults in compute benchmark by borzunov in https://github.com/bigscience-workshop/petals/pull/117
* Choose --num_blocks for bigscience/bloom-petals automatically by borzunov in https://github.com/bigscience-workshop/petals/pull/119
* Require hivemind==1.1.4 with p2pd v0.3.13 by borzunov in https://github.com/bigscience-workshop/petals/pull/121
* Rework readme, move code example to the top, link draft of Colab by borzunov in https://github.com/bigscience-workshop/petals/pull/118
* Remove "-r" when installing Petals in examples by mryab in https://github.com/bigscience-workshop/petals/pull/122
* Update notebooks to use full BLOOM-176B by artek0chumak in https://github.com/bigscience-workshop/petals/pull/104
* Call block.load_state_dict only once by mryab in https://github.com/bigscience-workshop/petals/pull/124
* Add checks for forward() inputs on the client side by justheuristic in https://github.com/bigscience-workshop/petals/pull/123
* Fix typos with codespell by mryab in https://github.com/bigscience-workshop/petals/pull/126
* Set dht.num_workers = n_layer, update_period = 150, expiration = 300 by borzunov in https://github.com/bigscience-workshop/petals/pull/125
* Avoid synchronous updates, ban peers based on request outcome by justheuristic in https://github.com/bigscience-workshop/petals/pull/127
* Revert to hivemind==1.1.3 for stability by borzunov in https://github.com/bigscience-workshop/petals/pull/129
* Clear trigger before engaging in update by justheuristic in https://github.com/bigscience-workshop/petals/pull/130
* Fix inference and rpc_info() fault tolerance by borzunov in https://github.com/bigscience-workshop/petals/pull/131
* Set default --step_timeout to 5 min by borzunov in https://github.com/bigscience-workshop/petals/pull/133
* Don't ban servers in case of client-caused handler errors by borzunov in https://github.com/bigscience-workshop/petals/pull/134
* Allow .generate() to reuse existing inference session by borzunov in https://github.com/bigscience-workshop/petals/pull/132
* Fix waiting until free memory is available by borzunov in https://github.com/bigscience-workshop/petals/pull/136
* Fix "could not unlink the shared memory file" during rebalancing by borzunov in https://github.com/bigscience-workshop/petals/pull/135
* Add Docker commands, use permanent Discord links by borzunov in https://github.com/bigscience-workshop/petals/pull/137
* Update texts in "Terms of use" and "Privacy and security" sections by borzunov in https://github.com/bigscience-workshop/petals/pull/138
* Show route on client by borzunov in https://github.com/bigscience-workshop/petals/pull/139
* Update Anaconda instructions by borzunov in https://github.com/bigscience-workshop/petals/pull/140
* Use common folder for all caches, make it a volume in Dockerfile by borzunov in https://github.com/bigscience-workshop/petals/pull/141
* Suppress asyncio error logs by default by borzunov in https://github.com/bigscience-workshop/petals/pull/142
* Add link to privacy & security Wiki by borzunov in https://github.com/bigscience-workshop/petals/pull/144
* Improve block size calculations by borzunov in https://github.com/bigscience-workshop/petals/pull/149
* Fix OOMs during server rebalancing by borzunov in https://github.com/bigscience-workshop/petals/pull/150
* Bump transformers to 4.25.1 by justheuristic in https://github.com/bigscience-workshop/petals/pull/151
* Clean up disk space by borzunov in https://github.com/bigscience-workshop/petals/pull/152
* Fix arguments in remove_old_models.py by mryab in https://github.com/bigscience-workshop/petals/pull/153
* Add missing methods for SamplingAlgorithm, fix docstrings by mryab in https://github.com/bigscience-workshop/petals/pull/107
* Reset MemoryCache during rebalancings by borzunov in https://github.com/bigscience-workshop/petals/pull/154
* Check reachability automatically and give advice how to fix it by borzunov in https://github.com/bigscience-workshop/petals/pull/155
* Fix logging: do not duplicate lines, enable colors in Colab by borzunov in https://github.com/bigscience-workshop/petals/pull/156
* Update advanced notebooks by artek0chumak in https://github.com/bigscience-workshop/petals/pull/148
* Downgrade CUDA in Docker image to 11.0.3 by mryab in https://github.com/bigscience-workshop/petals/pull/145
* Switch to speedtest-cli by justheuristic in https://github.com/bigscience-workshop/petals/pull/157
* Fix issues related to `petals` as a module by borzunov in https://github.com/bigscience-workshop/petals/pull/159
* Alloc inference cache as one contiguous buffer by borzunov in https://github.com/bigscience-workshop/petals/pull/160
* Fix misstypos in the example notebooks. by artek0chumak in https://github.com/bigscience-workshop/petals/pull/161
* Hot fix: Increase hivemind.P2P's startup_timeout for Colab, remove absent initial peer by borzunov in https://github.com/bigscience-workshop/petals/pull/162
* Shield alloc & free from cancellation by borzunov in https://github.com/bigscience-workshop/petals/pull/163
* Update wording in readme by borzunov in https://github.com/bigscience-workshop/petals/pull/165
* Correct grammar in readme by vadi2 in https://github.com/bigscience-workshop/petals/pull/166
* Add link to chat.petals.ml by borzunov in https://github.com/bigscience-workshop/petals/pull/168
* Fix code example in readme by borzunov in https://github.com/bigscience-workshop/petals/pull/169
* Fix instruction for developers by justheuristic in https://github.com/bigscience-workshop/petals/pull/170
New Contributors
* dbaranchuk made their first contribution in https://github.com/bigscience-workshop/petals/pull/17
* borzunov made their first contribution in https://github.com/bigscience-workshop/petals/pull/20
* artek0chumak made their first contribution in https://github.com/bigscience-workshop/petals/pull/29
* GreenFatGuy made their first contribution in https://github.com/bigscience-workshop/petals/pull/54
* mryab made their first contribution in https://github.com/bigscience-workshop/petals/pull/82
* vadi2 made their first contribution in https://github.com/bigscience-workshop/petals/pull/166
**Full Changelog**: https://github.com/bigscience-workshop/petals/commits/v1.0.0