Tensorlink

Latest version: v0.1.3

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

0.1.3

What's Changed
- `.env` file renamed to `.tensorlink.env` to prevent conflicts with other projects.
- Validators can now create job requests on behalf of users.
- Added `DistributedValidator` (`ml/validator.py`), enabling validators to handle ML tasks.
- This reflects progress on issues **21** and **30**
- Laid the groundwork for APIs to handle inference and fully hosted jobs.
- Improved model parsing formatting to expand network limitations (**Issue 18**).

**Bug & Performance Improvements**
- Improved connection monitoring and timeout detection.
- Improved modularity of the job monitor and node activity tracking.
- Enhanced handling of worker and user timeouts during active jobs (**Issue 19**).
- Moved the connection monitor to a dedicated file for better structure.
- Node bootstrapping and connectivity tweaks.

**Full Changelog**: https://github.com/smartnodes-lab/tensorlink/compare/v0.1.2...v0.1.3

0.1.2.post1

What's New
- Isolated proposal management and job monitoring from validator.py
- Updated contracts and interactions to new deployment
- Refactoring and modularization
- Refactoring tests

Limitations
- **General Issues**: Bugs, performance hiccups, and limited network availability are expected in this early release.
- **Model Support**:
- Tensorlink supports **scriptable PyTorch models** (`torch.jit.script`) and select open-source Hugging Face models (excluding API-key-dependent models).
- This is due to security and serialization constraints in untrusted P2P interactions. We're working on custom serialization methods to support all PyTorch model types. [14]
- **Model Size**: Current public jobs are best suited for models under ~1 billion parameters.
- **Worker Allocation**: Public jobs are limited to one worker.
- Data-parallel acceleration is disabled for public tasks but can be enabled for local jobs or private clusters.
- **Latency**: Internet speeds and latency may impact performance, particularly for complex tasks.

0.1.1

What's Changed

- **New Worker and Validator Executables**: Introduced executables for UNIX systems. [16]
- **Idle GPU Mining**: Workers can now execute GPU mining scripts (or other intensive tasks) when idle.
- Mining scripts are only executed as root if the main logic is also run as root (not recommended). [20]
- **Configurable Mining**: A `config.json` file attached with the miner binary allows users to specify worker addresses and mining file paths.

Limitations

- **General Issues**: Bugs, performance hiccups, and limited network availability are expected in this early release.
- **Model Support**:
- Tensorlink supports **scriptable PyTorch models** (`torch.jit.script`) and select open-source Hugging Face models (excluding API-key-dependent models).
- This is due to security and serialization constraints in untrusted P2P interactions. We're working on custom serialization methods to support all PyTorch model types. [14]
- **Model Size**: Current public jobs are best suited for models under ~1 billion parameters.
- **Worker Allocation**: Public jobs are limited to one worker.
- Data-parallel acceleration is disabled for public tasks but can be enabled for local jobs or private clusters.
- **Latency**: Internet speeds and latency may impact performance, particularly for complex tasks.

Future Plans

- **Expanded Capacity**: We're actively scaling the network, with the next update set to support larger models and enable more complex workflows.
- **Enhanced Model Support**: Improvements in serialization methods will unlock broader compatibility for all PyTorch model types.
- **Worker Allocation Updates**: Future releases will improve worker distribution for public jobs, allowing for more robust task execution.

0.1.0

Notes

Tensorlink provides the tools for streamlined distributed model training and inference in PyTorch, with included support for Hugging Face. This release introduces foundational concepts and structures designed to simplify the creation of scalable, distributed machine learning workflows. Furthermore, Tensorlink offers the creation of nodes that can access and contribute to public distributed machine learning resources.

Key Features:
- `DistributedModel`: A flexible wrapper for `torch.nn.Module` designed to simplify distributed machine learning workflows.
- Provides methods for parsing, distributing, and integrating PyTorch models across devices.
- Supports standard model operations (e.g., `forward`, `backward`, `parameters`).
- Automatically manages partitioning and synchronization of model components across nodes.
- Seamlessly supports both data and model parallelism.

- `DistributedOptimizer`: An optimizer wrapper built for `DistributedModel` to ensure synchronized parameter updates across distributed nodes.
- Compatible with native PyTorch and Hugging Face optimizers.

- Nodes Types (`tensorlink.nodes`): Tensorlink provides three key node types to enable robust distributed machine learning workflows:
- `UserNode`: Handles job submissions and result retrieval, facilitating interaction with `DistributedModel` for training and inference. Required for public network participation.
- `WorkerNode`: Manages active jobs, connections to users, and processes data for model execution.
- `ValidatorNode`: Secures and coordinates training tasks and node interactions, ensuring job integrity on the public network.

- **Public Computational Resources**: By default, Tensorlink nodes are integrated with a smart contract-secured network, enabling:
- Incentive mechanisms to reward contributors for sharing computational power.
- Access to both free and paid machine learning resources.
- Configuration options for private networks, supporting local or closed group machine learning workflows.

Limitations:

- Bugs, performance issues, and limited network availability are expected.
- **Model Support**: Tensorlink currently supports scriptable PyTorch models (`torch.jit.script`) and select open-source
Hugging Face models not requiring API-keys.
- **Why?** Security and serialization constraints for un-trusted P2P interactions. We're actively working on custom serialization methods to support all PyTorch model types. Feedback and contributions to accelerate this effort are welcome!
- **Job Constraints**:
- **Model Size**: Due to limited worker availability in this initial release, public jobs are best suited for models under ~1 billion parameters.
- **Future Plans**: We are actively expanding network capacity, and the next update (expected soon) will increase this limit, enabling support for larger models and more complex workflows.
- Worker Allocation: Public jobs are currently limited to one worker. Data parallel acceleration is temporarily disabled for public tasks but can be enabled for local jobs or private clusters.
- Internet latency and connection speeds can significantly impact the performance of public jobs, which may become problematic for certain training and inference scenarios.

📚 Get Started: [Documentation](https://smartnodes.ca/docs)
🤝 Contribute: [GitHub Repository](https://github.com/smartnodes-lab/tensorlink)

Releases

Has known vulnerabilities