Architecture
Milvus 2.0 is a cloud-native vector database with storage and computation separated by design. All components in this refactored version of Milvus are stateless to enhance elasticity and flexibility.
The system breaks down into four levels:
- Access layer
- Coordinator service
- Worker nodes
- Storage
**Access layer:** The front layer of the system and endpoint to users. It comprises peer proxies for forwarding requests and gathering results.
**Coordinator** **service:** The coordinator service assigns tasks to the worker nodes and functions as the system's brain. It has four coordinator types: root coord, data coord, query coord, and index coord.
**Worker nodes:** Worker nodes are dumb executors that follow the instructions from the coordinator service. There are three types of worker nodes, each responsible for a different job: data nodes, query nodes, and index nodes.
**Storage:** The cornerstone of the system that all other functions depend on. It has three storage types: meta storage, log broker, and object storage. Kudos to the open-source communities of etcd, Pulsar, MinIO, and RocksDB for building this fast, reliable storage.
> For more information about how the system works, see [Milvus 2.0 Architecture](https://milvus.io/docs/v2.0.0/architecture_overview.md).
New Features
**SDK**
- Object-relational mapping (ORM) PyMilvus
The PyMilvus-ORM APIs operate directly on collections, partitions, and indexes, helping users focus on the building of an effective data model rather than the detailed implementation.
**Core Features**
- Hybrid Search between scalar and vector data
Milvus 2.0 supports storing scalar data. Operators such as GREATER, LESS, EQUAL, NOT, IN, AND, and OR can be used to filter scalar data before a vector search is conducted. Current supported data types include bool, int8, int16, int32, int64, float, and double. Support for string/VARBINARY data will be offered in a later version.
- Match query
Unlike the search operation, which returns similar results, the match query operation returns exact matches. Match query can be used to retrieve vectors by ID or by condition.
- Tunable consistency
Distributed databases make tradeoffs between consistency and availbility/latency. Milvus offers four consistency levels (from strongest to weakest): strong, bounded staleness, session, and consistent prefix. You can define your own read consistency by specifying the read timestamp. As a rule of thumb, the weaker the consistency level, the higher the availability and the higher the performance.
- Time travel
Time travel allows you to access historical data at any point within a specified time period, making it possible to query data in the past, restore, and backup.
**Miscellaneous**
- Supports installing Milvus 2.0 with Helm or Docker-compose.
- Compatibility with Prometheus and Grafana for monitoring and alerts.
- Milvus Insight
Milvus Insight is a graphical management system for Milvus. It features visualization of cluster states, meta management, data queries and more. Milvus Insight will eventually be open sourced.
Breaking Changes