Hadoop catalog with Cloud Storage
- For S3, please refer to [Hadoop-catalog-with-s3](./hadoop-catalog-with-s3.md) for more details.
- For GCS, please refer to [Hadoop-catalog-with-gcs](./hadoop-catalog-with-gcs.md) for more details.
- For OSS, please refer to [Hadoop-catalog-with-oss](./hadoop-catalog-with-oss.md) for more details.
- For Azure Blob Storage, please refer to [Hadoop-catalog-with-adls](./hadoop-catalog-with-adls.md) for more details.
How to custom your own HCFS file system fileset?
Developers and users can custom their own HCFS file system fileset by implementing the `FileSystemProvider` interface in the jar [gravitino-catalog-hadoop](https://repo1.maven.org/maven2/org/apache/gravitino/catalog-hadoop/). The `FileSystemProvider` interface is defined as follows:
java
// Create a FileSystem instance by the properties you have set when creating the catalog.
FileSystem getFileSystem(Nonnull Path path, Nonnull Map<String, String> config)
throws IOException;
// The schema name of the file system provider. 'file' for Local file system,
// 'hdfs' for HDFS, 's3a' for AWS S3, 'gs' for GCS, 'oss' for Aliyun OSS.
String scheme();
// Name of the file system provider. 'builtin-local' for Local file system, 'builtin-hdfs' for HDFS,
// 's3' for AWS S3, 'gcs' for GCS, 'oss' for Aliyun OSS.
// You need to set catalog properties `filesystem-providers` to support this file system.
String name();
In the meantime, `FileSystemProvider` uses Java SPI to load the custom file system provider. You need to create a file named `org.apache.gravitino.catalog.fs.FileSystemProvider` in the `META-INF/services` directory of the jar file. The content of the file is the full class name of the custom file system provider.
For example, the content of `S3FileSystemProvider` is as follows:

After implementing the `FileSystemProvider` interface, you need to put the jar file into the `$GRAVITINO_HOME/catalogs/hadoop/libs` directory. Then you can set the `filesystem-providers` property to use your custom file system provider.
Authentication for Hadoop Catalog
The Hadoop catalog supports multi-level authentication to control access, allowing different authentication settings for the catalog, schema, and fileset. The priority of authentication settings is as follows: catalog < schema < fileset. Specifically:
- **Catalog**: The default authentication is `simple`.
- **Schema**: Inherits the authentication setting from the catalog if not explicitly set. For more information about schema settings, please refer to [Schema properties](schema-properties).
- **Fileset**: Inherits the authentication setting from the schema if not explicitly set. For more information about fileset settings, please refer to [Fileset properties](fileset-properties).
The default value of `authentication.impersonation-enable` is false, and the default value for catalogs about this configuration is false, for
schemas and filesets, the default value is inherited from the parent. Value set by the user will override the parent value, and the priority mechanism is the same as authentication.
Catalog operations
Refer to [Catalog operations](./manage-fileset-metadata-using-gravitino.mdcatalog-operations) for more details.
Schema
Schema capabilities
The Hadoop catalog supports creating, updating, deleting, and listing schema.
Schema properties
| Property name | Description | Default value | Required | Since Version |
|---------------------------------------|----------------------------------------------------------------------------------------------------------------|---------------------------|----------|------------------|
| `location` | The storage location managed by Hadoop schema. | (none) | No | 0.5.0 |
| `authentication.impersonation-enable` | Whether to enable impersonation for this schema of the Hadoop catalog. | The parent(catalog) value | No | 0.6.0-incubating |
| `authentication.type` | The type of authentication for this schema of Hadoop catalog , currently we only support `kerberos`, `simple`. | The parent(catalog) value | No | 0.6.0-incubating |
| `authentication.kerberos.principal` | The principal of the Kerberos authentication for this schema. | The parent(catalog) value | No | 0.6.0-incubating |
| `authentication.kerberos.keytab-uri` | The URI of The keytab for the Kerberos authentication for this schema. | The parent(catalog) value | No | 0.6.0-incubating |
| `credential-providers` | The credential provider types, separated by comma. | (none) | No | 0.8.0-incubating |
Schema operations
Refer to [Schema operation](./manage-fileset-metadata-using-gravitino.mdschema-operations) for more details.
Fileset
Fileset capabilities
- The Hadoop catalog supports creating, updating, deleting, and listing filesets.
Fileset properties
| Property name | Description | Default value | Required | Since Version |
|---------------------------------------|--------------------------------------------------------------------------------------------------------|--------------------------|----------|------------------|
| `authentication.impersonation-enable` | Whether to enable impersonation for the Hadoop catalog fileset. | The parent(schema) value | No | 0.6.0-incubating |
| `authentication.type` | The type of authentication for Hadoop catalog fileset, currently we only support `kerberos`, `simple`. | The parent(schema) value | No | 0.6.0-incubating |
| `authentication.kerberos.principal` | The principal of the Kerberos authentication for the fileset. | The parent(schema) value | No | 0.6.0-incubating |
| `authentication.kerberos.keytab-uri` | The URI of The keytab for the Kerberos authentication for the fileset. | The parent(schema) value | No | 0.6.0-incubating |
| `credential-providers` | The credential provider types, separated by comma. | (none) | No | 0.8.0-incubating |
Credential providers can be specified in several places, as listed below. Gravitino checks the `credential-providers` setting in the following order of precedence:
1. Fileset properties
2. Schema properties
3. Catalog properties
Fileset operations
Refer to [Fileset operations](./manage-fileset-metadata-using-gravitino.mdfileset-operations) for more details.
---
title: How to sign and verify Gravitino releases
slug: /how-to-sign-releases
license: "This software is licensed under the Apache License version 2."
---
These instructions provide a guide to signing and verifying Apache Gravitino releases to enhance the security of releases. A signed release enables people to confirm the author of the release and guarantees that the code hasn't been altered.
Prerequisites
Before signing or verifying a Gravitino release, ensure you have the following prerequisites installed:
- GPG/GnuPG
- Release artifacts
Platform support
These instructions are for macOS. You may need to make adjustments for other platforms.
1. **How to Install GPG or GnuPG:**
[GnuPG](https://www.gnupg.org) is an open-source implementation of the OpenPGP standard and allows you to encrypt and sign files or emails. GnuPG, also known as GPG, is a command line tool.
Check to see if GPG is installed by running the command:
shell
gpg -help
If GPG/GnuPG isn't installed, run the following command to install it. You only need to do this step once.
shell
brew install gpg
Signing a release
1. **Create a Public/Private Key Pair:**
Check to see if you already have a public/private key pair by running the command:
shell
gpg --list-secret-keys
If you get no output, you'll need to generate a public/private key pair.
Use this command to generate a public/private key pair. This is a one-time process. Setting the key expiry to 5 years and omitting a comment. All other defaults are acceptable.
shell
gpg --full-generate-key
Here is an example of generating a public/private key pair by using the previous command.
shell
gpg (GnuPG) 2.4.3; Copyright (C) 2023 g10 Code GmbH
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Please select what kind of key you want:
(1) RSA and RSA
(2) DSA and Elgamal
(3) DSA (sign only)
(4) RSA (sign only)
(9) ECC (sign and encrypt) *default*
(10) ECC (sign only)
(14) Existing key from card
Your selection?
Please select which elliptic curve you want:
(1) Curve 25519 *default*
(4) NIST P-384
(6) Brainpool P-256
Your selection?
Please specify how long the key should be valid.
0 = key does not expire
<n> = key expires in n days
<n>w = key expires in n weeks
<n>m = key expires in n months
<n>y = key expires in n years
Key is valid for? (0) 5y
Key expires at Mon 13 Nov 16:08:58 2028 AEDT
Is this correct? (y/N) y
GnuPG needs to construct a user ID to identify your key.
Real name: John Smith
Email address: johnapache.org
Comment:
You selected this USER-ID:
"John Smith <johnapache.org>"
Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? o
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
gpg: revocation certificate stored as '/Users/justin/.gnupg/openpgp-revocs.d/CC6BD9B0A3A31A7ACFF9E1383DF672F671B7F722.rev'
public and secret key created and signed.
pub ed25519 2023-11-15 [SC] [expires: 2028-11-13]
CC6BD9B0A3A31A7ACFF9E1383DF672F671B7F722
uid John Smith <johnapache.org>
sub cv25519 2023-11-15 [E] [expires: 2028-11-13]
:::caution important
Keep your private key secure and saved somewhere other than just on your computer. Don't forget your key password, and also securely record it somewhere. If you lose your keys or forget your password, you won't be able to sign releases.
:::
2. **Sign a release:**
To sign a release, use the following command for each release file:
shell
gpg --detach-sign --armor <filename>.[zip|tar.gz]
For example, to sign the Gravitino 0.2.0 release you would use this command.
shell
gpg --detach-sign --armor gravitino.0.2.0.zip
This generates an .asc file containing a PGP signature. Anyone can use this file and your public signature to verify the release.
3. **Generate hashes for a release:**
Use the following command to generate hashes for a release:
shell
shasum -a 256 <filename>.[zip|tar.gz] > <filename>.[zip|tar.gz].sha256
For example, to generate a hash for the Gravitino 0.2.0 release you would use this command:
shell
shasum -a 256 gravitino.0.2.0.zip > gravitino.0.2.0.zip.sha256
4. **Copy your public key to the KEYS file:**
The KEYS file contains public keys used to sign previous releases. You only need to do this step once. Execute the following command to copy your public key to a KEY file and then append your KEY to the KEYS file. The KEYS file contains all the public keys used to sign previous releases.
shell
gpg --output KEY --armor --export <youremail>
cat KEY >> KEYS
5. **Publish hashes and signatures:**
Upload the generated .asc and .sha256 files along with the release artifacts and KEYS file to the release area.
Verifying a release
1. **Import public keys:**
Download the KEYS file. Import the public keys used to sign all previous releases with this command. It doesn't matter if you have already imported the keys.
shell
gpg --import KEYS
2. **Verify the signature:**
Download the .asc and release files. Use the following command to verify the signature:
shell
gpg --verify <filename>.[zip|tar.gz].asc
The output should contain the text "Good signature from ...".
For example to verify the Gravitino 0.2.0 zip file you would use this command:
shell
gpg --verify gravitino.0.2.0.zip.asc
3. **Verify the hashes:**
Check if the hashes match, using the following command:
shell
diff -u <filename>.[zip|tar.gz].sha256 <(shasum -a 256 <filename>.[zip|tar.gz])
For example to verify the Gravitino 2.0 zip file you would use this command:
shell
diff -u gravitino.0.2.0.zip.sha256 <(shasum -a 256 gravitino.0.2.0.zip)
This command ensures that the signatures match and that there are no differences between them.
---
title: "OceanBase catalog"
slug: /jdbc-oceanbase-catalog
keywords:
- jdbc
- OceanBase
- metadata
license: "This software is licensed under the Apache License version 2."
---
import Tabs from 'theme/Tabs';
import TabItem from 'theme/TabItem';
Introduction
Apache Gravitino provides the ability to manage OceanBase metadata.
:::caution
Gravitino saves some system information in schema and table comment, like `(From Gravitino, DO NOT EDIT: gravitino.v1.uid1078334182909406185)`, please don't change or remove this message.
:::
Catalog
Catalog capabilities
- Gravitino catalog corresponds to the OceanBase instance.
- Supports metadata management of OceanBase (4.x).
- Supports DDL operation for OceanBase databases and tables.
- Supports table index.
- Supports [column default value](./manage-relational-metadata-using-gravitino.mdtable-column-default-value) and [auto-increment](./manage-relational-metadata-using-gravitino.mdtable-column-auto-increment).
Catalog properties
You can pass to a OceanBase data source any property that isn't defined by Gravitino by adding `gravitino.bypass.` prefix as a catalog property. For example, catalog property `gravitino.bypass.maxWaitMillis` will pass `maxWaitMillis` to the data source property.
Check the relevant data source configuration in [data source properties](https://commons.apache.org/proper/commons-dbcp/configuration.html)
If you use a JDBC catalog, you must provide `jdbc-url`, `jdbc-driver`, `jdbc-user` and `jdbc-password` to catalog properties.
Besides the [common catalog properties](./gravitino-server-config.mdgravitino-catalog-properties-configuration), the OceanBase catalog has the following properties:
| Configuration item | Description | Default value | Required | Since Version |
|----------------------|---------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------|
| `jdbc-url` | JDBC URL for connecting to the database. For example, `jdbc:mysql://localhost:2881` or `jdbc:oceanbase://localhost:2881` | (none) | Yes | 0.7.0-incubating |
| `jdbc-driver` | The driver of the JDBC connection. For example, `com.mysql.jdbc.Driver` or `com.mysql.cj.jdbc.Driver` or `com.oceanbase.jdbc.Driver`. | (none) | Yes | 0.7.0-incubating |
| `jdbc-user` | The JDBC user name. | (none) | Yes | 0.7.0-incubating |
| `jdbc-password` | The JDBC password. | (none) | Yes | 0.7.0-incubating |
| `jdbc.pool.min-size` | The minimum number of connections in the pool. `2` by default. | `2` | No | 0.7.0-incubating |
| `jdbc.pool.max-size` | The maximum number of connections in the pool. `10` by default. | `10` | No | 0.7.0-incubating |
:::caution
Before using the OceanBase Catalog, you must download the corresponding JDBC driver to the `catalogs/jdbc-oceanbase/libs` directory.
Gravitino doesn't package the JDBC driver for OceanBase due to licensing issues.
:::
Catalog operations
Refer to [Manage Relational Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.mdcatalog-operations) for more details.
Schema
Schema capabilities
- Gravitino's schema concept corresponds to the OceanBase database.
- Supports creating schema, but does not support setting comment.
- Supports dropping schema.
- Supports cascade dropping schema.
Schema properties
- Doesn't support any schema property settings.
Schema operations
Refer to [Manage Relational Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.mdschema-operations) for more details.
Table
Table capabilities
- Gravitino's table concept corresponds to the OceanBase table.
- Supports DDL operation for OceanBase tables.
- Supports index.
- Supports [column default value](./manage-relational-metadata-using-gravitino.mdtable-column-default-value) and [auto-increment](./manage-relational-metadata-using-gravitino.mdtable-column-auto-increment)..
Table properties
- Doesn't support table properties.
Table column types
| Gravitino Type | OceanBase Type |
|-------------------|---------------------|
| `Byte` | `Tinyint` |
| `Byte(false)` | `Tinyint Unsigned` |
| `Short` | `Smallint` |
| `Short(false)` | `Smallint Unsigned` |
| `Integer` | `Int` |
| `Integer(false)` | `Int Unsigned` |
| `Long` | `Bigint` |
| `Long(false)` | `Bigint Unsigned` |
| `Float` | `Float` |
| `Double` | `Double` |
| `String` | `Text` |
| `Date` | `Date` |
| `Time` | `Time` |
| `Timestamp` | `Timestamp` |
| `Decimal` | `Decimal` |
| `VarChar` | `VarChar` |
| `FixedChar` | `FixedChar` |
| `Binary` | `Binary` |
:::info
OceanBase doesn't support Gravitino `Boolean` `Fixed` `Struct` `List` `Map` `Timestamp_tz` `IntervalDay` `IntervalYear` `Union` `UUID` type.
Meanwhile, the data types other than listed above are mapped to Gravitino **[External Type](./manage-relational-metadata-using-gravitino.mdexternal-type)** that represents an unresolvable data type since 0.6.0-incubating.
:::
Table column auto-increment
:::note
OceanBase setting an auto-increment column requires simultaneously setting a unique index; otherwise, an error will occur.
:::
<Tabs groupId='language' queryString>
<TabItem value="json" label="Json">
json
{
"columns": [
{
"name": "id",
"type": "integer",
"comment": "id column comment",
"nullable": false,
"autoIncrement": true
},
{
"name": "name",
"type": "varchar(500)",
"comment": "name column comment",
"nullable": true,
"autoIncrement": false
}
],
"indexes": [
{
"indexType": "primary_key",
"name": "PRIMARY",
"fieldNames": [["id"]]
}
]
}
</TabItem>
<TabItem value="java" label="Java">
java
Column[] cols = new Column[] {
Column.of("id", Types.IntegerType.get(), "id column comment", false, true, null),
Column.of("name", Types.VarCharType.of(500), "Name of the user", true, false, null)
};
Index[] indexes = new Index[] {
Indexes.of(IndexType.PRIMARY_KEY, "PRIMARY", new String[][]{{"id"}})
}
</TabItem>
</Tabs>
Table indexes
- Supports PRIMARY_KEY and UNIQUE_KEY.
<Tabs groupId='language' queryString>
<TabItem value="json" label="Json">
json
{
"indexes": [
{
"indexType": "primary_key",
"name": "PRIMARY",
"fieldNames": [["id"]]
},
{
"indexType": "unique_key",
"name": "id_name_uk",
"fieldNames": [["id"] ,["name"]]
}
]
}
</TabItem>
<TabItem value="java" label="Java">
java
Index[] indexes = new Index[] {
Indexes.of(IndexType.PRIMARY_KEY, "PRIMARY", new String[][]{{"id"}}),
Indexes.of(IndexType.UNIQUE_KEY, "id_name_uk", new String[][]{{"id"} , {"name"}}),
}
</TabItem>
</Tabs>
Table operations
:::note
The OceanBase catalog does not support creating partitioned tables in the current version.
:::
Refer to [Manage Relational Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.mdtable-operations) for more details.
Alter table operations
Gravitino supports these table alteration operations:
- `RenameTable`
- `UpdateComment`
- `AddColumn`
- `DeleteColumn`
- `RenameColumn`
- `UpdateColumnType`
- `UpdateColumnPosition`
- `UpdateColumnNullability`
- `UpdateColumnComment`
- `UpdateColumnDefaultValue`
- `SetProperty`
:::info
- You cannot submit the `RenameTable` operation at the same time as other operations.
- If you update a nullability column to non-nullability, there may be compatibility issues.
:::
---
title: "MySQL catalog"
slug: /jdbc-mysql-catalog
keywords:
- jdbc
- MySQL
- metadata
license: "This software is licensed under the Apache License version 2."
---
import Tabs from 'theme/Tabs';
import TabItem from 'theme/TabItem';
Introduction
Apache Gravitino provides the ability to manage MySQL metadata.
:::caution
Gravitino saves some system information in schema and table comment, like `(From Gravitino, DO NOT EDIT: gravitino.v1.uid1078334182909406185)`, please don't change or remove this message.
:::
Catalog
Catalog capabilities
- Gravitino catalog corresponds to the MySQL instance.
- Supports metadata management of MySQL (5.7, 8.0).
- Supports DDL operation for MySQL databases and tables.
- Supports table index.
- Supports [column default value](./manage-relational-metadata-using-gravitino.mdtable-column-default-value) and [auto-increment](./manage-relational-metadata-using-gravitino.mdtable-column-auto-increment).
- Supports managing MySQL table features though table properties, like using `engine` to set MySQL storage engine.
Catalog properties
You can pass to a MySQL data source any property that isn't defined by Gravitino by adding `gravitino.bypass.` prefix as a catalog property. For example, catalog property `gravitino.bypass.maxWaitMillis` will pass `maxWaitMillis` to the data source property.
Check the relevant data source configuration in [data source properties](https://commons.apache.org/proper/commons-dbcp/configuration.html)
When you use the Gravitino with Trino. You can pass the Trino MySQL connector configuration using prefix `trino.bypass.`. For example, using `trino.bypass.join-pushdown.strategy` to pass the `join-pushdown.strategy` to the Gravitino MySQL catalog in Trino runtime.
If you use a JDBC catalog, you must provide `jdbc-url`, `jdbc-driver`, `jdbc-user` and `jdbc-password` to catalog properties.
Besides the [common catalog properties](./gravitino-server-config.mdgravitino-catalog-properties-configuration), the MySQL catalog has the following properties:
| Configuration item | Description | Default value | Required | Since Version |
|----------------------|--------------------------------------------------------------------------------------------------------|---------------|----------|---------------|
| `jdbc-url` | JDBC URL for connecting to the database. For example, `jdbc:mysql://localhost:3306` | (none) | Yes | 0.3.0 |
| `jdbc-driver` | The driver of the JDBC connection. For example, `com.mysql.jdbc.Driver` or `com.mysql.cj.jdbc.Driver`. | (none) | Yes | 0.3.0 |
| `jdbc-user` | The JDBC user name. | (none) | Yes | 0.3.0 |
| `jdbc-password` | The JDBC password. | (none) | Yes | 0.3.0 |