Mage-ai

Latest version: v0.9.76

Safety actively analyzes 723296 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 7 of 10

0.9.4

Not secure
![Solar Flare](https://media.giphy.com/media/l2Sq7UcchYPqoR3vq/giphy-downsized.gif)

🎉 Features

Azure Data Lake streaming pipeline support

************Docs:************ https://docs.mage.ai/guides/streaming/destinations/azure_data_lake

Mage now supports Azure Data Lake as a streaming destination!

![Azure Data Lake Destination](https://media.graphassets.com/output=format:jpg/resize=height:800,fit:max/b79fIgRl2XTTQyHIXY9w)

Pipeline Tags

Tags can now be applied to pipelines. Users can leverage the pipeline view to apply filters or group pipelines by tag.

![Pipeline Tags](https://media.graphassets.com/HFLDE5fSACfPR1riLuHZ)

![Pipeline Tags](https://media.graphassets.com/zHQamvHMRvGUoA5QwPMy)

Support for custom k8s executor job prefixes

You can now prefix your k8s executor jobs! Here’s an example k8s executor config file:

yaml
k8s_executor_config:
job_name_prefix: data-prep
resource_limits:
cpu: 1000m
memory: 2048Mi
resource_requests:
cpu: 500m
memory: 1024Mi
service_account_name: default


See [the documentation](https://docs.mage.ai/production/configuring-production-settings/compute-resource#kubernetes-executor) for further details.

Removed data integration config details from logs

Mage no longer prints data integration settings in logs: a big win for security. đź”’

![image](https://github.com/mage-ai/mage-ai/assets/59450879/a11cf4c3-b625-4ebe-9855-ddd2d9754b3d)

đź’… Other bug fixes & polish

Cloud deployment
- Fix k8s job deletion error.
- Fix fetching AWS events while editing trigger.
- Fixes for Azure deployments.
- Integrate with Azure Key Vault to support `azure_secret_var` syntax: [docs](https://docs.mage.ai/production/configuring-production-settings/secrets#azure-key-vault).
- Pass resource limits from main ECS task to dev tasks.
- Use network configuration from main ECS service as a template for dev tasks.

Integrations
- Fix Postgres schema resolution error— this fixes schema names with characters like hyphens for Postgres.
- Escape reserved column names in Postgres.
- Snowflake strings are now casted as `VARCHAR` instead of `VARCHAR(255)`. The MySQL loader now uses `TEXT` for strings to avoid truncation.
- Use AWS session token in io.s3.
- Fix the issue with database setting when running ClickHouse in SQL blocks

Other
- Fix multiple upstream block callback error. Input variables will now be fetched one block at a time.
- Fix data integration metrics calculation.
- Improved variable serialization/deserialization— this fixes kernel crashes due to OOM errors.
- User quote: "A pipeline that was taking ~1h runs in less than 2 min!"
- Fix trigger edit bug— eliminates bug that would reset fields in trigger.
- Fix default filepath in ConfigFileLoader (Thanks Ethan!)
- Move `COPY` step to reduce Docker build time.
- Validate env values in trigger config.
- Fix overview crashing.
- Fix cron settings when editing in trigger.
- Fix editing pipeline’s executor type from settings.
- Fix block pipeline policy issue.

🗣️ Shout outs

- ethanbrown3 made their first contribution in https://github.com/mage-ai/mage-ai/pull/2976 🎉
- erictse made their first contribution in https://github.com/mage-ai/mage-ai/pull/2977 🥳

0.9.0

Not secure
![image](https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExMXluYjk2Znk0OGVtMHlkNDUxd3R1cXczcTQ0bXd5ZTFzazJub201ciZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/itOLXOfzqYgtVoiS4d/giphy.gif)

Workspace management

You can use Mage with multiple workspaces in the cloud now. Mage has a built in workspace manager that can be enabled in production. This feature is similar to the **[multi-development environments](https://docs.mage.ai/developing-in-the-cloud/cloud-dev-environments/overview)**, but there are settings that can be shared across the workspaces. For example, the project owner can set workspace level permissions for users. The current additional features supported are:

- workspace level permissions
- workspace level git settings

Upcoming features:

- common workspace metadata file
- customizable permissions and roles
- pipeline level permissions

Doc: https://docs.mage.ai/developing-in-the-cloud/workspaces/overview

![Untitled](https://media.graphassets.com/DODu9phUROSOZDD7wO1e)

![Untitled](https://media.graphassets.com/OYLX9d1WRzKiMg6gpyVQ)

**Pipeline monitoring dashboard**

Add "Overview" page to dashboard providing summary of pipeline run metrics and failures.

![Untitled](https://media.graphassets.com/jURVZhYFQ4m7bOGLBMAO)

**Version control application**

Support all Git operations through UI. Authenticate with GitHub then pull from a remote repository, push local changes to a remote repository, and create pull requests for a remote repository.

Doc: https://docs.mage.ai/production/data-sync/github

![Untitled](https://media.graphassets.com/0trCSf4zR2C5zq9JFomW)

N**ew Relic monitoring**

- Set the `ENABLE_NEW_RELIC` environment variable to enable or disable new relic monitoring.
- User need to follow new relic guide to create configuration file with license_key and app name.

Doc: https://docs.mage.ai/production/observability/newrelic

![Untitled](https://media.graphassets.com/dyv28I9eRSOJw3TSO9lU)

![Untitled](https://media.graphassets.com/MHT9Y8Rf6DCqeIrgeT2w)

Authentication

**Active Directory OAuth**

Enable signing in with Microsoft Active Directory account in Mage.

Doc: https://docs.mage.ai/production/authentication/microsoft

![Untitled](https://media.graphassets.com/CvybxdeRVaqNgxXkks9Q)

LDAP

https://docs.mage.ai/production/authentication/overview#ldap

- Update default LDAP user access from editor to no access. Add an environment variable `LDAP_DEFAULT_ACCESS` so that the default access can be customized.

**Add option to sync from Git on server start**

There are two ways to configure Mage to sync from Git on server start

- Toggle `Sync on server start up` option in Git settings UI
- Set `GIT_SYNC_ON_START` environment variable (options: 0 or 1)

Doc: https://docs.mage.ai/production/data-sync/git#git-settings-as-environment-variables

![Untitled](https://media.graphassets.com/1ZTJnNl9T6CUPjES6SkO)

Data integration pipeline

M**ode Analytics Source**

Shout out to [Mohamad Balouza](https://github.com/mohamad-balouza) for his contribution of adding the Mode Analytics source to Mage data integration pipeline.

Doc: https://github.com/mage-ai/mage-ai/blob/master/mage_integrations/mage_integrations/sources/mode/README.md

**OracleDB Destination**

Doc: https://github.com/mage-ai/mage-ai/blob/master/mage_integrations/mage_integrations/destinations/oracledb/README.md

**MinIO support for S3 in Data integrations pipeline**

Support using S3 source to connect to MinIO by configuring the `aws_endpoint` in the config.

Doc: https://github.com/mage-ai/mage-ai/blob/master/mage_integrations/mage_integrations/sources/amazon_s3/README.md

Bug fixes and improvements

- **Snowflake:** Use `TIMESTAMP_TZ` as column type for snowflake datetime column.
- **BigQuery:** Not require key file for [BigQuery](https://github.com/mage-ai/mage-ai/tree/master/mage_integrations/mage_integrations/sources/bigquery) source and destination. When Mage is deployed on GCP, it can use the service account to authenticate.
- **Google Cloud Storage:** Allow authenticating with [Google Cloud Storage](https://github.com/mage-ai/mage-ai/blob/master/mage_integrations/mage_integrations/destinations/google_cloud_storage/README.md) using service account
- **MySQL**
- Fix inserting DOUBLE columns into MySQL destination
- Fix comparing datetime bookmark column in MySQL source
- Use backticks to wrap column name in MySQL
- **MongoDB source:** Add authSource and authMechanism options for [MongoDB source](https://github.com/mage-ai/mage-ai/blob/master/mage_integrations/mage_integrations/sources/mongodb/README.md).
- **Salesforce source:** Fix loading sample data for Salesforce source
- Improve visibility into non-functioning "test connection" and "load sample data" features for integration pipelines:
- Show unsupported error is "Test connection" is not implemented for an integration source.
- Update error messaging for "Load sample data" to let user know that it may not be supported for the currently selected integration source.
- Interpolate pipeline name and UUID in data integration pipelines. Doc: https://docs.mage.ai/data-integrations/configuration#variable-names

SQL block

**OracleDB Loader Block**

Added OracleDB Data Loader block

![Untitled](https://media.graphassets.com/AIiZjYfaQSGcTHlE8raV)

Bug fixes

- **MSSQL:** Fix MSSQL sql block schema. `Schema` was not properly set when checking table existence. Use `dbo` as the default schema if no schema is set.
- **Trino:** Fix inserting datetime column into Trino
- **BigQuery:** Throw exception in BigQuery SQL block
- **ClickHouse:** Support automatic table creation for ClickHouse data exporter

DBT block

DBT ClickHouse

Shout out to [Daesgar](https://github.com/Daesgar) for his contribution of adding support running ClickHouse DBT models in Mage.

![Untitled](https://media.graphassets.com/rLdMkeobSYWkwtUo4boH)

**Add DBT generic command block**

Add a DBT block that can run any generic command

![Untitled](https://media.graphassets.com/iUxh8itLTlq8YKbkAG41)

Bug fixes and improvements

- Fix bug: Running DBT block preview would sometimes not use sample limit amount.
- Fix bug: Existing upstream block would get overwritten when adding a dbt block with a ref to that existing upstream block.
- Fix bug: Duplicate upstream block added when new block contains upstream block ref and upstream block already exists.
- Use UTF-8 encoding when logging output from DBT blocks.

Notebook improvements

- Turn on output to logs when running a single block in the notebook

![Untitled](https://media.graphassets.com/5Xa1Oy3STIGmiIRx6nSW)

![Untitled](https://media.graphassets.com/bhISkb0PQyyMhlEX2RCe)

- When running a block in the notebook, provide an option to only run the upstream blocks that haven’t been executed successfully.

![Untitled](https://media.graphassets.com/LeggxmI8SaqZ5hVagQr3)

- Change the color of a custom block from the UI.

![Untitled](https://media.graphassets.com/Xok9gmiEQkqUNiMTIhBB)

- Show what pipelines are using a particular block
- Show block settings in the sidekick when selecting a block
- Show which pipelines a block is used in
- Create a block cache class that stores block to pipeline mapping

![Untitled](https://media.graphassets.com/Twy9p5x4QYC8wUvsQw5Q)

- Enhanced pipeline settings page and block settings page
- Edit pipeline and block executor type and interpolate
- Edit pipeline and block retry config from the UI
- Edit block name and color from block settings

![Untitled](https://media.graphassets.com/wDH3FmQPQpW5VB6tRMUX)

![Untitled](https://media.graphassets.com/UsQo7hxuQvSLOkEo5Pg2)

- Enhance dependency tree node to show callbacks, conditionals, and extensions

![Untitled](https://media.graphassets.com/J2ZVk4cmQQGbG4iwAWaV)

- Save trigger from UI to code

![Untitled](https://media.graphassets.com/TbjTMd2RwmQchHWnLDwM)


Cloud deployment

- Allow setting service account name for [k8s executor](https://docs.mage.ai/production/configuring-production-settings/compute-resource#kubernetes-executor)
- Example k8s executor config:

yaml
k8s_executor_config:
resource_limits:
cpu: 1000m
memory: 2048Mi
resource_requests:
cpu: 500m
memory: 1024Mi
service_account_name: custom_service_account_name


- Support customizing the timeout seconds in [GCP cloud run config](https://docs.mage.ai/production/configuring-production-settings/compute-resource#gcp-cloud-run-executor).
- Example config

yaml
gcp_cloud_run_config:
path_to_credentials_json_file: "/path/to/credentials_json_file"
project_id: project_id
timeout_seconds: 600


- Check ECS task status after running the task.

S**treaming pipeline**

- Fix copy output in streaming pipeline. Catch deepcopy error (`TypeError: cannot pickle '_thread.lock' object in the deepcopy from the handle_batch_events_recursively`) and fallback to copy method.

Spark pipeline

- Fix an issue with setting custom Spark pipeline config.
- Fix testing Spark DataFrame. Pass the correct Spark DataFrame to the test method.

Other bug fixes & polish

- Add json value macro. Example usage: `"{{ json_value(aws_secret_var('test_secret_key_value'), 'k1') }}"`
- Allow slashes in block_uuid when downloading block output. The regex for the block output download endpoint would not capture block_uuids with slashes in them, so this fixes that.
- Fix renaming block.
- Fix user auth when disable notebook edits is enabled.
- Allow JWT_SECRET to be modified via env var. The `JWT_SECRET` for encoding and decoding access tokens was hardcoded, the fix allows users to update it through an [environment variable](https://docs.mage.ai/development/environment-variables).
- Hide duplicate shortcut items in editor context menu
- Before (after running the block a few times and removing/adding block connections):

![Untitled](https://media.graphassets.com/U0qz5mdSIOgkMpVKA0Ey)

- After (after following the same steps and running the block a few times and removing/adding block connections):

![Untitled](https://media.graphassets.com/Iqb5GcQjRfCGrZkisH31)

- When changing the name of a block or creating a new block, auto-create non-existent folders if the block name is using nested block names.
- Fix trigger count in pipeline dashboard
- Fix copy text for secrets
- Fix git sync `asyncio` issue
- Fix Circular Import when importing `get_secret_value` method
- Shorten branch name in the header. If branch name is greater than 21 characters, show ellipsis.
- Replace hard-to-read dark blue font in code block output with much more legible yellow font.
- Show error popup if error occurs when updating pipeline settings.
- Update tree node when block status changes
- Prevent sending notification multiple times for multiple block failures

0.8.93

Not secure
![Image](https://media.giphy.com/media/JHVyTBvMRgQ9UnWTOf/giphy.gif)

Conditional block

Add conditional block to Mage. The conditional block is an "Add-on" block that can be added to an existing block within a pipeline. If the conditional block evaluates as False, the parent block will not be executed.

Doc: https://docs.mage.ai/development/blocks/conditionals/overview

![Untitled](https://media.graphassets.com/fBtlFN8HTBug2RovcxLX)

![Untitled](https://media.graphassets.com/Ijt5ACNIQPiLYdpeo9pO)

![Untitled](https://media.graphassets.com/GPIWZwSxRLmWnKvaeR5m)

Download block output

For standard pipelines (not currently supported in integration or streaming pipelines), you can save the output of a block that has been run as a CSV file. You can save the block output in Pipeline Editor page or Block Runs page.

Doc: https://docs.mage.ai/orchestration/pipeline-runs/saving-block-output-as-csv

![Untitled](https://media.graphassets.com/4IBYbOdQ3yTReYfZzmzw)

![Untitled](https://media.graphassets.com/GvFLkzwGQqKDcIiseWpQ)

Customize Pipeline level spark config

Mage supports customizing Spark session for a pipeline by specifying the `spark_config` in the pipeline `metadata.yaml` file. The pipeline level `spark_config` will override the project level `spark_config` if specified.

Doc: https://docs.mage.ai/integrations/spark-pyspark#custom-spark-session-at-the-pipeline-level

Data integration pipeline

Oracle DB source

Doc: https://github.com/mage-ai/mage-ai/blob/master/mage_integrations/mage_integrations/sources/oracledb/README.md

Download file data in the API source

Doc: https://github.com/mage-ai/mage-ai/tree/master/mage_integrations/mage_integrations/sources/api

Personalize notification messages

Users can customize the notification templates of different channels (slack, email, etc.) in project metadata.yaml. Hare are the supported variables that can be interpolated in the message templates: `execution_time` , `pipeline_run_url` , `pipeline_schedule_id`, `pipeline_schedule_name`, `pipeline_uuid`

Example config in project's metadata.yaml

yaml
notification_config:
slack_config:
webhook_url: "{{ env_var('MAGE_SLACK_WEBHOOK_URL') }}"
message_templates:
failure:
details: >
Failure to execute pipeline {pipeline_run_url}.
Pipeline uuid: {pipeline_uuid}. Trigger name: {pipeline_schedule_name}.
Test custom message."


![Untitled](https://media.graphassets.com/4j7FVwkQZCw13uScI2kO)

Doc: https://docs.mage.ai/production/observability/alerting-slack#customize-message-templates

Support **MSSQL and MySQL as the database engine**

Mage stores orchestration data, user data, and secrets data in a database. In addition to SQLite and Postgres, Mage supports using MSSQL and MySQL as the database engine now.

MSSQL docs:

- https://docs.mage.ai/production/databases/default#mssql
- https://docs.mage.ai/getting-started/setup#using-mssql-as-database

MySQL docs:

- https://docs.mage.ai/production/databases/default#mysql
- https://docs.mage.ai/getting-started/setup#using-mysql-as-database

Add MinIO and ****Wasabi**** support via S3 data loader block

Mage supports connecting to MinIO and Wasabi by specifying the `AWS_ENDPOINT` field in S3 config now.

Doc: https://docs.mage.ai/integrations/databases/S3#minio-support

Use dynamic blocks with replica blocks

To maximize block reuse, you can use dynamic and replica blocks in combination.

- https://docs.mage.ai/design/blocks/dynamic-blocks
- https://docs.mage.ai/design/blocks/replicate-blocks

![Untitled](https://media.graphassets.com/f5lRRy5lQAC8K0Gkcucu)

Other bug fixes & polish

- The command `CREATE SCHEMA IF NOT EXISTS` is not supported by MSSQL. Provided a default command in BaseSQL -> build_create_schema_command, and an overridden implementation in MSSQL -> build_create_schema_command containing compatible syntax. (Kudos to [gjvanvuuren](https://github.com/gjvanvuuren))
- Fix streaming pipeline `kwargs` passing so that RabbitMQ messages can be acknowledged correctly.
- Interpolate variables in streaming configs.
- Git integration: Create known hosts if it doesn't exist.
- Do not create duplicate triggers when DB query fails on checking existing triggers.
- Fix bug: when there are multiple downstream replica blocks, those blocks are not getting queued.
- Fix block uuid formatting for logs.
- Update WidgetPolicy to allow editing and creating widgets without authorization errors.
- Update sensor block to accept positional arguments.
- Fix variables for GCP Cloud Run executor.
- Fix MERGE command for Snowflake destination.
- Fix encoding issue of file upload.
- Always delete the temporary DBT profiles dir to prevent file browser performance degrade.

0.8.86

Not secure
![Image](https://media.giphy.com/media/roh7bs2cEW2ReFgYRN/giphy-downsized.gif)

Replicate blocks

Support reusing same block multiple times in a single pipeline.

Doc: https://docs.mage.ai/design/blocks/replicate-blocks

![Untitled](https://media.graphassets.com/ufT5VQlTTFK70zZZAHkF)

![Untitled](https://media.graphassets.com/kwFLC5WDSiGq0wTwXKB5)

Spark on Yarn

Support running Spark code on Yarn cluster with Mage.

Doc: https://docs.mage.ai/integrations/spark-pyspark#hadoop-and-yarn-cluster-for-spark

Customize retry config

Mage supports configuring automatic retry for block runs with the following ways

1. Add `retry_config` to project’s `metadata.yaml`. This `retry_config` will be applied to all block runs.
2. Add `retry_config` to the block config in pipeline’s `metadata.yaml`. The block level `retry_config` will override the global `retry_config`.

Example config:

yaml
retry_config:
Number of retry times
retries: 0
Initial delay before retry. If exponential_backoff is true,
the delay time is multiplied by 2 for the next retry
delay: 5
Maximum time between the first attempt and the last retry
max_delay: 60
Whether to use exponential backoff retry
exponential_backoff: true


Doc: https://docs.mage.ai/orchestration/pipeline-runs/retrying-block-runs#automatic-retry

DBT improvements

- When running DBT block with language YAML, interpolate and merge the user defined --vars in the block’s code into the variables that Mage automatically constructs
- Example block code of different formats

bash
--select demo/models --vars '{"demo_key": "demo_value", "date": 20230101}'
--select demo/models --vars {"demo_key":"demo_value","date":20230101}
--select demo/models --vars '{"global_var": {{ test_global_var }}, "env_var": {{ test_env_var }}}'
--select demo/models --vars {"refresh":{{page_refresh}},"env_var":{{env}}}


- Doc: https://docs.mage.ai/dbt/run-single-model#adding-variables-when-running-a-yaml-dbt-block
- Support `dbt_project.yml` custom project names and custom profile names that are different than the DBT folder name
- Allow user to configure block to run DBT snapshot

Dynamic SQL block

Support using dynamic child blocks for SQL blocks

Doc: https://docs.mage.ai/design/blocks/dynamic-blocks#dynamic-sql-blocks

Run blocks concurrently in separate containers on Azure

If your Mage app is deployed on Microsoft Azure with Mage’s **[terraform scripts](https://github.com/mage-ai/mage-ai-terraform-templates/tree/master/azure)**, you can choose to launch separate Azure container instances to execute blocks.

Doc: https://docs.mage.ai/production/configuring-production-settings/compute-resource#azure-container-instance-executor

Run the scheduler and the web server in separate containers or pods

- Run scheduler only: `mage start project_name --instance-type scheduler`
- Run web server only: `mage start project_name --instance-type web_server`
- web server can be run in multiple containers or pods
- Run both server and scheduler: `mage start project_name --instance-type server_and_scheduler`

Support all operations on folder

Support “Add”, “Rename”, “Move”, “Delete” operations on folder.

![Untitled](https://media.graphassets.com/YUnvzFbR2SBZ61y1Eton)

Configure environments for triggers in code

Allow specifying `envs` value to apply triggers only in certain environments.

Example:

yaml
triggers:
- name: test_example_trigger_in_prod
schedule_type: time
schedule_interval: "daily"
start_time: 2023-01-01
status: active
envs:
- prod
- name: test_example_trigger_in_dev
schedule_type: time
schedule_interval: "hourly"
start_time: 2023-03-01
status: inactive
settings:
skip_if_previous_running: true
allow_blocks_to_fail: true
envs:
- dev


Doc: https://docs.mage.ai/guides/triggers/configure-triggers-in-code#create-and-configure-triggers

Replace current logs table with virtualized table for better UI performance

- Use virtual table to render logs so that loading thousands of rows won't slow down browser performance.
- Fix formatting of logs table rows when a log is selected (the log detail side panel would overly condense the main section, losing the place of which log you clicked).
- Pin logs page header and footer.
- Tested performance using Lighthouse Chrome browser extension, and performance increased 12 points.

Other bug fixes & polish

- Add indices to schedule models to speed up DB queries.
- “Too many open files issue”
- Check for "Too many open files" error on all pages calling "displayErrorFromReadResponse" util method (e.g. pipeline edit page), not just Pipelines Dashboard.

![Untitled](https://media.graphassets.com/vnPZfdyQiid2ocrw4cQJ)

- Update terraform scripts to set the `ULIMIT_NO_FILE` environment variable to increase maximum number of open files in Mage deployed on AWS, GCP and Azure.
- Fix git_branch resource blocking page loads. The `git clone` command could cause the entire app to hang if the host wasn't added to known hosts. `git clone` command is updated to run as a separate process with the timeout, so it won't block the entire app if it's stuck.
- Fix bug: when adding a block in between blocks in pipeline with two separate root nodes, the downstream connections are removed.
- Fix DBT error: `KeyError: 'file_path'`. Check for `file_path` before calling `parse_attributes` method to avoid KeyError.
- Improve the coding experience when working with Snowflake data provider credentials. Allow more flexibility in Snowflake SQL block queries. Doc: https://docs.mage.ai/integrations/databases/Snowflake#methods-for-configuring-database-and-schema
- Pass parent block’s output and variables to its callback blocks.
- Fix missing input field and select field descriptions in charts.
- Fix bug: Missing values template chart doesn’t render.
- Convert `numpy.ndarray` to `list` if column type is list when fetching input variables for blocks.
- Fix runtime and global variables not available in the keyword arguments when executing block with upstream blocks from the edit pipeline page.

View full [Changelog](https://www.notion.so/What-s-new-7cc355e38e9c42839d23fdbef2dabd2c)

0.8.83

Not secure
![image](https://media.giphy.com/media/U2Olc7gWU5pFzpDufV/giphy-downsized.gif)

Support more complex streaming pipeline

More complex streaming pipeline is supported in Mage now. You can use more than transformer and more than one sinks in the streaming pipeline.

Here is an example streaming pipeline with multiple transformers and sinks.

![Untitled](https://media.graphassets.com/4ZWHyLpgTESu0rB1sXnP)

Doc for streaming pipeline: [https://docs.mage.ai/guides/streaming/overview](https://docs.mage.ai/guides/streaming/overview)

Custom Spark configuration

Allow using custom Spark configuration to create Spark session used in the pipeline.

yaml
spark_config:
Application name
app_name: 'my spark app'
Master URL to connect to
e.g., spark_master: 'spark://host:port', or spark_master: 'yarn'
spark_master: 'local'
Executor environment variables
e.g., executor_env: {'PYTHONPATH': '/home/path'}
executor_env: {}
Jar files to be uploaded to the cluster and added to the classpath
e.g., spark_jars: ['/home/path/example1.jar']
spark_jars: []
Path where Spark is installed on worker nodes,
e.g. spark_home: '/usr/lib/spark'
spark_home: null
List of key-value pairs to be set in SparkConf
e.g., others: {'spark.executor.memory': '4g', 'spark.executor.cores': '2'}
others: {}


Doc for running PySpark pipeline: [https://docs.mage.ai/integrations/spark-pyspark#standalone-spark-cluster](https://docs.mage.ai/integrations/spark-pyspark#standalone-spark-cluster)

Data integration pipeline

DynamoDB source

New data integration source DynamoDB is added.

Doc: [https://github.com/mage-ai/mage-ai/blob/master/mage_integrations/mage_integrations/sources/dynamodb/README.md](https://github.com/mage-ai/mage-ai/blob/master/mage_integrations/mage_integrations/sources/dynamodb/README.md)

Bug fixes

- Use `timestamptz` as data type for datetime column in Postgres destination.
- Fix BigQuery batch load error.

**Show file browser outside edit pipeline**

Improved the file editor of Mage so that user can edit the files without going into a pipeline.

![Untitled](https://media.graphassets.com/gsAACptRJa3IZ94tRjAR)

**Add all file operations**

![Untitled](https://media.graphassets.com/ItoUlNwmSq6dXnRHYcrT)

**Speed up writing block output to disk**

Mage uses Polars to speed up writing block output (DataFrame) to disk, reducing the time of fetching and writing a DataFrame with 2 million rows from 90s to 15s.

**Add default `.gitignore`**

Mage automatically adds the default `.gitignore` file when initializing project


.DS_Store
.file_versions
.gitkeep
.log
.logs/
.preferences.yaml
.variables/
__pycache__/
docker-compose.override.yml
logs/
mage-ai.db
mage_data/
secrets/


Other bug fixes & polish

- Include trigger URL in slack alert.

![Untitled](https://media.graphassets.com/0Uzv5GaWSd2B5192P4N9)

- Fix race conditions for multiple runs within one second
- If DBT block is language YAML, hide the option to add upstream dbt refs
- Include event_variables in individual pipeline run retry
- Callback block
- Include parent block uuid in callback block kwargs
- Pass parent block’s output and variables to its callback blocks
- Delete GCP cloud run job after it's completed.
- Limit the code block output from print statements to avoid sending excessively large payload request bodies when saving the pipeline.
- Lock typing extension version to fix error `TypeError: Instance and class checks can only be used with runtime protocols`.
- Fix git sync and also updates how we save git settings for users in the backend.
- Fix MySQL ssh tunnel: close ssh tunnel connection after testing connection.

View full [Changelog](https://www.notion.so/What-s-new-7cc355e38e9c42839d23fdbef2dabd2c)

0.8.78

Not secure
![image](https://media.giphy.com/media/z9g6xLr5C0H1m/giphy.gif)

MongoDB code templates

Add code templates to fetch data from and export data to MongoDB.

Example MongoDB config in `io_config.yaml` :

yaml

Page 7 of 10

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.