Safety vulnerability ID: 50835
The information on this page was manually curated by our Cybersecurity Intelligence Team.
Synapseml 0.10.0 updates its NPM dependency 'prismjs' to v1.27.0 to include a security fix.
Latest version: 1.0.9
Synapse Machine Learning
<img width="100%" src="https://mmlspark.blob.core.windows.net/graphics/emails/email_header_synapseml.jpg" alt="SynapseML" href="https://github.com/Azure/mmlspark">
Building production ready distributed machine learning pipelines can be a challenge for even the most seasoned researcher or engineer. We are excited to announce the release of SynapseML v0.10.0 (Previously MMLSpark), an open-source library that aims to simplify the creation of massively scalable machine learning pipelines. SynapseML unifies several existing ML Frameworks and new MSFT algorithms in a single, scalable API that’s usable across Python, R, Scala, Java, .NET, C, and F.
Highlights
|<img width="600" src="https://mmlspark.blob.core.windows.net/graphics/emails/OpenAI_Logo.svg"> | <img width="400" src="https://mmlspark.blob.core.windows.net/graphics/emails/Microsoft_.NET_logo.svg"> | <img width="500" src="https://mmlspark.blob.core.windows.net/graphics/emails/MLflow-logo-final-black.png"> | <img width="500" src="https://mmlspark.blob.core.windows.net/graphics/emails/binder.svg"> |
|:--:|:--:|:--:|:--:|
|**OpenAI Language Models** | **.NET, C, and F Support** | **Full MLFlow Support** | **Live Demos in Browser** |
| Embed 175-billion parameter models into your databases with ease | Use or train any SynapseML model from .NET | Quick and easy MLOps, model management, and autologging | Explore the SynapseML library with zero setup |
| [Learn More](https://microsoft.github.io/SynapseML/docs/features/cognitive_services/CognitiveServices%20-%20OpenAI/) | [Getting Started Guide](https://microsoft.github.io/SynapseML/docs/reference/dotnet-setup/) | [Explore the Docs](https://microsoft.github.io/SynapseML/docs/mlflow/introduction/) | [Run in Browser](https://aka.ms/synapseml-binder) |
New Features
General ✨
- SynapseML now supports .NET, C, F, and other .NET ecosystem languages in addition to Scala, Python, and R. Please see our [Setup Guide](https://microsoft.github.io/SynapseML/docs/reference/dotnet-setup/) and [LightGBM from .NET example](https://microsoft.github.io/SynapseML/docs/next/getting_started/dotnet_example/) for more details. ([#1539](https://github.com/Microsoft/SynapseML/issues/1539), [#1156](https://github.com/Microsoft/SynapseML/issues/1156), [#1443](https://github.com/Microsoft/SynapseML/issues/1443))
- SynapseML is now usable from your browser with zero setup using Binder. [Quickly explore our demos in Binder](https://aka.ms/synapseml-binder). ([#1487](https://github.com/Microsoft/SynapseML/issues/1487), [#1493](https://github.com/Microsoft/SynapseML/issues/1493))
Azure Cognitive Services for Big Data 🧠
- Added OpenAI GPT-3 Sentence Completion Transformer. Use this feature to embed [175-billion parameter language models](https://azure.microsoft.com/en-us/services/cognitive-services/openai-service/) into distributed pipelines and databases [to solve a variety of general purpose NLP tasks](https://docs.microsoft.com/en-us/azure/cognitive-services/openai/how-to/completions) across natural language and code. ([#1495](https://github.com/Microsoft/SynapseML/issues/1495), [#1541](https://github.com/Microsoft/SynapseML/issues/1541))
- Added [an example of Sentence Completion with GPT-3](https://microsoft.github.io/SynapseML/docs/features/cognitive_services/CognitiveServices%20-%20OpenAI/) ([#1564](https://github.com/Microsoft/SynapseML/issues/1564))
- Added support for Form Recognizer V3.0 ([1269](https://github.com/Microsoft/SynapseML/issues/1269))
- Improved MVAD usability with async training and better data validation ([1477](https://github.com/Microsoft/SynapseML/issues/1477))
- Upgraded the univariate anomaly detection version to v1.1-preview ([1440](https://github.com/Microsoft/SynapseML/issues/1440))
- Added a [multivariate anomaly detection sample notebook](https://microsoft.github.io/SynapseML/docs/features/cognitive_services/CognitiveServices%20-%20Multivariate%20Anomaly%20Detection/) ([#1365](https://github.com/Microsoft/SynapseML/issues/1365))
- Added a [Text to Speech example](https://microsoft.github.io/SynapseML/docs/features/cognitive_services/CognitiveServices%20-%20Overview/#text-to-speech-sample) to cognitive service overview ([1350](https://github.com/Microsoft/SynapseML/issues/1350))
- Added opinion mining to TextSentiment Models ([1449](https://github.com/microsoft/SynapseML/pull/1449))
- Fixed Azure Maps schemas ([1553](https://github.com/Microsoft/SynapseML/issues/1553))
- Removed modelID param validators in FormRecognizerV3 ([1551](https://github.com/Microsoft/SynapseML/issues/1551))
- Fixed form recognizer and form ontology learner issues ([1506](https://github.com/Microsoft/SynapseML/issues/1506))
- Fixed `setServiceName` python method in OpenAI ([1498](https://github.com/Microsoft/SynapseML/issues/1498))
- Fixed error in Text Analytics Analyze schema
- Improved error handling for MVAD ([1448](https://github.com/Microsoft/SynapseML/issues/1448), [#1391](https://github.com/Microsoft/SynapseML/issues/1391))
- Removed unused concurrency parameter for MVAD ([1383](https://github.com/Microsoft/SynapseML/issues/1383))
- Improved robustness of flood risk notebook by adding polling ([1427](https://github.com/Microsoft/SynapseML/issues/1427))
Responsible AI at Scale 😇
- Added partial dependence plots (PDP) to allow for understanding how independent variables affect a model's prediction ([1426](https://github.com/Microsoft/SynapseML/issues/1426))
- Updated ICE/PDP documentation with PDP-based feature importance and additional examples ([1441](https://github.com/Microsoft/SynapseML/issues/1441), [#1352](https://github.com/Microsoft/SynapseML/issues/1352))
- Added [a notebook for ICE and PDP feature explainers](https://microsoft.github.io/SynapseML/docs/features/responsible_ai/Interpretability%20-%20PDP%20and%20ICE%20explainer/) ([#1318](https://github.com/Microsoft/SynapseML/issues/1318))
- Updated data balance documentation to better describe how it can be used to ensure model fairness ([1540](https://github.com/Microsoft/SynapseML/issues/1540))
MLFlow 🔃
- Added [documentation for MLFlow autologging](https://microsoft.github.io/SynapseML/docs/mlflow/autologging/) ([#1508](https://github.com/Microsoft/SynapseML/issues/1508))
- Added [documentation on the SynapseML-MLFlow integration](https://microsoft.github.io/SynapseML/docs/mlflow/introduction/) ([#1428](https://github.com/Microsoft/SynapseML/issues/1428))
LightGBM on Spark 🌳
- Added the ability to pass in generic argument strings to LightGBM enabling many complex parameterizations ([1444](https://github.com/Microsoft/SynapseML/issues/1444))
- Added seed parameters to LightGBM ([1387](https://github.com/Microsoft/SynapseML/issues/1387))
- Added a method to get LightGBM native model string directly ([1515](https://github.com/Microsoft/SynapseML/issues/1515))
- Fixed issue with validation data creation during `useSingleDataset` mode ([1527](https://github.com/Microsoft/SynapseML/issues/1527))
- Fixed multiclass training with initial scores ([1526](https://github.com/Microsoft/SynapseML/issues/1526))
- Fixed saving LightGBM model iterations with early stopping ([1497](https://github.com/Microsoft/SynapseML/issues/1497))
- Fixed issue where chunk size parameter was incorrectly specified during data copy ([1490](https://github.com/Microsoft/SynapseML/issues/1490))
- Fixed issue where when empty partition is chosen as the main worker in `singleDatasetMode` ([1458](https://github.com/Microsoft/SynapseML/issues/1458))
- Fixed bug with data repartitioning in `LightGBMRanker `([1368](https://github.com/Microsoft/SynapseML/issues/1368))
- Fixed outdated docs for `useSingleDatasetMode` (1562)
- Refactored LightGBM class structure to improve logging and debugging ([1557](https://github.com/Microsoft/SynapseML/issues/1557))
Vowpal Wabbit 🐇
- Fixed issues with the `saveNativeModel` for the VWRegressionModel [1364](https://github.com/Microsoft/SynapseML/issues/1364) ([#1366](https://github.com/Microsoft/SynapseML/issues/1366))
- Fixed issues with building quadratic interaction terms ([1460](https://github.com/Microsoft/SynapseML/issues/1460))
Isolation Forests 🌲
- Added an [Isolation Forest Multivariate Anomaly Detection sample notebook](https://microsoft.github.io/SynapseML/docs/features/isolation_forest/IsolationForest%20-%20Multivariate%20Anomaly%20Detection/) ([#1483](https://github.com/Microsoft/SynapseML/issues/1483))
Additional Updates
Maintenance 🔧
- Removed unused debugging code ([1546](https://github.com/Microsoft/SynapseML/issues/1546))
- Remove Synapse test exclusion for Explanation Dashboard notebook ([1531](https://github.com/Microsoft/SynapseML/issues/1531))
- Made python style checks verbose ([1532](https://github.com/Microsoft/SynapseML/issues/1532))
- Fixed library checking while installing library on Databricks cluster ([1488](https://github.com/Microsoft/SynapseML/issues/1488))
- Upgraded and fix Dockerfiles ([1472](https://github.com/Microsoft/SynapseML/issues/1472))
- Added Developer Docker Image build to pipeline ([1480](https://github.com/Microsoft/SynapseML/issues/1480))
- Fixed ADO area path in Issue Linker ([1464](https://github.com/Microsoft/SynapseML/issues/1464))
- Fix master version badge display
- Improved Databricks error reporting
- Updated azure cli to stop build errors
- Fixed SSL handshake flakiness
- Added `itsdangerous` as a dependency to ADB tests ([1412](https://github.com/Microsoft/SynapseML/issues/1412))
- Turned on debug for pr to work item workflow
- Pointed pr linker to official implementation
- Changed GitHub action trigger from pull_request_target to pull_request ([1413](https://github.com/Microsoft/SynapseML/issues/1413))
- Fixed issue where Unit Tests were not executing ([1409](https://github.com/Microsoft/SynapseML/issues/1409))
- Added Azure DevOps PR linker ([1394](https://github.com/Microsoft/SynapseML/issues/1394))
- Updated GH PAT name ([1389](https://github.com/Microsoft/SynapseML/issues/1389))
- Re-enable Synapse E2E Tests ([1517](https://github.com/Microsoft/SynapseML/issues/1517))
- Updated SynapseE2E Tests to Spark 3.2 ([1362](https://github.com/Microsoft/SynapseML/issues/1362))
- Fix ADO issue/pr linking ([1463](https://github.com/Microsoft/SynapseML/issues/1463))
- Cleaned up extra MVAD models and improved network resiliency ([1457](https://github.com/Microsoft/SynapseML/issues/1457))
- Updated azure blob client version ([1563](https://github.com/Microsoft/SynapseML/issues/1563))
- Fixed docker security vulnerability ([1561](https://github.com/Microsoft/SynapseML/issues/1561))
- Streamlined scalastyle hook ([1530](https://github.com/Microsoft/SynapseML/issues/1530))
- Updated CODEOWNERS ([1523](https://github.com/Microsoft/SynapseML/issues/1523))
- Updated OpenAI resource info ([1525](https://github.com/Microsoft/SynapseML/issues/1525))
- Fixed semantic PR checking ([1503](https://github.com/Microsoft/SynapseML/issues/1503))
- Updated docker images to remain compliant ([1500](https://github.com/Microsoft/SynapseML/issues/1500))
- Added component governance explicitly to build so timeout variable works ([1489](https://github.com/Microsoft/SynapseML/issues/1489))
- Fixed path for notebook test files in gitignore ([1485](https://github.com/Microsoft/SynapseML/issues/1485))
- Increased component governance timeout ([1482](https://github.com/Microsoft/SynapseML/issues/1482))
- Added conda caching to build
- Stopped build from failing after 1 hour
- Fixed flaking MVAD test
- Refactored build pipeline definitions
- Split Synapse tests into multiple test ([1377](https://github.com/Microsoft/SynapseML/issues/1377))
- Moved from ADO Pipelines to GitHub Workflows ([1406](https://github.com/Microsoft/SynapseML/issues/1406))
Website Improvements 💻
- Fixed MathJax expressions rendering ([1343](https://github.com/Microsoft/SynapseML/issues/1343))
- Fixed google analytics gtags ([1434](https://github.com/Microsoft/SynapseML/issues/1434))
- Corrected placement of BingSiteAuth.xml config ([1445](https://github.com/Microsoft/SynapseML/issues/1445), [#1439](https://github.com/Microsoft/SynapseML/issues/1439))
- Fixed website security and upgrade docusaurus ([1545](https://github.com/Microsoft/SynapseML/issues/1545))
- Moveed Geospatial Services to its own folder ([1345](https://github.com/Microsoft/SynapseML/issues/1345))
- Bumped minimist from 1.2.5 to 1.2.6 in /website ([1455](https://github.com/Microsoft/SynapseML/issues/1455))
- Bumped node-forge from 1.2.1 to 1.3.0 in /website ([1451](https://github.com/Microsoft/SynapseML/issues/1451))
- Bumped prismjs from 1.25.0 to 1.27.0 in /website ([1430](https://github.com/Microsoft/SynapseML/issues/1430))
- Bumped follow-redirects from 1.14.7 to 1.14.8 in /website ([1402](https://github.com/Microsoft/SynapseML/issues/1402))
- Bumped nanoid from 3.1.23 to 3.2.0 in /website ([1355](https://github.com/Microsoft/SynapseML/issues/1355))
- Bumped shelljs from 0.8.4 to 0.8.5 in /website ([1347](https://github.com/Microsoft/SynapseML/issues/1347))
- Bumped follow-redirects from 1.14.1 to 1.14.7 in /website ([1348](https://github.com/Microsoft/SynapseML/issues/1348))
- Bumped cross-fetch from 3.1.4 to 3.1.5 in /website ([1496](https://github.com/Microsoft/SynapseML/issues/1496))
- Bumped async from 2.6.3 to 2.6.4 in /website ([1481](https://github.com/Microsoft/SynapseML/issues/1481))
- Pinned onnxmltools to a specific version ([1524](https://github.com/Microsoft/SynapseML/issues/1524))
Bug Fixes 🐞
- Fixed twitter sentiment detection notebook ([1544](https://github.com/Microsoft/SynapseML/issues/1544))
- Fixed issue with `DataConversion` serialization ([1505](https://github.com/Microsoft/SynapseML/issues/1505))
- Fixed typos in `TestBase` ([1501](https://github.com/Microsoft/SynapseML/issues/1501))
- Fixed issue in `GridSpace` python API ([1470](https://github.com/Microsoft/SynapseML/issues/1470))
- Fixed reflective class loading in IntelliJ ([1456](https://github.com/Microsoft/SynapseML/issues/1456))
- Removed verbose `ComputeModelStatistics` output and convert `scoredLabelsCol` to DoubleType ([1361](https://github.com/Microsoft/SynapseML/issues/1361))
- Fixed flaking in geospatial notebooks
Code Style 🎶
- Improved style checks using pre-commit ([1538](https://github.com/Microsoft/SynapseML/issues/1538), [#1528](https://github.com/Microsoft/SynapseML/issues/1528), [#1535](https://github.com/Microsoft/SynapseML/issues/1535))
- Formatted code and notebooks with Black style checker ([1522](https://github.com/Microsoft/SynapseML/issues/1522), [#1520](https://github.com/Microsoft/SynapseML/issues/1520))
Documentation 📘
- Tabularized badges for readability ([1486](https://github.com/Microsoft/SynapseML/issues/1486))
- Added a PR template ([1418](https://github.com/Microsoft/SynapseML/issues/1418))
- Improved installation readme ([1369](https://github.com/Microsoft/SynapseML/issues/1369), [#1422](https://github.com/Microsoft/SynapseML/issues/1422))
- Added a Security readme ([1511](https://github.com/Microsoft/SynapseML/issues/1511))
- Updated the Azure Synapse readme ([1372](https://github.com/Microsoft/SynapseML/issues/1372))
- Remove reference to custom maven resolver
- Added pointer to docs on synapse pool configuration
- Fixed typos in readme ([1516](https://github.com/Microsoft/SynapseML/issues/1516))
Contributor Spotlight
We are excited to highlight the contributions of the following SynapseML contributors:
| <img width="200px" src="https://mmlspark.blob.core.windows.net/graphics/people/serena_color.jpg"> |<img width="200px" src="https://mmlspark.blob.core.windows.net/graphics/people/ric.jpg"> | <img width="200px" src="https://mmlspark.blob.core.windows.net/graphics/people/puneet.jpg"> |
|:--:|:--:|:--:|
| **Serena Ruan** | **Ric Serradas** | **Puneet Pruthi** |
| Serena is a Software Engineer II on the Synapse team in Beijing and a force of nature. In this release, Serena has continued her prolific contribution steak by adding language support for .NET, C, and F and integrating SynapseML with MLFlow. Additionally, Serena has contributed several features to the MLFlow and Spark.NET open-source communities so that these systems can work better for every user. These contributions are just some of the many amazing things Serena has accomplished during this release, and her devotion and craft are pivotal to the ecosystem. | Ric is a Senior Engineering Manager on the OneNote team with a shining personality and drive to collaborate. In just a few weeks Ric hit the ground running by setting up an automated link between GitHub and Azure DevOps, building the first working version of SynapseE2E tests, and re-writing our entire build in GH Actions. Furthermore, Ric worked tirelessly through nights and weekends to land his contributions. | Puneet is a Senior Engineer on the SynapseML team with a knack for engineering systems and dockerization. Puneet's contributions to the library include architecting the new binder integration, driving our Synapse E2E tests to completion, and improving SynapseML’ s infrastructure around community engagement. Puneet is constantly thinking of ways to improve the community and we value his effort. |
| <img width="200px" src="https://mmlspark.blob.core.windows.net/graphics/people/mark_n.jpg"> |<img width="200px" src="https://mmlspark.blob.core.windows.net/graphics/people/keerthi.jpg"> | <img width="200px" src="https://mmlspark.blob.core.windows.net/graphics/people/yagna.jpg"> |
| **Mark Niehaus** | **Keerthi Yanda** | **Yagna Oruganti** |
| Mark is a Senior Software Engineer on the SynapseML team with a deep knowledge of the .NET ecosystem and infrastructure development. In this release, Mark architected SynapseML’ s .NET binding blob publishing strategy, drove the OpenAI GPT-3 bindings to completion, and wrote [a detailed GPT-3 walkthrough](https://microsoft.github.io/SynapseML/docs/features/cognitive_services/CognitiveServices%20-%20OpenAI/). Mark completed these projects while supporting the Time Series Insights service, speaking to his ability to keep multiple plates spinning at a time. | Keerthi is a Software Engineer II on the SynapseML team. Despite joining Microsoft just a few months ago, Keerthi has quickly learned the SynapseML ropes to take command of our integration with the Azure Synapse platform. Huge kudos to her for braving long build times, and daunting error messages to make sure SynapseML works out of the box on Synapse Analytics clusters. | Yagna is a Senior Data and Applied Scientist on the Industry AI team with a talent for building solutions that integrate many community tools to solve customer challenges. Yagna's first contribution to SynapseML was [a masterpiece of a demo](https://microsoft.github.io/SynapseML/docs/features/isolation_forest/IsolationForest%20-%20Multivariate%20Anomaly%20Detection/) showing how to use Isolation Forests, MLFlow, Tabular SHAP, and the interpret-ml explanation dashboard in a single anomaly detection example. |
Acknowledgements
We would like to acknowledge the developers and contributors, both internal and external, who helped create this version of SynapseML
Serena Ruan serena-ruan, Eric Dettinger, Scott Votaw svotaw, Puneet Pruthi ppruthi, Ric Serradas riserrad, Mark Niehaus niehaus59, Kyle Rush k-rush, Keerthi Yanda KeerthiYandaOS, Yagna Oruganti YagnaDeepika, Jason Wang memoryz, Ilya Matiach imatiach-msft, Yazeed Alaudah yalaudah, Elena Zherdeva ezherdeva, Kashyap Patel ms-kashyap, Martha Laguna martthalch marthalc, Alex Li liyzcj, Maria Guirguis maguir, Alexandra Savelieva alsavelv, netang, Sudhindra Kovalam SudhindraKovalam, Markus Cozowicz eisber, Tom Finley, Markus Weimer, Jeff Zheng, James Verbus jverbus, Chris Hoder, Misha Desai, Nellie Gustafsson, Eren Orbey, Beverly Kodhek, Louise Han jr-MS, Justyna Lucznik, Kim Manis, Mitrabhanu Mohanty, Bogdan Crivat, Anand Raman, William T. Freeman, James Montemagno, Luis Quintanilla, Dennis Kennedy, Ryan Hurey, Jarno Ensio, Brian Mouncer, Steve Suh suhsteve, Akshaya Annavajhala (AK), Guolin Ke, Tara Grumm, Niharika Dutta Niharikadutta, Andrew Fogarty, Juanyong Duan, Weichen Xu WeichenXu123, Spark.NET Team, ONNX Team, Azure Global, Vowpal Wabbit Team, LightGBM Team, MSFT Garage Team, MSR Outreach Team, Speech SDK Team, MLflow Team
Learn More
| <img width="500" src="https://mmlspark.blob.core.windows.net/graphics/emails/synapseml_website.jpg"> |<img width="500" src="https://mmlspark.blob.core.windows.net/graphics/emails/philosphy.svg"> | <img width="500" src="https://mmlspark.blob.core.windows.net/graphics/emails/dotnet_blog_card.jpg"> |
|:--:|:--:|:--:|
| Visit [our website](https://aka.ms/spark) for the latest docs, demos, and examples | Read more about SynapseML's GA release in the [Microsoft Research Blog](https://www.microsoft.com/en-us/research/blog/synapseml-a-simple-multilingual-and-massively-parallel-machine-learning-library) | [Learn more](https://aka.ms/synapseml-dotnet-blog) about our .NET bindings and code generation system. |
| <img width="500" src="https://mmlspark.blob.core.windows.net/graphics/emails/synapseml_demo_series.jpg"> |<img width="500" src="https://mmlspark.blob.core.windows.net/graphics/emails/large_scale_paper.jpg"> | <img width="500" src="https://mmlspark.blob.core.windows.net/graphics/emails/openai_blog_card.jpg"> |
| Watch [a demonstration](https://www.youtube.com/watch?v=iXnBLwp7f88) of SynapseML to create a multilingual search engine. | Read our [Paper from IEEE Big Data '21](https://arxiv.org/pdf/2009.08044.pdf) | [Explore our integration with the Azure OpenAI Service](https://aka.ms/synapseml-openai-docs)|
Scan your Python project for dependency vulnerabilities in two minutes
Scan your application