Cloudian Adds Open Source PyTorch Support
ML library integrates with local data lakes running on HyperStore S3-compatible object storage solution.
This is a Press Release edited by StorageNewsletter.com on March 7, 2024 at 2:01 pmCloudian, Inc. announced an open-source software contribution that integrates PyTorch, the ML library, with local data lakes running on its HyperStore S3-compatible object storage.
This breakthrough simplifies the ML workflow and reduces costs by allowing data scientists and AI developers to run ML on data resident in local Cloudian object storage, without the need to move and stage the data into another system. The ML tasks can also run on local compute resources such as AWS Outposts and Local Zones.
AWS Outposts and Local Zones users can employ Python and ML libraries to analyze data within a local the firm’s HyperStore S3-compatible storage system without the cumbersome step of moving data to a separate staging area, streamlining the data processing pipeline and significantly accelerating the ML workflow. Cloudian is a certified Service Ready partner for AWS Outposts and Local Zones, and is commercially available through the AWS Marketplace.
This open-source contribution bridges the gap between distributed S3-compatible object storage systems and ML compute platforms, eliminating the dependency on a dedicated parallel file system for ML workflows. By enabling direct access to a cost-effective, scalable data repository, the company is simplifying the ML process, reducing both complexity and costs associated with data analysis.
Key benefits of development include:
- Simplified workflow: Eliminates the need for data staging, thus simplifying the workflow and reducing the cost of real-time analysis and model training.
- Integration: Allows direct use of PyTorch with the firm’s HyperStore, enabling local S3-compatible data storage.
- Local performance: Run ML models locally with AWS Outposts and Local Zones for low latency and high-speed access to data.
“We are excited to offer the machine learning community a tool that integrates two of their most important needs: the computational power of PyTorch and the storage flexibility of Cloudian S3-compatible systems,” said Jon Toor, CMO. “By connecting these platforms, we are enabling a more efficient and streamlined approach to machine learning.”
The company contributed enhancements to AWS Labs’ open-source S3-Connector-for-PyTorch. The enhancements enable PyTorch ML algorithms to access data in the firm’s HyperStore object storage system via the AWS S3 API.
The enhanced S3 connector is available from the GitHub repositories of AWS Labs and Cloudian.
Resource:
Blog: Streamline PyTorch Machine Learning Workflows with Cloudian and AWS Hybrid Edge