How Delta Sharing Enables Secure End-to-End Collaboration | Databricks Blog (2024)

In today's digital landscape, secure data sharing is critical to operational efficiency and innovation. Databricks and the Linux Foundation developed Delta Sharing as the first open source approach to data sharing across data, analytics and AI. Databricks provides secure data exchange, facilitating seamless sharing across platforms, clouds and regions. Enterprises of all sizes trust Delta Sharing, which supports a broad spectrum of applications and diverse data formats. This flexibility makes it a reliable tool for organizations seeking to harness the full potential of their data assets.

In this blog, we will review Delta Sharing's security architecture through three different sharing scenarios— Databricks customer to Databricks customer (D2D), Databricks customer to Open sharing (D2O), and cross-cloud data sharing. We will summarize the benefits of implementing Delta Sharing as part of a modern data collaboration strategy, such as enhanced operational efficiency through streamlined, secure data exchanges across various platforms and clouds, and reducing complexity and risk. This secure framework accelerates time to insight, enabling quicker decision-making while maintaining robust privacy protections that foster trust among stakeholders. Additionally, Delta Sharing's flexibility supports a diverse range of data formats and applications, making it adaptable to evolving business needs in a secure manner. Each scenario includes a customer testimonial that highlights first-hand knowledge of the solution's game-changing impact. We will focus this blog on Databricks Delta Sharing, where the data provider is using the managed version of the Databricks platform.

Databricks to Databricks Data Sharing (D2D)

The D2D scenario exemplifies secure, streamlined data exchange between two Databricks customers within the Databricks ecosystem. It features Databricks-managed connections and a no-token exchange system, ensuring both simplicity and security.

Using D2D sharing, customers benefit from Delta Sharing's native integration with Unity Catalog (UC) which provides unified governance and security for sharing operations. It's important to note sharing is not just limited to data—Unity Catalog goes beyond datasets to include volumes, notebooks, and AI models, showcasing an impressive range of functions. Delta Sharing for intra-account sharing is also turned on by default, while external sharing is available when activated with the required admin-level access. In order to set up Databricks Delta Sharing, you simply need at least one Databricks workspace that is enabled for Unity Catalog and Metastore, along with an admin role or the CREATE SHARE and CREATE RECIPIENT privileges (See documentation for account setup).

Unity Catalog provides a unified governance layer throughout— from the initial steps of creating a recipient and establishing shares to the crucial act of granting access. The Delta Sharing service processes API requests conducts thorough authorization checks, and keeps detailed activity logs. All of these steps ensure operations are as transparent as they are secure, much like a well-oiled machine that you can trust to keep your sharing ecosystem running smoothly.

Data Access: Delving deeper into post-authorization data access, Unity Catalog is again a crucial element. Upon receiving authorization from Unity Catalog, the method of access is determined—either cloud tokens or pre-signed URLs— based on factors such as asset type and sharing arrangement. For cloud tokens, a read-only scoped-down SAS token is minted by the provider's UC which is then forwarded to the recipient's compute plane. This provides secure limited-time storage access to the table root directory. Similarly, with pre-signed URLs, a list of relevant URLs are created and sent to the recipient's compute plane, providing secure, temporary access to the storage files. By strategically using security features when using different cloud services, such as Azure SAS tokens and AWS pre-signed URLs you can ensure that only authorized individuals can access the data in a secure setting across regions and clouds. Moreover, the interactions are confined to the recipient and provider's control planes, and it is a privileged operation that cannot be triggered by external agents, thus protecting against external breaches. This methodology underscores the system's adaptability, ensuring that data sharing is both flexible and secure, adeptly accommodating a wide array of business needs.

How Delta Sharing Enables Secure End-to-End Collaboration | Databricks Blog (1)

Coastal Community Bank selected Delta Sharing in order to meet its rigorous and challenging data sharing, compliance and security demands from its network of partners. Coastal chose Cavallo Technologies to help them develop a modern data platform. Rob Cavallo, President at Cavallo Technologies, explains Coastal needed a flexible solution for now and into the future, Read Coastal Community Bank case study.

"In some ways, Coastal [Community Bank] was asking for a paradox: enable easy collaboration yet meet the highest security standards for consumer financial data. It's critical to ensure the platform is performant and cost-effective for today's workloads while also adaptable enough to handle future use cases not yet imagined. In the end, the Databricks Data Intelligence Platform was the only platform we found that empowered us to do that."

— Rob Cavallo, President at Cavallo Technologies

Secure Data Sharing, Beyond Tables

Delta Sharing supports more than just tabular data, embracing a more holistic approach to data collaboration with the inclusion of non-tabular data assets such as volumes, notebooks, and AI models. These asset types are currently only supported in the D2D sharing framework, where they enhance the collaborative ecosystem. AI models are shared in a similar manner to volumes, while notebooks feature a unique sharing mechanism. Notebooks can be previewed by recipients through a pre-signed URL, rendering the content as HTML in a pop-up window for immediate access. For deeper integration, notebooks can also be imported into the recipient's environment, utilizing base64 encoding and API calls for a seamless transition.

AI model sharing is facilitated by generating a secure, read-only scoped down SAS token that is minted by the provider's UC, which is then forwarded to the recipient's Compute plane. This approach ensures secure and efficient access and avoids the need for extraneous copies of the model by allowing a one-time copy to the Model Registry in the recipient's UC. This copy of the model can then be deployed to multiple regions to optimize the inference process, enhance performance with reduced latency and deliver faster response times by leveraging regional data centers closer to the end users. iscovering, accessing, and utilizing shared volumes and AI models with Delta Sharing demonstrates both similar and tailored approaches that match each data type, promoting a secure and versatile platform for data sharing and collaboration.

Databricks to Open Data Sharing (D2O)

Transitioning to the open sharing scenario, D2O upholds strict security protocols for a Databricks customer sharing data with external third-party users not on Databricks. D2O enables recipients to directly connect to shared data using Delta Sharing connectors that support various systems like pandas, Tableau, Apache Spark, Rust, or others that support the open protocol, without first needing a specific compute platform.

Upon creating an open recipient in Databricks, a secure, one-time activation URL is generated, allowing the recipient to download a credential file that contains a Delta Sharing endpoint address and a token. In case of a security breach, providers have the ability to take immediate action, such as changing a recipient's credentials or withdrawing their read permissions to prevent any further issues.

Data Access Workflow: When a recipient queries a shared table using one of these mentioned connectors, Delta Sharing verifies the recipient using tokens from the credential file, and provides pre-signed URLs for accessing the data. This approach ensures compatibility with various open source connectors, safeguarding the integrity and security of the shared assets. (See more on sharing and accessing data.)

Cox Automotive Europe (part of Cox Automotive) is the world's largest automotive service organization using Delta Sharing to centrally manage and audit data shared outside their enterprise data services team, while ensuring robust security and governance. Read Cox Automotive case study.

"Delta Sharing makes it easy to securely share data with business units and subsidiaries without copying or replicating it. It enables us to share data without the recipient having an identity in our workspace."

— Robert Hamlet, Lead Data Engineer at Cox Automotive

Cross-Cloud Data Sharing

Enterprises are increasingly adopting cross-cloud strategies, driven by the need to support diverse functionalities across different cloud platforms, facilitate partnerships, or integrate data from another organization, post-acquisition. This shift toward a multicloud environment underscores the importance of organizations implementing robust solutions like Delta Sharing to enable seamless and secure sharing both internally and externally. Implementing a cross-cloud strategy is often essential for our clients to maintain operational continuity, foster innovation, and drive growth in an interconnected digital ecosystem, while having the ability to leverage the unique strengths of each cloud service.

For many of our clients who adopt cross-cloud strategies, it's clear that Delta Sharing's open cross-platform sharing capabilities which seamlessly support multicloud environments are a clear differentiator and advantage. Delta Sharing is equally effective whether sharing data internally within a single cloud, or sharing data externally across multiple cloud platforms, ensuring a secure and efficient data exchange process for both scenarios. Databricks has heard from many customers about their data sharing needs within multicloud environments and how Delta Sharing helps promote interoperability and enhance security across their cloud ecosystem.

One of these Databricks customers is Deutsche Börse, an international exchange organization and market infrastructure provider. Once they implemented Delta Sharing enabling them to openly share and collaborate with their customers, the business impact was transformative.

"Having a platform that allows secure data sharing with fine-grained access controls, the highest security standards, and privacy assurance opens up new possibilities. We can now engage in conversations on customized solutions where in the past, we would have said, 'Unfortunately, our clients don't want to share their data and models with us, or we don't want to share more granular data or our models for confidentiality reasons.'"

— Jan Stiebing, head of Business Strategy and M&A at Deutsche Börse

In this customer example and in many others, Delta Sharing is able to bridge gaps for data sharing and collaboration that were once considered insurmountable, all while maintaining the highest standards of security and privacy. Deutsche Börse also offers several market data listings on Databricks Marketplace.

Network and Storage Configuration

Delta Sharing enables secure and seamless data sharing across various cloud environments, integrating seamlessly with the cloud's native storage security architecture. It does so without needing to make significant modifications to your existing security framework. This approach is designed for organizations utilizing Databricks on cloud platforms such as Azure, AWS, and GCP, aligning with Unity Catalog's requirements. The Databricks Data Intelligence Platform supports data sharing through cloud storage solutions (ADLS Gen2, S3, GCS) with an emphasis on private communication channels or IP address whitelisting for enhanced security.

The network and storage configuration for Delta Sharing outlined below works across both intra-cloud and cross-cloud scenarios. Intra-cloud sharing facilitates secure data exchange within the same cloud ecosystem using private endpoints, storage firewalls, and network gateways, ensuring no public access is allowed. In cross-cloud sharing scenarios, Delta Sharing leverages NAT gateway egress IPs and supports existing cross-cloud private connections, such as site-to-site VPNs or dedicated links to enable secure data access across different cloud platforms and on-premise networks. This comprehensive and secure approach allows for a wide range of network infrastructures to efficiently engage in Delta Sharing, promoting both flexibility and security.

How Delta Sharing Enables Secure End-to-End Collaboration | Databricks Blog (2)

The above diagram represents a cross-cloud network configuration example.

Data Filtering

In Delta Sharing, data filtering is crucial for providing flexible and secure access, with two primary methods:

  • Partition Filtering: Enables sharing specific table partitions that align with recipient properties, known as parameterized partition sharing. This strategy allows data providers to share the needed data portions in a flexible manner, facilitating controlled access.
  • Dynamic Views: Enables sharing of any subset of data with recipients via dynamic functions such as current_recipient, offering fine-grained control over data access and improved manageability.

Allow access restrictions based on specific recipient properties, ensuring data is shared only with intended recipients and in the appropriate context. These approaches enhance Delta Sharing's security and flexibility, allowing for tailored data access that meets unique recipient needs.

Security, Flexibility, and Seamless Integration with Delta Sharing

In conclusion, Delta Sharing is a key component of the Databricks Data Intelligence Platform and stands out for its secure, flexible, and cross-platform data sharing capabilities, supporting modern data strategies. In addition to supporting other platforms via open-source connectors, Delta Sharing enables customers to share structured and unstructured data, as well as AI models. All of these capabilities clearly differentiate Delta Sharing from other data exchange platforms. As a result, Delta Sharing is widely trusted by clients across different industries, reflected in customer testimonials, highlighting the significant impact on operational efficiency and innovation. As the data sharing landscape continues to evolve, Delta Sharing is built for the future, prioritizing security, flexibility, and seamless integration across diverse data sharing ecosystems. This steadfast commitment positions Delta Sharing as an indispensable asset in harnessing the power of data to advance the digital objectives of enterprises worldwide.

To learn more about how to implement Delta Sharing within your organization, check out the latest resources including new eBooks and related blogs below, or deep dive into the Delta Sharing documentation.

  • Read O'Reilly's technical guide, Data Sharing and Collaboration with Delta Sharing
  • Read A New Approach to Data Sharing, Second Edition eBook
  • Read Accelerate Industry Innovation with Data Sharing eBook

If you are already a Delta Sharing customer, you can also reach out to the team with questions or to provide feedback at [emailprotected].

How Delta Sharing Enables Secure End-to-End Collaboration | Databricks Blog (2024)

FAQs

What are the advantages of Delta sharing? ›

Delta Sharing will enable organizations to:
  • Share any existing, live data in a cloud storage or lake house without needing to copy it.
  • Take advantage of the open source and open data formats of Delta Lake to make data accessible to everyone.
  • Provide strong security, governance, and auditing.

Is Delta sharing secure? ›

Delta Sharing uses pre-signed URLs to provide temporary access to a file in object storage. They are only given to recipients that already have access to the shared data. They are secure because they are short-lived and don't expand the level of access beyond what recipients have already been granted.

How do I enable Delta sharing Databricks? ›

Enable Delta Sharing on a metastore

As a Databricks account admin, log in to the account console. Catalog. Click the name of a metastore to open its details. Click the checkbox next to Enable Delta Sharing to allow a Databricks user to share data outside their organization.

What is delta sharing in Unity Catalog Databricks? ›

Delta Sharing is the core of the Azure Databricks secure data sharing platform, enabling you to share data and AI assets in Databricks with users outside your organization, whether those users use Databricks or not.

What are the advantages of Delta system? ›

Advantages of Delta Connection

Delta connection requires less power. Delta connection provides more torque as compared to the Star connection. Most of the three phase loads are connected using Delta Connection because in the case of unbalanced load, flexibility is there to add or remove the loads on a single phase.

What are the positive effects of deltas? ›

Deltas absorb runoff from both floods (from rivers) and storms (from lakes or the ocean). Deltas also filter water as it slowly makes its way through the delta's distributary network. This can reduce the impact of pollution flowing from upstream.

Is delta sharing an API? ›

The REST APIs provided by Delta Sharing Server are stable public APIs. They are defined by Delta Sharing Protocol and we will follow the entire protocol strictly. The interfaces inside Delta Sharing Server are not public APIs.

Is delta sharing read only? ›

Granting permissions on shared catalogs and data assets works just like it does with any other assets registered in Unity Catalog, with the important distinction being that users can be granted only read access on objects in catalogs that are created from Delta Sharing shares.

Which of the following describes delta sharing as a solution for data sharing? ›

Delta Sharing is a protocol for securely sharing data across different cloud platforms, differentiating itself from SFTP, which is for secure file transfers. Delta Sharing is neither a proprietary solution nor limited to a single cloud platform.

How to setup delta sharing server? ›

  1. Step 1: Create a Resource Group. ...
  2. Step 2: Create a Storage Account. ...
  3. Step 3: Create a Storage Container. ...
  4. Step 4: Upload a Sample Dataset. ...
  5. Step 5: Launch a Virtual Machine for the Sharing Server. ...
  6. Step 6: Install the Latest Delta Sharing Release. ...
  7. Step 7: Update the Server Configuration. ...
  8. Step 8: Set the Shared Access Key.
Jan 6, 2023

What does Delta mean in Databricks? ›

Delta table is the default data table format in Azure Databricks and is a feature of the Delta Lake open source data framework. Delta tables are typically used for data lakes, where data is ingested via streaming or in large batches.

How do you collaborate on Databricks? ›

Share a notebook

at the top of the notebook. The Sharing dialog opens, which you can use to select who to share the notebook with and what level of access they have. You can also manage permissions in a fully automated setup using Databricks Terraform provider and databricks_permissions.

What are the benefits of Delta Sharing? ›

Delta Sharing is a game-changer for data sharing. It enables you to share live data from your lakehouse with any computing platform, without replicating or moving the data. It also reduces the compute costs of data sharing and integrates seamlessly with Power BI Desktop.

What are the advantages of Databricks Delta? ›

With Delta Tables, you can easily ingest and process your data in real time, allowing for faster access to insights and analytics. Additionally, Delta Tables offers advanced features such as ACID transactions, time travel capabilities, and integrated file management.

What is Delta caching in Databricks? ›

In Databricks, there are two different type of caching: Delta caching and Spark caching. This write-up is about the delta cache. Delta Caching : improves query performance as data sits closer to the workers and storing on the local disk frees up memory for other Spark operations.

What are the advantages and disadvantages of delta hedging? ›

Delta hedging protects your profits from short-term fluctuations in the underlying asset price without affecting your long-term view of the asset. However, the strategy needs constant adjustment as your option's delta changes, incurring costs and not eliminating all option risks like time decay and volatility changes.

What is the disadvantage of Delta delta connection? ›

The main disadvantage of the delta-delta transformer is that there is no star-point or neutral terminal available. Therefore, the delta-delta connected transformer is used when neither primary nor secondary requires neutral terminal and the voltages are low and moderate.

What are advantages of delta modulation? ›

Adaptive delta modulation offers extremely high performance. This technique decreases the need for correction circuits in radio design and error detection. Dynamic range is high since the variable step size covers a large range of values. Slope overload error and granular error are not seen.

What are the advantages of open delta connection? ›

Advantages: Open-deltas only require the utility to install two transformers. Future Capacity can be increased by simply installing a third similar sized transformer verses installing 2-3 larger transformers.

Top Articles
Latest Posts
Article information

Author: Saturnina Altenwerth DVM

Last Updated:

Views: 6199

Rating: 4.3 / 5 (44 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Saturnina Altenwerth DVM

Birthday: 1992-08-21

Address: Apt. 237 662 Haag Mills, East Verenaport, MO 57071-5493

Phone: +331850833384

Job: District Real-Estate Architect

Hobby: Skateboarding, Taxidermy, Air sports, Painting, Knife making, Letterboxing, Inline skating

Introduction: My name is Saturnina Altenwerth DVM, I am a witty, perfect, combative, beautiful, determined, fancy, determined person who loves writing and wants to share my knowledge and understanding with you.