Snowflake’s growth trajectory has been nothing short of remarkable. Since 2012, the company has witnessed exponential market adoption and has attracted a diverse range of clients, from startups to Fortune 500 giants. Some of its notable customers include Adobe, Airbnb, BlackRock, Dropbox, Pepsico, ConAgra Foods, Novartis and Yamaha. In India, Snowflake caters to the needs of companies such as Porter, Swiggy and Urban Company. The rapid expansion is a testament to Snowflake’s ability to address the ever-increasing demands of the data-driven world we live in.
But today, we are stepping into the age of generative AI and Snowflake too is gearing up to bring the best of the technology to its long list of customers. Torsten Grabs, senior director of product management at Snowflake told AIM that with the advent of generative AI, we will increasingly see less technical users successfully interact with computers with technology and that’s probably the broadest and biggest impact that he would expect from generative AI and Large Language Models (LLMs) across the board. Moreover, talking about the impact of generative AI on Snowflake, he said that it has impacted Snowflake on two distinct levels.
Firstly, like almost every other company, generative AI is leading to productivity improvements at Snowflake. Grabs anticipates developers working on Snowflake to benefit the most from generative AI. This concept is akin to Microsoft’s co-pilot and AWS’s CodeWhisperer, where a coding assistant aids in productivity by comprehending natural language and engaging in interactive conversations to facilitate faster and more precise code creation.
Moreover, Snowflake is harnessing generative AI to enhance conversational search capabilities. For instance, when accessing the Snowflake marketplace, it employs conversational methods to identify suitable datasets that address your business needs effectively. “There’s another layer that I think is actually very critical for everybody in the data space, which is around applying LLMs to the data that’s being stored or managed in a system like Snowflake,” Grabs said. The big opportunity for Snowflake lies in leveraging generative AI to offer enhanced insights into the data managed and stored within these systems.
Conversing with your data
On May 24, 2023, Snowflake acquired Neeva AI with the aim of accelerating search capabilities within Snowflake’s Data Cloud platform by leveraging Neeva’s expertise in generative AI-based search technology. “We recognised the necessity of integrating robust search functionality directly into Snowflake, making it an inherent and valuable capability. Partnering with Neeva AI further enriched our approach, combining their expertise in advanced search with generative AI, benefiting us in multiple dimensions,” Grabs said.
Grabs believes the Neeva AI acquisition is going to bring a host of benefits to Snowflake’s customers. Most importantly, it will provide them the ability to talk to their data essentially in a conversational way. “It’s analogous to the demonstration we presented, where a conversation with the marketplace utilizes metadata processed by the large language model to discover relevant datasets,” Grabs said.
Now consider scaling this process and going beyond metadata, involving proprietary sensitive data. By employing generative AI, Snowflake’s customers can engage in natural language conversations to gain precise insights about their enterprise’s data.
Building LLMs for customers
Building on generative AI capabilities, Snowflake, at its annual user conference called ‘Snowflake Summit 2023’ also announced a new LLM built from Applica’s generative AI technology to help customers understand documents and put their unstructured data to work. “We have specifically built this model for document understanding use cases and we started with TILT base model that we leveraged and then built on top of it,” Grabs said.
When compared to the GPT models from OpenAI or other models developed by labs such as Antrhopic, Snowflake’s LLMs offers few distinct advantages. For example, the GPT models are trained on the entirety of publicly available internet data, resulting in broad capabilities but high resource demands. Their resource-intensive nature also makes them costly to operate. Much of these resources are allocated to aspects irrelevant to your specific use case. Grabs believes utilising a more tailored, specialised model designed for your specific use case allows for a narrower model with a reduced resource footprint, leading to increased cost-effectiveness.
“This approach is also poised to yield significantly superior outcomes due to its tailor-made design for the intended use case. Furthermore, the model can be refined and optimised using your proprietary data. This principle isn’t confined solely to the document AI scenarios; rather, it’s a pattern that will likely extend more widely across various use cases.”
In many instances, these specialised models are expected to surpass broad foundational models in both accuracy and result quality. Additionally, they are likely to prove more resource-efficient and cost-effective to operate. “Our document AI significantly aids financial institutions in automating approval processes, particularly for mortgages. Documents are loaded into the system, the model identifies document types (e.g., salary statements), extracts structured data, and suggests approvals. An associate reviews and finalises decisions, streamlining the process and enhancing efficiency.”
Addressing customer’s concerns
While generative AI has garnered significant interest, enterprises, including Snowflake’s clients, which encompasses 590 Forbes Global 2000 companies, remain concerned about the potential risks tied to its utilisation. “I think some of the top concerns for pretty much all of the customers that I’m talking to is around security, privacy, data governance and compliance,” Grab said.
This presents a significant challenge, especially concerning advanced commercial LLMs. These models are often hosted in proprietary cloud services that require interaction. For enterprise clients with sensitive data containing personally identifiable information (PII), the prospect of sending such data to an external system outside their control and unfamiliar with their cybersecurity processes raises concerns. This limitation hinders the variety of data that can interact with such systems and services.
“Our long-standing stance has been to avoid dispersing data across various locations within the data stack or across the cloud. Instead, we advocate for bringing computation to the data’s location, which is now feasible with the abundant availability of compute resources,” Grabs said. Unlike a decade or two ago when compute was scarce, the approach now is to keep data secure and well-governed in its place and then bring computation to wherever the data resides.
He believes this argument extends to generative AI and LLMs as well. “We would like to offer the state-of-the-art LLMs and side by side the compelling open-source options that operate within the secure confines of the customer’s Snowflake account. This approach ensures that the customer’s proprietary or sensitive data remains within the security boundary of their Snowflake account, offering them peace of mind.”
Moreover, on the flip side, another crucial aspect to consider is the protection of proprietary intellectual property (IP) within commercial LLMs. The model’s code, weights, and parameters often involve sensitive proprietary information. “With our security model integrated into native apps on the marketplace, we can ensure that commercial LLM vendors’ valuable IP remains undisclosed to customers utilising these models within their Snowflake account. Our role in facilitating the compute for both parties empowers us to maintain robust security and privacy boundaries among all participants involved in the process,” Grabs concluded.
C-SPAN is a public service created by the American Cable Television Industry
Even though enterprise data sources, such as resource planning and customer relationship management systems are critical for analytics, retrieving data from them is a tall order.
As innovations continue to rock the data space, Informatica SuperPipe for Snowflake was devised to get mission-critical data out of hard-to-get places at a 3.5 times faster replication and ingestion rate, according to Rik Tamm-Daniels (pictured), general vice president of ecosystem alliances and technology at Informatica Inc.
“One of the big ones you mentioned is SuperPipe for Snowflake, and we think about the different types of needs for data integration,” he stated. “Reducing the latency of data, making it more real-time, that’s what SuperPipe’s all about. We see up to about three and a half times faster performance than our previous kind of change data capture replication technology. It’s a huge leap forward, leveraging some of the latest Snowpipe streaming capabilities from Snowflake.”
Tamm-Daniels spoke with theCUBE industry analysts Lisa Martin and Dave Vellante at Snowflake Summit, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how Informatica has partnered with Snowflake Inc. to enhance the intelligent data management sector. (* Disclosure below.)
As generative artificial intelligence and large language models – think ChatGPT – continue to gain steam, Informatica seeks to revamp the data management sector using AI. This can be illustrated by the fact that the company recently rolled out Claire GPT and extended its Claire copilot capabilities, according to Tamm-Daniels.
“When you think about the LLM space, there are really two angles for us in generative AI,” he noted. “The first is those models need data … we’re also invested heavily in using generative AI to really revolutionize data management, and so we announced our Claire GPT and Claire Copilot at Informatica World back in early May to address those kinds of opportunities.”
By incorporating generative AI into the data management cloud interface, Informatica is able to turn metrics into a pipeline of integration and connections. This is highly transformative because a text box offers more options, Tamm-Daniels pointed out.
“Claire GPT, the idea is I think one of the big transformative things about generative AI is it actually lets you take some very complex and nuanced requests, express them in pretty significant descriptive English language descriptions, and then actually turn them into something actionable, executable,” he noted. “On the Claire copilot, that’s all about … how do we bring the power of generative AI to help make better decisions, to help have assistive technology, to recommend data quality transformations or items that you be concerned about.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of Snowflake Summit:
(* Disclosure: Snowflake Inc. and Informatica Inc. sponsored this segment of theCUBE. Neither Snowflake, Informatica, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
UC San Francisco is the leading university dedicated to advancing health worldwide through preeminent biomedical research, graduate-level education in the life sciences and health professions, and excellence in patient care.
Within our overarching advancing health worldwide mission, UCSF is devoted at every level to serving the public.
UCSF’s commitment to public service dates to the founding of its predecessor institution, Toland Medical College, in 1864. Born out of the overcrowded and unsanitary conditions of Gold Rush-era San Francisco, Toland Medical College trained doctors to elevate the standards of public health in the burgeoning city.
By 1873, the University of California acquired the college and forged a partnership with San Francisco General Hospital that continues to this day and serves as a model for delivering leading-edge care at a public safety-net hospital.
Today UCSF’s public mission goes beyond San Francisco and delivers a substantial impact on a national and global level by innovating health care approaches for the world’s most vulnerable populations, training the next generation of doctors, nurses, dentists, pharmacists and scientists; supporting elementary and high school education; and translating scientific discoveries into better health for everyone.
In his 2016 State of the University Address, Chancellor Sam Hawgood announced that UCSF is embracing a common set of values to set a clear direction for all members of the UCSF community as we work together to fulfill our mission. This set of overarching values aligns with UCSF’s Principles of Community and Code of Ethics.
PRIDE values are:
Professionalism: To be competent, accountable, reliable and responsible, interacting positively and collaboratively with all colleagues, students, patients, visitors and business partners.
Respect: To treat all others as you wish to be treated, being courteous, kind and acting with utmost consideration for others.
Integrity: To be honest, trustworthy and ethical, always doing the right thing, without compromising the truth, and being fair and sincere.
Diversity: To appreciate and celebrate differences in others, creating an environment of equity and inclusion with opportunities for everyone to reach their potential.
Excellence: To be dedicated, motivated, innovative and confident, giving your best every day, encouraging and supporting others to excel in everything they do.
Azure Synapse Analytics and Snowflake are two commonly recommended ETL tools for businesses that need to process large amounts of data. Choosing between the two will depend on the unique strengths of these services and your company’s needs. These are the key differences between Synapse and Snowflake, including their features and where they excel.
Azure Synapse Analytics (formerly known as Azure SQL Data Warehouse) is a data analytics service from Microsoft. It’s part of the Azure platform, which includes products like Azure Databricks, Cosmos DB and Power BI.
Microsoft describes it as offering a “… unified experience to ingest, explore, prepare, transform, manage, and serve data for immediate BI and machine learning needs.” The service is one of the most popular tools available for information warehousing and the management of big data systems.
Key features of Azure Synapse Analytics include:
Snowflake is another popular big data platform, developed by a company of the same name. It’s a fully managed platform as a service used for various applications — including data warehousing, lake management, data science and secure sharing of real-time information.
A Snowflake data warehouse is built on either the Amazon Web Services or Microsoft Azure cloud infrastructure. Cloud storage and compute power can scale independently.
Like most available data platforms, Snowflake is built with key trends in business intelligence automation in mind, including automation, segmentation of intelligence workflows and growing use of anything as a service tools.
The major competitors of Snowflake include Dremio, Firebolt, and Palantir.
Key features of Snowflake’s platform include:
SEE: For more information, explore our overview of Snowflake.
|Features||Azure Synapse Analytics||Snowflake|
|Control over infrastructure||Yes||Limited|
|Integration with Azure||Yes||No|
|Built-in security features||Yes||Yes|
|Ease of use||Limited||Yes|
|Real-time and streaming data processing||Yes||Yes|
Azure Snapase offers different pricing tiers and categories based on region, type of service, storage, unit of time and other factors. The prepurchase plans are available in six tiers starting with 5,000 Synapse Commit Units for $4,750. The higher tier is priced at $259,200 for 260,000 SCUs.
The pricing for data integration capabilities offered by Azure Synapse Analytics is based on data pipeline activities, integration runtime hours, operation charges, and data flow cluster size and execution. Each activity has separate charges. For example, Basis Data Flows are charged at $0.257 per vCore-hour, while Standard Data Flows are charged at $0.325 per vCore-hour.
The pricing for Snowflake is divided into four tiers, the pricing of which depends on the preferred platform and region. For example, if you prefer the Microsoft Azure platform and are located in the U.S. West region, you will pay the following:
You can choose to pay an extra $50 per terabyte per month for on-demand storage or $23 per terabyte per month for capacity storage.
The two extract, transfer and load products have a lot in common, but they differ in specific features, strengths, weaknesses and popular use cases.
Synapse Analytics and Snowflake are built for a range of data analysis and storage applications, but Snowflake is a better fit for conventional business intelligence and analytics. It includes near-zero maintenance with features like automatic clustering and performance optimization tools.
Businesses that use Snowflake for storage and analysis may not need a full-time administrator who has deep experience with the platform.
By comparison, native integration with Spark Pool and Delta Lake makes Synapse Analytics an excellent choice for advanced big data applications, including artificial intelligence, machine learning and data streaming. However, the platform will require much more labor and attention from analytics teams.
A Synapse Analytics administrator who is familiar with the platform and knows how to effectively manage the service will likely be necessary for a business to benefit fully. Setup of the Synapse Analytics platform will also likely be more involved, meaning businesses may need to wait longer to see results.
Snowflake isn’t built to run on a specific architecture and will run on top of three major cloud platforms: AWS, Microsoft Azure’s Cloud platform and Google Cloud. A layer of abstraction separates the Snowflake storage and compute credits from the real cloud resources from a business’s provider of choice.
Each virtual Snowflake warehouse has its own independent compute cluster. They don’t share resources, so the performance of one warehouse shouldn’t impact the performance of another.
Comparatively, Azure Synapse Analytics is built specifically for Azure Cloud. It’s designed from the ground up for integration with other Azure services. Snowflake will also integrate with many of these services, but it lacks some of the capabilities that make Synapse Analytics’ integration with Azure so seamless.
Snowflake has built-in auto-scaling capabilities and an auto-suspend feature that will allow administrators to dynamically manage warehouse resources as their needs change. It uses a per-second billing model, and being able to quickly scale storage and compute up or down can provide immediate cost savings.
The zero-copy cloning feature from Snowflake allows administrators to create a copy of tables, schemas and warehouses without duplicating the real data. This allows for even greater scalability.
Azure offers strong scalability but lacks some of the features that make Snowflake so flexible. Serverless SQL Pools and Spark Pools in Azure have automatic scaling by default. However, Dedicated SQL Pools require manual scaling.
SEE: Compare features of top time tracking software.
To review Azure Synapse Analytics and Snowflake, we analyzed various factors, including the core functionality, scalability, ease of use, integration capabilities, security tools and customer support. We also analyzed the pricing structure of each solution, including its licensing costs and any extra charges for add-on services.
A company deciding between Synapse and Snowflake is in a good position. Both platforms are excellent data storage and analysis services, with features necessary for many business intelligence and analysis workflows.
However, the two do differ when it comes to specific strengths and ideal use cases. Snowflake excels for companies that want to perform more traditional business intelligence analytics and will benefit from excellent scalability.
With Snowflake, you get a more user-friendly interface but are dependent on cloud service availability. As Snowflake is cloud-native, you also have limited direct control over the infrastructure. Businesses that need granular control over their infrastructure optimization will find this a key disadvantage of Snowflake.
Azure Synapse Analytics has a steeper learning curve than Snowflake, and scalability may be more challenging, depending on the type of pool a business uses. However, it’s an excellent choice for companies working with AI, ML and data streaming and will likely perform better than Snowflake for these applications.
Snowflake (NYSE:SNOW) plunged after its most accurate earnings release in spite of strong results, but has since recovered its losses. Wall Street appears concerned about management's conservative guidance, which calls for a sizable slowdown in the company’s growth rate. Customers are adapting to the high interest rate environment by undergoing “data optimization,” but management is of the view that these headwinds are near term in nature. It's curious that the company is facing headwinds given its perceived exposure to the growth of artificial intelligence, but it's possible that its role in enabling AI is overstated. The company's accurate announced partnership with Microsoft (MSFT) may help to clarify their AI positioning. I reiterate my buy rating but again emphasize that this name trades quite richly relative to tech peers.
SNOW has lost some of its hype, but even after the steep correction from highs, the stock still trades at rich valuations.
I last covered SNOW in April, where I rated the stock a buy on account of the solid growth and strong cash flow margins. The growth story is facing some near-term hiccups, but the strong balance sheet and cash generation help to even out the story.
In its most accurate quarter, SNOW generated 48% YOY revenue growth to $624 million, with product revenues of $590 million coming ahead of guidance for $573 million.
SNOW saw remaining performance obligations (‘RPOs’) decline sequentially - a surprising development given the company’s historically high growth rates. Many tech companies have reported elongated sales cycles including smaller deal sizes - these headwinds have finally caught up with SNOW.
SNOW continued to rapidly grow its customer base, something that I expect to help revenue growth accelerate as the tough macro environment subsides.
The company’s dollar-based net revenue retention rate declined yet again down to 151%.
SNOW delivered yet another quarter with strong free cash flow generation - prepaid revenues are nice perk of being an enterprise tech business.
Unlike many tech peers which have slowed down hiring or engaged in large headcount reductions, SNOW continues to invest aggressively in growth, with headcount continuing to expand sequentially. It's highly impressive that the company can keep driving margin expansion in spite of both a tough macro environment as well as continued aggressive investment.
SNOW ended the quarter with $5 billion of cash and investments vs. no debt, representing a strong balance sheet.
Looking ahead, management reduced full-year guidance to $2.6 billion in revenue (down from $2.7 billion) and its operating margin to 5% (down from 6%). Analysts on the call seemed concern that this latest set of guidance was not conservative enough to which management stated that they have not changed their forecast methodology, which typically aims for preciseness without much upside surprise. That said, management did state that they are assuming continued weakness through the end of the year.
Management noted that some of its customers are choosing to delete data to save costs where they otherwise would not have previously. Management believes this to be a near-term trend, especially given the rising enthusiasm for generative AI. Regarding that last point, management views SNOW as being attractively positioned for the growth of generative AI due to models needing to be trained on large amounts of data.
Yet I must wonder why the company is not seeing stronger fundamentals given what should have been an easily understandable correlation with generative AI. It appears that Wall Street has begun to hold some doubt and the company has moved quickly to address these issues, notably announcing a partnership with AI leader Nvidia (NVDA) subsequent to the quarter. Management noted that they did not see material headwinds to their business until in April, explaining the guidance miss. The company repurchased $192 million of stock at an average price of $136 per share. Management stated that the program is in place in order to reduce dilution, but I'm of the view that external M&A is a superior choice given the relative premium of the stock price. Management has targeted around 2% in annual dilution starting in fiscal 2024.
At their investor day, management reiterated confidence in their ability to achieve $10 billion of product revenue in fiscal 2029 with ever-increasing margin expectations.
I suspect that part of the stock’s initial weakness may be due to specific commentary regarding the company having greater exposure to AWS (AMZN) than Azure (MSFT). That problem appears to be resolved. Subsequent to the quarter end, SNOW announced an expanded partnership with MSFT with a focus on generative AI. Given MSFT's head start in generative AI, I see this expanded investment in Azure solutions and integrations as being very important to ensuring that SNOW can fully capture the generative AI opportunity. SNOW already is viewed as being one of the clear leaders in managing data and by partnering with the clear leader in generative AI, SNOW positions itself to grow alongside this new market (and separately, also fixes any prior damage to the growth story).
As a data warehouse and data lake, SNOW offers direct exposure to the growth of data. And with the company's partnership with Microsoft Azure, it has become a legitimate play on generative AI as well.
Backed by investors like Warren Buffett, SNOW has historically been something of “tech royalty,” with a secular growth story considered cleaner than most.
This is reflected by the fact that the stock is still trading at around 21x revenues.
Based on 25% growth exiting fiscal 2029, a 30% long-term net margin, and a 1.5x price to earnings growth ratio (‘PEG ratio’), I could see the stock trading at 11.3x sales by then. That implies a stock price of around $405 per share in January 2029, implying around 16.5% potential annual upside over the next 5.5 years.
What are the key risks? Valuation is an obvious risk as I can see the stock declining 40% to trade in-line with peers on a growth-adjusted basis. At this point, a new risk is if the data optimization headwinds prove to be more long term in nature. What if the “data can only grow” thesis falls apart? Lastly, we mustn’t ignore competition risk as cloud providers Boost their own data offerings, something that they may be more incentivized to do due to rapid interest in their LLMs. I reiterate my buy rating for SNOW though note the rich relative valuation and high reliance on the company to hit consensus estimates.
Stocks: Real-time U.S. stock quotes reflect trades reported through Nasdaq only; comprehensive quotes and volume reflect trading in all markets and are delayed at least 15 minutes. International stock quotes are delayed as per exchange requirements. Fundamental company data and analyst estimates provided by FactSet. Copyright 2019© FactSet Research Systems Inc. All rights reserved. Source: FactSet
Indexes: Index quotes may be real-time or delayed as per exchange requirements; refer to time stamps for information on any delays. Source: FactSet
Markets Diary: Data on U.S. Overview page represent trading in all U.S. markets and updates until 8 p.m. See Closing Diaries table for 4 p.m. closing data. Sources: FactSet, Dow Jones
Stock Movers: Gainers, decliners and most actives market activity tables are a combination of NYSE, Nasdaq, NYSE American and NYSE Arca listings. Sources: FactSet, Dow Jones
ETF Movers: Includes ETFs & ETNs with volume of at least 50,000. Sources: FactSet, Dow Jones
Bonds: Bond quotes are updated in real-time. Sources: FactSet, Tullett Prebon
Currencies: Currency quotes are updated in real-time. Sources: FactSet, Tullett Prebon
Commodities & Futures: Futures prices are delayed at least 10 minutes as per exchange requirements. Change value during the period between open outcry settle and the commencement of the next day's trading is calculated as the difference between the last trade and the prior day's settle. Change value during other periods is calculated as the difference between the last trade and the most accurate settle. Source: FactSet
Data are provided 'as is' for informational purposes only and are not intended for trading purposes. FactSet (a) does not make any express or implied warranties of any kind regarding the data, including, without limitation, any warranty of merchantability or fitness for a particular purpose or use; and (b) shall not be liable for any errors, incompleteness, interruption or delay, action taken in reliance on any data, or for any damages resulting therefrom. Data may be intentionally delayed pursuant to supplier requirements.
Mutual Funds & ETFs: All of the mutual fund and ETF information contained in this display, with the exception of the current price and price history, was supplied by Lipper, A Refinitiv Company, subject to the following: Copyright 2019© Refinitiv. All rights reserved. Any copying, republication or redistribution of Lipper content, including by caching, framing or similar means, is expressly prohibited without the prior written consent of Lipper. Lipper shall not be liable for any errors or delays in the content, or for any actions taken in reliance thereon.
Cryptocurrencies: Cryptocurrency quotes are updated in real-time. Sources: CoinDesk (Bitcoin), Kraken (all other cryptocurrencies)
Calendars and Economy: 'Actual' numbers are added to the table after economic reports are released. Source: Kantar Media
With more and more solutions entering the enterprise software market, organizations have used many data sources for their operational processes. To properly transfer and share your organizational data and information between software systems, using an effective ETL solution is a necessity.
This resource will analyze two of the top ETL tools, Databricks and Snowflake, so you can see which would better satisfy your data extraction, transformation and loading needs.
Databricks ETL is a data and AI solution that organizations can use to accelerate the performance and functionality of ETL pipelines. The tool can be used in various industries and provides data management, security and governance capabilities.
Snowflake is software that provides users with a data lake and warehousing environment for their data processing, unification and transformation. It is designed to simplify complex data pipelines and can be used with other data integration tools for greater functionality.
|Focus on data warehousing||No||Yes|
|Real-time data analytics||Yes||No|
|Built-in machine learning||Yes||No|
|Learn more||Visit Databricks||Visit Snowflake|
After a free trial, Databricks can be purchased as a pay-as-you-go solution, with pricing based on computer usage. Alternatively, customers can purchase the software through a committed use plan. This means that users can commit to certain levels of usage and gain discounts when purchasing the software.
Snowflake offers similar pricing models for its software. The Data Cloud service can be purchased through a pay-as-you-go model that is usage-based with no long-term commitment, or through Snowflake On Demand. This lets customers access Snowflake by choosing pre-purchased software capacity options and promises discounts on the software’s overall cost.
The Databricks solution allows users to make full use of their data by eliminating the silos that can complicate data. Data silos traditionally separate data engineering, analytics, BI, data science and machine learning. Companies can avoid proprietary walled gardens and other restrictions by removing these silos and allowing users to access and manage their structured and unstructured data through the Databricks platform. Users simply sync their data through a Databricks Data Lake connection for full access and automatic data update capabilities.
Snowflake supports data transformation both during loading and after it is loaded into the platform environment. The software has integration with many popular tools and solutions for easy data extraction and transformation into the target database through native connectivity with Snowflake. Snowflake takes care of multiple integration operations, including the preparation, migration, movement and management of data. In addition, the system provides capabilities for data loading from external and internal file locations, bulk loading, continuous loading and other data loading options (Figure A).
Databricks gives users multiple methods for visualizing their data, including choropleth maps, marker maps, heatmaps, counters, pivot tables, charts, cohorts, markers, funnels, box plots, sunbursts, sankeys and word clouds. Once users store their data within their Databricks SQL data lake, they can create and save visualizations of their stored data (Figure B). Users can then edit, clone, customize or aggregate their visualizations. When they are happy with their visualizations, users can download them as image files or add them to their platform dashboards.
With the Snowflake web interface, Snowsight, users can visualize their data and query results as charts. Snowsight supports bar charts, line charts, scorecards, scatterplots and heat grids. Users can configure their data visualizations by adjusting their chart columns, column attributes and chart appearance. For example, to view data from specific time periods, users can select the buckets of time in the inspector panel to adjust the display without needing to modify their query. In addition, aggregation functions allow the system to determine single values from data points in a chart, and users can download their charts as .png files.
The Databricks SQL analytics platform uses machine learning to allow users to create queries in ANSI SQL and develop visualizations and dashboards using their accessible data. The visualizations allow users to gain insights and lightweight reporting from their data lake. However, users may prefer to utilize their existing third-party BI tools by connecting them to the platform. Tools like Microsoft PowerBI or Tableau can be used for analysis and reporting directly on the Databricks data lake.
Snowflake delivers insights on data through the Snowflake Data Cloud, a data platform that can be deployed across AWS, Google and Azure. It can analyze the data for various purposes: Data Engineering, Data Science, Data Lake, Applications, and Data Sharing and Exchange. Its visualization tools can enable users to gain valuable insight and information from their data through queries (Figure C). Additionally, Snowflake can be used together with other software systems for a broader range of analysis capabilities.
This is a technical review using compiled literature researched from relevant databases. The information provided within this article is gathered from vendor websites or based on an aggregate of user feedback to ensure a high-quality review.
So which ETL solution is better for your organization? The best method to determine the ideal software solution for any purpose is to first identify your organization’s relevant aspects and requirements.
For example, if you require a cloud-based system for its data processing, utilizing Snowflake Data Cloud can enable your team to transform and manage its data through the online interface.
However, if your organization wishes to use its ETL solution to process big data batches, Databricks may be the better option. This is because Databricks has many functions and integrations for processing and analyzing big data sets.
Other factors to consider are the third-party products you want to use with your ETL solution. Ensure that the solution you choose has integration capabilities for each of your existing tools so that you can gain value from each of your data sources. Through thorough consideration of your organization’s needs, you can determine the best ETL solution to support your data operations.
citizen-times.com cannot provide a good user experience to your browser. To use this site and continue to benefit from our journalism and site features, please upgrade to the latest version of Chrome, Edge, Firefox or Safari.