What are you looking for ?
Advertise with us
Advertise with us

Exclusive Interview With WekaIO CEO Liran Zvibel

Growing fast as serious commercial alternative to leading HPC file storage such Lustre and IBM Spectrum Scale

Liran Zvibel is co-founder of WekaIO, Inc. and CEO since November 2017, based in Campbell, CA. He was CTO of the company, based in Tel Aviv, Israel from the inception to November 2017. Before that, he spent almost 3 years at Fusic as VP R&D and co-founder. From 2005 to mid 2011, he worked for XIV for various technical engineering positions. He was at IDF as software engineer and commander for 5 years starting in 2000, 1 year at InSightec in 1999 and system administrator and programmer at Tel Aviv University in 1998.

StorageNewsletter.com: Any message regarding Covid-19?
Zvibel:
First, I would like to start by wishing all your readers stay well and healthy during these very challenging times. As mandated throughout our geographies, Weka employees now honoring the shelter-in-place recommendations and working home, But we had a business continuity plan in place and are honoring our service and support SLAs, and all of our systems are operational. Some of our customers see especially increased load during these times, and we support them all to ensure that they provide best in breed service to their customers.

WekaIO was founded 6 years ago, could you update us on the company? Revenues, number of employees, offices… and the dynamic around these metrics
Weka came out of stealth in June 2017 with a handful of early fans of the product. Our mission was and is to fix storage. Storage, as we know it today, is riddled with compromises induced by lack of ability to scale as well as difficulty in achieving high enough performance for shared use cases, and we started by focusing on large scale AI/ML projects. Little did we know that many of the mature markets were suffering the same performance and scale issues. Weka grew 400% in fiscal year 2018 and another 600% in fiscal year 2019, with growth in both our customers and partner ecosystem. During that same period, we grew our employee base by over 200%, resulting in offices in California, CA, Boston, MA, New York and London, UK, as well as a large engineering group in Tel Aviv. Customer adoption and growth in the partner ecosystem is being fueled by the increasing adoption of NVMe-native storage systems to enable AI, machine learning, scientific research, and high-performance data analytics (HPDA), especially for high-end financial customers.

Does your initial mission change over time?
From the very beginning, our journey was always to build a company that looks to solve the large problems that plague the storage market.

At Weka we built the first storage company to solve all workloads – we have a single product that is faster than DAS and SAN, provides the semantics and ‘sharability’ of NAS, with the scale and economics of object storage. The exact same software suite runs on standard server hardware sold by most OEMs and also runs natively on the AWS public cloud. Our snap-to-object feature enables built-in backup and archive, data migration (so cloud-bursting or cloud DR), and enterprise grade multi-protocol support with encryption enabled, running the most demanding enterprise applications.

So we are steadfast in our mission, which is to solve the big storage problems for our customers, which we believe requires a totally new approach to manage the scale and performance of emerging applications. Organizations and enterprises that utilize legacy storage systems that are not optimized for today’s accelerated datacenters will be hindered in their ability to extract value from their data. We are committed to helping our customers maximize the investment in high-powered IT infrastructure, such as GPU and FPGA-enabled servers, so they can innovate faster and solve previously unsolvable problems.

What about your business, how was 2019 in terms of revenue growth, market penetration, partnerships and deployments/installations? Do you suffer from long sales and decisions cycles?
2019 alone, we saw a record breaking 600% growth. This is directly related to the value we show customers when they try our product. Weka succeeds when customers have a need for performance at scale. Use cases including AI, machine learning, genomics and life sciences research, and financial analytics. What we see very often, is once customers start using our product they quickly expand the system to accommodate other use cases, on average the first expansion happens within nine months of the first purchase. Such expansion is a significant driver in top line growth for the company.

A second major driver of growth has been a pivot to partner-led sales engagement. Weka’s ecosystem is growing very fast. We fill a technology gap for our partners, where our product is clearly differentiated and offers unique value from existing NAS, NVMe-oF block storage, and other parallel file systems. The typical sales cycle through one of our partners is less than four months and AWS customers are actually up and running in minutes with our on-demand service.

A small percentage of Weka deals have had longer sales cycles but these are typically very complex projects where there may be numerous technologies and influencers involved in the purchase decision. For example, Genomics England had a complete re-design of its entire data center including networks, compute and storage.

But you did some team adjustment recently, what was the trigger for that? It has impacted your image a bit
This past year was a big turning point for the company. We made the strategic decision to become 100% channel focused. As a result, we invested heavily in recruiting channel partners with practice expertise and OEM led and fulfilled sales. We aligned our existing resources and added to them to meet the three primary use cases we are working with AI and machine learning, scientific research, and high-performance data analytics (HPDA). The shift in focus is already proving effective in accelerating the landing of new customers and increasing our ecosystem.

You raised so far $66.7 million with a recent round in 2019, any need for a new round soon to accelerate your market presence and stay independent?
At this stage, no. We have raised a significant amount of money and in fact, have a run rate well into 2021. Our market presence is through continued innovation and hard work. As the industry evolves, so too shall the Weka product and company. Future fundraising is never out of the question but it is not something we are currently thinking about.

How many total installations do you have? How many Weka nodes do you deploy on average? And how many clients are connected to WekaFS data server nodes? What is the average storage capacity deployed?
In terms of on premises installations, Weka has over 130PB under management. 78% of the capacity is on HDD-based object storage and 22% on the NVMe tier. These numbers highlight the value WekaFS brings as cost is reduced dramatically, while keeping record breaking performance. 58% of the customers are flash only, while 42% are a hybrid model. We find customers start with NVMe flash, and as they add more applications or grow their data set that is when they tend to add the object tier. We are fast approaching 100 customers for this particular structure. On AWS we are consumed through the AWS Marketplace, and while we are aware of some heavy users (ex: Untold Studios, Tre Altamira, Baylor Labs), we do not have access to independent customer statistics, and only receive aggregate usage from AWS. With all that to say, on AWS we have also crossed the 100PB mark significantly.

With regard to node counts, on-premises most customers use between 10 and 20 storage nodes, some significantly bigger systems and some start with our base systems. We even have customers who run the software converged with compute applications. As for node count, we have some customers that run 100s of GPU filled servers achieving performance figures bigger than state funded Supercomputers have listed on the IO500 table.

Confirmation of your performance capabilities, already demonstrated with SPEC SFS numbers, you made a achievement with a leading position in IO500, any details on this?
We are proud of our benchmark achievements. We hold three record breaking titles, the IO500, SPEC SFS and the STAC-M3 benchmarks. In addition we were highlighted by NVIDA at SC-2019 doing 73GB/s to a single DGX-2, utilizing GPUDirect. This number currently is over 80GB/s fully saturating the network links. And stay tuned, the internal benchmarks we are running this year are showing even better results as we continue to optimize the data path.

One thing these benchmarks test results prove for us, we deliver increasingly better performance on systems with smaller footprints, holding significant value for our customers.

For more details on how we can do it, Weka is the first storage company to completely re-create the stack in a way that is optimized for fast networking and NVMe. Where all other solutions have inherent dependence on data locality, Weka optimizes spreading workloads across the network, and coupled with our unprecedented way of scaling metadata operations (either directories or 4kb IOs) we are able to show tremendous results.

Another important component in our performance is our ability to do fine-grained load balancing, and the IO500 numbers were achieved on a the AWS public cloud. Unlike other solutions we don’t suffer from ‘noisy-neighbors’ syndrome as we have the ability to steer work to the best performing instances. Similar to that is how we support heterogeneous HW from several generations on premises.

Weka is the first storage product to solve running mixed workloads, scale and load balancing, and we allow our customers to run multiple workloads on shared infrastructure, which brings considerable cost savings and simplification as compared to adding more silos that is done with the traditional solutions.

WekaFS demonstrated high performance, what are your comfort zone where the product excels?
Performance is just one of our strengths, and it just happens to be amplified because it is so easy to showcase our superiority on the public, audited benchmarks. Our other strong points are not each easily demonstrated on one-off tests, so they don’t get as much of a spotlight in the press.

Our scalability is unprecedented even when comparing to object storage systems, and especially to any other NAS system. We can store 100s of billions of files in a single directory, supporting trillions of files in exabyte sized namespaces. We have several customers that have directories that are actually bigger than the bucket size limitation for some leading object storage solutions.
By tiering to object storage we are showing amazing economics, but our strongest feature there is actually our snap-to-object functionality. We allow saving our standard snapshots to an object storage in a way that any other Weka system can pick up that saved snapshot and keep running from it either as read-only snapshot, or writable clone. We provide an RPO of about 1 minute, and we have customers that use that functionality for workload migration and DR to another data center or to the AWS cloud. By allowing customers to tier and save snapshots to two distinct and independent buckets we allow seamless 3-2-1 archiving and backup policy, or an easy way to backup on-premises and to AWS as well.

Weka has great multiprotocol support to Windows, supporting AD and LDAP integration, and synchronizing ACLs across Windows (NTFS ACLs) and Linux (POSIX ACLs), which is functionality that until recently only very few high end NAS provided.

Finally, we provide best-in-class encryption. We support encryption with keys managed by a KMS (key management system, e.g. Hashicorp Vault), where a different KMS can be configured per each logical FS on the system. Our encryption is in-flight all the way to the clients, alongside at rest, and even on the object storage. Our customers can save snapshots to AWS S3 that are unusable unless a key is provided from their on-premises KMS.

WekaIO develops a new generation of parallel file system, remind us your advantages over Lustre or IBM GPFS/Spectrum Scale, the 2 famous products deployed in technical IT?
Past generations of parallel file systems were created in the HDD era. We provide much greater performance density as can be seen on the IO500 numbers, and the biggest two differences are around metadata performance and the performance of small-file IO/s. Also, we support mixed workloads well, and don’t require any tuning parameters. Where traditional parallel FS require the admin to optimize for a specific IO workload, Weka is fast for any kind of IO, and also runs mixed workloads extremely well (small IOs will come in with very low latency and high IO/s number, concurrently with large streaming IOs). We show customers that when they deploy us we successfully replace several clusters of past generation with one Weka system that runs each application better than before.

New kind of applications have a tremendous amount of file count, and thanks to our incredible metadata scaling we support them very well. These previous generations of parallel file systems have fixed amount of metadata ‘power’, and once that threshold is hit, metadata performance starts suffering.

Weka provides multiprotocol support with SMB and NFS allowing to run in an enterprise environment, with native support for encryption that is now required in many environment.
Finally, the operational overhead of running a Weka system is dramatically reduced compared to the past solutions. For example, Weka has a proactive support model with monitoring done on a dedicated cloud to ensure business continuity to our customers, this is now table stakes with enterprise storage solutions, but not common with the traditional parallel FS world.

What are the next few key features you plan to add in the coming months?
Today we provide best-of-breed performance, scale and features, the next area to optimize would be around cost, ease of deployment in more clouds and container based workloads.

DDN is a leader in HPC storage and you meet them very often for sure, could you update us on the competition landscape? DDN, Lustre, IBM GPFS/Spectrum Scale, BeeGFS, Panasas or other player like Quobyte?
We bring unprecedented performance and scale to enterprise workloads to be leveraged by large corporations. These products have not seen significant success in our customer base within the verticals we focus on (AI/ML, finance and genomics), as they lack many of the features and manageability required by enterprise IT, and also the support model alongside ‘fit-and-finish’. We have very limited exposure to these in our customers, and are mostly replacing the more traditional big-box storage incumbents.

HPE is one of your key partners but they acquired Cray and promotes actively ClusterStor, a Lustre-based product, and even other products to tactically push their hardware. What is the status of your partnership with them?
We have a strong relationship with HPE, and have resources committed to that relationship. HPE has a huge portfolio of products and we have clear view of where the Cray ClusterStor product and Weka play. We have worked closely with the Cray team to help position correctly inside of HPE sales. ClusterStor is the best Lustre based solution in the market, and positioned for large scale HPC deployments while Weka fills the gap in the technical compute space where performance density and enterprise features are a must. We have many engagements with the HPE team, particularly in AI and finance.

I know one of your favorite arguments are NFS limitations but some players like VAST Data, Qumulo, both listed on IO500, and even emerging Stellus Technologies deliver impressive performance. What is your opinion on that?
We love the fact that more vendors take place in running benchmarks, and indeed a storage system that leverage NFS can run these benchmarks. If you compare the actual results, especially around performance density, you can see that there is at least an order of magnitude difference between Weka and these solutions. We encourage every vendor to publish performance numbers on audited benchmarks such as SPEC, STAC and the IO-500, as it helps customers make informed choices.

What about pNFS, a hope for quite long time to boost NFS and make it parallel without adding any software layer on client?
Firstly, we have seen a huge change in customer reception to putting a client on their compute nodes so it is no longer the major limiter it was seen many years ago. But pNFS has never really emerged. I am not sure how actively it is being worked on, and I have not come across anyone actually using of recent years. pNFS was created over a decade ago, when the technology trade offs were significantly different than today.

WekaFS supports S3 storage as a capacity tier, who are the top 3 partners you are deployed with on premises?
Quantum ActiveScale (formerly WDC ActiveScale), Scality and Cloudian. However our largest capacity is with AWS. Last year we reached over 100PB ingest into AWS Cloud, as Weka is also providing backup and DR capability.

WekaFS runs on AWS but not yet on Azure or GCP, what is your cloud strategy? As it started to be a new battle for HPC vendors and even cloud providers having their own solution for some of them
We need to qualify this statement, AWS is the only cloud vendor where we have an active program where customers with seamless deployment through their marketplace. That said, we have tested and run our product on Microsoft Azure and IBM COS, and can be deployed as a dedicated service on their instances. Weka will continue to build out our portfolio, it is simply a matter of time and resources, not technology limitations.

What could we expect from Weka in the coming months?
Continuous innovation for sure, more benchmark validation that we are the world’s fastest file system, a larger partner ecosystem with more OEM announcements in the pipeline, and we’ll make strategic investments in the company to penetrate key markets, especially in Europe and APJ.

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E
RAIDON