Israeli cloud data warehouse startup Firebolt announced today that it has closed a $127M Series B financing round, bringing total funding to $164M. This round follows only 6 months after the company’s $37M Series A in December 2020. Lead investors for this round are Dawn Capital and K5 Global, with participation from previous investors Zeev Ventures, TLV Partners, Bessemer Venture Partners, and Angular Ventures. The additional money will be used to expand Firebolt’s product, engineering and go-to-market teams.
Geek speaks
ZDNet spoke with Firebolt CEO Eldad Farkash. Similar to our last interaction with Farkash in December, the conversation focused on a combination of market and technology factors in the cloud data warehouse world.
Also read: Varada and Firebolt launch cloud analytics query platforms
Farkash shared some interesting insights. First he explained that, even if it may seem fast, a 6-month interval between Series A and B funding rounds is considered conservative in the Israeli startup scene. But beyond this inside-startup-baseball tidbit, Farkash opined that data warehousing isn’t just about business intelligence-oriented analytics anymore. He feels that also, and perhaps even more important, cloud data warehousing encompasses the need of applications to perform analytics on-the-fly for operational purposes.
Developers, democratization and data
Take, for (a paraphrased) example, the scenario of an online game with thousands of concurrent users. The publisher of that game may want to perform just-in-time analytics on usage patterns, scoring trends and more, and might use the results to drive promotions or even game content on the fly. Because that kind of analytics is so tightly coupled to the working of the game itself, it’s developers, not analysts, who need to be able to perform it, and they need the performance to be super-fast, as it can easily impact the performance of game play itself.
This has caused a new kind of democratization of data: one where mainstream developers need a service that can provide them this data access and performance. And that service needs to be easily-consumable via APIs, and cost-efficient, so that developers need not be overly-sparing in their use of it.
Set the table scan
Farkash explained that this type of analytics brings about nuanced workloads, where some queries may be aggregational and others may be point queries but, because of huge and ever-growing data volumes, even the point queries may involve scanning data volumes on the order of half a terabyte. This combination of factors means that most queries are very reliant on table scans, but that many will also need to prune data aggressively. That, in turn, means that most data warehouses and data lakes (focused on broad aggregations) and certainly most operational databases (focused on writing data and relatively narrow point queries when they read it) can be ill-suited to the task.
We knew from our last conversation that Firebolt implements its own file format (FFF), which makes liberal use of indexing. This avoids a downside of the highly adopted Parquet file format: its columnar structure aids broad table scans, its but single-expression partitioning scheme may limit point query performance. With FFF, Firebolt makes table scanning the priority, without sacrificing data pruning capabilities.
Too much ain’t enough
Firebolt claims its performance is two orders of magnitude higher than its competitors’. While in many BI scenarios that perf increase may offer diminishing returns, it’s essential for the big data operational analytics use cases that Firebolt targets. While Firebolt is confident it has the fastest engine out there, Farkash admits the quest for great performance isn’t over, that “there’s still so much to battle.”
One assumes the cool $127M of funding the company has closed on will give it robust ammunition for the fight.