Data Lakes vs. Data Warehouses: What’s Right for Your Association?
Associations looking for a new data-management solution might consider data lakes or data warehouses—but they also need to consider which option makes sense, and how leadership can help strengthen the final result.
The data lake is having a moment in the association space, and you might be wondering if it makes sense to bring it into your organization.
To complicate matters, your research might turn up lots of similar-sounding technologies, such as data warehouses and data fabrics, which could create confusion over what exactly you need.
“Realize Data Lake Dreams on a Beer Budget,” an upcoming learning lab at the 2022 ASAE Annual Meeting & Exposition in Nashville, aims to help clarify the benefits of a data-management approach for organizations large and small.
Yes, small. Despite these kinds of data-management systems often being associated with large organizations, Byron Smith, the senior director of IT and operations at the Association for the Advancement of Blood & Biotherapies (AABB) and one of the presenters in the learning lab, says the strategies can work for any organization—if it has the structure and support for them. He characterized it as a matter of operational maturity, not size.
“It almost doesn’t matter what size you are,” Smith said. “What matters is what you can bring to bear, with support from your leadership, to make use of the information that you’re using.”
Smith and his copresenters—Barbara J. Armentrout, CAE, managing partner of Mesa Group, and Rajeev Gupta, a Salesforce developer with Aplusify—described data management as an ongoing challenge for many organizations.
“Associations need to constantly analyze and evaluate the data that is collected for different business lines and units to identify what can and must be shared, recycled, and reused for multiple purposes,” the trio wrote in an email.
Cloud-Enabled Data Structures
The two most common types of cloud-enabled data structures are data lakes and data warehouses, which sound similar and share some functionality but have different goals.
A data warehouse is an organizing structure that allows information to be carefully stored in a well-managed way, while a data lake allows structured and unstructured data to be stored in one place and accessed as needed.
Data warehouses tend to be better for commercial firms that work heavily with structured data, Armentrout said. Organizations such as associations and nonprofits tend to have more unstructured data—information from social media, video transcripts, and help desk transcripts, for example—which means they may benefit from a data lake approach.
“Data lakes reflect more information flow than pure information storage,” she noted.
Gupta added that a data warehouse approach requires a lot of planning that data lakes do not.
“A data warehouse requires associations to think about (and agree on) the data structure of the stored data,” Gupta said. “However, lakes allow unstructured data to be stored for future consumption and analysis.”
Keep Things Clean in Your Data Lake
While data lakes don’t require deeply organized data the way data warehouses do, you still need to clean them. The trio explained that cleaning and maintaining data should be an important part of a data lake approach, because failing to do so can create problems with data hygiene.
“Before you flow everything into a data lake, you absolutely must perform essential basics, ranging from data governance to data cleaning,” the presenters wrote. “And once you’ve got your data lake, you must constantly monitor it to be sure it’s not getting muddied or polluted with bad or irrelevant data.”
Some of the biggest challenges in data management can come in the form of the humans you hope will use these tools.
For example, Armentrout mentioned groups that have spent lots of time implementing data lakes without talking to the organizational stakeholders. “Then the budget kind of runs dry at that point, not to mention the energy level,” she said. “And they still don’t have users who are ready to take advantage of this fantastic resource.”
As for leaders, AABB’s Smith recommends that they avoid being pulled in by shiny objects when building an association’s data framework. But when leaders act as stewards throughout the process, they can help encourage uptake.
“We’re not talking about an overnight situation,” he said. “We’re talking about a long-term, performance-based realization of any kind of products that are the result of the data lake process.”
Ultimately, moving to a data lake or data warehouse may sound good, but if it’s not helping you push along your strategic goals, there could be deeper problems at play on the digital transformation front.
“The data lake is buying the car. It’s not your digital transformation. It’s an important step on that journey,” Armentrout said.
(Maksym Kaplun/iStock/Getty Images Plus)