What is ETL and how can it streamline your DAM?
By Kyle HENKE
Everyone with a Digital Asset Management system wants the workflow to be consistent, repeatable, and sustainable. The answer might be something called “ETL.” ETL stands for “Extract, Transform, Load,” and it’s a data-integration strategy and workflow that helps bring together data, content, and assets from different systems or databases. It might sound simple, but successfully executing ETL requires a systematic approach, with attention to detail along every step in the process.
Organizations often store data in multiple locations, or through various services that are disconnected and not integrated. That disconnection and lack of integration can be further complicated if your DAM is used by different departments within your organization, each with their own set of workflows, assets, and goals. A lack of integration complicates accessibility, causes more work, and requires additional resources to identify missing content. Implemented correctly, ETL can solve those problems by establishing a common workflow, and centralizing storage into a single data warehouse.
The ETL process brings together all your data from all your systems, drives, and servers (i.e., “extract”), standardizes and organizes your data into a consistent form, and confirms its integrity (that’s “transform”), and then moves all of it into a single centralized system ( that’s “load”). ETL allows data to travel between systems and applications seamlessly, improving user experience, strengthening your knowledge base, and enhancing discoverability and accessibility.
But it’s important to understand the commitment necessary for a successful implementation of the ETL process. ETL is worth the trouble, but it does take time and resources. Before you jump into ETL, the key questions to ask (and answer):
- How many assets do you have?
- What types of files and formats?
- Where are they?
If you can answer those questions, your DAM managers can begin the ETL process in earnest.
The first step in the ETL process is to bring your assets together from all the systems you manage. Assets could come from your digital asset management (DAM) system, visual library, content management system (CMS), product information management system (PIM), document warehouse, project drives, etc. You take that existing content from your different platforms and applications, structured and unstructured, and bring it into a singular database.
Extraction can be performed manually by your team, either through export methods within their particular application or system, or through ETL tools that can automate much, if not all, of the extraction process. Once performed, you can move to the transformation process.
During the transformation process, you will take all your data and perform tasks that will standardize and organize your assets. First, you need to create standards for your data that can be used across the board on your assets, regardless of the original system or application. Standards create consistency which improves accessibility and discovery of assets in your DAM and other platforms.
- Standardize: Formatting, description, and content type should all be determined and designated. Image assets should be the same, whether a TIFF or a JPEG. Now is the time to define and implement your standards.
- Dedupe: Identify redundant or duplicative assets often found across your datasets and/or utilized in multiple applications and platforms. Centralizing your assets means only one version of an asset will be necessary.
- Verify: Flag any discrepancies or other issues with your assets, including checksum verification errors, unusable data, or file integrity issues. It’s essential to identify and resolve these issues as early as possible.
- Organize: Sort and organize your assets based on factors you previously standardized and determined were beneficial for your systems going forward.
The transformation process is intended to simplify and build consistency in your assets. How one format or content type looks and behaves, its metadata, and how assets are accessed should be consistent across all platforms and applications.
Now is the time to move all your assets into a centralized server where your DAM and other systems can call and access. “Load” is where you ingest your assets into a single centralized repository or database that is agile enough to push out relevant asset information or content to your other systems. The “Load” process is where the data integration is solidified and made beneficial for your organization and end-users.
ETL In Action: An Example
“Assets” are more than just a media file (e.g., the image or video). Assets include associated metadata files, permissions, rights, and usage information, connections to similar or related assets, and more. The lifecycle of each asset and how they connect and integrate are vital to a successful DAM. Your DAM needs access to all your assets, and ETL provides a great roadmap to bring everything together.
As an example, say you have a video file. Is the video asset the only important piece of content to all end-users? Doubtful. Your users will want to know what the content is about, how and where they have permission to use the video, maybe whether or not it contains that one shot of that one building. And very likely, without ETL, that all feels pretty scattered.
With ETL, three systems – your visual library, DAM, and CMS – are in sync with centralized asset storage. Now you have all of your related assets, the full picture, in one place. With ETL, you can:
- Use the video asset and see all accompanying metadata (from your visual library).
- Use related b-roll, outtakes, etc. used to make that final video asset (from your DAM).
- Easily get your hands on copyright and usage rights information (from your CMS).
Bottom Line: Why ETL?
- ETL creates a one-stop data warehouse, and sets up your DAM to truly be the platform that carries the direction for the others.
- ETL enables you to stop storing multiple versions of assets across different applications and systems, thereby fragmenting your assets, data, and information.
- With ETL, all of your systems will be able to synchronize with your assets, ensuring those systems then have the best and most complete data possible.
- ETL ensures that your users can easily find, use and repurpose your assets efficiently, enterprise-wide.