FolderSizes – Advances in Disk Space Analysis Performance

The release of FolderSizes 8 represented a major milestone for our market-leading storage analysis and reporting product. Not only did we introduce a range of valuable new features, but we re-engineered the product’s file system analysis engine from scratch – with stunning results (note that performance varies based upon storage device speed, etc.).

Since the public release of FolderSizes 8, we’ve received a number of inquiries from users asking how FolderSizes is able to perform as well as it does, especially when compared to similar products. Today I’d like to provide a high-level overview of where we’re at today and how we got here.

“SuperThreading” – A New Approach

The new FolderSizes 8 disk space analysis subsystem was built from the ground up to maximize parallelism and concurrency at all stages. It employs a user-customizable thread pool to minimize thread startup costs, and then intelligently allocates work using proprietary algorithms that maximize effectiveness and minimize the potential for thread contention.

This advanced threading system operates against a proprietary in-memory database model that iteratively constructs a complete working picture of the file system(s) being analyzed. The database dynamically adapts its internal storage model based upon the kind of workload assigned, ensuring that a minimal subset of information is stored in memory at any time, thereby vastly increasing scalability. Special memory compaction and data storage algorithms further ensure optimal memory usage.

All FolderSizes source code is developed using C/C++ and is compiled directly to 32 or 64 bit machine code instructions using the most powerful optimizing compilers available today. Optimizations occur not only during compilation, but also during runtime performance profiling, which optimizes the location of “hot” code paths during real-world program execution scenarios. These efforts result in amazingly fast program execution with minimal computational overhead and resource consumption. In other words, FolderSizes is not only blazingly fast – it’s also very efficient.

FolderSizes was the first product of its type to provide native 64-bit support – an accomplishment that many competing solutions have still not achieved.

FolderSizes also works very hard to optimize both local and network analysis workloads. It understands the semantic differences between these environments and adapts accordingly, ensuring the best possible experience regardless of whether you’re analyzing a local drive, a critical corporate NAS (network attached storage) device, or anything in between.

Collaboration, Experience and Pedigree

 This work was made possible through years of cooperation with many of the largest organizations in the world. FolderSizes provides storage analysis capabilities to the likes of ExxonMobil, Chevron, NASA, Comerica Bank, Marathon Oil, and thousands of other companies with massive data storage requirements. Our close working relationship with these customers has helped us build the most advanced and thoroughly battle-tested disk space analysis product in the world.

Not only is FolderSizes stress tested against massive, real-world storage subsystems, but we also maintain an extensive internal testing environment that includes numerous “extreme” state conditions. This has helped us to develop FolderSizes into a tool that performs with a high level of accuracy in environments where other products simply crash and burn.

FolderSizes is developed and published by Key Metric Software, a Traverse City, MI company specializing in file system analysis and reporting products. Initial work on FolderSizes began early in 2001, with the first formal release published in 2003. That’s over fifteen years of iterative development on FolderSizes. In fact, you can review the product release notes at any time to see exactly how the tool has evolved over the years.

We’re very proud of how FolderSizes has advanced over the years, and even more excited about the future.