[sam-qfs-discuss] kernel performance when aggregating free'd md blocks

Stuart Anderson anderson at ligo.caltech.edu
Tue Jan 1 15:52:00 PST 2008


I am very keen to start using ma/md filesystems with 10^7-8 files in it
again, however, if there is still an O(N^2) algorithm (my conjecture) in
the garbage collection of freed small DAU's that is non-starter for us
and we will have to stick with ma/mr and live with the compromise between
wasted disk space and performance on large files.

I would appreciate it if anyone can comment on the performance of freeing
(rm or release) on a large number dual-allocated small files in the current
4.6.25 or later code base. In particular, is it still possible to overwhelm
the free block list aggregating code?

Thanks.


On Fri, Apr 20, 2007 at 11:47:17AM -0700, Stuart Anderson wrote:
> > 
> > >1) The last time we tried tweaking the Solaris kernel parameters to increase
> > >   the various caching parameters we got burned by what we suspect is an
> > >   extremely inefficient algorithm for reclaiming DAU's for md 
> > >   dual-allocation
> > >   devices when a large number, O(1M), of small files are deleted. It is
> > >   conjectured that the algorithm to reclaim lots of small DAU's into a 
> > >   fewer
> > >   number of large DAU's was causing the problem. It took us up to a _week_
> > >   for the kernel to stop spinning as it freed up the space in one instance.
> > >   Is this still a problem? Or, based on your anderstanding of free DAU
> > >   aggregation for md devices, is this guess impossible and we should look
> > >   for some other explanation?
> > 
> > I believe our testing group has recreated this problem, but your results
> > are much worse.  This could be due to the fact that the only activity in
> > our filesystem during the test was the file deletes.
> 
> We believe this also happened when running release on several million
> small files which tends to confirm the theory about aggregating small DAU's,
> though we did not reproduce that in a controlled fashion.  Another reason we
> may have had a longer kernel processing time than your test case was that we
> first discovered this while running with increased kernel caching parameters
> for metadata--which we no longer do. This was also on an older machine
> (8x1.2GHz V880). However, even after going back to default kernel parameters
> I believe we where still getting day long kernel loops before we simply
> learned not to run rm -r and release -r on these filesystems. Instead we
> wrote scripts that feed the file lists to rm/release in ~100k lists.
> 

-- 
Stuart Anderson  anderson at ligo.caltech.edu
http://www.ligo.caltech.edu/~anderson


More information about the sam-qfs-discuss mailing list