Copying files is slow and slows to a crawl over time for large numbers of files [$200]

Bug #1404588 reported by Sergey "Shnatsel" Davidoff
106
This bug affects 23 people
Affects Status Importance Assigned to Milestone
Files
Confirmed
High
Unassigned

Bug Description

Copying a lot of files via Pantheon Files becomes slower and slower over time.

I've created 250,000 100-byte files on tmpfs for testing, and kicked off copying to another tmpfs. It started off at speeds over 100Kb/s but halfway through it's just 4Kb/s (!) and dropping.

Profiling with sysprof shows that all this time is spent in g_list_last(), which probably means that we're abusing a linked list somewhere and that it has to walk the entire list of already copied files, one by one, for each next file copied.

Testcase:
mkdir ~/created-files ~/copy-here
sudo mount -t tmpfs -o size=1G,mode=0777 tmpfs ~/created-files
sudo mount -t tmpfs -o size=1G,mode=0777 tmpfs ~/copy-here
cd created-files
split -b 100 SOME-BIG-FILE
# open Pantheon Files and copy "created-files" folder into "copy-here"

This is a synthetic test case, but I had over 250,000 files during my last backup for OS reinstallation, so this *is* a real-life scenario.

ProblemType: Bug
DistroRelease: elementary OS 0.3
Package: pantheon-files 0.1.5.1+r1680+pkg35~ubuntu0.3.1 [origin: LP-PPA-elementary-os-daily]
ProcVersionSignature: Ubuntu 3.13.0-43.72-generic 3.13.11.11
Uname: Linux 3.13.0-43-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.6
Architecture: amd64
CrashDB: pantheon_files
CurrentDesktop: Pantheon
Date: Sun Dec 21 04:42:10 2014
ExecutablePath: /usr/bin/pantheon-files
GsettingsChanges:

InstallationDate: Installed on 2014-12-10 (10 days ago)
InstallationMedia: elementary OS 0.3 "Freya" - Daily amd64 (20141209)
SourcePackage: pantheon-files
UpgradeStatus: No upgrade log present (probably fresh install)

Related branches

Revision history for this message
Sergey "Shnatsel" Davidoff (shnatsel) wrote :
summary: - Copying a lot of files slows to a crawl
+ Copying a lot of files slows to a crawl over time
Changed in pantheon-files:
importance: Undecided → High
Revision history for this message
Sergey "Shnatsel" Davidoff (shnatsel) wrote :

Replacing GList with GSequence data structure might be a way to hotfix this without changing huge amounts of code.

Changed in pantheon-files:
status: New → Confirmed
Revision history for this message
Jeremy Wootten (jeremywootten) wrote : Re: Copying a lot of files slows to a crawl over time [$100]

A bounty of 100$ has been placed on this bug

summary: - Copying a lot of files slows to a crawl over time
+ Copying a lot of files slows to a crawl over time [$100]
Changed in pantheon-files:
assignee: nobody → cmm2 (cmm2)
milestone: none → freya-rc1
description: updated
Changed in pantheon-files:
status: Confirmed → In Progress
Revision history for this message
Giulio Sant (giulio-sant) wrote :

Might this be related with a similar issue concerning very slow file transfer to USB stick?

summary: - Copying a lot of files slows to a crawl over time [$100]
+ Copying files is slow and slows to a crawl over time for large numbers
+ of files [$100]
Changed in pantheon-files:
status: In Progress → Confirmed
assignee: cmm2 (cmm2) → nobody
milestone: freya-rc1 → none
Revision history for this message
Jeremy Wootten (jeremywootten) wrote : Re: Copying files is slow and slows to a crawl over time for large numbers of files [$100]

I have changed the bug description to clarify that the bounty relates to obtaining significant improvement in file copying performance in general, not just for file numbers of the order of 100,000. Even with comparatively small numbers of files (100 - 1000) Files is very much slower than other well known file managers. I have increased the bounty to reflect the widened scope.

summary: Copying files is slow and slows to a crawl over time for large numbers
- of files [$100]
+ of files [$200]
RabbitBot (rabbitbot-a)
Changed in pantheon-files:
status: Confirmed → Fix Committed
Cody Garver (codygarver)
Changed in pantheon-files:
milestone: none → freya-rc1
Changed in pantheon-files:
status: Fix Committed → Confirmed
Revision history for this message
Cody Garver (codygarver) wrote :

Note solved but it was improved some so I'm bumping it from the milestone

Changed in pantheon-files:
milestone: freya-rc1 → none
Changed in pantheon-files:
milestone: none → loki-beta1
Revision history for this message
Jeremy Wootten (jeremywootten) wrote :

I assume both source and destination were open in a Files view during the copy? Different tabs or different windows? Icon View or other?

Changed in pantheon-files:
milestone: loki-beta1 → loki+1-beta1
Revision history for this message
Matt Spaulding (madsa) wrote :

So I ran a couple benchmarks to see if I could figure out what the problems might be here. Wrote a simple program, basically a "cp" clone using g_file_copy to benchmark copy speeds against "cp" itself. What I found is that g_file_copy has very similar performance to "cp" (copied 10,000 small files at about 160kB/sec), so no problems there. Seems more like this has to do with all the queuing and locking going on in the file manager. Been swapping out various data structures and benchmarking and seeing some small performance increases. Removing some locking from the deep counter and switching out the marlin file queue for a thread-safe GAsyncQueue improved things a bit. I've been getting between 40kB/sec to 60kB/sec with those changes. It might also be worth swapping out the GIOScheduler stuff since that is deprecated. Not sure if that will bring any speed increase with it.

Revision history for this message
Jeremy Wootten (jeremywootten) wrote :

Matt: Thanks for having a go at this. Just for clarity, the target is to get Files to be at least comparable to other popular filemanagers in performance in this aspect, say within 75%? This assumes that other features like "undo" that might affect speed are also comparable.

Revision history for this message
Matt Spaulding (madsa) wrote :

Okay, thank you for the clarification. Which file managers should I run comparisons against? At least in my tests with Nautilus it's copy speeds with large numbers of files is very poor, comparable to what we're seeing with Files.

Revision history for this message
Jeremy Wootten (jeremywootten) wrote :

Matt: I was thinking of Thunar and PCFman primarily although I have not done a comparison recently I admit. I assumed Nautilus was was superior at time of filing of the bug but perhaps things have changed. It is a fairly old bug now.

If Files is (now) already comparable to the best file managers under the conditions quoted in the bug then I would be willing to change the target to a more modest improvement and/or fixing of associated memory leakages.

Revision history for this message
Vishal Rao (vishalrao) wrote :

Since the code seems to be on github now, I've got a patch if anyone would like to try and comment? See https://github.com/vjr/files/commit/c972549cf42ac68cacc24b1bed1d080f5542d2c1

Revision history for this message
Jeremy Wootten (jeremywootten) wrote :

Hello Vishal, thank you for your work. I see you have created a pull request on GitHub. We will be concentrating on getting integrated testing working and more complete before merging any major changes (we have only just migrated) but I should be able to test your fix fairly soon. It would be good to get this issue fixed. There is a general desire to replace GList with another structure where possible in elementary code. There are quite a lot of GLists inherited in Files. If you have any test conditions and comparative performance timings please include them in your PR.

Revision history for this message
Vishal Rao (vishalrao) wrote :

Hi Jeremy. I almost missed your comment because LP didn't notify me by default :-) I was paying attention to the github PR location mostly and it has been almost a whole day since your post.

Anyhow, thank you for looking into my patch when you can get the opportunity.

If nothing, at least I hope it will point you or other devs into at least one partial area for improvement in Files.

OK I will update the PR with a comment about whatever "informal testing" I did, but in general, copying a large number of files was slowing Files to a crawl and making it unresponsive. With this patch the operation was happening "very quickly" - sorry I don't have performance timings.

Note that this does not address copying speed of a single large file, of course, that remains the same.

Will update the PR with a comment/notes shortly. Thanks again.

Revision history for this message
Jeremy Wootten (jeremywootten) wrote :

OK, thanks! Future communications will be in GitHub.

Cody Garver (codygarver)
tags: added: bounty
Revision history for this message
James Buren (braewoods) wrote :

Is this still needing to be resolved?

Revision history for this message
Jeremy Wootten (jeremywootten) wrote :

There have been a number of incremental improvements to Files not specifically related to this bug that have ameliorated the problem I believe. However, if you are able to demonstrate a significant quantified and reproducible performance improvement in this or another related aspect of Files I would be inclined to award the bounty. There is no mechanism for withdrawing a bounty on BountySource, as far as I am aware, so it is difficult to close this bug. As per previous comments, all work and communication should now be through the GitHub repository: https://github.com/elementary/files.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.