Swap Algorithm Improvement

Registered by Saugata Das on 2012-07-06

Linux swap implementation is currently not efficient to handle flash based storage such as eMMC or UFS, resulting in high write amplification and lower device life time. Following improvements need to be considered for improving the swap algorithm,

1. Ensure that the swap partition start address is erase block aligned
2. Ensure the swap meta-data (header) and clusters are erase block aligned
=> One proposal is to include the erase block size information in the swap headers during mkswap

3. Ensure the write pages are super page size aligned
=> One proposal is to include the super page information in the swap headers during mkswap

4. Discard pages as soon as it is free
=> Today, the complete cluster is discarded just before it is getting used. Evaluate if it is feasible, to discard the page as soon as it is free and if it helps in performance

5. Write the swap pages in order within a cluster. Start writing to next cluster only after completing write in the one cluster
=> This is currently handled by IO scheduler (deadline scheduling)

Once we complete the above implementations, check the pattern of the write to eMMC (specifically check point 5) for the next steps.

Related discussion in Kernel mailing list:

Blueprint information

Arnd Bergmann
Saugata Das
Needs approval
Venkatraman S
Series goal:
Milestone target:
Started by
Venkatraman S on 2012-10-09

Related branches



Roadmap id: CARD-154
Headline: Swap algorithm on eMMC and UFS improved
 * TODO: to which tree is this going to, or which maintainer needs to take it?
 * prove that it is more efficient by showing test results (Appala)
[jakub-pavelek 2013-01-02] Blocked, no developer. Can go deferred if not resolved


Work Items

Work items for 12.10:
[svenkatr] Update util-linux libblkid to parse MMC erase block information if it is present in sysfs: DONE
[svenkatr] Update util-linux mkswap tool to embed erase block information into swap header padding: DONE
[svenkatr] Update kernel swapon syscall to read erase block information from partition header: DONE

Work items for backlog:
[svenkatr] Upstream the above patches to util-linux: TODO
[svenkatr] Align swap extent starting block to erase block: TODO
[svenkatr] Study kernel swap code to understand about super block organization: TODO
[svenkatr] Ensure write pages are superblock aligned : TODO
[appala-bade] Investigate how can we make a test/benchmark of existing implementation and then run it against new implementation and see improvement: TODO
[appala-bade] Implement tests/benchmarks: TODO
[appala-bade] Document (e.g. in Google Doc or other shared document) the results: TODO

This blueprint contains Public information 
Everyone can see this information.