Preload included in the default software selection of Linux Mint

Registered by Rovanion on 2010-08-01

This blueprint is about including the small daemon Preload in the default software selection that is shipped with Linux Mint and further information around what it is that Preload actually do and how it does it.

Simply put:
Preload is a program that runs in the background and makes launching applications faster by predicting what or which applications you are most likely to launch next and caches them into the RAM before you launch them.

Use case:
A user has just started her computer. Preload sees that no additional applications than the base system is loaded so it predicts that the user will start her web-browser because it has seen a correlation between just the base system running and the web-browser being started next. Preloads then asks the Linux kernel to cache the web-browser files. When the user launches her web-browser the cached files are used instead of those on the hard-drive, effectively eliminating the slowest part of a computer. Preload then predicts that Adobe Flash is probably going to be used next and prefetches that.

Discription from the thesis written by Behdad Esfahbod for his Master of Science in Computer Science:
There are two fairly isolated components in preload: the data gathering and modeltraining component, and the predictor. These two are connected together using a shared probabilistic model. The former component trains the model online based on data gathered by on-going monitoring of user actions, while the latter uses the model to make predictions and perform prefetching.

The data gathering component will gather information about running applications periodically, once each cycle where a cycle is a tunable parameter that defaults to twenty seconds. The list of running applications is produced by filtering the list of the processes running on the system, and for each application, the list of its file-backed memory maps is fetched, and used to update the model parameters.

The predictor component also takes action once every cycle, and uses the trained model and the list of currently running applications. For every application that is not running, the predictor derives a probability that this application is going to be started during the next cycle. The predictor then uses these per-application probabilities to assign probabilities to their maps, and sorts the maps based on their probabilities, and proceeds with prefetching the top ones into main memory. Memory statistics and system load are used to decide how much prefetching is performed in each cycle, to minimize the effect of preload on the system load.

Known issues:
Preload can conflict with ureadahead causing a boot time to skyrocket, in some cases up to five minutes. This can be avoided by using readahead-fedora as in the Linux Mint Debian Edition. Or it can be fixed by configuring model.memfree to a lower value causing Preload to be less aggressive during the boot.

History:
Preload was originally designed both as a Google Summer of Code project in cooperation with Fedora and as a project for Behdad Esfahbod's Masters Degree in Computer Science. It has since then been developed further outside of these two instances.

Resources:
At the Toronto University Preload tests were performed on a desktop computer with an Intel Pentium M 1.7GHz processor and 512MB of RAM. So it was originally tested and designed to have a very modest resource footprint on computers that most today quite consider old. For example it does as mentioned earlier not perform any prefetching when the system is under load. It does not, by default, concider applications that are smaller than 2MB. Neither does it consider applications that do not run for a long a long enough time such as a cron job.

Much of this information was collected from this thesis: www.techthrob.com/tech/preload_files/preload.pdf

This has hopefully given a complete overview.
Hopefully the next release of Linux mint will run faster with the help of Preload.

Blueprint information

Status:
Not started
Approver:
Clement Lefebvre
Priority:
Undefined
Drafter:
None
Direction:
Needs approval
Assignee:
Clement Lefebvre
Definition:
New
Series goal:
Proposed for julia
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.