|
|||||||
Delta Service
BackgroundRelated Workxproxy: an xdelta based proxy.Debian Package Management SystemDebian uses a straight forward package managing system. It basically consists of a repository (Debian Archive) of available packages, usually accessible via HTTP(S), a set of tools (i.e. apt and dpkg) for retrieval and package management on the client machine, and a cache which stores retrieved packages locally. The debian archive provides a list of all available packages ( A debian binary package is a The cache ( Public Package ArchivesOlder versions of debian packages are publicly available in two major archives: Debian Snapshot Archive and Ubuntu Launchpad. AnalysisArchitecture ApproachesThere are basically two different approaches to be considered. Both rely on a so-called Delta Service, which provides patches for the transition of one locally available old package from the cache to a new version of the same package. Those patches can be downloaded and applied to unpacked debian packages. The result of an applied patch is the unpacked package content of the new version of that package. The two architecture approaches now differ in the way they further process the unpacked content of the new package. Approach 1: Providing Patched and Repackaged PackagesThis approach relies on a service on the local machine, which acts as a proxy between the package management tools and the Debian Archive. On an request for a debian package, the proxy checks whether an old package is available in the cache. In case it is, it requests a patch from the Delta Service. The older version from the cache gets unpacked into a temporary directory and the patch is applied to that directory, resulting in the unpacked content of the new package. This directory is then carefully repacked to create the exact same package (with the same checksum) as the original package in the Debian Archive. This is then provided as reply to APT, which can now proceed with its usual task. Approach 2: Providing Patched Package ContentIn this approach, the local package management tools are extended by so-called Delta Plugins. They basically intercept package retrieval and redirect it to retrieve appropriate patch files instead. As in the other approach, the patch gets applied to a directory with the unpacked content of the old package from cache. The difference is, that the unpackaged content will not be repackaged before installation. Instead, the delta plugins provide the directory to the package management tools in its unpacked form for further processing/installation. While the package management tools perform their task on the patched package content, the Delta Plugins start repackaging the package (with low priority) to store it in the cache later. Difference Between Both ApproachesThe first approach has a significant performance disadvantage, because it repacks packages before they can be installed. During repacking a lot of effort goes into compression and especially the new standard compression method LZMA is know to use up a lot of processing power. On network connections of higher throughput starting with 10 mbps, the time consumption of LZMA compression can even be higher then the time needed to download the full package. Here, the second approach has the advantage, that unpacked content can be instantly installed after patching. Compression will only be applied, if the unpacked content contains compressed files, that where worth patching, and in most cases, those packages aren't compressed with LZMA. The main advantage of the first approach is, that it does not interfere with the basic functionality of the package management tools. The delta proxy can act as a fully functional, locally available Debian Archive providing common debian packages, which can be checked for integrity the usual way. And in case anything goes wrong, it can still download the actual package from its original source. This makes the first approach highly reliable and trustworthy. In contrast, the plugins in the second approach have to be integrated in the existing package management tools and require a new method to do integrity checks on downloaded and patched package content. Key Topics to be Further Analysed
Identifying Derivative PackagesUpdated packages can have the same name but also different names such as having a different version number in the file name. The relationship between older and newer packages is of course known to the package provider, but not necessarily to the client which initiates the download. Simple example is a manual download through a web browser. Even though the individual behind the PC knows the similarity, the web browser doesn't.
Holger Machens, 02-Jan-2021
|
|||||||