Ade Malsasa Akbar contact
Senior author, Open Source enthusiast.
Thursday, March 3, 2016 at 23:11

Among the benefits of knowing the structure inside of a Debian package is knowing more about system. For those don't understand, Debian package is a .deb archive file packaged for Debian family system. You will understand more later, in some aspects, how apt-get and dpkg work. It is important because it is related directly with your package management system. This article is mainly written for those want to start learning Debian packaging.

Requirement


To see the structure of a Debian package, you need a Debian package. Have one from anywhere e.g. your /var/cache/apt/archives/ or download one from http://packages.ubuntu.com. Here, we use emacs_46.1_all.deb package from Ubuntu 15.04 official repository. You can open any .deb file with File Roller in Ubuntu or another file archiver you have.


1. Directory Structure 

 

In every Debian package, each of them, generally there are always two parent directories named usr and DEBIAN. They store some files or another directory inside a Debian package. There are reasons these two exist in there, and there are consequences for having them exist.

usr



The /usr directory here is equal with /usr directory in every Debian system. This directory contains some other directories such as bin, lib, share, etc. The bin (/usr/bin) directory contains the binary file. In Debian package context, here is where the program stored. The lib (/usr/lib) directory contains library file. The share (/usr/share) directory contains another directories such as icons, doc, etc. Note that the explanations here are just in general. A special case when bin contains shell script or lib contains binary executable is possible.

The directory tree generally would be like this:

*-usr
  - bin  
  - lib
  - share
    - doc
    - icons
    - info
    - man

The structure above is same with our Debian system. Every directory will be copied into our system once a package is installed. It is easier to install anything with this same structures, compared to if a package has different directory structure with the Debian system.


DEBIAN



This DEBIAN directory (all uppercased) is a mandatory for every Debian packager (a human created Debian package) to create it inside a package. In other words, DEBIAN directory is must exist in a Debian package. DEBIAN directory contains control file, always. If a Debian package structure doesn't contain a control file, that structure would never built into a Debian package. This control file is the most important metadata storage for a Debian package.

control

An example of control file from emacs_46.1_all.deb package:
Package: emacs
Source: emacs-defaults
Version: 46.1
Architecture: all
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Rob Browning <rlb@defaultvalue.org>
Installed-Size: 25
Depends: emacs24 | emacs24-lucid | emacs24-nox
Section: editors
Priority: optional
Description: GNU Emacs editor (metapackage)
 GNU Emacs is the extensible self-documenting text editor.
 This is a metapackage that will always depend on the latest
 recommended Emacs release.
 
From above control file, we know at least the package name is emacs, the version is 46.1, package is being maintained by Ubuntu Developers, package dependencies is emacs24 (or emacs24-lucid or emacs24-nox), and from the description field we know this package is just metapackage. 

One of the most important jobs in package management system is handling dependencies. This job will need information in Depends: field above. And this package says it needs emacs24 to be installed first, because emacs24 is the dependency to this package. For emacs24-lucid and emacs24-nox, they are optional towards emacs24 package but one must be chosen to install this package.

md5sums

This file is basically plain text file like control file. But different with control, this file contains a list of whole contents in the package with each MD5 sum hash numbers (unique ID for every file). This file is very important to ensure the package and the contents are original and valid, not the fake or invalid one (e.g. containing appended virus). This file is also useful as index of all files inside a package.

We show another package, emacs24_24.4+1-4ubuntu5_i386.deb, as example for its md5sums file here:
ed1d6463a0b0988caa53fd54bdfab11c  usr/bin/emacs24-x
1de51f5410205017391d924d6025b209  usr/share/applications/emacs24.desktop
593501773dfe68c26137b599f9426484  usr/share/doc/emacs24/README.Debian
0a6306d9d7e821c2f6656bcbaba74a8d  usr/share/doc/emacs24/copyright
bc0280a4d6386cb2399b0bb88697f79b  usr/share/emacs/24.4/etc/DOC
7ab22db3e21e699cc15b2a1812586220  usr/share/lintian/overrides/emacs24
fbef85cc4395011769e17ef249d233f4  usr/share/menu/emacs24

It shows every file's hash sum (the random characters in the left column). This ensures every file in the package user gets is valid. This is one of the common ways GNU/Linux distributions use to protect the users from malware (virus etc.) in their package management systems.


2. Post-Installed Directory Structure


After installing a package, we can see where every file stored in our Debian system. We can analyze that from some aspects:

  • The man page of this program (emacs program) is stored in /usr/share/man/man1 in our Debian system. The man page for emacs consists of some .gz package files, each named *emacs* and contains text file. These man page files comes from the corresponding /usr directory inside the Debian package mentioned above.
  • The info* page of this program (emacs program) is stored in /usr/share/info/emacs24 in our Debian system. The info page directory contains .gz package file, similar with man page.
  • The binary file (the real program emacs) is stored as /usr/bin/emacs.
  • The icon file of emacs is stored in /usr/share/icons/hicolor/{resolution}/apps/emacs24.{png|svg}. We know this information from emacs24-common_24.4+1-4ubuntu5_all.deb package. This icon is what you see in the menu while emacs installed in your Debian system.
  • The desktop file of emacs is stored as /usr/share/applications/emacs24.desktop. We know this information from emacs24_24.4+1-4ubuntu5_i386.deb package.
*) info is GNU replacement for man. In an over-simplistic term, there is man in UNIX, and there is info in GNU. See http://www.gnu.org/software/texinfo/.

man files


icon file

3. Relationship with dpkg

 

Preinstall


Before you install a package, the control file inside a Debian package is actually provided for dpkg. Beyond that, the program that produces Debian package is actually "dpkg", more precisely dpkg-deb.

Postinstall


After you install a package, the content of control file is copied into dpkg database /var/lib/dpkg/status file. For example, after installing emacs_46.1_all.deb above, we see the content of dpkg status file at line 16624:


You can see the content is very similar with the mentioned control file. dpkg reads the control file, do many other jobs, and copy control file content into its local database. This database is a record of all installed package in the system. It means this database is a collection of all control files from every packages installed. In our Ubuntu 15.04 system, this database last 43387 lines long.

There are no two installed Debian system share exactly same dpkg status, all Debian users will have different contents in their dpkg statuses respectively, except the fresh installed ones. Because it's almost impossible they will always install exactly same packages over the time. Briefly, that's how dpkg works.

4. Relationship with apt

 

apt has some "companions". Among them are apt-get and apt-cache. Beyond these, apt still has many more companions. We can analyze the relationships in limited two aspects.

apt-get


One of the jobs of apt-get is to download Debian package. The most known command sudo apt-get install downloads all the Debian packages from remote repository into local directory /var/cache/apt/archives/. This is apt cache directory. If you are looking for all your stored .deb files (by apt-get), see this directory.


apt-cache


apt-cache is a program to query the apt database. The apt database is located at /var/lib/apt/lists/. This directory contains many large size text files. What are the content of them? For example, this is an example from archive.ubuntu.com_ubuntu_dists_vivid_main_binary-i386_Packages file. Yes, the file name indeed is long.


You can see that the emacs emacs_46.1_all.deb package metadata is contained in here. This information is basically same with control contents in emacs_46.1_all.deb. And apt-get uses this information to search package and to resolve dependency.