ACC SHELL
<HTML>
<HEAD>
<TITLE>Backup module specification</TITLE>
</HEAD>
<BODY BGCOLOR=#FFFFFF>
<H1>Backup module specification</H1>
<P>
<I>Author: Ladislav Slezak <<A HREF="mailto:lslezak@suse.cz">
lslezak@suse.cz</A>></I>
</P>
<P>
Changes:
</P>
<P>
<UL>
<LI><B>2003-02-17:</B> Profiles</LI>
<LI><B>2003-02-06:</B> Star support (ACL backup), documentation update</LI>
<LI><B>2002-02-01:</B> Added default excluded directories and file systems</LI>
<LI><B>2002-01-03:</B> Multi volume problems with tar utility</LI>
<LI><B>2001-12-18:</B> Changed archive file internal structure</LI>
<LI><B>2001-11-29:</B> Changed solutions of some problems</LI>
<LI><B>2001-10-31:</B> Multi volume archive</LI>
<LI><B>2001-10-25:</B> Partition table storing updated</LI>
<LI><B>2001-10-15:</B> New options in <I>Backup settings dialog</I></LI>
<LI><B>2001-10-05:</B> Screenshots update</LI>
<LI><B>2001-09-26:</B> Detailed batch mode description</LI>
<LI><B>2001-09-06:</B> Initial version</LI>
</UL>
</P>
<HR>
<H2>Backup and system recovery solutions</H2>
<H3>Backup</H3>
<P>Utilities like dump, kbackup and others are indended for full system backup.
These utilities do not use rpm database to get changed files, but they support
incremental backup or client/server architecture or other features.</P>
<H4>YaST1</H4>
<P>YaST1 backup module compares file flags stored in rpm database with actual
file properties and backups only files which differ, the result is small backup
archive.</P>
<P>
How is backup done in YaST1:
<OL>
<LI>get list of installed packages</LI>
<LI>get list of files and modification flags for each installed package</LI>
<LI>select from list only modified files</LI>
<LI>remove duplicate entries from file list</LI>
<LI>find files which do not belong to any package (on demand):
iterate through directory content (start at root directory), for each:</LI>
<UL>
<LI>file - check whether it is in package files list, if no add it to the backup list</LI>
<LI>directory - check if it is in exclude directory list or it is on excluded filesystem,
if no then get recursive file list from it
</LI></LI></UL></LI>
<LI>display list of files to backup, user choose files from list</LI>
<LI>display archiving options</LI>
<LI>run <I>tar</I> to create backup archive</LI>
</OL>
</P>
<P>YaST1 can not restore files from backup archive. Restoration of system
is done by new system installation and manually updating files from backup
archive. There is no way how to check whether installed packages are same
as in backup time or if they are installed in same place (not relocated to
another directory). </P>
<H3>System recovery utilities</H3>
<P><A HREF="http://mkcdrec.ota.be"><B>Make CDROM recovery</B></A>
makes a bootable recovery ISO images, with full backup of system. System can be booted
from CD and complete recovered from backup. Others utilities only make bootable
floppy/CD with useful utilities, for example
<A HREF="http://freshmeat.net/redir/recoveryispossible!/8898/url_tgz/">rip</A> or
<A HREF="ftp://ibiblio.org/pub/Linux/system/recovery/zdisk-2.03.tar.gz">zdisk</A>.
<A HREF="http://www.stud.uni-hannover.de/user/76201/gpart">Gpart</A> utility
can guess lost partition table from disk content, this is useful if no partition backup is available.
</P>
<H2>Documentation</H2>
<P>YaST1 source, tar, fdisk, e2image and rpm manual pages,
<A HREF="http://www.linuxdoc.org/HOWTO/mini/Partition-Rescue/index.html">
Partition rescue mini HOWTO</A>,
<A HREF="http://www.rpmdp.org/rpmbook/node32.html">Maximum RPM - Verification</A>,
<A HREF="http://bugzilla.suse.de/show_bug.cgi?id=9556">Bugzilla entry</A></P>
<H2>Features</H2>
<P>List of features is available <A HREF="backup_features.txt">here</A>.</P>
<H2>Implementation</H2>
<H3>Backup</H3>
<P>Backup archive contains files which differ from files in RPM packages and files which
do not belong to any RPM package.</P>
<P>Changed files which belong to one package are stored to one subarchive with name
<UL>
<I><package_name>-<package_version>-<date_of_backup>-<num>.<ext></I>
</UL>
Currently <I>num</I> is 0 - this value is for future enhacement (backup archive
created by package update have similar structure, so it is possible to include
already created backup files to archive and this number can be used to differ
file names). File extension <I>ext</I> is one of '.tar.gz', '.tar.bz2',
'.tar' or '.star.gz', '.star.bz2', 'star' - depends on selected archiver (tar or star).</P>
<P>Some additional information will be stored in subdirectory <I>info</I> in the archive.
This directory should contain files:
<UL>
<LI><I>comment</I> - user comment, description of backup archive</LI>
<LI><I>date</I> - date of backup</LI>
<LI><I>hostname</I> - name of machine where archive was created</LI>
<LI><I>installed_packages</I> list of installed packages - output from <TT>rpm -qa</TT> command</LI>
<LI><I>files</I> list of all files present in the backup archive (useful for multiple volume archive)</LI>
<LI><I>packages_info</I> - list of files (with absolute path) and packages to which they belong
<P>
Format of <I>packages_info</I> file:<BR>
<PRE>Package: <I>package_name_with_version</I>
Installed: <I>installation_prefix</I>
<I>file1
file2</I>
Package: <I>package_name_with_version</I>
Installed: <I>installation_prefix</I>
<I>file1
file2
file3
...</I>
Nopackage:
<I>file1
file2
...</I>
</PRE>
</LI>
</UL>
</P>
<H4>Files not owned by any package</H4>
<P>Files not owned by any package are stored in NOPACKAGE subcharive. It is
posible to exclude directory or file system type from search. This allows to
exclude from backup /tmp directory or directories with mounted NFS file
system.</P>
<P>Default excluded directories are: /media, /windows, /tmp, /var/lock, /var/run, /var/tmp and
/var/cache. Default excluded file system is iso9660, ntfs and file systems with
"nodev" flag in /proc/filesystems file (e.g. nfs, proc, tmpfs...).</P>
<H4>Access Control Lists (ACLs)</H4>
<P>Backup module can store ACLs using star archiver - in this case subarchives
(with *.star extension) have exustar format and they are created with -acl flag
to store ACLs.</P>
<H4>System backup</H4>
<P>System backup will be stored in <I>system</I> subdirectory in the archive. This subdirectory can contain files:
<UL>
<LI><I>partition_table_<device>.txt</I> - partition table info obtained by <TT>sfdisk -d <device></TT> (human readable, can be used as input file for sfdisk - see sfdisk man page)</LI>
<LI><I>partition_table_<device></I> - partition table stored by <TT>dd if=/dev/<device> of=<file> bs=512 count=1</TT> (raw data)</LI>
<LI><I>e2image_<device>.tar.bz2</I> - compressed ext2 metadata obtained by <TT>e2image <device></TT></LI>
</UL>
</P>
<P>All package archives, files in <I>info</I> and <I>system</I> subdirectories are archived to one big archive by tar utility.</P>
<P>It is possible to store only list of changed files to file - script
<I>backup_archive.pl</I> outputs file names from <I>packages_info</I> file.
This list of files can be used to create another archive type.</P>
<H4>Autoinstallation profile</H4>
<P>Backup module also creates autoinstallation XML profile. Profile file name
is same as backup archive except extension which is ".xml".</P>
<P>Autoinstallation profile contains installation settings which can be used in automatic
installation withou user interaction.
</P>
<H4>Profiles</H4>
<P>Yast2 Backup module contains profile manager. After backup archive is
created it is possible to store entered settings to profile with specified
name.</P>
<P>When backup module find any stored profile after start it opens profile dialog.
Action which can be selected:
<UL>
<LI>don't use profile - start backup with default settings, standard workflow</LI>
<LI>use profile without changes - read settings from profile, start backup immediately</LI>
<LI>allow changes before use - read settings from profile, start standard workflow to allow changes</LI>
</UL>
</P>
<H4>Problems and solutions</H4>
<P>
<UL>
<LI>How to detect changed files? Most reliable detecting of changed files
is MD5 sum stored in RPM database. But computing MD5 sum of all files takes
long time. There is option to use MD5 sum or only use size and modification time flags
- this is faster, but less reliable.
</LI>
<LI>What if some package was updated after backup? For restoring must
be present package of the same version!<BR>
Solution: Store package version with list of changed files
</LI>
<LI>What if relocation was used in installation? (Package is reinstalled
in another place.)<BR>
Solution: Store installation prefix
</LI>
<LI>What if some file is owned by more packages?
File can be owned by multiple packages, but if in one packge it has different
modify time (content is equal) then this file will be unnecessary backed up.
(This is possible if MD5 sum is not used.)
<BR>
Solution: Get list of files which are owned by more than one package. If a file has
only different modify time in the RPM database, but MD5 sum and size tests passes
and it's in the duplicate list then do not backup it. This check takes some time,
but there should be a minimum of files which are owned by more than one package.
</LI>
<LI>Symbolic link problem:
If a file in RPM package is installed in directory which is symbolic link to another
directory then it seems like file in linked directory is not in any package (because it
was not reported by <EM>rpm -qal</EM> command), but <EM>rpm -qf</EM> tells
right owner.<BR>
Solution: Stat all files owned by packages, remember device and inode numbers.
While searching files not owned by any package check if it is in list of all
files. If not check if its device and inode is in the list.
</LI>
<LI>Updated/other packages:
If a package was updated or it isn't available on the installation sources it
is not enought to store only changed files because RPM file cannot be available
at restoration time (e.g. package update was modified, lost installation
sources...).<BR>
Solution: Get list of available packagest on the installation sources and backup
unavailable packages completly.
</LI>
</UL>
</P>
<H4>Multi volume archive</H4>
<P>Size of created archive can be larger than size of backup media. In this
case archive file can be manually splitted to more files, but for beginners
is more acceptable automatic splitting.</P>
<P>Tar archiver creates multiple volume archives if <I>-M</I> or <I>-L</I> is used.
Each volume is standalone archive - files can be extracted without need of
other volumes (except files which were splitted to more volumes).
</P>
<P>Each volume is named according to pattern <EM><volume_number>_<archive_name></EM>.
If user enters archive name <EM>/tmp/archive.tar</EM> then volume names are:
<EM>/tmp/01_archive.tar</EM>, <EM>/tmp/02_archive.tar</EM>, ...</P>
<P>There is small problem with tar and multi volume archives: Volume size is
multiply of block size, which is default 20 * 512 bytes = 10kiB. So if requested
volume size is 32kiB (with parametr -L 32) created volume will have size 40kiB.
There is option to set block size, but with parameter -b 1 (block size 1*512B)
or -b 2 tar exits on SIGSEGV. With -b 4 it seems that it works, so default
block size is set to 4*512 bytes. This was tested with tar version 1.13.18.</P>
<P>Solution is to create smaller volumes - if requested volume size is 35kiB
create volumes with size 34kiB (with 2kiB blocks).</P>
<H4>Architecture</H4>
<P>
Backup module has two parts: <I>backup scripts</I> and <I>YaST2 frontend</I>.
</P>
<P>
<I>Backup scripts</I> are written in Perl. First script (<EM>backup_search.pl</EM>)
collects files to backup, second script (<EM>backup_archive.pl</EM>)creates
archive file. Two scripts are required so that user can exclude some files from
backup after search. These scripts are independent on YaST2 and they can be used in
batch mode backup. In batch mode all parametres are entered on command line and
backup is done without any interaction with user. This feature can be used in
shell scripts to do backup automatically.
</P>
<P>
<I>YaST2 frontend</I> collects information from user, then it runs backup scripts and
displays progress of backup.
</P>
<H3>Restore</H3>
<P>YaST2 retore module will use some information about system stored in backup archive
and will allow simple restoring of files. Restoring partition table or restoring ext2 critical areas
are dangerous operations, therefore these operations should be done only manually by expert user.
Restoring partition table is described in
<A HREF="http://www.linuxdoc.org/HOWTO/mini/Partition-Rescue/x92.html#AEN95">
Partition rescue mini HOWTO - chapter 8.1</A>
Ext2 data are stored in image obtained by e2image utility, see e2image man page.
</P>
<P>More information about restoring can be found in Yast2 restore module documentation.</P>
<H2>Workflow</H2>
<H3>Overview</H3>
<P>This is backup workflow overview. (<A HREF="workflow.dia">source</A>)</P>
<P>
<IMG SRC="workflow.png">
</P>
<H3>Dialog details</H3>
<P>(Outdated Qt designer workflow dialogs are available <A HREF="ui.tar.bz2">here</A>.)</P>
<P><A NAME="archive_options"></A><B>Archive options</B> - Archive file name,
achive type and optional archive description are entered in this dialog.
<BR><IMG SRC="dialog01.png">
</P>
<P><A NAME="archive_file_options"></A><B>Archive file options</B> - Multi volume archive
options
<BR><IMG SRC="dialog02.png">
</P>
<P><A NAME="backup_options"></A><B>Backup options</B> -
Number of selected parts to backup affects number of dialogs in workflow.
<BR><IMG SRC="dialog03.png">
</P>
<P><A NAME="exclude_dir"></A><B>Exclude directory</B> - Directories can be excluded from search.
<BR><IMG SRC="dialog04.png">
</P>
<P><A NAME="exclude_fs"></A><B>Exclude file system</B> - Filesystems can be excluded from search.
<BR><IMG SRC="dialog05.png">
</P>
<P><A NAME="system"></A><B>System backup</B> - Partition table and
ext2 partitions can be selected here.
<BR><IMG SRC="dialog06.png">
</P>
<P><A NAME="searching_modif"></A><B>Searching modified files</B> - Progress bar is displayed
while searching modified files.
<BR><IMG SRC="dialog07.png">
</P>
<P><A NAME="searching_files"></A><B>Searching files</B> - Progress of searching files
is displayed here.
<BR><IMG SRC="dialog08.png">
</P>
<P><A NAME="file_selection"></A><B>File selection</B> - All found files are
displayed here.
<BR><IMG SRC="dialog09.png">
</P>
<P><A NAME="creating_archive"></A><B>Creating archive</B> -
Progess is displayed while creating archive.
<BR><IMG SRC="dialog10.png">
</P>
<P><A NAME="summary"></A><B>Backup summary</B> - Results
of backup procedure are displayed here.
<BR><IMG SRC="dialog11.png">
</P>
<P><A NAME="summary_details"></A><B>Summary details</B> - Detailed results are displayed here.
<BR><IMG SRC="dialog12.png">
</P>
</BODY>
</HTML>
ACC SHELL 2018