Vipul Kumar Bhardwaj (V.K.B) LINUX

Embed Size (px)

Citation preview

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    1/32

    UNIT I

    UNIX :-

    Unix Operating system was originally developed in the Bell Laboratories which was ona part of telecommunications giant AT&T .

    UNIX has become a very powerful and popular multitasking and multiuser operatingsystem for a wide variety of hardware platforms.

    UNIX is case sensitive, it supports lower case.

    UNIX does not contain an un-erase option, or an undo option. UNIX has many same features as DOS but are given different names.

    History Of UNIX and LINUX :- In 1969,Ken Thompson and Denis Ritchie wrote the first version of UNIX

    PDP7 assembler language. Around this time C was conceived by DENNIS RITCHIE and in 1973

    Dennis Ritchie and Ken Thompson rewrote the entire UNIX kernel in Cbreaking away the tradition of using the assembly language for the same

    Many vendors such as IBM, SUN and others purchased this source code oUNIX and developed their own versions of UNIX.

    In 1991 LINUS TORVELLs developed such kernel and called it LINUX. UNIX is a trademark administered by The Open Group, and UNIX refers to a computer

    operating system that conforms to a particular specification. Specification is mainlyconcerned to the term POSIX(Portable Operating System Interface) developed by theIEEE(Institute Of Electrical and Electronic Engineers).

    Many UNIX-like systems are commercially available such as IBMs AIX,HPs HP-UX and SunsSolaris.

    Following are some of the characteristics shared by typical UNIX programs and systems :-1. Simplicity : UNIX utilities are very simple and as a result, small and easy to

    understand. KISS Keep It Small and Simple, technique followed by the UNIX.2. Focus : In UNIX various utilities are concerned with the user needs only and as

    per the demands of the user.3. Reusable Components :4. Filters : Many UNIX applications can be used as filters which means that the

    input given to the application come out translated by these applications as theoutput.

    5. Open File Formats: UNIX programs use configuration files and data files whichare plain ASCII text or XML.

    LINUX :-Q: what is kernel?? Illustrate its functioning in UNIX environment.

    Linux is a freely distributed implementation of the UNIX like kernel, a kernel is a core of theoperating system.

    Linux was developed by the Linus Torvalds at the university of Helsinki, this developmentstarted as an inspiration by the Andy Tanenbaums MINIX, a small UNIX like system.

    Linux is actually just a KERNEL which can easily be installed and later on various otherfreely distributed software programs could also be installed to make a complete Linuxinstallation.

    Some of the most popular LINUX distributions under the Intel x86 family of processors are Red Hat Enterprise Linux. Fedora A community cousin of Red Hat. Novell SUSE Linux, openSUSE variant. Ubuntu Linux. Slackware. Gentoo. Debian GNU/Linux.

    Linux is compatible with POSIX("Portable Operating System Interface [for Unix]"), the first version

    was LINUX KERNEL which became available in 1991.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    2/32

    File management, program management and user interaction are traditional features of OS but LINUXadds 2 more features i.e. Multiuser & Multitasking.

    BASIC COMPNENTS OF LINUX ARE :::::: The KERNEL --- It is a core program that runs program and manages hardware

    devices or resources. The ENVIRONMENT --- It provides the interface to the user and receives the

    commands from the user and sends those commands to the kernel for execution. The FILE STRUCTURE --- A file structure organizes the way files are stored on a

    storage device here files are organized into directories.It is true that UNIX and its most of the applications were written in C language but C language not the only option available to the LINUX or UNIX programmers. Various languages available to th

    LINUX programmer :

    Ada C++ Eiffel Forth FortranC Icon Java JavaScript LispModula2 Modula3 Obero

    nObjectiveC

    Pascal

    Perl Postscript Prolog

    Python Ruby

    SmallTalk

    PHP Tcl/Tk BourneShell

    Linux applications are represented in two special types of files :-

    1) Executable Files (.exe)2) Scripts : They are collections of instructions for another program, an interpreter to follow.

    UNIX uses a simple colon (:) character to separate the entries in the path variable, rather than theSemicolon(;) that MS-Dos in Windows use.

    For Example :/usr/local/bin : /bin : /usr/bin : . : /home/neil/bin : /usr/X11R6/bin

    The above PATH variables contain entries from the standard program locations, the currentdirectory (.), ausers home directory and the X window system.

    Linux uses forward slash (/) to separate the directories rather than Windows which uses (\).

    CHARACTERISTICS OF LINUX OR FEATURES OF LINUX

    1. Multi-Tasking ::

    Linux supports true pre-emptive multitasking. Linux supports multitasking an

    all the processes run entirely independently of each other without caring other

    processes.

    2. Multi-User ::

    Linux allows a number of users to work with the system at the same time.

    Linux is a computer system that is able to concurrently and independently execute

    several applications belonging to two or more users.

    3. Multi-Processing ::

    Linux runs onto a multi process environment, which means that the OS can

    distribute several applications across several processors.

    4. Architecture Independence ::

    Linux runs on almost all platforms that are able to process bits and bytes. Th

    type of hardware independence is not achieved by any other Operating System

    5. Demand Load Executables ::

    Only those parts of the program which are actually required in for the executio

    are loaded into the memory. Whenever a new process is created using fork()

    command memory is not required immediately, but instead the memory for thparent process is used jointly by both processes.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    3/32

    6. PAGING ::

    Despite the best effort to use physical memory efficiently, it may be so that t

    memory may be fully occupied. LINUX looks for 4KB pages of memory which

    can be freed. Pages whose content has been already stored on hard disk are all

    discarded.

    7. Dynamic CACHE For Hard disk :: Linux dynamically adjust the size of cache memory

    suit the current memory usage situation. If no memory is available at the given time

    the size of the memory is reduced and new memory is provided. Once the memory i

    released the area of cache is increased.

    8. Shared Libraries :: Libraries are collections of routines needed by a program forprocessing data.

    9. Memory Protected Mode :: Linux uses the processors memory protection mechanism

    to prevent the process from accessing memory allocated to the system kernel or oth

    processes.

    10.Support For National Keyboards & Fonts ::

    11.Different File Systems :: Linux supports a variety of file systems. Some of the file

    systems supported by linux are Ext2, Proc.

    FILE STRUCTURES OF LINUX ::

    The file structure of any O.S. includes the arrangement of files & folders. Linux organizes fileinto a hierarchically connected set of directories. Each directory may contain a set of files orother directories. Because of the similarities to a tree, such a structure is often related to trstructure and also called parent-child structure.The system directories present in a tree structure are :

    /root = Begins the file system structure called root. /fs = The virtual file system is in the fs directory. /home = Contains users home directories. /bin = Holds all the standard commands and the utility programs. /usr = Holds those files and commands used by the system, this breaks down into

    several sub-directories : /usr/bin = Holds the user oriented commands and the utility programs. /usr/sbin = Holds system administration commands. /usr/lib = Holds libraries for programming languages. /usr/doc = Holds LINUX documentation. /usr/man= Holds manual MAN files. /usr/spool = Holds spooled files such as those generated for printing jobs and

    network transfers. /sbin = Holds system administration commands for booting the system. /var = Holds the files that vary such as mailbox files. /dev = Holds file interface for devices such as terminals and printers. /etc = Holds system configuration files and any other system files. /init = Contains all the functions needed to start the kernel like start_kernel()

    /net = Contains the implementation of various network protocols. /arch = Architecture dependent code is held in the sub-directories of arch/. /mm = Contains Memory Management sources for the kernel.

    TEXT EDITORS :::::A text editor is used to create and manage text files and documents. Most popular of all the

    editors available is the VI(Visual) editor and also the improved version VIM(Visual Improved Editor)One more editor being preferred by the programmers is the Emacs editor, it mainly provides agraphical interface to the programmer.

    In linux the applications provided by the system are listed in the /usr/bin, and the applications addeby the system administrators for a specific host computer or a local network are found in/usr/local/bin or /opt.

    Functions of an editor =

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    4/32

    1. CREATING A FILE = Creating up of a file is done through an editor, writing up of data and latestoring the same anywhere as per the need is in a storage media.

    2. OPENING AN EXISTING FILE = Editor allows to open an existing file with the changes made tobe saved, Multiple files are also meant to be opened.

    3. COPYING AND PASTING TEXT= It should allow us to copy and paste because this facilitatesdocument creation.

    4. SEARCHING FOR A TEXT =5. HANDLING LARGE AMOUNT OF DATA =

    PROCESS AND TASK STRUCTURE ::

    A process is usually defined as an instance of a program in execution, thus, if 16 users are using vieditor at once, there are 16 separate processes, although they can share the same executable codEach and every process have some unique information which store in task_struct. In the task_struthe state field describes that what is currently happening to the process. There are following possibprocess states

    1. TASK_RUNNING :- The process is either executing on the CPU or waiting to be executed.2. TASK_INTERRUPTABLE :- The process is sleeping until some condition becomes true. Raising

    hardware interrupt, releasing a system resource the process is waiting for, or delivering asignal are examples of conditions that might wake up the process, that is put its state back tTASK_RUNNING.

    3. TASK_UNINTERRUPTABLE :- In this state as the name signifies the process is uninterruptableany hardware interrupt or any signal.

    4. TASK_STOPPED :- Process execution has been stopped.5. TASK_ZOMBIE :- Process execution is terminated but the parent process has not stopped . T

    kernel cannot discard the data contained in the dead process task_struct because the parencould need it.

    Q : What is the process table in the linux kernel. ??A :

    Every process occupies exactly one entry in the process table. In Linux, this is statistically organized and restricted in size to NR_TASKS. NR_TASKS denote

    the maximum number of the processes. In older versions of the Linux kernel all processes present could be traced by searching the

    task() process table for entries. In the newer versions this information is stored in the linked lists next_task and prev_task

    which can be found in the task_struct structure. The entry task(0) has a special significance in the Linux Task[0] is the INIT_TASK mentioned

    above, which is the first to be generated when the system is booted and has something of aspecial role to play.

    Q : What is an inode? How is it used for storage of regular files ?A : All entities in linux are treated as files. The information related to all these files, is stored in aninode table on the disk. For each file there is an inode entry in the table . Inodes contain informatiosuch as files owner and access rights.Struct inode{

    ..}The struct inode will contain

    1. i_dev = This component is the description of the device on which the file is located.2. i_ino = This component identifies the file within the device.3. i_mode = This shows the mode in which the file would be opened.4. i_uid = This component shows the user id.5. i_gid = This component shows the group id.6. i_size = This component shows the size in bytes.7. i_mtime = This component displays the time of last modification.8. i_atime = This component shows the time of last access.

    9. i_ctime = This component shows the time of last modification to the inode.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    5/32

    Q : What are interrupts ? Define the slow and fast interrupts ?A: Whenever a special signal is generated by any hardware is called interrupt. Interrupts are used allow the hardware to communicate with the Operating System.There are two types of interrupts in Linux slow and fast ::

    1. Slow Interrupts Slow interrupts are of usual kind and after a slow interrupt has beenprocessed, additional activities requiring attention are caused by the system. For Example The Timer Interrupt.

    2. Fast Interrupts -- Fast interrupts are used for short,less complex tasks. While they are beinghandled, all other interrupts are blocked, unless they are explicitly enabled. A typical exampis the keyboard interrupt.

    Q : What is the BOOTING process of the linux system ??A : Booting process for linux is

    1. There is something magical about booting LINUX system, first of all LILO (LInux LOader) findthe Linux kernel and loads it into memory.

    2. It then begins at the entry point start: as the name suggests, this is an assembler coderesponsible for initializing the hardware.

    3. Once the essential hardware parameters have been established, the process is switched intoProtection Mode by setting the protected mode bit in the machine status word.

    4. Then initiates a jump to the start address of the 32-bit code for actual operating system kernand continues from startup_32.

    5. Once the initialization is complete, the first C function start_kernel() is called.6. All areas of the kernel are then initialized, the process is now running is process 0, it now

    generates a kernel thread which executes the init() function.7. The init() function carries out the remaining initialization. It starts with bdflush() and kswap

    daemons which are responsible for synchronization of the buffer cache contents with the filesystem and for swapping.

    8. Then the system call setup is used to initialize file systems and to mount the root file systemthen an attempt is made to execute one of the programs /etc/init, /bin/init or /sbin/init.

    Q: Give DATA DTRUCTURES IN LINUX.A: There are 6 types of data structures in linux ::

    a) The task Structure.b) Process Table.c) Files & inodes.

    d) Queues & Semaphores.e) System Time & Timers.

    Q : Define the system calls getpid, nice, pause, fork, execve, exit, wait.A: SYSTEM CALL = The system call are part o kernel, the system call interface is an example of anAPPLICATION PROGRAMMING INTERFACE (API). An API is a set of system calls with strictly definedparameter which allows an application request access to a service. System calls are totally differento the library functions because all system calls can interact directly with hardware but any shell

    command cannot interact with hardware.Basically 3 steps are involved in each of process :

    1. Fork() Process creates a copy of the process that invokes it.2. Exec() - This system call overlays a process by a further process whose name is the argume

    of exec().3. The 3rd stage is the wait() , it makes the parent process to wait for the child process to exit.

    getpid The getpid call is a very simple call, it merely reads a value from the taskstructure and returns it.

    nice The system call nice is a little more complicated, nice expects as argument anumber by which the static priority of the current process is to be modified.

    pause A call to pause interrupts the execution of the program until the process isreactivated by a signal. This merely amounts to setting the status of current process to

    TASK_INTERRUPTABLE and then calling the scheduler.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    6/32

    fork The system call fork is the only way of starting a new process. This is done bycreating an identical copy of the process that has called the fork. Fork is a very demandinsystem call. All the data of the process have to be copied , and these can easily run to afew megabytes.

    execve This system call enables a process to change its executing program, Linuxsupports a number of executable files. Linux supports the widely used executable filesCOFF (Common Object File Format) and ELF (Executable and Linkable Format).

    exit A process is always terminated by this kernel function call. It merely has to releasthe resources that have been claimed up by the process and if necessary inform otherprocesses.

    wait The system call wait enables a process to wait for the end of a child process and

    interrogate the exit code supplied. Depending on the argument given wait call will wait fothe specified child process.

    Q: What is the output of command ps ?A: ps command output which processes are running at any instant time assigns a unique number every process running in memory. This is called process ID or simply PID.

    PID TTY TIME COMMAND2269 tty01 0:05 sh2345 tty01 0:00 ps

    PID : Process ID.

    TTY : Terminal ID Which The Processors Were Launched.TIME : The Time that has elapsed since the processes were launched.COMMAND : The Names Of The Processes.

    Q: What is links ? What is the difference between Hard Links & Symbolic Links ?A:

    If a file is meant to be referred with the different different file names to access it from differentdirectories then the link of the same file is created with the help of the ln command.

    $ ln original-file-name link-nameHard Links & Symbolic Links :

    Link within one disk and one user environment is called Hard Links. A hard link may in some

    situations fall when you try to link to a file on some other users directory. A file in one file systemcannot be linked to a file in another file system.To resolve this problem symbolic links are used, a symbolic link holds the pathname of the file towhich it is linking.With a symbolic link we may link to a file on another users directory that is located on another filesystem.

    DEVICE DRIVERS ::::::Device driver is an interface between device and O.S. Device driver is a software which operateshardware. There is a wide variety of hardware available for LINUX computers. Each hardware havean own device driver. Without these an operating system would have no means of input or outputand no file system.

    Q : Explain character and Block Devices Under Linux?A : Block Devices Block devices are those to which transfer the data in block wise and provide thfacility of random access. Block devices are divided into a specific number of equal- sized blocks aeach block has a unique number. Block devices are RAM, Hard Disk, Floppy Disk, CD-ROM etcCharacter Devices These devices on the other hand process data character by character andsequentially. Linux doesnt maintain buffer area for that. Some character devices maintain its ownbuffer for its internal operation for block transferring but these blocks are sequential in nature andcannot be accessed randomly. For Example An ink printer and a Laser Printer print the character line and page wise respectively so all character stores in buffer and when a required limit is reachedevice sends the whole block of data to printing. Some character devices are Printer, Scanner,Sound Cards, Monitor, PC speaker.

    Q : In context of linux device drivers , write a short note of the following

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    7/32

    Polling, Interrupt, Interrupt Sharing, Bottom Halves, Task Queues, DMA, setup, open, read, IOCTinit, release, write, select.A:

    POLLING : In polling the driver constantly checks the hardware. The driver defines a timeout, andriver continuously checks the hardware until timeout limit is not reached. Whenever a timeout limis over the timeout error handling will then the appropriate error messages in case of printer likeprinter is out of paper.In polling mode the processor time gets wasted but is sometimes the fastest way of communicatingwith the hardware.

    INTERRUPT : The use of interrupts is only possible if these are supported up by the hardware. Here

    the device informs the CPU via an interrupt channel (IRQ) that it has finished an operation.For Ex. In the serial mouse, every movement of it sends data to the serial port, triggering an IRQ. Tdata from the serial port is read first by handling ISR, which passes it through to the applicationprogram.

    INTERRUPT SHARING : Various hardware use the same IRQ number, if different hardware which usesame interrupt are used in same PCI board then the hardware conflict each other. In this caseinterrupt sharing provides facility to use both device in same PCI board. For this if one device is usethe PCI buses the second device wait for freeing that buses.

    BOTTOM HALVES :

    TASK QUEUES : Task Queue is a dynamic extension of the concept of bottom halves, Use of bottomhalves is somewhat difficult because their number is limited to only 32, and some tasks are alreadyassigned to fixed numbers. Task queue allow a number of functions to be entered in a queue andprocessor one after another at later time.Before a function can be entered in a task queue, a tq_struct structure must be created andinitialized.

    DMA mode : Direct Memory Access or DMA, is the hardware mechanism that allows peripheralcomponents to transfer their I/O data directly to and from main memory without the need for thesystem processor to be involved in the transfer . Use of this mode is ideal for multitasking, as theCPU can take care of other tasks during the data transfer.In a DMA operation the data transfer takes place without CPU intervention the data bus is directly

    driven by the I/O device and the DMAC(Direct Memory Access Controller)

    setup() : The setup function must initialize the hardware devices in the computer and set up theenvironment for the execution of the kernel program. Although the BIOS has already initialized mohardware devices. It is desirable to pass parameters to a device driver or to the Linux kernel ingeneral. These parameters will come in the form of a command line from the LInux LOader(LILO).

    Init() The init() function is only called during kernel initialization, but is responsible for importanttasks. This function tests for the presence of a device, generates internal device driver structuresand registers the device.The call to the init() function must be carried out in one of the following functions, depending on thtype of device driver.

    ForCharacter Devices : chr_dev_init()Block Devices : blk_dev_init()SCSI Devices : scsi_dev_init()Network Devices : net_dev_init()

    The init() function is also the right place to test whether a device supported by the driver present aall, this applies especially for devices which cannot be connected or changed during operation, sucas hard disks.

    Open() The open function is responsible for administering all the devices and is called as soon asprocess opens a device file. If only one process can work with a given device. When a device can b

    used by a number of processes at the same time, open() should set up the necessary wait queues.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    8/32

    Release() The release function is only used when the file descriptor for the device is released. Ttasks of this function comprise a cleaning-up activities global in nature, such as clearing wait queueFor some devices it can also be useful to pass through to the device all the data still in thebuffers.

    Read() & Write() The read() and Write() functions perform a similar task that is copying data fromand to application code. Whenever an input device is used read() function is fired and for outputdevices write() function fired, because only read operation is possible by input device like mouse,keyboard and also write operation is possible by output devices like printer, monitor.

    IOCTL() Each device has its own characteristics, which may consist in different operation modes

    and certain basic settings. It may also be that device parameters such as IRQs, I/O addresses and on need to be set at run-time. IOCTL usually only change variables global to the driver or globaldevice settings.

    Select() The select() function checks whether data can be read from the device or written to it, ifthe device is free or argument wait is NULL, the device will only be check .If it is ready for the function concerned, select() will return 1, otherwise a 0. If wait is not NULL, theprocess must be held up until the device becomes available.

    Q: Define the paging under LINUX .

    A:1. The RAM memory in the computer has always been limited and compared to fixed disks,

    relatively expensive.2. Particularly in multi-tasking operating systems, the limit of working memory is quickly

    reached. Thus it was not long before someone hit on the idea of offloading temporarily unusareas of primary storage(RAM) to secondary storage.

    3. The traditional procedure for this used to be the so-called swapping which involves savingentire processes from memory to a secondary medium and reading them again. This approadoes not solve the problem of running processes with large memory requirements in theavailable primary memory. Besides this, saving and reading in whole processes is veryinefficient.

    4. When new hardware architectures(VAX) were introduced, the concept of demand paging wa

    developed.5. Under the control of Memory Management Unit(MMU) the entire memory is divided up into

    pages, with only complete pages of memory being read in or saved as required.6. As all modern processor architectures, including the x86 architecture, support the

    management of paged memory, demand paging is employed by Linux.7. Pages of memory in kernel segment cannot be saved, for the simple reason that routines an

    data structures which read memory pages back from the secondary storage must always bepresent, in primary memory.

    8. Linux can save pages to external media in 2 ways :a. In the first, a complete block device is used as an external medium. This typically be a

    partition on the hard disk.b. The second uses fixed length files in a file system for its external storage. The term

    swap space may refer to either a swap device or a swap device.9. Using a swap device is more efficient than using a swap file. In a swap device, a page is

    always saved to consecutive blocks, whereas in a swap file, the individual blocks may be givvarious block numbers depending on how the particular file system fragmented the file whenwas set up. These blocks then need to be found via the swap files inode. On a swap device,the first block is given directly by the offset for the page of memory to be saved or read in.

    Working With Files ::::: ::In linux everything is treated as a file, so besides a users program file and data files there are

    also some special files such as those containing information about directory contents or various inp

    output devices.Basically there are three types of files ::

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    9/32

    Ordinary Files All files created by a user come under the ordinary category , these may data files, programs files or even executable files. Changes can be made to such files.

    Directory Files Linux automatically creates a directory file whenever a directory getscreated, this file has the same name as that of the directory and also contain theinformation related to the files present in the directory. A directory file cannot be modifieby a user but is modified by the linux Operating System whenever a new file is created inthe directory.

    Special Files Most of the system files in LINUX are special files and are typicallyassociated with I/O devices and can be found in /dev directory.

    In Linux everything is a file and every file has a name and some properties, or administrativeinformation, i.e the files creation/modification and its permissions, all these properties are stored

    files inode. The system uses this inode to reference or use up every file with its respective inode,the name of the file is only for the users sake.

    inode of a file can be seen by ln i .

    tilde (~) is the notation used for getting straight upto the HOME directory. For another user typing of ~ user will fetch the result.The /home directory is itself a subdirectory of the root directory, / , which sits at the top of thedirectory of the hierarchy and contains all the systems files and folders.

    The THREE device files found in both UNIX and LINUX are /dev/console , /dev/tty and /dev/null .1. /dev/console :- This device represents the system console, Error messages and diagnostics

    are often sent to this device. Each UNIX system has a designated terminal to receive theconsole messages.

    2. /dev/tty :- This special file is mainly for controlling terminal of a process.3. /dev/null :- Unwanted output is redirected to this device.

    System Calls And Device Drivers :: The files and devices can be accessed using a small number ofunctions. These functions are known as system calls. At the heart of the operating system, theKERNEL, are a number of device drivers. The low-level functions used to access the device drivers,the system calls, include :-

    open : Open a file or device. read: Read from an open file. write : Write to a file or device.

    close : Close the file or a device. ioctl : Pass control information to a device driver such as any information related to input or

    output. Each driver has its own set of input output commands.

    /proc file system -----/proc/cpuinfo -> This command gives out the complete details of the processors

    available./proc/meminfo -> This command gives out the information about the memory usage./proc/version -> This command gives out the information about the kernel version.

    UTILITIES ::: Linux programs and sources are commonly distributed in a file whose name contathe version number , with an extension of .tar.gz or .tgz. These are gzipped TAR(TAPE ARCHIVE)

    files also known as TARBALLS.$ tar cvf filename.tar files tobe tarred

    Now so as to make it smaller or more compressed we may use a compression program ::::: i.

    gzip.

    $ gzip filename.tar This will create a file filename.tar.gz.

    This tar.gz can be renamed to .tgz.

    $ mv filename.tar.gz filename.tgz

    Now to retrieve the files from the .tar.gz compression we can do following ::

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    10/32

    $ mv filename.tgz filename.tar.gz

    $ gzip -d filename.tar.gz

    $ tar xvf filename.tar

    Or we may do this in a single line :::----

    $ tar xcvf filename.tgz contents of the tarred file

    OPTIONS OF THE tar ZIPPER ::::

    c :: Creates a new archive.

    f :: Specifies that the target is a file rather than a device.

    t :: Lists the contents of the archive without actually extracting them.

    v :: (VERBOSE MODE) tar displays messages as it zips the files.

    x :: Extracts the file from an archive.

    z :: Filters the archive through gzip.

    RPM packages ::::: (Red Hat Package Manager)

    Each RPM package is stored in a file with an .rpm extension. Package files usually follow a namingconvention with the following structure :

    Name-version-release.architecture.rpm

    KERNEL ARCHITECTURE:::::::::Most UNIX kernels are monolithic, each kernel layer is integrated into the whole kernel program a

    runs in KERNEL mode on behalf of the current process Microkernel operating systems demand a vesmallest function from the kernel generally including a simple scheduler and an inter processcommunication mechanism.

    Although Microkernels oriented Operating System are generally slower than the monolithicones.

    Linux was not designed on the drawing board but developed in an evolutionary manner andcontinues to develop today.

    Each and every function of the kernel has been altered and expanded again and again to gerid of the BUGS.

    The actual KERNEL provides only the necessary functionality like IPC(Inter ProcessCommunication) and MM(Memory Management).

    KERNEL is the Heart of the LINUX and basic functions of KERNEL are : I/O Management Process Management Memory Management File Management

    USER

    SHELL APPLICATION

    LINUX KERNEL

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    11/32

    LINUX SHELL ----

    Computer understands the language of )s and 1s called BINARY language, in early

    days the commands to any of the computer system was given through this language

    but it became very difficult for the people to give long commands and even to read &

    write. So in Operating System there is a special program called SHELL which acts as

    interpreter between the KERNEL and the USER. SHELL is a command line interpreter that executes command read from the standard

    input device or from a file.

    SHELL is not a part of KERNEL but uses the same for execution of the programs, crea

    files etc.

    Several SHELLS AVAILABLE IN LINUX ARE :::

    1. BASH(Bourne Again Shell) It is most common shell.

    2. CSH(C-Shell) Language is similar to C language.

    3. KSH(Korn Shell)

    4. TCSH

    5. COMMAND.COM ----- SHELL name in MS-Dos but not a powerful one.

    Normally the shells are interactive which means that shell accepts commands fro

    the user and executes them, but if all the commands are stored in a file and the

    shell is told to execute the same file instead of entering the commands again and

    again this is known as SHELL SCRIPT.

    SHELL SCRIPTS :- If we have a sequence of linux commands that are used frequently, we can store them in a

    file. It is then possible to have the shell read the file and execute the commands init and suca file is called SCRIPT.

    A shell script allows input, output manipulation variables and a powerful flow of control anditeration constructs for programming.

    Numerous shell scripts are loaded into the system by default in the folder /etc/rc.d all these files aruseful in the booting up of the system.A shell is basically a program that acts as the interface between the user and the Linux system,enabling the user to enter those commands which are required by the user to be run or executed bthe user. A linux shell resembles to the Windows command prompt but they are more useful than tlatter. On linux its quite feasible to have multiple shells for various users.In linux the standard shell that is always installed is the /bin/sh and is called bash (the GNU BourneAgain SHell) from the GNU suite of tools.Version of bash can be checked by the following command :-

    /bin/bash versionWhenever a user is created we can assign a respective Shell to the same from the GNU mode, bysimply creating a user from the user manager.

    Output from a command can be redirected by the use of (>) sign,ls -l > output.txt

    This command above will save the content in the file output.txt rather than showing up the dataSo as in need of appending the output if any changes are made to the real file then the txt file canalso be appended.

    ps >> output.txtThe above command will easily update the changes made in the file.Executing the Shell Script ---

    1. To create the shell script we need to write it in a text file using any editor like VI.

    Computer HARDWARE

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    12/32

    2. Execution of the same can be done in further two ways a. At the command prompt, we have to write

    $ BASH filenameIn this the line means the shell to execute the script using the BASH shell.

    b. At the command prompt type Chmod u+x filename$ ./filename

    VARIABLES ---In the bash shell the variables need to be explicitly declared, they can be created at any poi

    of time by a simple assignment of value. Syntax --- variablename = valueREFERENCING VARIABLE ---

    The $ symbol is used to refer to the contents of a variable. Ex. To assign the value of onevariable to another the syntax would be variable1 = ${variable2}ECHO COMMAND ---

    This command is used to display the messages on the screen,Eg - $echo Hello World

    This command displays the text enclosed b/w . By default it displays the text meant to bedisplayed and after that puts a new line character after it.Expr COMMAND ---

    Most shells do not support numeric values. All variables are treated as character strings,therefore, the declaration a=24 means that a contains two characters 2 & 4. With the help of Exprcommand we can apply mathematical rules to evaluate the arithematic expressions. For ex.

    a=4 b=5

    $ expr $a + $b OR C = expr $a + $bArithematic tests that can be performed are

    -eq Equal To -ne Not Equal To -gt Greater Than -ge Greater Than Equal To -lt Less Than -le Less Than Equal To

    Block Device ::Block special files or Block device files or Block devices correspond to devices through which thesystem moves data in the form of blocks. Linux makes use of dynamic cache system which employ

    primary memory left unused by the kernel and the processes as buffer for block devices.Some important data structures used by the BLOCK DEVICE layer are

    a) GenDisk :-b) hd_struct :- This stores the information about a partition on the disk.c) block_device :-d) buffer_head :-e) bio :-f) bio_jeev :-

    Q > EXPLAIN THE LINUX ARCHITECTURE

    Linux Architecture ::::

    1). Kernel :: The core of a linux system is kernel, and the same controls the

    resources of the computer allocating them to different users & tasks.

    makes it easier for the programs to making interaction with the hardwar

    SHELL UTILITIES

    APPLICATIONS

    HARDWARE

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    13/32

    platforms. However the user does not interact directly with the KERNEL,

    interactive program is used up called SHELL.

    2). SHELL :: Linux has a simple user interface and provides the same services

    that a user demands. It is through the shell that the user interacts with t

    hardware and also is not required to have the details of the hardwares

    internal. Some of the common shells used are :-

    i. BOURNE SHELL = This is the original command processor developed in the

    AT&T lab and has been named after it developers Stephen&Bourne. This sh

    is officially distributed by UNIX systems. The executable file name is SH.

    ii. C SHELL = This is another command processor developed in University Of

    California by William Joy. The name is given as C because the programmi

    language used is similar to C in syntax. The executable file name is CSH.

    iii.KORNE SHELL = This command processor was also developed in AT&T lab b

    David Korne . This shell is a combination of both the C-SHELL and the

    BOURNE-SHELL. The executable filename is KSH.

    iv.RESTRICTED SHELL = Whenever a user is meant to have a limited or null

    access into the linux server the shell used is the Restricted Shell. Mainly thi

    shell is used in for the guest users who are not part of the system.

    v. BASH(BOURNE AGAIN SHELL) = This shell is the enhancement to the bournshell and is the default shell for most of the LINUX SYSTEMS. This shell has

    the capability of storing the history of the commands that were earlier

    executed. The executable filename is BASH.

    vi.TCSH(TOMs C SHELL) = This shell is an enhancement of C-shell ,this shell

    similarly to the C-Shell is not compatible with the Bourne shell. The

    executable filename is either CSH or TCSH.

    vii.A SHELL = This command processor was developed in the University of

    Berklay by the Kenneth Almquist. It is said to be a light weight clone of the

    Bourne shell. This one is majorly used for the computers which have very le

    memory.

    viii.Z SHELL = The Z-shell has one of the best feature of the Korne-Shell and h

    the largest number of utilities.

    3). LINUX UTILITIES AND APPLICATION PROGRAMS ::

    These are the set of those programs that are required by the user on day to day

    basis and these programs are invoked in by their respective shells.

    VARIOUS COMMANDS USED IN THE LINUX :::

    USER LOGIN COMMANDS

    LOGIN = ITLOGSINTOTHESYSTEM. LOGOUT = ITLOGSOUTOFTHESYSTEM.

    PASSWD = ITISUSEDFORSETTINGUPANEWPASSWORD.

    WHO = ITDISPLAYSTHECURRENTLYLOGGEDINUSERS.

    WHO I AM = ITDISPLAYSTHENAMEOFTHEUSERTHROUGHYOUARELOGGEDIN.

    DIRECTORIES AND FILES COMMANDS

    MKDIR = ITCREATESANEWDIRECTORY.

    RMDIR = ITREMOVESADIRECTORY.

    RM = ITISUSEDTOREMOVEAFILE.

    MV = ITISUSEDTOMOVEAFILEFROMASOURCETOANYDESTINATION.

    CD = THISCOMMANDISUSEDFORCHANGINGUPTHEDIRECTORY. PWD = PRINT WORKING DIRECTORY.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    14/32

    LS = THISCOMMANDISUSEDFORLISTINGUPTHEFILESINAFOLDER

    CP = THISCOMMANDISUSEDFORCOPYINGAFILE.

    SORT = THISCOMMANDISUSEDFORSORTINGUPOFFILESINAFOLDER.

    CAT = THISCOMMANDLISTSTHECONTENTSOF FILE.

    LP = THISCOMMANDISUSEDTOPRINTTHECONTENTSOFTHEFILETHROUGHAPRINTER.

    LESS = THISCOMMANDDISPLAYSTHECONTENTSOFTHEFILEINONEGOANDALSOALLOWSSCROLLIN

    HISTORY = THISCOMMANDSHOWSTHEHISTORYOFTHECOMMANDSTHATHAVEBEENEXECUTEDONT

    MACHINEBYAPARTICULARLOGIN.

    INFORMATION COMMANDS

    LEARN = SELFLEARNINGINSTRUCTIONSABOUTLINUX. APROPOS =

    MAN = THISCOMMANDGIVESTHEDETAILEDINFORMATIONABOUTVARIOUSCOMMANDSANDTHEIR

    USAGE.

    DATE = ITPRINTSANDSETSTHEDATE.

    CAL = THISCOMMANDPRINTSTHECALENDARFORANYYEAR.

    CALENDAR = THISCOMMANDOPENSUPTHEDIARYOFAPPOINTMENTSALONGWITHTHEREMINDER

    SERVICE.

    PROCESS MANAGEMENT

    PS = THISCOMMANDPRINTSTHESTATUSOFVARIOUSPROCESSES.

    KILL = THISCOMMANDISUSEDTOTERMINATEOREVENCANCELAPROCESS. WAIT = THISCOMMANDWAITSFORTHEBACKGROUNDPROCESSES.

    SLEEP =

    BATCH = THISCOMMANDSEXECUTESTHEPROCESSES.

    DEBUGGING COMMANDS

    CC = THISCOMMANDINVOKESTHE C COMPILER.

    F77 = THISCOMMANDINVOKESTHEFORTRUNCOMPILER.

    INT = THISCOMMANDVERIFIESTHEPROGRAM.

    AS = THISCOMMANDSINVOKESTHEASSEMBLER.

    PAS ! = THISCOMMANDINVOKESTHEPASCALCOMPILER. BAS ! = THISCOMMANDINVOKESTHE BASIC COMPILER.

    Memory Management :::

    Q:: Define the architecture-independent memory model.A: :A typical computer today has at its disposal a number of levels of memory with differenaccess time.

    The first level mostly consists of cache memory within the processor. A second level of memory is usually implemented by using SRAM chips with a fast

    access time of around 20ns. In almost all cases the actual memory consists of inexpensive DRAM chips with th

    access time around 70ns. As far as the programming is concerned the cache levels are transparent once the

    have been initialized by the BIOS code. For this reason the cache levels are notmapped by the architecture-independent memory model and the term physicalmemory is used to refer to RAM in general.

    Memory Management is primarily concerned with allocation of main memory to requestprocesses. Two important features of memory management functions are-

    1. Protection2. Sharing

    Some of the main issues related to the memory management are :

    Pages Of Memory :-

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    15/32

    The physical memory is divided into pages, the size of a memory page isdefined by the PAGE_SIZE macro. For x86 processor, the size is set to 4KB,while the ALPHA processor uses 8KB.

    Virtual Address Space :-In abstract memory model, the virtual address space is structured as a kernsegment plus a user segment. Code and data for the kernel can be accessedin the kernel segment, and the code and data for the process in the usersegment.When the code is being processed, the segment selector is already set andonly offsets are used. In the kernel, however, access is denied not-only todata in the kernel segment but also data in the user segment.

    MEMORY ADDRESS :Programmers casually refer to a memory address as a way to access the memory cells.x86 Micro processors, we have three kinds of memory addresses

    1. Logical Addresses This is an address which is included in the machine languageinstructions to specify the address of an instruction. Each logical addressesconsists of a segment and an offset that denotes the distance from the start of thsegment to the actual address.

    2. Linear Addresses This address includes a single 32 bit insigned integer that canbe used to address upto 4GB that is upto 232 memory cells. Linear addresses areusually represented in hexe-decimal notation. Their values ranges from0x00000000 to 0xffffffff.

    3. Physical Addresses Physical addresses is used to address memory cells includedin memory chips. They correspond to the electrical signals sent along the addresspins of the microprocessor to the memory bus. Physical Address are represented 32bit unsigned integer.

    CONVERTING THE LINEAR ADDRESS :The linear addresses need to be converted into a physical address by either theprocessor or a separate MMU(Memory Management Unit). In the architectureindependent model this page conversion is a 3 level process in which the address forthe linear address space is split into four parts----

    a. The first part is used as an index in the page directory.b. The second part of the address serves as an index to a page middle

    directory.c. The third part is used as index to the page table.d. The fourth part of the address gives the offset within the selected pag

    of the memory.The linux adopted a three-level paging model so paging is feasible on 64bitarchitectures.

    The x86 processor only supports a two-level conversion of the linear address. While Alpha processor supports three-level conversion because the Alpha process

    supports linear addresses with a width of 64 bits.Three level paging model defines three types of paging table :

    1. Page Global Directory-Page global directory includes the addresses of several page middle

    directory. It is of 12bit length. Different functions available for modification Page Global Directory are :-

    i. pgd_alloc() : Allocates a Page Directory.ii. pgd_bad() : Can be used to test whether the entry in Page Directory

    valid.iii. pgd_clear() : Deletes the entry in Page Directory.iv. pgd_free() : Release the page memory from the page Directory.v. pgd_none() : Tests whether the entry has been initialized.

    1. Page Middle Directory-Page Middle Directory includes the address of several Page Tables. It is of13bit length. Functions used for handling Page Middle Directory are :

    i. pmd_alloc() : Allocates a Page Middle Directory to manage memory in

    user area.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    16/32

    ii. pmd_bad() : This is to test that whether the entry in the Page MiddleDirectory is valid or not.

    iii. pmd_clear() : Deletes the entry in the Page Middle Directory.iv. pmd_free() : Releases the Page Middle Directory for the memory in th

    user segment.v. pmd_offset():Returns the address of an entry in the page middle

    directory to which the argument is allocated.vi. pmd_none() : Tests whether the entry in the Page Middle Directory ha

    been set.1. Page Table-

    Each page table entry points to page frames. It is of 25 bits length. The dir

    attribute is set when the contents of the memory page has been modified. Apage table entry contains a number of flags which describes the legal accesmodes to the memory page and their state. Various pages available

    PAGE_NONE PAGE_SHARE PAGE_COPY PAGE_READONLY PAGE_KERNEL

    Following are some functions have been defined to manipulate the page tabentries and their attributes:

    i. mk_pte() : Returns a page table entry generated from the memoryaddress.

    ii. pte_alloc() : Allocates a new page table.iii. pte_clear() : Clears a page table entry.iv. pte_dirty() :v. pte_free() : Releases the page table.

    VIRTUAL ADDRESS SPACE :::::::The virtual address space of the Linux Operating system has been segmented into twofurther segments-

    A. Kernel SegmentB. User Segment

    The virtual address space of a linux process is segmented a distinction is made betweethe Kernel segment and the user segment.

    1) User Segment In the user mode, a process can access only the user segment. Athe user segment contains the data and code for the process hence it is differentfor all the processes, and this means in term that the page directories, or at leastthe individual page tables for the different processors must also be different. In tsystem call fork() as it is known that the parent process page directories and pagtables are copied for the child process. The system call fork() has an alternative:clone, both system calls generate a new thread, but in clone the old thread andthe thread generated by the clone can fully share the memory.

    2) Virtual Memory All the linux systems provide a useful abstraction called virtualmemory, it acts as a logical layer between the application memory requests andthe hardware Memory Management Unit(MMU). Virtual memory has many purposeand advantages ::

    a. Several processes can be executed concurrently.b. With the help of virtual memory it is possible to run applications whose

    memory needs are larger than the physical memory.c. With the help of virtual memory a process can execute a program whose co

    is only partially loaded in memory.d. Processes can share a single memory image of a library or program.e. Virtual Memory ease the locating up of the, programs can be relocatable, th

    is they can be placed anywhere.f. Programmers can write machine-dependent code, since they do not need to

    be concerned about physical memory organization.A virtual memory area is defined by the data structure vm_area_struct :

    Struct vm_area_struct

    {Struct mm_struct * Vm_mm;

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    17/32

    Unsigned Long vm_start;Unsigned Long vm_end;

    }Vm_start and vm_end determine the start and end address of the virtual memoarea managed by structure.

    3) System Call brk The system call brk can be used to find the current value of thepointer or to set it to new value. If the argument is smaller than the pointer to theend of the process code, the current value of brk will be returned. Otherwise anattempt will be made to set a new value.

    4) Mapping Function The C library provides three functions in the header filesys/mman.h

    #Include 5) Kernel Segment A linux system call is generally initiated by the software interru

    0x80 being triggered. The processor then reads the gate descriptor stored in theinterrupt descriptor table.

    Access to the user segment can be made using the put_user() and get_user() functionsmentioned earlier.

    Q: DEFINE THE STATIC AND DYNAMIC MEMORY ALLOCATION IN THE KERNELSEGMENT ???A: Static Memory Allocation in the kernel segment ---

    In the system kernel it is necessary to allocate memory for the kernel process.

    Before a kernel generates its first process when it is run, it calls initializationroutines for a range of kernel components.

    These routines known as start_kernel() are able to reserve memory in kernelsegment.

    Dynamic Memory Allocation in the kernel segment The basic functions used for dynamic memory allocation are kmalloc() and

    kfree(). The kmalloc() function attempts to reserve the memory. The memory hence reserved can be released again by the function kfree(). _get_free_pages() -- If none of the free pages are available then in that case the

    function get_free_pages() is called which copies other pages to the secondary

    memory, and freeing the pages for the required pages. In the linux kernel thefunction _get_free_pages() function can only be used to reserve contiguous areasmemory.

    Kmalloc reserves only one page of the memory, this situation was improved by thfunction vmalloc() and its counterpart vmfree()

    The advantage of vmalloc() function is that size of the area of the memoryrequested can be better adjusted to actual needs than when using kmalloc().

    Q: DEFINE THE UPDATE AND BDFLUSH PROCESSES ??A: The update process is a Linux process which at periodic intervals calls the systembdflush with an appropriate parameter. The interval used by update as a default underLinux is five seconds.Bdflush is implemented as a kernel thread and is started during kernel initialization.

    SYSTEM CALL ---A system call is the transition of a process from the user mode to the system mod

    There are four main system calls that are mostly used i. The System call fork()ii. The system call execve()iii. The system call exit()iv. The system call wait()

    UNIT II (IPC- INTER PROCESS COMMUNICATION)

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    18/32

    Q: Define IPC.A :

    Race Condition Many processes require one resource, but it is important to make sure thatthe resource is used by one process at a time. For ex. Printer is to be used by one processora time.We use IPC basically to eliminate the Race-Condition.The Linux IPC facility provides manymethods for multiple process to communicate with each other.

    Features of IPC(Inter Process Communication) are :- Resource Sharing :

    If processes are meant to share a resource (lets say printer), it is important tomake sure that no more than one process is accessing the resource, i.e , sendin

    the data to the printer at any given time. If different processes send data on same time the race condition is fired, and

    communications between process must prevent it. Eliminating this race conditiis only one possible use of Inter-Process Communication.

    Synchronization In The Kernel : As the kernel manages the system resources, access by processes to these

    resources must be synchronized. A process will not be interrupted by the scheduler so long as it is executing a

    system call. Connection-Less Data Exchange :

    In connection less data exchange a process sends data packets , which may begiven a destination address or a message type and leaves it to the infrastructu

    to deliver them. Connection-Oriented Data Exchange :

    In connection oriented data exchange the two parties to the connection must up a connection before communication can start.

    Available methods for Connection-Oriented Data Exchange are :-1) Pipes2) Named Pipes (FIFO)3) Domain Sockets4) Stream Sockets

    Synchronization In The Kernel ::Because the kernel manages the system resource access by processes to these resources must be

    synchronized. Normally a process will not be interrupted by the scheduler as long as it is executingsystem call. This only happens if it locks or it calls schedule(), to explicitly allow the execution ofother processes. In kernel programming it should be remembered that function like_get_free_pages() & down() can lock processes.Processes in the kernel can be interrupted by interrupt handling routines : this can result in the RacConditions even if the process is not executing any functions that can lock files.The base synchronization mechanism in multiprocessor system also used in other Operating Systemis called spin lock.These locks carry out the mutual exclusion of processes in the kernel.The critical section can only be executed by the process that is in possession of the spin lock, thisconcept is called Mutex. The process of testing and setting up the lock variable is atomic. If thespin lock cannot be set the processor waits in a loop until the lock variable is released again. Thewait for the release of the lock variables is known as busy waiting.Spin lock is doing the atomic synchronization mechanisms in the Linux Kernel.The read & write locks are an alternate to the spin locks.

    IMPLEMENTATIONS OF THE IPC is done in LINUX in different forms ::1) Communication By Files2) Pipes3) System V IPC

    a. Semaphoresb. Message Queuesc. Shared Memory

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    19/32

    SYSTEM V IPC --- IPC is an abbreviation that stands for Inter Process Communication. The classicaforms of inter-process communication

    Semaphores Message Queues Shared Memory were implemented in a special variant of UNIX.

    These were later integrated into system V and are known as System V IPC. The system VIPC denotea set of system calls that allows a user mode process to:

    Synchronize itself with either process by means of semaphores. Send messages to other processes or receive messages from them through message queue Share a memory area with other process through shared memory.

    The IPC data structures are created dynamically when a process requests an IPC resource, such a

    resource may be used by any process, including those that do not share the parent process thatcreated the resource.The access permissions to the resources is managed by the kernel in the structure ipc_perm.

    A. Semaphores :: Semaphores are counters used to provide controlled access to shared datastructures for multiple processes. The value of semaphore is positive if the protected resouris available and is negative or zero if the protected resource is currently not available. Aprocess that wants

    B. Message Queues ::C. Shared Memory ::-- Shared memory is the fastest form of IPC. The most useful IPC mechanis

    is shared memory, which allows two or more processes to access some data structures byplacing them in the shared memory segment.Shmget() function is invoked to get the IPC identifier of a shared memory segment, optional

    creating it if it doesnot already exist. Drawback of the shared memory is that the processesneed to use additional synchronization mechanisms to ensure that race conditions do notarise.

    Q: Define the system call ptrace.A: Execution Tracing is a technique that allows a program to monitor the execution of anotherprogram. The traced program can be executed step-by-step and its memory can be read & modifieuntil a signal is received or until a system call is invoked. Execution tracing is widely used bydebuggers, together with other techniques like the insertion of break points in the debuggedprogram and run time access its variables. In linux the execution tracing is performed through theptrace() system call.

    ptrace system call provides a mechanism by which a parent process may observe and contthe execution of another process.

    Before running execve the child calls ptrace with the first argument equal toPTRACE_TRACEME. This tells the kernel that the process is being traced.

    In the system call ptrace() the commands to be followed can be : PTRACE_TRACEME == This command starts executing tracing for the current

    process. Mainly indicates that this process is to be traced by its parent. PTRACE_ATTACH == This command makes a process the child process of the

    calling process. PTRACE_KILL == This command kills the traced process. PTRACE_PEEKTEXT == This commands reads the 32 bit value from the text segme

    at a location addr in the childs memory. PTRACE_PEEKDATA == This commands reads a 32 bit value from the data segmen

    at a location addr in the childs memory. PTRACE_POKETTEXT == This commands writes the 32 bit value for the text segmen

    to the location addr in the childs process memory. PTRACE_POKETDATA ==This commands writes the 32 bit value for the data segment

    to the location addr in the childs process memory. PTRACE_POKEUSR == PTRACE_CONT ==This command resumes execution. PTRACE_KILL == This command sends SIGKILL to terminate the child

    process.

    When a monitored events occurs, the traced program is stopped and a SIGCHLD signal is sent to itsparent. When the parent wishes to resume the childs execution, it can use one of the PTRACE_CON

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    20/32

    Debuggers such as gdb are based on ptrace() system call. Since, ptrace() system call is dependenton the process architecture, this call is defined in the file arch/i836/Kernel/ptrace.C.Ptrace() system call, can be called through function

    long ptrace(enum_ptrace_request request, pid_t pid, void *addr, void *data)

    The ptrace() is called with FOUR arguments as mentioned earlier,,,, The first argument determines the behavior of ptrace and how other arguments are used. The request can be any of the following. PTRACE_TRACEME, PTRACE_ATTACH, PTRACE_KILL,

    PTRACE_PEEKTEXT, PTRACE_PEEKDATA, PTRACE_POKETEXT, PTRACE_POKEDATA,PTRACE_CONT.

    UNIT III (LINUX FILE SYSTEM)

    The actual representation of data in LINUX memory works out the same Linux sticks closely to itsmodel unix because the management structure for the file system are very similar to the logicalstructure of a unix file system. Older Linux Kernels the structure still managed in a static table. Witthe introduction of modules it became desirable to load new file system after Linux system had

    started running.I. Every Operating System has its own file system and each of these naturally claims to be fas

    , better and more secure than its predecessors.II. Linux retains UNIXs standard file system model.III. The large number of the file systems supported has been undoubtedly one of the main reaso

    why linux has gained acceptance so quickly, as it becomes difficult for a user to convert thedata into the or as per the new file system.

    IV. The range of file systems supported is made possible by the unified interface to the LINUXkernel. This interface is the Virtual File System.

    V. The Linux Virtual File System has been designed around the object oriented principles, andhas 2 components :

    a. A set of definitions that define what a file object is allowed to look like.

    b. A layer of software to manipulate the objects.VI. The three main object types defined by the Virtual File System are the

    a. Inode Objectb. The File Object Structuresc. The File System Object

    VII.Every object of one of these types contains a pointer to a function table.VIII.The function table lists the address of the actual functions that implement various operation

    for those particular object.IX. Thus, a VFS software layer can perform an operation on one of these objects by calling the

    appropriate function from the function table, without having to know in advance exactly withwhat kind of an object it is dealing.

    X. The VFS doesnt know and even doesnt care that whether an inode represents a networked

    file, disk file, socket file etc.XI. The file system object represents a connected set of files that forms a self contained directo

    hierarchy and its main responsibility is to give access to inodes.XII.The Virtual File System identifies every inode by a unique file system inode number and it

    finds the inode corresponding to a particular inode number by asking the file system object treturn the inode with that number.

    XIII.The inode and file objects are the mechanisms used to access files. An inode-objectrepresents the file as a whole and a file object represents a point of access to the data in thefile. A process cannot access an inodes data contents without first obtaining a file objectpointing to the inode. The file object keeps track of where in the file the process is currentlyreading or writing, to keep track of sequential file I/O.

    XIV. VIRTUAL FILE SYSTEM ----

    Process Process Pro

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    21/32

    .. .. .. .. .. .. USER MODE

    -----------------------------------------------------------------------------------

    ----

    SYSTEM MODE

    Each file system starts with a BOOT block. This block is reserved for the code required toboot the operating system.

    As file system should usually be able to exist on any block oriented device and on each devicin principle they will always have the same structure, the boot block will be present whethernot the computer is booted from the device.

    All the information which is essential for managing the file system is held in the superblock.

    This is followed by a number of inode blocks containing the inode structures for the filesystem. The remaining blocks for the device provide the space for the data. These data blocks thus contain ordinary files along with the directory entries and the indirec

    blocks.

    The file system is the most visible aspect of an operating system, it provides the mechanism foronline storage of and access to both data and programs of the operating system.

    Each file system starts with a bootblock, which is required to boot the Operating System. The range of the file system supported is made possible by the unified interface to the Linux

    Kernel, and is the Virtual File System Switch(VFS). VFS is a kernel software layer that handleall the system calls related to a standard Linux file system and its main strength is providingcommon interface to several kinds of file systems.

    Common file model consists of the following structure types :1) MOUNTING2) The Superblock Structure3) The inode structure4) The file structure

    Mounting : Before a file can be accessed the file system containing the file must be mounted. Mounting up of the file system can be easily done by either the system call mount or the

    function mount_root(). Mount command or the function both are run under the system call setup which gets called

    just once after the init() process is created by the kernel function init(). Every mounted file system is represented up by a super_block structure,

    The Superblock Structure : (SUPERBLOCK OPERATION)The superblock structure provides in the function for accessing the file system, the functions in thesuper_operations structure serve to read and write an individual inode, to write the super block andto read file system information. This means that the super block operations contain functions totransfer the specific representation of the superblock and inode on the data media to their generalform in memory and vice-versa. As result this layer completely hides the actual representations.The superblock is followed in each block group by the block group descriptors, which provide

    information on the block group descriptors, which provide information on the block groups. Eachblock group is described by a 32-byte descriptor.

    VIRTUAL FILE

    Ext procminimsdo

    BUFFER

    Device

    BOOT BLOCK SUPER BLOCK Inode BLOCKS

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    22/32

    All the information which is essential for managing the file system is held in the superblock, everymounted file system is represented by a super_blockstructure. The super block is initialized by thfunction read_super() in the VFS, this superblock contains the information on the entire file systemsuch as block size, access rights and the time of last change. The superblock also holds thereferences to the file systems root inode.Some of the important operations on super_block structure are as follows :

    write_super() -- This function is used to save the information of the superblock. put_super() -- This function is called by the VFS while unmounting the system. read_inode() -- The inode structure is initialized by this function. notify_change() This function acknowledges the changes made up to the inode via

    system calls.

    write_inode() --- This function saves the inode structure.The inode structure : In memory, the inodes are managed in two ways, First, they are managed in adoubtedly linked circular list starting with first_inode, which is accessed via the entries i_next andi_prev. The complete list of inodes is scanned through in the following way.Some possible operations on the inode structure :

    Create() This creates a new inode for a file. Lookup() This function searches an inode for a given file. Link() --- This function sets up a hard link. Unlink() -- This function deletes the specified file in the directory specified. Symlink() Creates the symbolic link.

    Struct inode *next* file

    { next = file1;For(i=0;i

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    23/32

    The main difference between Ext2fs and ffs(Fast File System) lies in the disk allocation policy. In ffs the disk is allocated to files in the blocks of 8Kb with blocks being sub-divided in

    fragments of 1Kb to store small files. In contrast Ext2fs doesnot use fragment at all but perform all its allocation in smaller unit. Th

    default block size on Ext2fs is 1Kb although 2 and 4Kb blocks are all supported.

    As LINUX was initially developed under MINIX, MINIX was the first LINUX file system and restrictedthe partitions to the maximum of 64MB and filenames to no more than 14 characters, and hence thsearch for a better file system started and the result was Ext file system the first to be designed fthe Linux. The Ext file system allowed partitions up to 2GB and filenames up to 255 characters, thisfile system included several significant extensions but offered unsatisfactory performance. At this

    time the second Extended File system was introduced Ext2 in 1994, this was efficient as well asrobust.Significant features of Ext2 are ::

    1) Block Fragmentation :: System administrators usually choose large block sizes for accessingrecent disks, as a result small files stored in large blocks waste a lot of disk space, and Thisproblem has been solved by allowing several files to be stored in different fragments of thesame box.

    2) Access Control Lists :: Instead of classifying the users of a file under three classes Owner,Group and others, an ACCESS CONTROL LIST(ACL) is associated with each file to specify theaccess rights for any specific users or a combination of users.

    3) Handling Of Encrypted and Compressed Files :: This file system included a new option whiccan be specified while creating a file and will allow users to store compressed or encrypted

    versions of the file on the disk.4) Logical Deletion :: An undelete option used by the users to easily recover the data if needed

    in between the work.

    STRUCTURE OF THE Ext2 FILESYSTEMThe first block in any Ext2 partition is never managed by the Ext2 file system , since it is reserved fthe partition boot sector. The rest of the Ext2 partition is split into block group. Block groups reducfile fragmentation, since the kernel tries to keep the data blocks belonging to a file in the same blogroup if possible. Each block in a block group contains one of the following pieces of information :

    A copy of file systems superblock. A copy of the group of block descriptors. A data block bitmap.

    A group of inodes. An inode bitmap. A chunk of data belonging to a file , i.e. a data block.

    An Ext2 disk superblock is stored in an ext2_super_block structure, which contains the number ofinodes, file system size in blocks, number of reserved blocks, free blocks counter, free inodescounter, block size, fragmented size and various other information.

    EXTENSIONS OF THE EXT2 FILESYSTEMThe Ext2fs has additional file attributes beyond those which exist in Standard UNIX file systems. ::-

    a) Ext2_SECRM_FL :- If a file has this attribute its data blocks are first overwritten with randobytes before they are released via the truncate function.

    b) Ext2_UNRM_FL :- This attribute will eventually be used to implement the restoration ofdeleted files.

    c) Ext2_SYNC_FL :- If a file has this attribute, all write request are performed synchronouslythat is not delay by the buffer cache.

    d) Ext2_APPEND_FL :- In this type of files with attribute cannot be deleted renamed orrelinked.

    DIRECTORIES IN Ext2 FILE SYSTEM In this file system directories are administered using a singly linked list. It is possible for a directory entry to be longer than is required to store the filename. An entry is deleted by setting the inode number to zero and removing the directory entry fro

    the linked list.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    24/32

    In this file system a special kind of style for storing files is by storing the filenames along wittheir inode numbers.

    BLOCK ALLOCATION IN THE Ext2 FILE SYSTEMOne of the most common problem encountered in all the file systems is the fragmentation of fili.e., scattering up of files into small pieces as a result of constant deleting and creating up of nefiles, this problem is usually solved by the use of DEFRAGMENTATION PROGRAMS. The Ext2 filesystem uses 2 algorithms to limit the fragmentation of files:

    I. Target-Oriented Allocation (Pro-Allocation) :- This type of block is itself free, if someblock all ready targeted, the block allocation routine tries to find a free block which isleast in the same block group as the targeted block. This algorithm always looks for a

    new data block in the area of a target block, if this block is free it is allocated, otherwa free block is sought within 32 blocks of the target block. If this gets failed i.e. the freblock is not available then only the rest of the block venues are investigated.

    II. Pre-allocation :- All ready this type of block is pre-allocated. When the file is closed, tremaining blocks still reserved are released. This also guarantees that as many datablocks as possible are collected into one cluster. If a free block is found then up to eig(8) following blocks are reserved, if they are free, When the file gets closed these blocalso gets released and these assures that as may possible data blocks are collected inone cluster.

    EXTENSIONS OF EXT2 FILE SYSTEM Ext2_SECRM_FL Ext2_UNRN_FL

    Ext2_COMPR_FL Ext2_SYNC_FL Ext2_APPEND_FL Ext2_NOTIME_FL Ext2_IMMUTABLE_FL

    MODULES AND DEBUGGING :::

    Modules consist of object code linkable, removable at run-time , usually comprising a

    number of functions This object code is integrated into the already running Kernel with eq

    rights, which means that it runs in system mode. One advantage of implementing device

    drivers of the file systems as modules is that only the documented interface can be used.

    For the user, modules enable a small and compact Kernel to be used, within other function

    only being added as and when required.

    What are modules ?? And their implementation in KERNEL??

    MODULES are components of the linux that can be loaded and attached to it a

    needed, for adding support for a new device one can simply instruct a kernel t

    load its module. Such an efficient use of modules has added advantage of

    reducing the size of kernel program. Implementation of modules in the KERNEL ::

    Linux provides three system calls create_module, init_module and

    delete_module for implementation of Linux modules.

    The administration of modules under Linux uses a list of all the loaded

    modules. For the user the procedure divides into four phases ::

    The user process fetches the content of the object file into it

    own address space, only to fetch the data and the code into

    form in which they can actually be executed. For this purpos

    the actual load address may be added at various points. This

    process is termed as Relocating.

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    25/32

    After this process the system call create_module is now use

    mainly for obtaining the final address of the object module a

    also to reserve memory for it.

    The load_address received by the create _module is used

    relocate the object file if the process is a user process then

    load in the user area and if the kernel process load in kernel

    segment.

    When a module is already use in a process and the

    other process wish to use this then it uses the module

    which earlier loaded. This mechanism is known as

    MODULE STACKING.

    Once the preliminary work is complete, we can load the obje

    module, this uses the system call init_modules, cleanup()

    function is called when the module is deinstalled.

    BY using the system call delete_module, a module that has

    been loaded can be removed again.

    Define KERNEL DEMON .

    The kernel daemon is a process which automatically carries out loading and

    removing of the modules without the system user noticing it. The communication between the Linux Kernel and the Kernel Daemon is carrie

    out by the means of IPC(Inter Process Communication).

    Kernel Daemon opens a message queue with a new flag IPC_KERNELD.

    The kernel sends the messages to the kernel daemon bykerneld_send

    function.

    Specific request is gets stored in the kerneld_msg struct, which includes

    mtype : This component contains message.

    Id : This indicates that whether the kernel expects an answer.

    Pid : This component contains the pid of the process.DEBUGGING :- Debugging is the process in which a process is searched for errors and if the sameoccurs at the run time the error is rectified or a warning is given. For the debugging purpose theprogram will be loaded into a debugger such as gdb and run step by step until am error is found, tincludes real time applications, parallel processes and software which runs without a host OperatinSystem. The most common debugging technique is MONITORING. According to SEI (SoftwareEngineering Institute) and the IEEE, every significant piece of software will initially contain defects,typically around 2 per 100 lines of code.

    Types Of Errors ----A bug usually arises from a small number of causes, each of suggests a specific method of

    detection and removal :

    SPECIFICATIION ERRORS DESIGN ERRORS CODING ERRORS

    One of the debugger is printk debugger and in the same the code is checked and on the occurrencof an error an appropriate alarm message is given. For Ex. Whenever a kernel segment process tocall the data and code of user segment process, verify_area() function is fired, which checks all thearea of the code and if any error is occurred then printk debugger is called.

    UNIT IV (MULTIPROCESSING)

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    26/32

    Multiprocessing was originated in mid 1950s at a number of companies. In some early 1960sBurroughs Corporation introduced a symmetric MIMD(MULTIPLE INSTRUCTION MULTIPLE DATA)multiprocessor with 4 CPUs and upto 16 memory modules connected via a crossbar switch.While multiprocessing systems were developed technologies also advanced the ability to shrink theprocessor and operate at much higher clock rates.One of the cheaper ways to improve hardware performance is to put more than one C.P.U on theboard. This can be done either by making the different CPUs take on different jobs known asASYMMETRICAL MULTIPROCESSING or by making them all run in parallel doing the same job knownas SYMMETRICAL MULTIPROCESSING(smp).In a multiprocessor system one processor is chosen up by the BIOS, it is called BSP(Boot Processorand is used basically for the system initialization . All the other processors are called AP(Application

    Processors) and are initially halted up by the BIOS.Doing asymmetrical multiprocessing effectively requires specialized knowledge about the task thecomputer should do, which is available in a general purpose operating system such as LINUX. On thother hand symmetric multiprocessing is easy to implement.

    Symmetric Multi-Processing :::: Most of the systems are single processor systems, i.e they have onone CPU. But sometimes the applications require more power from the processors, to solve such asituation multiple processors are used for close communication, sharing the computer bus, the clocand sometimes memory and peripheral devices. The most common technique used foraccomplishing such a multiprocessor system is SMP and in this same each processor runs anidentical copy of the operating system and they communicate with one another as needed.In a symmetrical multiprocessing environment the CPUs share the same memory and as a result,

    code running on one CPU can affect the memory used by another. We can no longer be certain thavariable we have set to a certain value in the previous line still has that value, the other CPU mighthave played with it while we werent looking.A Symmetric Multiprocessing architecture is simply one where two or more identical processorsconnect to one another through a shared memory. Each processor has a equal access to the sharedmemory.

    BSP AP

    ICC bus

    SMP systems are mainly of two types --- LOOSELY COUPLED MULTIPROCESSING = Earlier the linux SMP systems were loosely coupled

    multiprocessor systems, these are constructed from multiple stand alone systems connecteby a high speed interconnect. This type of architecture is also called a cluster. Building

    loosely coupled multiprocessor architecture is easy but they have their limitations. Thebigger drawback is the communications fabric.

    CPU 1

    Local

    CPU 2

    Local

    I/O APIC

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    27/32

    TIGHTLY COUPLED MULTIPROCESSING = Tightly coupled multiprocessing refers to chip levelmultiprocessing . The idea behind the tightly coupled multiprocessing is to scale down theloosely coupled to chip level. On a single integrated circuit, multiple chips, shared memoryand interconnect from a tightly integrated core for multiprocessing.

    1. Difference between MEMORY SYMMETRY & I/O SYMMETRY. Memory Symmetry All processors share the same main memory , which basically

    means that they share same physical addresses are the same. This means that allprocessors execute the same operating system and applications are visible to all

    processors and can be used or executed on every processor. I/O Symmetry I/O symmetry allows reduction of a possible I/O bottleneck. Howeve

    some Multiprocessors systems assign all interrupt to one single processor. Allprocessors share the same I/O subsystems, some Multiprocessor Systems assign allinterrupts to one single processor and on the other hand use the I/O APIC(AdvancedProgrammable Interrupt Controller). All CPUs are connected up by the ICC(InterruptController Communications) BUS.

    2. Coarse-Grained Locking & Finer-Grained Locking. For the correct functioning of the multi-tasking system it is important that the data

    present in the kernel can be modified only by one processor so that the identicalresources are not allocated twice. For this we use Coarse Grained Locking, in thissometimes the whole kernel is locked so as only one process can be present in the

    kernel. Finer Grained Locking is used for the multi-processing and real-time operating system

    In the development of multi-processor LINUX kernels was meant to have the following three rules 1. No process running in the kernel mode is meant to be interrupted by any other process whic

    is also running in the kernel mode, except when it releases control and sleeps.2. Interrupt Handling can interrupt a process running in kernel mode but than in the end the

    control is returned back to the same process. A process can even block the interrupts andmake sure that it will not be interrupted.

    3. Interrupt handling cannot be interrupted up by the process running in the kernel mode. Thismeans that the interrupt handling will be processed completely or at the most be interruptedby another interrupt of higher priority.

    All processes to monitor the transition to the kernel mode use one single semaphore, thissemaphore is used to ensure that no process running in kernel mode can be interrupted up byanother process.

    CHANGES TO THE KERNEL ::--In order to implement Symmetric Multiprocessing in the linux, there are some changes that have tobe made

    1. Kernel Initialization : The first problem with the implementation of multi-processoroperation arises when starting the kernel. All the processors must be started because the BIOhave halted all the Application Processes and initially only Boot Processor is running.

    a. Only the BOOT processor enters the kernel starting function start_kernel().

  • 8/7/2019 Vipul Kumar Bhardwaj (V.K.B) LINUX

    28/32

    b. After the start_kernel() function gets executed the normal LINUX initialization, smp_inis called.

    c. The smp_init() function activates all other functions by calling smp_boot_cpus().d. After the function smp_boot_cpus() checks all the information then every processor is

    started by calling do_boot_cpu() function.But how can a halted processor are started ???

    This purpose of starting up of the halted functions is served by the APIC(AdvancedProgrammable Interrupt Controller), this function allows each processor to send otherprocessors a so called inter processor interrupt. Because of this function served by the APIC is possible now to send the send each processor an INIT().

    2. Scheduling : The linux scheduler include a task structure, which contains the information

    related to all the processes meant to be executed or those which are under execution. But thchanges include

    a. Processor Component is added to the Task structure, which contains the number of thrunning processor or if the processor is constant, in other words if no process is allotteto the processor, then in that case the number is denoted as No_Proc_ID.

    b. The last_processor component contains the number of the processor which processedthe last task.

    c. Each processor works through the schedule and is assigned a new task which isexecutable and has not been assigned to any other processor.

    Since, now each processor possesses its own active process the current symbol whichnormally points to the current process expands to ----

    current_set[smp_processor_id()];

    where the smp_processor_id() function supplies the number of the currently runningprocessor.

    3. Message Exchange Between Processes : Messages in the form of inter-processorinterrupts are handled via interrupt 13 & 16. In the i386 processor, interrupt 13 had the taskinforming the system about FPU errors. Since i486 processors, which is the smallestprocessors supported by Intel Multiprocessing Specifications, this is now carried out byinterrupt 16 the only one used in the SMP mode. Messages in the form of inter-processorinterrupts are handled via interrupt 13 & 16.

    4. Entering Kernel Mode : The kernel is protected by a single semaphore. All interrupthandlers, syscall routines and exception handlers need this semaphore and wait in a processloop until the semaphore is free.

    5. Interrupt Handling : At the system start , however, all the interrupts are forwarded only to

    the BSP. Each SMP operating systemmust therefore switch the APIC into SMP mode, so thatother processors too can handle interrupts.Linux does not use this operating mode, that is, during the whole time the system is operatininterrupts are only delivered to the BSP(Boot Processor) . This compromises the latency timesince incoming interrupts can only be handled when no processor or the BSP is in the KernelHowever if there is an AP in the kernel, the interrupt handling routine must wait until there isan AP in the Kernel, the Interrupt Handling routine must wait until the AP has left the kernel.

    SPOOLING ::::

    NETWORK IMPLEMENTATIONSOCKET STRUCTURE :::

    Sockets are used to handle communication links between application over the network .Communication between the client and the server is through the socket . The socket progra