Upload
scarlett-lawson
View
224
Download
5
Embed Size (px)
Citation preview
1
Java .class File Format
陳正佳
2
Java Virtual Machine
• the cornerstone of Sun's Java programming language. • a component of the Java technology responsible for A. Java's cross-platform delivery, B. the small size of its compiled code, C. Java's ability to protect users from malicious programs. • JVM knows nothing of the Java programming language, only of a particular file format, the class file format.
32
Class file format
A class file contains » 1. Java Virtual Machine instructions (bytecodes)» 2. a symbol table (Constant pool)» 3. other ancillary information.
Javap options» 1. -c bytecode» 2. -s internal type signatures» 3. -verbose stack size and number of local
variables and args.
42
Class file format
Each class file contains one Java type, either a class or an interface.
A class file consists of a stream of 8-bit bytes.
All 16-bit, 32-bit, and 64-bit quantities are constructed by reading in two, four, and eight consecutive 8-bit bytes, respectively.
53
Class File Format
Multibyte data items are always stored in big-endian order, where the high bytes come first.
u1, u2,and u4 represent an unsigned one-, two-, or four-byte quantity, respectively.
These types may be read by methods such as
readUnsignedByte, readUnsignedShort, and
readInt of the interface java.io.DataInput.
6
class SumI {
public static void main (String[] args) {int count=10;int sum =0;for (int index=1;index<count;index++)
sum=sum+index;System.out.println("Sum="+sum);
} // method main
}
74
Class File Structure
ClassFile {
u4 magic; u2 minor_version; u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags; u2 this_class; u2 super_class;
u2 interfaces_count; u2 interfaces[interfaces_count];
u2 fields_count; field_info fields[fields_count];
u2 methods_count; method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
85
Magic
magic :u4
The magic item supplies the magic
number identifying the class file format;
it has the value 0xCAFEBABE.
95
Version
minor_version :u2, major_version:u2
The values of the minor_version and
major_version items are the minor
and major version numbers of the
compiler that produced this class file.
105
Constant Pool
constant_pool_count :u2» must > 0.» indicates the number of entries in the
constant_pool table of the class file,» where the constant_pool entry at index zero
is included in the count but is not present in the constant_pool table of the class file.
» i.e., if count = 13 => pool[1] … pool[12]
115
Constant Pool
constant_pool[]» a table of variable-length structures» representing various
– numeric literals– string constants, – class/interface/type names, – field reference,– method reference and – other constants that are referred to within the
ClassFile structure and its substructures.
125
Constant Pool
constant_pool[0], » reserved for internal use by a JVM implementation. That
entry is not present in the class file.
The first entry in the class file is constant_pool[1]. Each constant_pool[i] is a variable-length structure
whose format is indicated by its first "tag" byte.
135
Access Flags
access_flags :u2» a mask of modifiers used with class and
interface declarations.
145
Access Flags
The access_flags modifiers Flag Name Value Meaning Used By ACC_PUBLIC 0x0001 Is public; Class, interface ACC_FINAL 0x0010 Is final; Class ACC_SUPER 0x0020 superclass Class, =1 for new JVM interface ACC_INTERFACE 0x0200 interface. Interface ACC_ABSTRACT 0x0400 Is abstract; Class, interface
155
Access Flags
The access_flags modifiers Flag Name Value Meaning Used By ACC_PUBLIC 0x0001 Is public; Class, interface ACC_FINAL 0x0010 Is final; Class ACC_SUPER 0x0020 superclass Class, =1 for new JVM interface ACC_INTERFACE 0x0200 interface. Interface ACC_ABSTRACT 0x0400 Is abstract; Class, interface
165
Access Flags (continued)
The access_flags modifiers Flag Name Value Meaning Used By
ACC_SYNTHETIC 0x1000 synthetic
; Not present in the source code.
ACC_ANNOTATION 0x2000 annotation type.
ACC_ENUM 0x4000 enum type.
175
this class
this_class :u2» a valid index into the constant_pool table. » The entry at that index must be a
CONSTANT_Class_info structure representing the class or interface defined by this class file.
185
super_class
super_class» For a class, the value of the super_class
item either must be zero or must be a valid index into the constant_pool table.
» If the value of the super_class item is nonzero, the constant_pool entry at that index must be a CONSTANT_Class_info structure representing the superclass of the class defined by this class file.
195
super_class
super_class» Neither the superclass nor any of its
superclasses may be a final class.» If the value of super_class is zero, then this
class file must represent the class java.lang.Object,
» the only class or interface without a superclass.
205
interfaces
interfaces_count» the number of direct superinterfaces of this class or
interface type.
interfaces[]» Each value in the interfaces array must be a valid
index into the constant_pool table. » The constant_pool entry at each value of
interfaces[i], where i < interfaces_count, must be
a CONSTANT_Class_info structure
215
Fields
fields_count» gives the number of field_info structures in the fields
table.
fields[]» Each entry a variable-length field_info
structure giving a complete description of a field in the class or interface type.
» includes only those fields that are declared by this class or interface.
» does not include fields inherited from superclasses or superinterfaces.
225
Methods
methods_count» gives the number of method_info structures in the
methods table.
methods[]» Each entry a variable-length method_info
structure giving a complete description of Java Virtual Machine code for a method in the class or interface.
» The method_info structures represent all methods, both instance methods, class (static) methods, and constructor methods declared by this class or interface type.
235
Methods
methods[]» includes only those methods explicitly declared
by this class.» Interfaces have only the single method <clinit>,
the interface initialization method. » Constructor methods have the a common name
<init>.» does not include items representing methods
that are inherited from superclasses or superinterfaces.
245
Class Attributes
attributes_count » the number of attributes in the attributes
table of this class
attributes[]» Each value of the attributes table must be a
variable-length attribute structure. » A ClassFile structure can have any number
of attributes associated with it.
25
Internal form of fully qualified names
Replace all dots with /.
EX: a.b.C ==> a/b/C
26
Descriptor A descriptor is a string representing the
type of a field or method. Descriptors are represented in the class
file format using UTF-8 strings. Grammar:
FieldType ::= BaseType| ObjectType
| ArrayType
27
Fields Descriptors BaseType ::= B | C | D | F | I | J | S | Z
» B for byte; Z for boolean» C D F I J S for char, double, float, int, long
and short, respectively. ObjectType ::= Lclassname; ArrayType ::= [ComponentType Ex:
» int[][] ==> [[I» Object[] ==> [java/lang/Object;
28
Field and method descriptors
FieldDescriptor ::= FieldType ComponentType ::= FieldType MethodDescriptor ::= ( ParameterDescriptor* ) ReturnDescriptor ParameterDescriptor ::= FieldDescriptor ReturnDescriotor ::= FieldDescriptor | V
» V for void method Ex: 1. Object mymethod(int i, double d, Thread t) ==> (IDLjava/lang/Thread;)Ljava/lang/Object; 2. void com.Clazz.m(int i) ==> (I)V.
29
The Constant Pool JVM instructions do not rely on the runtime
layout of classes, interfaces, class instances, or arrays.
Instead, instructions refer to symbolic information in the constant_pool table.
All constant_pool table entries have the following general format:
cp_info { u1 tag; u1 info[]; }
30
Constant pool Tags Constant Type Value CONSTANT_Utf8 1 CONSTANT_Integer 3 CONSTANT_Float 4 CONSTANT_Long 5 CONSTANT_Double 6 CONSTANT_Class 7 CONSTANT_String 8 CONSTANT_Fieldref 9 CONSTANT_Methodref 10 CONSTANT_InterfaceMethodref 11 CONSTANT_NameAndType 12
31
The CONSTANT_Utf8_info Structure
Used to represent constant string values. utf-8 encoding :1. 1~127 (xxx xxxx) ==> 1 bytes : 0xxx xxxx 2. 0, \u0080~\u07ff ( xxx xxxx xxxx) ==> 2 bytes: 110x xxxx 10xx xxxx 0 => 1100 0000 1000 00003. \u0800 ~ \uffff (xxxx xxxx xxxx xxxx)
==> 3 bytes: 1110 xxxx 10xx xxxx 10xx xxxx
32
Format CONSTANT_Utf8_info {
u1 tag; // = 1
u2 length;
u1 bytes[length]; }
Notes: tag = 1. length : number of bytes in the encoded utf8 string.
bytes[] bytes[] : contains the bytes of the string. No byte may have the value (byte)0 or lie in the range
(byte)0xf0-(byte)0xff.
33
The CONSTANT_String_info Structure
CONSTANT_String_info { u1 tag; // = 8u2 string_index; }
string_index : » a valid index into a CONSTANT_Utf8_info
34
CONSTANT_Integer_info and CONSTANT_Float_info
CONSTANT_Integer_info { u1 tag; // = 3
u4 bytes; }
CONSTANT_Float_info { u1 tag; // =4 u4 bytes; }
35
CONSTANT_Long_info and CONSTANT_Double_info
CONSTANT_Long_info { u1 tag; // = 5
u8 bytes; }
CONSTANT_Double_info { u1 tag; // =6 u8 bytes; }
36
CONSTANT_Class_info Used to represent a class or an interface:
CONSTANT_Class_info { u1 tag; // = 7u2 name_index; }
name_index is an index into an utf-8 entry encoding a fully qualified class name.
include also array type: [[I, [Lcom/Cl;, etc.
37
CONSTANT_NameAndType_info
used to represent a field or method, without indicating which class or interface type it belongs to:
CONSTANT_NameAndType_info { u1 tag; // = 12 u2 name_index; // index into utf8 string of a simple name u2 descriptor_index; // index into utf8 string
of a filed/method descriptor }
38
Constant fieldref, methodRef
and interfaceMethodRef info CONSTANT_Fieldref_info { u1 tag; // = 9 u2 class_index; // index into a class_info
entry u2 name_and_type_index; } CONSTANT_Methodref_info { u1 tag; // = 10 u2 class_index; u2 name_and_type_index; } CONSTANT_InterfaceMethodref_info { u1 tag;
//11 u2 class_index; u2 name_and_type_index; }
39
Fields Each field is described by a field_info structure. No two fields in one class file may have the same
name and descriptor. Format: field_info {
» u2 access_flags; » u2 name_index; // index to utf8 entry for name» u2 descriptor_index;// index to uft8 entry for field
type » u2 attributes_count; » attribute_info attributes[attributes_count]; }
40
Methods Each method, including each instance
initialization method(<init>) and the class or interface initialization method(<clinit>) , is described by a method_info structure.
Format: method_info { u2 access_flags; u2 name_index; // index into utf8 entry for
name u2 descriptor_index; // index into utf8 entry
u2 attributes_count; attribute_info attributes[attributes_count]; }
41
AttributesAttributes are used in the ClassFile,
field_info, method_info, and Code_attribute structures of the class file format.
General Format: Attribute_info { u2 attribute_name_index; // into utf8 u4 attribute_length; // excluding init 6
bytes u1 info[attribute_length]; }
42
Types of attributes class attributes:
» synthetic, deprecated, sourceFile» innerClasses
field attributes:» constantValue, deprecated
method attributes:» code attributes, deprecated, » Exception attributes
code attributes:» lineNumberTable, localVariableTable
43
ConstantValue Attribute ConstantValue_attribute { u2 attribute_name_index; // index to
utf8 : “conatantValue” u4 attribute_length; // = 2 u2 constantvalue_index; // into primitive(Integer_entry,…) // or string_entry }
44
Code Attribute a variable-length attribute used in the
attribute table of method_info structures. A Code attribute contains
» the Java virtual machine instructions and » auxiliary information for a single method, » instance initialization method , or class or
interface initialization method. Every JVM implementation must recognize
Code attributes. native or abstract, => no this attribute in
its method_info structure» O/W=> has exactly one Code attribute.
45
Code_attribute { u2 attribute_name_index; // into utf8:
“Code” u4 attribute_length; // exclude initial 6
bytes u2 max_stack;// max # of stack slots needed u2 max_locals;
// maximum # of local variables needed // Note: double and long count 2. u4 code_length; u1 code[code_length];
46
u2 exception_table_length; { u2 start_pc; u2 end_pc; //exclusive u2 handler_pc; u2 catch_type;
// index to cp entry f type constant_class_info // zero means called by all exceptions// used to implement finally-clause
} exception_table[exception_table_length]; u2 attributes_count; attribute_info attributes[attributes_count]; }
47
The (uncaught checked)Exception Attribute
appear in the attribute table of a method_info structure.
indicates which checked exceptions a method may throw.
1 Exception attribute in each method_info structure.
Format: Exceptions_attribute {
u2 attribute_name_index; // into utf8:”Exception” u4 attribute_length; u2 number_of_exceptions; u2 exception_index_table[number_of_exceptions];
// each entry an index to cp of class_info }
48
The InnerClasses Attribute a variable-length attribute in the attributes table of
the ClassFile structure. record any inner class/interface referred (or
declared) in this class/interface» If the constant pool of a class or interface refers to any
class or interface that is not a member of a package, its ClassFile structure must have exactly one InnerClasses attribute in its attributes table.
» If a class has members that are classes or interfaces, its constant_pool table (and hence its InnerClasses attribute) must refer to each such member, even if that member is not otherwise mentioned by the class.
49
Format:InnerClasses_attribute { u2 attribute_name_index; // to utf8: “innerClasses u4 attribute_length; u2 number_of_classes; { u2 inner_class_info_index; // into class_info // for each innerclass C in
cp u2 outer_class_info_index; // into class_info // containing class of C
u2 inner_name_index; // into utf8 for simple // name (no / no $); anonymous => zero
u2 inner_class_access_flags; } classes[number_of_classes]; }
50
The Syntheticand Deprecated Attribute
A class member that does not appear in the source code must be marked using a Synthetic attribute.
Synthetic_attribute { » u2 attribute_name_index; // utf8(“synthetic”) » u4 attribute_length; // = zero}
The Deprecated attribute has the following format:
Deprecated_attribute { » u2 attribute_name_index; // utf8(“Deprecated”) » u4 attribute_length; // = 0 }
51
The SourceFile Attribute
SourceFile_attribute { u2 attribute_name_index; // into utf8(“SourceFile”) u4 attribute_length; // = 2 u2 sourcefile_index; // into utf8 entry }
52
LineNumberTable Attribute LineNumberTable_attribute {
» u2 attribute_name_index ; // utf8(“LineNumberTable”)
» u4 attribute_length; » u2 line_number_table_length; » { u2 start_pc; » u2 line_number; }
line_number_table[line_number_table_length];» }
53
The LocalVariableTable Attribute
LocalVariableTable_attribute { » u2 attribute_name_index; //
utf8(“LocalVariableTable”) » u4 attribute_length; » u2 local_variable_table_length; » { u2 start_pc; u2 length; // scope must have» // value in [start_pc, start_pc + length] » u2 name_index; // utf8 for var name » u2 descriptor_index; // uft8 for type» u2 index; // local var index» } local_variable_table[local_variable_table_length];
}