106
C2 The Server Compiler 第5回JVMソースコードリーディングの会 @ytoshima 1

JVM code reading -- C2

Embed Size (px)

DESCRIPTION

A material for fifth JVM source code reading

Citation preview

Page 1: JVM code reading -- C2

C2The Server Compiler

第5回JVMソースコードリーディングの会

@ytoshima

1

Page 2: JVM code reading -- C2

Compilation triggerinvoke* と goto bytecode (to negative offset) の呼び出し数を interpreter でカウントし,しきい値を越えると CompilerBroker にコンパイルのリクエストを出す。キューにリクエストが入り、キューを見ている CompilerThread がコンパイルを開始する。GC

動作とも並列して動作できる様になっている。

2

Page 3: JVM code reading -- C2

Compilation trigger// Method invocationSimpleCompPolicy::method_invocation_event CompileBroker::compile_method(...)

// OSRSimpleCompPolicy::method_back_branch_event CompileBroker::compile_method(...)

3

Page 4: JVM code reading -- C2

Compilation triggerCompileBroker::compile_method CompileBroker::compile_method_base : CompileQueue* queue = compile_queue(comp_level); task = create_compile_task(queue, compile_id, method, osr_bci, comp_level, hot_method, hot_count, comment, blocking);

4

Page 5: JVM code reading -- C2

CompileCompile::Compile : if ((jvms = cg->generate(jvms)) == NULL) // Parse : Optimize(); : Code_Gen();

5

Page 6: JVM code reading -- C2

Compiler data structuresNodeMachNodeTypeciObjectPhaseJVMStatenmethod

6

Page 7: JVM code reading -- C2

Compiler data structures

Compile::build_start_state + aload_0 (this)

7

Page 8: JVM code reading -- C2

Compiler data structures

Graph Text Representation

8

Page 9: JVM code reading -- C2

Compiler data structures$ ~/jdk1.7.0-b147/fastdebug/bin/java -XX:+PrintCompilation -XX:+PrintIdeal -XX:CICompilerCount=1 sum

    214 1 sum::doit (22 bytes)VM option '+PrintIdeal' ...  21" ConI" === 0 [[ 180 ]] #int:0 180" Phi" === 184 21 70 [[ 179 ]] #int !orig=[159],[139],[66] !jvms: sum::doit @ bci:10 179" AddI" === _ 180 181 [[ 178 ]] !orig=[154],[137],70,[145] !jvms: sum::doit @ bci:12 178" AddI" === _ 179 181 [[ 177 ]] !orig=[153],[146],[135],86,[71] !jvms: sum::doit @ bci:14 177" AddI" === _ 178 181 [[ 176 ]] !orig=[165],[152],86,[71] !jvms: sum::doit @ bci:14 148" ConI" === 0 [[ 87 ]] #int:97 176" AddI" === _ 177 181 [[ 190 ]] !orig=[168],[152],86,[71] !jvms: sum::doit @ bci:14

// <idx> <node type> === <in[]> [[out[]]] <additional desc>// jvms = JVMState, root()->dump(9999) would dump IR as above;

Real example: https://gist.github.com/1369656

9

Page 10: JVM code reading -- C2

Ideal Graph Visualizer

level 4 ではパースの各段階と最適化の各段階の IR が表示可能

グラフ表示オプション

10

Page 11: JVM code reading -- C2

Ideal Graph Visualizerpublic int getValue() { return value; }

enum { Control, I_O, Memory, FramePtr, ReturnAdr, Parms };

11

Page 12: JVM code reading -- C2

Ideal Graph Visualizer

Final Code: Mach* node や Epilog, Prolog, Ret などマシン依存のノードになっている

12

Page 13: JVM code reading -- C2

Compiler data structuresIdeal Graph Visualizerhttp://ssw.jku.at/General/Staff/TW/igv.html

etc/idealgraphvisualizer.conf:

default_options="-J-Xmx400m --branding idealgraphvisualizer"

You need a fastdebug or a debug buildIn OpenJDK build dir$ make fastdebug_build OR$ make debug_build

13

Page 14: JVM code reading -- C2

Compiler data structuresOptions to generate data for Ideal Graph Visualizer

-XX:PrintIdealGraphLevel=0 [0:None, 4: most verbose]-XX:PrintIdealGraphPort=4444-XX:PrintIdealGraphAddress=”127.0.0.1”-XX:PrintIdealGraphFile=<path to IR xml file>

IdealGraphVisualizer listens to port 4444 by default.

14

Page 15: JVM code reading -- C2

Compiler data structures// -XX:PrintIdealGraphFile=<path> , IdealGraphViewer can display this<graphDocument> <group> <properties> <p name="name"> virtual jint Call.doit()</p> </properties> <graph name="Bytecode 0: aload_0"> <nodes> <node id="159337448"> <properties> <p name="name"> Root</p> <p name="type"> bottom</p> <p name="idx"> 0</p> ... </properties> </node> <node id="159414516"> ... </node> <node id="159337448"> ... </node> ... </nodes> <edges> <edge index="0" to="159337448" from="159337448"></edge> <edge index="0" to="159414516" from="159414516"></edge>

Example: https://gist.github.com/1369620

15

Page 16: JVM code reading -- C2

Compiler data structuresNode  # _in : Node** // use-def  # _out: Node** // def-use  # _cnt: node_idx_t // # of required inputs  # _max: node_idx_t // actual input array length  # _outcnt: node_idx_t  # _outmax: node_idx_t - _class_id: jushort - _flags: jushort

// _in は ordered, 位置も重要// サブクラスの合計 340 個 <-> C1 62 個// _class_id は 16 bit 値、ideal, mach で node の型を判断// _flags は enum NodeFlags: Flag_is_Copy, Flag_is_Call, // Flag_is_macro, Flag_is_con...

16

Page 17: JVM code reading -- C2

Compiler data structures// Insert a new required input at the endvoid Node::ins_req( uint idx, Node *n ) { assert( is_not_dead(n), "can not use dead node"); add_req(NULL); // Make space ... _in[idx] = n; // Stuff over old required edge if (n != NULL) n->add_out((Node *)this); // Add reciprocal def-use edge}

void add_out( Node *n ) { if (is_top()) return; if( _outcnt == _outmax ) out_grow(_outcnt); _out[_outcnt++] = n; }

17

Page 18: JVM code reading -- C2

Compiler data structuresRegionNode Control の mergePhiNode Control の merge に伴うデータのマージ. 対応する RegionNode を指す。

18

Page 19: JVM code reading -- C2

Compiler data structuresNode // Optimize functions// more ideal node, canonicalizevirtual Node *Ideal(PhaseGVN *phase, bool can_reshape);

// set of values this node can takevirtual const Type *Value( PhaseTransform *phase ) const;

// existing node which computes same virtual Node *Identity( PhaseTransform *phase );

19

Page 20: JVM code reading -- C2

Compiler data structuresNode Ideal defined in Add*Node, MinINode, StartNode, ReturnNode, RethrowNode, SafePointNode, AllocateArrayNode, LockNode, UnlockNode, RegionNode, PhiNode, PCTableNode, NeverBranchNode, CMove*Node, ConstraintCastNode, CheckCastPPNode, Conv?2?Node, Div?Node, Mod?Node, IfNode, LoopNode, CountedLoopNode, LoopLimitNode, Load*Node, Store*Node, ClearArrayNode, StrIntrinsicNode, MemBarNode, MergeMemNode, Mul*Node, And*Node, LShift*Node, URShift*Node, RootNode, HaltNode, Sub*Node, Cmp*Node, BoolNode

Ideal, Value, Identity は多くのサブクラスが目的に応じた物を定義している。

20

Page 21: JVM code reading -- C2

Compiler data structuresAddNode IdealConvert "(x+1)+2" into "x+(1+2)"Convert "(x+1)+y" into "(x+y)+1"Convert "x+(y+1)" into "(x+y)+1"

Add

Add

x Con1 Con2

Add

Add

x Con1 Con2

21

Page 22: JVM code reading -- C2

Compiler data structuresAddINode Ideal Node* in1 = in(1); Node* in2 = in(2); int op1 = in1->Opcode(); int op2 = in2->Opcode(); // Fold (con1-x)+con2 into (con1+con2)-x if ( op1 == Op_AddI && op2 == Op_SubI ) { // Swap edges to try optimizations below in1 = in2; in2 = in(1); op1 = op2; op2 = in2->Opcode(); } if( op1 == Op_SubI ) { "(a-b)+(c-d)" into "(a+c)-(b+d)" "(a-b)+(b+c)" into "(a+c)" "(a-b)+(c+b)" into "(a+c)"

22

Page 23: JVM code reading -- C2

Compiler data structuresconst Type *AddNode::Value(...) // Either input is TOP ==> the result is TOP // Either input is BOTTOM ==> the result is the local BOTTOM // Check for an addition involving the additive identity

23

Page 24: JVM code reading -- C2

Compiler data structuresNode *AddNode::Identity(...) // If either input is a constant 0, return the other input.

const Type *zero = add_id(); // The additive identity if( phase->type( in(1) )->higher_equal( zero ) ) return in(2); if( phase->type( in(2) )->higher_equal( zero ) ) return in(1); return this;

24

Page 25: JVM code reading -- C2

Compiler data structuresNode *AddINode::Identity(...) // Fold (x-y)+y OR y+(x-y) into x

if( in(1)->Opcode() == Op_SubI && phase->eqv(in(1)->in(2),in(2)) ) { return in(1)->in(1); } else if( in(2)->Opcode() == Op_SubI && phase->eqv(in(2)->in(2),in(1)) ) { return in(2)->in(1); } return AddNode::Identity(phase);

25

Page 26: JVM code reading -- C2

Compiler data structuresNode *PhaseGVN::transform_no_reclaim// Return a node which computes the same function // as this node, but in a faster or cheaper fashion.

while( 1 ) { Node *i = k->Ideal(this, /*can_reshape=*/false); if( !i ) break;... } const Type *t = k->Value(this); // Get runtime Value set k->raise_bottom_type(t); Node *i = k->Identity(this); if (i != k) return i;

i = hash_find_insert(k); if( i && (i != k)) return i;

Parse で Node を作ると transform する。Parse しながら GVN

26

Page 27: JVM code reading -- C2

Compiler data structuresNode // raise_bottom_type related TypeNode // Type* _type ConNode // ConINode, ConPNode .. PhiNode // TypePtr* _adr_type // int _inst_id, // inst_index, _inst_offset ConvI2LNode MemNode // TypePtr* _adr_type LoadNode // Type* _type LoadPNode // load obj or arr LoadINodehttps://gist.github.com/1369608

27

Page 28: JVM code reading -- C2

Compiler data structuresNode^RegionNodebasic blocks にマップできる。入力は Control sources. PhiNode は RegionNode を指す入力を持つ。PhiNode へのマージされるデータの入力は RegionNode の入力と一対一の対応を持つ。 PhiNode の 0 の入力は RegionNode で RegionNode の入力 0 は自身。PhiNode* has_phi() const^LoopNode // Simple Loop Header short _loop_flags^RootNode

28

Page 29: JVM code reading -- C2

Compiler data structuresMultiNode

SafePointNode

29

Page 30: JVM code reading -- C2

Compiler data structuresTypeNode^ConNode

30

Page 31: JVM code reading -- C2

Compiler data structuresNode^ProjNode // project a single elem out of a tuple or signature type^ParmNode // incoming Parameters const uint _con; // The field in the tuple we are projecting const bool _is_io_use; // Used to distinguish between the projections // used on the control and io paths from a macro node

31

Page 32: JVM code reading -- C2

Compiler data structuresNode^MergeMem // (See comment in memnode.cpp near MergeMemNode::MergeMemNode for semantics.) in(AliasIdxTop) = in(1) is always the top node in(0) is NULL in(AliasIdxBot) is a "wide" memory state. For in(AliasIdxRaw) = in(3) and above, mem state for alias type <N> or top base_memory() // wide state memory_at(N) // for alias type <N> Identity: base が empty なら base を返す,さも無ければ this Ideal: Simplify stacked MergeMem

32

Page 33: JVM code reading -- C2

Compiler data structuresTypeNode^PhiNode異なるコントロールパスからの値をマージする。Slot 0 は control する RegionNode

33

Page 34: JVM code reading -- C2

Compiler data structuresclass ConINode : public ConNode {public: ConINode( const TypeInt *t ) : ConNode(t) {} virtual int Opcode() const;

// Factory method: static ConINode* make( Compile* C, int con ) { return new (C, 1) ConINode( TypeInt::make(con) ); }class ConNode : public TypeNode {public: ConNode( const Type *t ) : TypeNode(t,1) { init_req(0, (Node*)Compile::current()->root()); init_flags(Flag_is_Con); }class TypeNode : public Node { const Type* const _type; TypeNode( const Type *t, uint required ) : Node

34

Page 35: JVM code reading -- C2

Compiler data structures// Add pointer plus integer to get pointer. NOT commutative, really.// So not really an AddNode. Lives here, because people associate it with// an add.class AddPNode : public Node {public: enum { Control, // When is it safe to do this add? Base, // Base oop, for GC purposes Address, // Actually address, derived from base Offset } ; // Offset added to address AddPNode( Node *base, Node *ptr, Node *off ) : Node(0,base,ptr,off) { init_class_id(Class_AddP); } Identity: if one input is 0, return in(Address), otherwise this Ideal: 左が定数の加算であれば, expression tree を平坦化 raw pointer で NULL なら CastX2PNode(offset) 右が constant の加算なら (ptr + (offset+cn)) を (ptr + offset) + con に変更

35

Page 36: JVM code reading -- C2

Compiler data structures// Return from subroutine nodeclass ReturnNode : public Node {public: ReturnNode( uint edges, Node *cntrl, Node *i_o, Node *memory, Node *retadr, Node *frameptr ); virtual int Opcode() const; virtual bool is_CFG() const { return true; }

36

Page 37: JVM code reading -- C2

Compiler data structuresJVMState JVMState* _caller // for scope chains uint _depth, _locoff, _stkoff, _monoff, uint _scloff // offset of scalar objs uint _endoff uint _sp int _bci ReexecuteState _reexecute ciMethod* _method SafePointNode* _map

37

Page 38: JVM code reading -- C2

Compiler data structuresclass Type {public:  enum TYPES { Bad = 0, Control,    Top,    Int, Long, Half, NarrowOop,    Tuple, Array,     AnyPtr, RawPtr, OopPtr, InstPtr, AryPtr, KlassPtr,    Function, Abio, Return_Address, Memory,    FloatTop, FloatCon, FloatBot,    DoubleTop, DoubleCon, DoubleBot,    Bottom, lasttype };private:  const Type __dual;protected:  const TYPES _base;

38

Page 39: JVM code reading -- C2

Compiler data structuresclass Type {  :public:  TYPES base();  static const Type *make(enum TYPES);  static int cmp(Type*, Type*);  int higher_equal( Type *t)  const Type *meet(Type *t);  virtual const Type *widen(Type *old, Type* limit)  virtual const Type *narrow(Type *old)

39

Page 40: JVM code reading -- C2

Compiler data structuresclass Dict;class Type;class   TypeD;class   TypeF;class   TypeInt;class   TypeLong;class   TypeNarrowOop;class   TypeAry;class   TypeTuple;class   TypePtr;class     TypeRawPtr;class     TypeOopPtr;class       TypeInstPtr;class       TypeAryPtr;class       TypeKlassPtr;

40

Page 41: JVM code reading -- C2

Compiler data structuresPhase Compile GraphKit PhaseCFG PhaseBlockLayout PhaseCoalesce PhaseAggressiveCoalesce PhaseConservativeCoalesce PhaseIFG PhaseLive PhaseMacroExpand PhaseRegAlloc PhaseChaitin PhaseRemoveUseless

PhaseTransform PhaseIdealLoop Matcher PhaseValues PhaseGVN PhaseIterGVN PhaseCCP PhasePeephole PhaseStringOpts

41

Page 42: JVM code reading -- C2

Compiler data structuresclass Phase : public StackObj {public:  enum PhaseNumber { Compiler, Parser,Remove_Useless, ...}protected:  enum PhaseNumber _pnum;public:  Compile * C;}

42

Page 43: JVM code reading -- C2

Compiler data structuresclass Compile : public Phase {  const int        _compile_id;  ciMethod*        _method;  int              _entry_bci;  const TypeFunc*  _tf;   InlineTree*      _ilt;  Arena            _comp_arena;  ConnectionGraph* _congraph;  uint             _unique;  Arena            _node_arena;  RootNode*        _root;  Node*            _top;  :}

43

Page 44: JVM code reading -- C2

Compiler data structuresclass Compile : public Phase {  :  PhaseGVN*         _initial_gvn;  Unique_Node_List  _for_igvn;  WarmCallInfo*     _warm_calls;  PhaseCFG*         _cfg;  Matcher*          _matcher;  PhaseRegAlloc*    _regalloc;  OopMapSet*        _oop_map_set;  :}

44

Page 45: JVM code reading -- C2

Compiler data structuresclass PhaseTransform : public Phase {protected:  Arena* _arena;  Node_Array _nodes;  Type_Array _types;  ConINode*  _icons[...];  ConLNode*  _lcons[...];  ConNode*   _zcons[...];  :}

45

Page 46: JVM code reading -- C2

Compiler data structuresclass PhaseTransform : public Phase {public:  const Type* type(const Node* n) const;  const Type* type_or_null(const Node* n) const;  void set_type(const Node* n, const Type* t);  void set_type_bottom(const Node* n);  void ensure_type_or_null(const Node* n);  ConNode* makecon(const Type* t);  ConINode* intcon(jint i);  ConLNode* longcon(jlong l);  virtual Node *transform(Node *) = 0;  :}

46

Page 47: JVM code reading -- C2

Compiler data structures値をテーブルで管理する機能class PhaseValues : public PhaseTransform {protected:  NodeHash       _table; // for value-numberingpublic:  bool   hash_delete(Node *n);  bool   hash_insert(Node *n);  Node  *hash_find_insert(Node* n);  Node  *hash_find(Node* n);  :}

47

Page 48: JVM code reading -- C2

Compiler data structuresローカルの悲観的な GVN-style の最適化class PhaseGVN : public PhaseValues {public:  Node  *transform(Node *n);  Node  *transform_no_reclaim(Node *n);  :}

48

Page 49: JVM code reading -- C2

Compiler data structures繰り返しのローカル、悲観的 GVN-style 最適化と ideal の変形class PhaseIterGVN : public PhaseGVN {private:  bool _delay_transform;  virtual Node *transform_old(Node *a_node);  void subsume_node(Node *old, Node *nn);protected:  virtual Node *transform(Node *a_node);  void init_worklist(Node *a_root);  virtual const Type* saturate(Type*, Type*, Type*)public:  Unique_Node_List _worklist;  void optimize();  :}

49

Page 50: JVM code reading -- C2

Parse最初のパスでブロックを認識、2番目のパスで各ブロックを訪れ、そのなかのバイトコードを処理して、Node のサブクラスのオブジェクトを作ったり、JVMState を作ったり、更新したり、最適化したり。値の伝播がうまくいく様にブロックに入ってくるブロックが極力 Parse されている様にする。

50

Page 51: JVM code reading -- C2

Parse#0 Parse::do_one_bytecode()#1 Parse::do_one_block()#2 Parse::do_all_blocks()#3 Parse::Parse(JVMState*, ciMethod*, float) ()#4 ParseGenerator::generate(JVMState*) #5 Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool)

51

Page 52: JVM code reading -- C2

Parse do_one_bytecode switch (bc()) { case Bytecodes::_nop: // do nothing break; case Bytecodes::_lconst_0: push_pair(longcon(0)); break; : case Bytecodes::_iconst_5: push(intcon( 5)); break; case Bytecodes::_bipush: push(intcon(iter().get_constant_u1())); break; case Bytecodes::_sipush: push(intcon(iter().get_constant_u2())); break;

makecon, ingcon など定数を表すノードを返す static 関数もある。

52

Page 53: JVM code reading -- C2

Parse do_one_bytecode case Bytecodes::_ldc: case Bytecodes::_ldc_w: case Bytecodes::_ldc2_w: // If the constant is unresolved, run this BC once in the interpreter. { ciConstant constant = iter().get_constant(); if (constant.basic_type() == T_OBJECT && !constant.as_object()->is_loaded()) { int index = iter().get_constant_pool_index();

53

Page 54: JVM code reading -- C2

Parse do_one_bytecode case Bytecodes::_aload_0: push( local(0) ); break; :

case Bytecodes::_aload: push( local(iter().get_index()) ); break;

push, local は結果的に JVMState, SafePointNode の状態を変更。

iter() を使って bytecode の引き数を取って来る事ができる。

54

Page 55: JVM code reading -- C2

Parse do_one_bytecode case Bytecodes::_fstore_0: case Bytecodes::_istore_0: case Bytecodes::_astore_0: set_local( 0, pop() ); break;

: case Bytecodes::_fstore: case Bytecodes::_istore: case Bytecodes::_astore: set_local( iter().get_index(), pop() ); break;

55

Page 56: JVM code reading -- C2

Parse do_one_bytecode case Bytecodes::_pop: _sp -= 1; break; case Bytecodes::_pop2: _sp -= 2; break; case Bytecodes::_swap: a = pop(); b = pop(); push(a); push(b); break; case Bytecodes::_dup: a = pop(); push(a); push(a); break;

56

Page 57: JVM code reading -- C2

Parse do_one_bytecode case Bytecodes::_baload: array_load(T_BYTE); break; case Bytecodes::_caload: array_load(T_CHAR); break; case Bytecodes::_iaload: array_load(T_INT); break; case Bytecodes::_saload: array_load(T_SHORT); break; case Bytecodes::_faload: array_load(T_FLOAT); break; case Bytecodes::_aaload: array_load(T_OBJECT); break; case Bytecodes::_laload: { a = array_addressing(T_LONG, 0); if (stopped()) return; // guaranteed null or range check _sp -= 2; // Pop array and index push_pair( make_load(control(), a, TypeLong::LONG, T_LONG, TypeAryPtr::LONGS)); break; }

57

Page 58: JVM code reading -- C2

Parse do_one_bytecode case Bytecodes::_bastore: array_store(T_BYTE); break; case Bytecodes::_castore: array_store(T_CHAR); break; case Bytecodes::_iastore: array_store(T_INT); break; case Bytecodes::_sastore: array_store(T_SHORT); break; case Bytecodes::_fastore: array_store(T_FLOAT); break; case Bytecodes::_aastore: { d = array_addressing(T_OBJECT, 1); if (stopped()) return; // guaranteed null or range check array_store_check(); c = pop(); // Oop to store b = pop(); // index (already used) a = pop(); // the array itself const TypeOopPtr* elemtype = _gvn.type(a)->is_aryptr()->elem()->make_oopptr(); const TypeAryPtr* adr_type = TypeAryPtr::OOPS; Node* store = store_oop_to_array(control(), a, d, adr_type, c, elemtype, T_OBJECT);

58

Page 59: JVM code reading -- C2

Parse do_one_bytecode case Bytecodes::_getfield: do_getfield(); break;

case Bytecodes::_getstatic: do_getstatic(); break;

case Bytecodes::_putfield: do_putfield(); break;

case Bytecodes::_putstatic: do_putstatic(); break;

59

Page 60: JVM code reading -- C2

Parse do_one_bytecode // implementation of _get* and _put* bytecodes void do_getstatic() { do_field_access(true, false); } void do_getfield () { do_field_access(true, true); } void do_putstatic() { do_field_access(false, false); } void do_putfield () { do_field_access(false, true); }

60

Page 61: JVM code reading -- C2

Parse do_one_bytecodeParse::do_field_access Parse::do_get_xxx(Node* obj, ciField* field, bool is_field) Node *adr = basic_plus_adr(obj, obj, offset); : Node* ld = make_load(NULL, adr, type, bt, adr_type, is_vol);

Node* GraphKit::basic_plus_adr(Node* base, Node* ptr, Node* offset) { // short-circuit a common case if (offset == intcon(0)) return ptr; return _gvn.transform( new (C, 4) AddPNode(base, ptr, offset) );}

61

Page 62: JVM code reading -- C2

Parse do_one_bytecode// factory methods in "int adr_idx"Node* GraphKit::make_load(Node* ctl, Node* adr, const Type* t, BasicType bt,int adr_idx, bool require_atomic_access) { Node* mem = memory(adr_idx); Node* ld; if (require_atomic_access && bt == T_LONG) { ld = LoadLNode::make_atomic(C, ctl, mem, adr, adr_type, t); } else { ld = LoadNode::make(_gvn, ctl, mem, adr, adr_type, t, bt); } return _gvn.transform(ld);}

62

Page 63: JVM code reading -- C2

Parse do_one_bytecodeNode* GraphKit::memory(uint alias_idx) { MergeMemNode* mem = merged_memory(); Node* p = mem->memory_at(alias_idx); _gvn.set_type(p, Type::MEMORY); // must be mapped return p;}

63

Page 64: JVM code reading -- C2

Parse do_one_bytecode_iaddb = pop(), a = pop()push(_gvn.transform( new (C, 3) AddINode(a,b)))

// GraphKit::pop()Node* pop() { ..; return _map->stack(_map->_jvms,--_sp); }// SefePointNode::stackNode *stack(JVMState* jvms, uint idx) const {  return in(jvms->stkoff() + idx);}

64

Page 65: JVM code reading -- C2

Parse do_one_bytecodecase Bytecodes::_iinc:         // Increment local    i = iter().get_index();     // Get local index    set_local( i, _gvn.transform(         new (C, 3) AddINode(             _gvn.intcon(iter().get_iinc_con()), local(i) ) ) );    break;

65

Page 66: JVM code reading -- C2

Parse do_one_bytecode_goto, _goto_w    int target_bci = (bc() == Bytecodes::_goto) ?         iter().get_dest() : iter().get_far_dest();    // If this is a backwards branch in the bytecodes, add Safepoint    maybe_add_safepoint(target_bci);    // Update method data    profile_taken_branch(target_bci);    // Add loop predicate if it goes to a loop    if (should_add_predicate(target_bci)){      add_predicate();    }    // Merge the current control into the target basic block    merge(target_bci);    ...

66

Page 67: JVM code reading -- C2

Parse do_one_bytecode_goto, _goto_w    ...// See if we can get some profile data and hand it off to the next block    Block *target_block = block()->successor_for_bci(target_bci);    if (target_block->pred_count() != 1)  break;    ciMethodData* methodData = method()->method_data();    if (!methodData->is_mature())  break;    ciProfileData* data = methodData->bci_to_data(bci());    assert( data->is_JumpData(), "" );    int taken = ((ciJumpData*)data)->taken();    taken = method()->scale_count(taken);    target_block->set_count(taken);    break;

67

Page 68: JVM code reading -- C2

Parse do_one_bytecodecase _ifnull:    btest = BoolTest::eq; goto handle_if_null;case _ifnonnull: btest = BoolTest::ne; goto handle_if_null;handle_if_null:    // If this is a backwards branch in the bytecodes, add Safepoint    maybe_add_safepoint(iter().get_dest());    a = null();    b = pop();    c = _gvn.transform( new (C, 3) CmpPNode(b, a) );    do_ifnull(btest, c);    break;

68

Page 69: JVM code reading -- C2

Parse do_one_bytecodecase _if_acmpeq: btest = BoolTest::eq; goto handle_if_acmp;case _if_acmpne: btest = BoolTest::ne; goto handle_if_acmp;handle_if_acmp:    // If this is a backwards branch in the bytecodes, add Safepoint    maybe_add_safepoint(iter().get_dest());    a = pop();    b = pop();    c = _gvn.transform( new (C, 3) CmpPNode(b, a) );    do_if(btest, c);    break;

69

Page 70: JVM code reading -- C2

Parse do_one_bytecodecase Bytecodes::_tableswitch:    do_tableswitch();    break;

case Bytecodes::_lookupswitch:    do_lookupswitch();    break;

70

Page 71: JVM code reading -- C2

Parse do_one_bytecodecase Bytecodes::_invokestatic:case Bytecodes::_invokedynamic:case Bytecodes::_invokespecial:case Bytecodes::_invokevirtual:case Bytecodes::_invokeinterface:    do_call();    break;case Bytecodes::_checkcast:    do_checkcast();    break;case Bytecodes::_instanceof:    do_instanceof();    break;

71

Page 72: JVM code reading -- C2

Parse do_one_bytecode

public class Call { public static void main(String[] args) { Call c = new Call(); for (int i = 0; i < 100000; i++) { c.doit(); } } int doit() { return getClass().hashCode(); }}

getClass はインライン展開され、LoadKlass -> メモリアクセスに。hashCode は static に

72

Page 73: JVM code reading -- C2

Parse do_one_bytecodecase Bytecodes::_anewarray:    do_anewarray();    break;case Bytecodes::_newarray:    do_newarray((BasicType)iter().get_index());    break;case Bytecodes::_multianewarray:    do_multianewarray();    break;case Bytecodes::_new:    do_new();    break;

73

Page 74: JVM code reading -- C2

Parse do_one_bytecodecase Bytecodes::_jsr:case Bytecodes::_jsr_w:    do_jsr();    break;

case Bytecodes::_ret:    do_ret();    break;

74

Page 75: JVM code reading -- C2

Parse do_one_bytecodecase Bytecodes::_monitorenter:    do_monitor_enter();    break;

case Bytecodes::_monitorexit:    do_monitor_exit();    break;

75

Page 76: JVM code reading -- C2

OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn)

76

Page 77: JVM code reading -- C2

OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn)

77

Page 78: JVM code reading -- C2

OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize() // worklist から取り出し, node を transform. // node が変わったら,edge 情報を // 更新して, users を worklist に置く while( _worklist.size() ) { Node *n = _worklist.pop(); if (++loop_count >= K * C->unique()) { // 範囲の確認 ...} if (n->outcnt() != 0) { Node *nn = transform_old(n); } else if (!n->is_top()) { remove_dead_node(n); } }

78

Page 79: JVM code reading -- C2

OptimizeNode *PhaseIterGVN::transform_old( Node *n )

Ideal に渡す can_reshape が true である事Constant に計算される物は subsume_node で user を新しいノードをさす様に変更する事

79

Page 80: JVM code reading -- C2

OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn)

80

Page 81: JVM code reading -- C2

OptimizeConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysis

if (congraph->compute_escape()) { // There are non escaping objects. C->set_congraph(congraph); }

congraph は LockNode, UnlockNode で確認し、これらが non-

escape なら処理がなくなる。local なオブジェクトにロック、アンロックは無意味。

81

Page 82: JVM code reading -- C2

OptimizeConnectionGraph::compute_escape()

java object の allocation がなければ false を返すAddP, MergeMem 等を work list にのせる、それらの out ものせる

worklist のノードを細かく調べる

GrowableArray<PointsToNode> _nodes に登録して、GlobalEscape, ArgEscape, NoEscape に分類, 到達可能なノードに伝播する。

// comment in escape.hpp// flags: PrintEscapeAnalysis PrintEliminateAllocations

82

Page 83: JVM code reading -- C2

Optimizeclass ConnectionGraph: public ResourceObj

// escape state of a node PointsToNode::EscapeState escape_state(Node *n);

// other information we have collected bool is_scalar_replaceable(Node *n) { if (_collecting || (n->_idx >= nodes_size())) return false; PointsToNode* ptn = ptnode_adr(n->_idx); return ptn->escape_state() == PointsToNode::NoEscape && ptn->_scalar_replaceable; }

83

Page 84: JVM code reading -- C2

OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn)

84

Page 85: JVM code reading -- C2

OptimizePhaseIdealLoop::PhaseIdealLoop(PhaseIterGVN &igvn, bool do_split_ifs)

build_and_optimize(do_split_ifs);

85

Page 86: JVM code reading -- C2

Optimize// Convert to counted loops where possiblePhaseIdealLoop::is_counted_loop( Node *x, IdealLoopTree *loop ) PhaseIdealLoop::is_counted_loop で CountedLoop への変換を試みる。再帰的に子のループに関しても counted_loop を呼ぶvoid PhaseIdealLoop::do_peeling( IdealLoopTree *loop, Node_List &old_new )// 1回目の実行を切り出す。loopTransform.cpp に図解void PhaseIdealLoop::do_unroll( IdealLoopTree *loop, Node_List &old_new, bool adjust_min_trip )void PhaseIdealLoop::do_maximally_unroll( IdealLoopTree *loop, Node_List &old_new )

// Eliminate range-checks and other trip-counter vs loop-invariant tests.void PhaseIdealLoop::do_range_check( IdealLoopTree *loop, Node_List &old_new )

86

Page 87: JVM code reading -- C2

Optimize -- PhaseIdealLoop

static int doit() { int sm = 0; for (int i = 0; i < 100; i++) sm += i; return sm;}

After Parsing

87

Page 88: JVM code reading -- C2

Optimize -- PhaseIdealLoop

static int doit() { int sm = 0; for (int i = 0; i < 100; i++) sm += i; return sm;}

After CountedLoop

88

Page 89: JVM code reading -- C2

Optimize -- PhaseIdealLoopOptimization Finished

Unrolling?

89

Page 90: JVM code reading -- C2

OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn)

90

Page 91: JVM code reading -- C2

OptimizePhaseCCP ccp( &igvn ) ccp.do_transform C->set_root( transform(C->root())->as_Root() );

定数置き換え可能な物を置き換える

91

Page 92: JVM code reading -- C2

OptimizePhaseIterGVN igvn(initial_gvn)igvn.optimize()ConnectionGraph::do_analysis(this, &igvn) // EscapeAnalysisigvn.optimize()PhaseIdealLoop ideal_loop(igvn, true)PhaseIdealLoop ideal_loop(igvn, ...)PhaseIdealLoop ideal_loop(igvn, ...)PhaseCCP ccp( &igvn )PhaseMacroExpand mex(igvn)

92

Page 93: JVM code reading -- C2

OptimizePhaseMacroExpand mex(igvn)mex.expand_macro_nodes() ... eliminate_allocate_node scalar_replacement

Escape Analysis の結果の処理、allocation をスタック操作に変換?

93

Page 94: JVM code reading -- C2

Code_GenMatcher m(proj_list)m.match()PhaseCFG cfg(node_arena(), root())cfg.Dominators()cfg.Estimate_Block_Frequency()cfg.GlobalCodeMotion(m,unique(),proj)PhaseChaitin regalloc(unique, cfg, m)regalloc->Register_Allocate()PaseBlockLayoutPhasePeepholeOutput

94

Page 95: JVM code reading -- C2

Matcher#0 in addI_eRegNode::Expand(State*, Node_List&, Node*) ()#1 in Matcher::ReduceInst(State*, int, Node*&) ()#2 in Matcher::match_tree(Node const*) ()#3 in Matcher::xform(Node*, int) ()#4 in Matcher::match() ()#5 in Compile::Code_Gen() ()#6 in Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool) ()

95

Page 96: JVM code reading -- C2

Matcher// x86_32.ad, an ADLC fileinstruct addI_eReg(eRegI dst, eRegI src, eFlagsReg cr) %{ match(Set dst (AddI dst src)); effect(KILL cr);

size(2); format %{ "ADD $dst,$src" %} opcode(0x03); ...%}

96

Page 97: JVM code reading -- C2

PhaseCFGPhaseCFG::build_cfg()RegionNode, StartNode を元に CFG (Control Flow Graph) を構築。以降のマシンよりの操作が行える様にする。

97

Page 98: JVM code reading -- C2

PhaseCFGclass PhaseCFG : public Phase+ _num_blocks: uint+ _blocks: RootNode*+ _bbs: Block_Array+ _broot: Block*+ _rpo_ctr: uint+ _root_loop: + _node_latency: GrowableAray<uint>*

98

Page 99: JVM code reading -- C2

PhaseCFGclass Block : public CFGElement+ _nodes : Node_List+ _succs : Block_Array+ _num_succs: uint+ _pre_order: uint // Pre-order DFS #

+ _dom_depth: uint+ _idom : Block*

+ _loop : CFGLoop*+ _rpo : uint: // reg pressure, etc

99

Page 100: JVM code reading -- C2

PhaseCFGPhaseCFG::Dominators()// Lengauer & Tarjan algorithm// Block の _dom_depth, _idom を設定// Code Motion の元になるデータ

PhaseCFG::Estimate_Block_Frequency()// IfNode の probabilities から block// の frequency を算出, Block の親の// field _freq に設定

100

Page 101: JVM code reading -- C2

PhaseCFGPhaseCFG::GlobalCodeMotion schedule_early schedule_late

101

Page 102: JVM code reading -- C2

Register allocation BriggsChaitinレジスタ彩色

変数の生存区間の干渉グラフを既定のレジスタ数の色に塗り分け、解けないならスピルを加えて再試行...の改良版

読めてません

102

Page 103: JVM code reading -- C2

OutputStartNode を MachPrologNode で置き換えUnverified entry point の設定MachEpilogNode を各 return の前に配置ScheduleAndBundle()BuildOopMap()Fill_buffer() CodeBuffer を用意 for (i=0; i < _cfg->numLblocks; i++) for Uj = 0; j < last_inst; j++) … n->emit(*cb, _regalloc)

-XX:+PrintOptoAssembly to dump instructions

https://gist.github.com/1376858

103

Page 104: JVM code reading -- C2

おまけ Sheet2

ページ 1

0 1000 2000 3000 4000 5000 6000 70000

2000000

4000000

6000000

8000000

10000000

12000000

14000000

bytecode size vs arena use

compnoderes

bytecode size

byte

s

104

Page 105: JVM code reading -- C2

Sheet2

ページ 1

0 5000 10000 15000 20000 25000 300000

2000000

4000000

6000000

8000000

10000000

12000000

14000000

compiler memory use

comp_arenanode_arenares_area

unique (number of nodes)

byte

s

105