| Prepared by: | Thomas Rodriguez (never) on Mon Aug 25 15:16:37 PDT 2008 |
|---|---|
| Workspace: | /export/ws/rpo |
| Compare against: | /export/ws/baseline |
| Summary of changes: | 1334 lines changed: 917 ins; 234 del; 183 mod; 29413 unchg |
| Patch of changes: | 6384206.patch |
| Author comments: |
6384206: Phis which are later unneeded are impairing our ability to inline based on static types Summary: Reviewed-by: This changes C2's parser into a real reverse post order parser. Previously C2 used a pre order which was close to RPO but not quite the same. RPO parsing help guarantee that as often as possible all predecessors of a block are parsed before we start parsing it. This helps code fold up more quickly and feeds more exact types into inlining. There's also a new lightweight bytecode analysis during ciTypeFlow that identifies phis at loop heads which aren't needed. This also helps with our inlining decisions. It's really a band aid until we introduce real post parse inlining. Additionally ciTypeFlow now performs real loop detection which is used to drive loop head cloning in ciTypeFlow. Previously the cloning was based on bcis which was unreliable and the cloning could introduce irreducible loops. I added a new assert in loop opts to check that we don't create irreducible loops during optimization but unfortunately it currently fails periodically because of the split_flow_path optimization in cfgnode.cpp. Until I can do something about that I'm going to leave the assert commented out. I also modified CTW so that the CompileTheWorldStartAt flag only controls when compilation begins, which help with reproducing bugs under CTW. If you start loading later then it's often the case that you don't end up with the same inlining decisions in other compiles. I also changed it to make the code not entrant after the compile which helps keep the code cache from filling up. After this change only 3 of the jar files managed to fill up the code cache during CTW. I added some LogCompilation output to track the loops we encounter during compilation. The current output looks like this: <loop_tree> <loop idx='7598' > <loop idx='8114' inner_loop='1' pre_loop='7614' > </loop> <loop idx='9506' inner_loop='1' main_loop='9506' > </loop> <loop idx='8049' inner_loop='1' post_loop='7614' > </loop> </loop> <loop idx='8265' inner_loop='1' pre_loop='7630' > </loop> <loop idx='9118' inner_loop='1' main_loop='9118' > </loop> <loop idx='8189' inner_loop='1' post_loop='7630' > </loop> <loop idx='7601' inner_loop='1' > </loop> </loop_tree> I haven't finished the performance results yet though mostly they should be neutral and that's what I've seen so far. The main effect of this is to get static binding of call sites for which we previously relied on type profiling to get the right results. In all the interesting cases I'd looked at, previously we generally got the right answer but now we consistently get the right answer so it might make our performance more stable. Tested with CTW, runThese and nsk on i586, amd64, sparc and sparcv9. |
| Bug id: | 6384206: Phis which are later unneeded are impairing our ability to inline based on static types |
| Legend: |
Modified file Deleted file New file |
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/ci/ciTypeFlow.hpp
241 lines changed: 219 ins; 7 del; 15 mod; 692 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/ci/ciTypeFlow.cpp
740 lines changed: 545 ins; 109 del; 86 mod; 2274 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/ci/ciMethodBlocks.hpp
4 lines changed: 2 ins; 1 del; 1 mod; 120 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/ci/ciMethodBlocks.cpp
9 lines changed: 5 ins; 0 del; 4 mod; 389 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/parse.hpp
19 lines changed: 10 ins; 3 del; 6 mod; 546 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/parse1.cpp
165 lines changed: 58 ins; 60 del; 47 mod; 2059 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/compile.hpp
3 lines changed: 3 ins; 0 del; 0 mod; 732 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/compile.cpp
2 lines changed: 2 ins; 0 del; 0 mod; 2530 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/bytecodeInfo.cpp
37 lines changed: 0 ins; 37 del; 0 mod; 506 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/doCall.cpp
1 line changed: 0 ins; 0 del; 1 mod; 863 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/cfgnode.cpp
5 lines changed: 4 ins; 0 del; 1 mod; 2002 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/graphKit.cpp
2 lines changed: 0 ins; 0 del; 2 mod; 3175 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/loopTransform.cpp
2 lines changed: 2 ins; 0 del; 0 mod; 1731 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/loopnode.cpp
46 lines changed: 36 ins; 6 del; 4 mod; 2832 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/loopopts.cpp
4 lines changed: 4 ins; 0 del; 0 mod; 2711 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/opto/memnode.cpp
4 lines changed: 4 ins; 0 del; 0 mod; 3902 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/classfile/classLoader.cpp
48 lines changed: 21 ins; 11 del; 16 mod; 1247 unchg
Cdiffs
Udiffs
Sdiffs
Frames
Old
New
Patch
Raw
src/share/vm/includeDB_compiler2
2 lines changed: 2 ins; 0 del; 0 mod; 1102 unchg
This code review page was prepared using /net/smite.sfbay/never/bin/hgwebrev (vers 23.12-hg-never).