1
UC Berkeley
Detecting Large-Scale System Problems by Mining Console Logs
Wei Xu (徐葳 )* Ling Huang†
Armando Fox* David Patterson* Michael Jordan*
*UC Berkeley † Intel Labs Berkeley
Jan 18, 2009, IBM Research Beijing
2
Why console logs?
• Detecting problems in large scale Internet services often requires detailed instrumentation
• Instrumentation can be costly to insert & maintain• Often combine open-source building blocks that are not
all instrumented• High code churn
• Can we use console logs in lieu of instrumentation?
+ Easy for developer, so nearly all software has them
– Imperfect: not originally intended for instrumentation
3
The ugly console log
HODIE NATUS EST RADICI FRATER
today unto the Root a brother is born.
* "that crazy Multics error message in Latin." http://www.multicians.org/hodie-natus-est.html
4
Result preview
200 nodes, >24 million lines of logs
Abnormal log segments A single page visualization
ParseDetect
Visualize
• Fully automatic process without any manual input
5
Outline
Key ideas and general methodology• Methodology illustrated with Hadoop case study
• Four step approach• Evaluation
• An extension: supporting online detection• Challenges• Two stage detection• Evaluation
• Related work and future directions
6
Our approach and contribution
Machine Learning VisualizationParsing
Feature Creation
• A general methodology for processing console logs automatically (SOSP’09)
• A novel method for online detection (ICDM ’09)• Validation on real systems• Working on production data from real companies
7
Key insights for analyzing logs
• The log contains the necessary information to create features• Identifiers• State variables• Correlations among messages (sequences)
NORMAL
receiving blk_1received blk_1
receiving blk_2
ERROR
• Console logs are inherently structured• Determined by log printing statement
8
Outline
• Key ideas and general methodology Methodology illustrated with Hadoop case study
• Four step approach• Evaluation
• An extension: supporting online detection• Challenges• Two stage detection• Evaluation
• Related work and future directions
9
• Non-trivial in object oriented languages• String representation of objects• Sub-classing
• Our Solution• Type inference on source code• Pre-compute all possible message templates
Step 1: Parsing
• Free text → semi-structured text• Basic ideas
Receiving block blk_1
Log.info(“Receiving block ” + blockId);
Receiving block (.*) [blockId]
Type: Receiving blockVariables: blockId(String)=blk_1
10
Parsing: Scale parsing with map-reduce
0 20 40 60 80 1000
5
10
15
20
Tim
e us
ed (m
in)
Number of nodes
24 Million lines of console logs= 203 nodes * 48 hours
11
Parsing: Language specific parsers
• Source code based parsers• Java (OO Language)• C / C++ (Macros and complex format strings)• Python (Scripting/dynamic languages)
• Binary based parsers• Java byte code• Linux + Windows binaries using printf format strings
Step 2: Feature creation - Message count vector
• Identifiers are widely used in logs• Variables that identify objects manipulated by the
program• file names, object keys, user ids
• Grouping by identifiers • Similar to execution traces
• Identifiers can be discovered automatically
12
receiving blk_1
receiving blk_1
received blk_1received blk_1
receiving blk_2
received blk_2
receiving blk_2
■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■
■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■
Feature creation – Message count vector example
Receiving blk_1Receiving blk_2Received blk_2
Receiving blk_1Received blk_1Received blk_1
Receiving blk_2
13
0 1 2 0 0 2 0 0 0 0 0 0 0 0 2 2
0 0 1 2 0 0 2 0 0 0 0 0 0 0 12
• Numerical representation of these “traces”• Similar to bag of words model in information retrieval
blk_1
blk_2
Step 3: Machine learning – PCA anomaly detection
• Most of the vectors are normal• Detecting abnormal vectors
• Principal Component Analysis (PCA) based detection• PCA captures normal patterns in these vectors
• Based on correlations among dimensions of the vectors
14NORMAL
receiving blk_1received blk_1
receiving blk_2
ERROR
0 1 2 0 0 2 0 0 0 0 0 0 0 0 2 2
15
Evaluation setup
• Experiment on Amazon’s EC2 cloud• 203 nodes x 48 hours• Running standard map-reduce jobs• ~24 million lines of console logs• ~575,000 HDFS blocks
• 575,000 vectors• ~ 680 distinct ones• Manually labeled each distinct cases
• Normal/abnormal• Tried to learn why it is abnormal• For evaluation only
PCA detection results
16
Anomaly Description Actual Detected1 Forgot to update namenode for deleted block 4297 42972 Write block exception then client give up 3225 32253 Failed at beginning, no block written 2950 29504 Over-replicate-immediately-deleted 2809 27885 Received block that does not belong to any file 1240 12286 Redundant addStoredBlock request received 953 9537 Trying to delete a block, but the block no longer exists on data node 724 6508 Empty packet for block 476 4769 Exception in receiveBlock for block 89 8910 PendingReplicationMonitor timed out 45 4511 Other anomalies 108 107
Total anomalies 16916 16808Normal blocks 558223
Description False Positives1 Normal background migration 13972 Multiple replica ( for task / jobdesc files ) 349
Total 1746False Positives
How can we make the results easy for operators to understand?
Step 4: Visualizing results with decision tree
17
OK
1
1
1
1
0
0
0
writeBlock # received exception
# Starting thread to transfer block # to #
#: Got exception while serving # to #:#
Unexpected error trying to delete block #\. BlockInfo Not found in volumeMap
addStoredBlock request received for # on # size # But it does not belong to any file
# starting thread to transfer block # to #
#Verification succeeded for #
Receiving block # src: # dest: #
ERROR0
<=2
0
0
0
0
0
>=3
>=1
>=3
>=1
>=1
>=1
>=1
>=1
<=2
ERROR
ERROR
ERROR
ERROR
OK
OK
OK
18
The methodology is general
• Surveyed a number of software apps• Linux kernel, OpenSSH• Apache, MySQL, Jetty• Cassandra, Nutch
• Our methods apply to virtually all of them• Same methodology worked on multiple
features / ML algorithms• Another feature “state count vector”• Graph visualization
• WIP: Analyzing production logs at Google
19
Outline
• Key ideas and general methodology• Methodology illustrated with Hadoop case study
• Four step approach• Evaluation
An extension: supporting online detection• Challenges• Two stage detection• Evaluation
• Related work and future directions
What makes it hard to do online detection?
PCA Detection VisualizationParsing
Feature Creation
Message count vector feature is based on sequences, NOT individual messages
20
Online detection: when to make detection?
• Cannot wait for the entire trace• Can last arbitrarily long time
• How long do we have to wait?• Long enough to keep correlations• Wrong cut = false positive
• Difficulties• No obvious boundaries• Inaccurate message ordering • Variations in session duration
21
receiving blk_1
received blk_1
reading blk_1
reading blk_1
deleting blk_1
deleted blk_1
receiving blk_1
received blk_1
Time
Frequent patterns help determine session boundaries
• Key Insight: Most messages/traces are normal• Strong patterns
• “Make common paths fast”• Tolerate noise
22
23
Two stage detection overview
Frequent pattern based filtering
OK
PCA DetectionOK
ERROR
Dominant cases
Non-patternNormal cases
Real anomalies
Parsing
Free text logs200 nodes
Stage 1 - Frequent patterns (1): Frequent event sets
24
receiving blk_1
received blk_1reading blk_1
deleting blk_1
deleted blk_1
receiving blk_1
received blk_1
Time
Coarse cut by time
Find frequent item set
Refine time estimation
reading blk_1
error blk_1
Repeat until all patterns found
PCA Detection
25
Stage 1 - Frequent patterns (2) : Modeling session duration time
• Cannot assume Gaussian• 99.95th percentile estimation is off by half• 45% more false alarms
• Mixture distribution• Power-law tail + histogram head
DurationDuration
Cou
nt
Pr(
X>=
x)
Frequent pattern matching filters most of the normal events
Frequent pattern based filtering
OK
PCA DetectionOK
ERROR
Dominant cases
Non-patternNormal cases
Real anomalies
Parsing*
Free text logs200 nodes
100% 86%
14% 13.97%
0.03%
26
27
Frequent patterns in HDFS
Frequent Pattern 99.95th percentile Duration
% of messages
Allocate, begin write 13 sec 20.3%
Done write, update metadata 8 sec 44.6%
Delete - 12.5%
Serving block - 3.8%
Read exception - 3.2%
Verify block - 1.1%
Total 85.6%
• Covers most messages• Short durations
(Total events ~20 million)
28
Detection latency
• Detection latency is dominated by the wait time
Single event pattern
Frequent pattern (matched)
Frequent pattern (timed out)
Non pattern events
29
Detection accuracy
True Positives
False Positives
False Negatives
Precision Recall
Online 16,916 2,748 0 86.0% 100.0%
Offline 16,808 1,746 108 90.6% 99.3%
(Total trace = 575,319)
• Ambiguity on “abnormal”• Manual labels: “eventually normal”• > 600 FPs in online detection as very long latency• E.g. a write session takes >500sec to complete
(99.99th percentile is 20sec)
30
Outline
• Key ideas and general methodology• Methodology illustrated with Hadoop case study
• Four step approach• Evaluation
• An extension: supporting online detection• Challenges• Two stage detection• Evaluation
Related work and future directions
31
Related work: system monitoring and problem diagnosis
Heavier instrumentation
Numerical tracesSystemHistory[Cohen05]Fingerprints[Bodik10][Lakhina04]
Finer granularity data
Statistical debugging[Zheng06]AjaxScope [Kiciman07]
Path-based problem detectionPinpiont [Chen04]
Source/binary •Aspect-oriented programming•Assertions
Replay-basedDeterministic replay [Altekar09]
Configurable Tracing frameworks•Dtrace•Xtrace [Fonseca07]•VM-based tracing [King05]
Event as time series[Hellerstein02][Lim08][Yamanishi05]
Alternative loggingPeerPressure[Wang04]Heat-ray [Dunagan09][Kandula09]
Runtime predicatesD3S [Liu08]
(Passive) data collection•Syslog•SNMP•Chukwa [Boulon08]
32
Related work: log parsing
• Rule based and scripts• Microsoft log parser, splunk, …
• Frequent pattern based• [Vaarandi04] R. Vaarandi. A breadth-first algorithm for mining frequent patterns from event
logs. In proceedings of INTELLCOMM, 2004.
• Hierarchical• [Fu09] Q. Fu, et al. Execution Anomaly Detection in Distributed Systems through
Unstructured Log Analysis. In Proc. of ICDM 2009• [Makanju09] A. A. Makanju, et al. Clustering event logs using iterative partitioning. In Proc.
of KDD ’09• [Fisher08] K. Fisher, et al. From dirt to shovels: fully automatic tool generation from ad hoc
data. In Proc. of POPL ’08
• Key benefit of using source code• Handles rare messages, which is likely to be more
useful in problem detection
33
Related work: modeling logs
• As free texts• [Stearley04] J. Stearley. Towards informatic analysis of syslogs. In Proc. of CLUSTER,
2004.
• Visualize global trends• [CretuCioCarlie08] G. Cretu-Ciocarlie et al, Hunting for problems with Artemis, In Proc of
WASL 2008• [Tricaud08] S. Tricaud, Picviz: Finding a Needle in a Haystack, In Proc. of WASL 2008
• A single event stream (time series)• [Ma01] S. Ma, et.al.. Mining partially periodic event patterns with unknown periods. In
Proceedings of the 17th International Conference on Data Engineering, 2001. • [Hellerstein02] J. Hellerstein, et.al. Discovering actionable patterns in event data, 2002.• [Yamanishi05] K. Yamanishi, et al. Dynamic syslog mining for network failure monitoring. In
Proceedings of ACM KDD, 2005.
• Multiple streams / state transitions• [Tan08] J. Tan, et al. SALSA: Analyzing Logs as State Machines. In Proc. of WASL 2008• [Fu09] Q. Fu, et al. Execution Anomaly Detection in Distributed Systems through
Unstructured Log Analysis. In Proc. of ICDM 2009
34
Related work: techniques that inspired this project
• Path based analysis• [Chen04] M. Y. Chen, et al. Path-based failure and evolution management. In Proceedings
of NSDI, 2004. • [Fonseca07] R. Fonseca, et.al. Xtrace: A pervasive network tracing framework. In In NSDI,
2007.
• Anomaly detection and kernel methods• [Gartner03] T. Gartner. A survey of kernels for structured data. SIGKDD Explore, 2003.• [Twining03] C. J. Twining, et al. The use of kernel principal component analysis to model
data distributions. Pattern Recognition, 36(1):217–227, January 2003.
• Vector space information retrieval• [Papineni 01] K. Papineni. Why inverse document frequency? In NAACL, 2001.• [Salton87] G. Salton et al. Term weighting approaches in automatic text retrieval. Cornell
Technical report, 1987.
• Using source code as alternative info• [Tan07] L. Tan, et.al.. /*icomment: bugs or bad comments?*/. In Proceedings of ACM SOSP,
2007.
35
Future directions: console log mining
• Distributed log stream processing• Handle large scale cluster + partial failures
• Integration with existing monitoring systems• Allows existing detection algorithm to apply directly
• Logs from the entire software stack• Handling logs across software versions• Allowing feedback from operators• Providing suggestions on improving logging
practice
36
Beyond console logs: textual information in SW dev-operation
CodePaper
37
Summary
http://www.cs.berkeley.edu/~xuw/Wei Xu <[email protected]>
Machine Learning VisualizationParsing
Feature Creation
• A general methodology for processing console logs automatically (SOSP’09)
• A novel method for online detection (ICDM ’09)• Validation on real systems
38
Backup slides
39
Rare messages are important
Receiving block blk_100 src: … dest:...
Receiving block blk_200 src: … dest:...
Receiving block blk_100 src: …dest:…
Received block blk_100 of size 49486737 from …
Received block blk_100 of size 49486737 from …
Pending Replication Monitor timeout block blk_200
40
Compare to natural language (semantics) based approach
• We use console log solely as a trace of program execution
• Semantic meanings sometimes misleading• Does “Error” really mean it?• Missing link between the message and the actual
program execution• Hard to completely eliminate manual search
41
Compare to text based parsing
• Existing work on inferring structure of log messages without source codes• Patterns in strings• Words weighing • Learning “grammar” of logs• Pros -- Don’t need source code analysis• Cons -- Unable to handle rare messages
• 950 possible, only <100 appeared in our log• Per-operation features require highly accurate
parsing
42
Better logging practices
• We do not require developers to change their logging habits
• Use logging• Traces vs. error reporting• Distinguish your messages from others’
• printf(“%d”, i);• Use a logging framework
• Levels, timestamps etc.• Use correct levels (think in a system context)
• Unnecessary WARNINGs and ERRORs
43
Ordering of events
• Message count vectors do not capture any ordering/timing information
• Cannot guarantee the global ordering on distributed systems– Needs significant change to the logging infrastructure
44
Too many alarms -- what can operators do with them?
• Operators see clusters of alarms instead of individual ones• Easy to cluster similar alarms• Most of the alarms are the same
• Better understanding on problems• Deterministic bugs in the systems
• Provides more informative bug reports • Either fix the bug or the log
• Due to transient failures• Clustered around single nodes/subnets/time
• Better understanding of normal patterns
45
Case studies
• Surveyed a number of software apps• Linux kernel, OpenSSH• Apache, MySQL, Jetty• Hadoop, Cassandra, Nutch• Our methods apply to virtually all of them
46
Log parsing process
Step 1: Log parsingChallenges in object oriented languages
47
starting: (.*) [transact][Transaction] [Participant.java:345]
xact (.*) is (.*) [tid, state] [int, String]
LOG.info(“starting:” + transact);
starting: xact 325 is PREPARINGstarting: xact 325 is STARTING at Node1:1000
starting: xact 325 is PREPARING
starting: (.*) [transact][Transaction] [Participant.java:345]
xact (.*) is (.*) at (.*) [tid, state, node] [int, String, Node]
starting: xact 325 is STARTING at Node1:1000
48
Intuition behind PCA
49
Another decision tree
50
Existing work on textual log analysis
• Frequent item set mining– R. Vaarandi. SEC - a lightweight event correlation tool. IEEE Workshop on IP Operations
and Management, 2002.–
• Temporal properties analysis– Y. Liang, A. Sivasubramaniam, and J. Moreira. Filtering failure logs for a bluegene/L
prototype. In Proceedings of IEEE DSN, 2005. – C. Lim, et.al. A log mining approach to failure analysis of enterprise telephony systems. In
Proceedings of IEEE DSN, 2008.
• Statistical modeling of logs– R. K. Sahoo, et.al.. Critical event prediction for proactive management in largescale
computer clusters. In Proceedings of ACM KDD, 2003.
51
Real world logging tools
• Generation• Log4j, Jakarta commons logging, logback, SLF4J, ...• Standard web servers log format (W3C, Apache)
• Collection - syslog and beyond• syslog-ng (better availability)• nsyslog, socklog… (better security)
• Analysis– Query languages for logs
• Microsoft log parser• SQL and continuous query / Complex events processing / GUI – Talend Open Studio
– Manually specified rules• A. Oliner and J. Stearley. What supercomputers say: A study of five system logs. In
Proceedings of IEEE DSN, 2007.• logsurfer, Swatch, OSSEC – an open source host intrusion detection system
– Searching the logs• Splunk
– Log visualization• Syslog watcher …
52
• [1] R. S. Barga, J. Goldstein, M. Ali, and M. Hong. Consistent streaming through time: A vision for event stream processing. In Proceedings of CIDR, volume 2007, pages 363–374, 2007.
• [2] D. Borthakur. The hadoop distributed file system: Architecture and design. Hadoop Project Website, 2007.
• [3] R. Dunia and S. J. Qin. Multi-dimensional fault diagnosis using a subspace approach. In Proceedings of American Control Conference, 1997.
• [4] R. Feldman and J. Sanger. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge Univ. Press, 12 2006.
• [5] R. Fonseca, G. Porter, R. H. Katz, S. Shenker, and I. Stoica. Xtrace: A pervasive network tracing framework. In In NSDI, 2007.
• [6] T. G¨artner. A survey of kernels for structured data. SIGKDD Explor. Newsl.,5(1):49–58, 2003.• [7] T. G¨artner, J.W. Lloyd, and P. A. Flach. Kernels and distances for structured data. Mach.
Learn., 57(3):205–232, 2004.• [8] M. Gilfix and A. L. Couch. Peep (the network auralizer): Monitoring your network with sound.
In LISA ’00: Proceedings of the 14th USENIX conference on System administration, pages 109–118, Berkeley, CA, USA, 2000. USENIX Association.
• [9] L. Girardin and D. Brodbeck. A visual approach for monitoring logs. In LISA ’98: Proceedings of the 12th USENIX conference on System administration, pages 299–308, Berkeley, CA, USA, 1998. USENIX Association.
• [10] C. Gulcu. Short introduction to log4j, March 2002. http://logging.apache.org/log4j.• [11] J. Hellerstein, S. Ma, and C. Perng. Discovering actionable patterns in event data, 2002.
53
• [12] J. E. Jackson and G. S. Mudholkar. Control procedures for residuals associated with principal component analysis. Technometrics, 21(3):341–349, 1979.
• [13] A. Lakhina, M. Crovella, and C. Diot. Diagnosing network-wide traffic anomalies. In SIGCOMM ’04: Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications, pages 219–230, New York, NY, USA, 2004. ACM.
• [14] A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot, E. D. Kolaczyk, and N. Taft. Structural analysis of network traffic flows. In SIGMETRICS ’04/Performance’04: Proceedings of the joint international conference on Measurement and modeling of computer systems, pages 61–72, New York, NY, USA, 2004. ACM.
• [15] Y. Liang, A. Sivasubramaniam, and J. Moreira. Filtering failure logs for a bluegene/L prototype. In DSN ’05: Proceedings of the 2005 International Conference on Dependable Systems and Networks, pages 476–485, Washington, DC, USA, 2005. IEEE Computer Society.
• [16] C. Lim, N. Singh, and S. Yajnik. A log mining approach to failure analysis of enterprise telephony systems. In Proceedings of International conference on dependable systems and networks, June 2008.
• [17] S. Ma and J. L. Hellerstein. Mining partially periodic event patterns with unknown periods. In Proceedings of the 17th International Conference on Data Engineering, pages 205–214, Washington, DC, USA, 2001. IEEE Computer Society.
• [18] I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, and T. Euler. Yale: Rapid prototyping for complex data mining tasks. In L. Ungar, M. Craven, D. Gunopulos, and T. Eliassi-Rad, editors, KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 935–940, New York, NY, USA, August 2006. ACM.
54
• [19] A. Oliner and J. Stearley. What supercomputers say: A study of five system logs. In DSN ’07: Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pages 575–584, Washington, DC, USA, 2007. IEEE Computer Society.
• [20] J. E. Prewett. Analyzing cluster log files using logsurfer. In in Proceedings of the 4th Annual Conference on Linux Clusters, 2003.
• [21] R. K. Sahoo, A. J. Oliner, I. Rish, M. Gupta, J. E. Moreira, S. Ma, R. Vilalta, and A. Sivasubramaniam. Critical event prediction for proactive management in largescale computer clusters. In KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 426–435, New York, NY, USA, 2003. ACM.
• [22] J. Stearley. Towards informatic analysis of syslogs. In CLUSTER ’04: Proceedings of the 2004 IEEE International Conference on Cluster Computing, pages 309–318, Washington, DC, USA, 2004. IEEE Computer Society.
• [23] J. Stearley and A. J. Oliner. Bad words: Finding faults in spirit’s syslogs. In CCGRID ’08: Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID), pages 765–770, Washington, DC, USA, 2008. IEEE Computer Society.
• [24] T. Takada and H. Koike. Tudumi: information visualization system for monitoring and auditing computer logs. pages 570–576, 2002.
• [25] C. J. Twining and C. J. Taylor. The use of kernel principal component analysis to model data distributions. Pattern Recognition, 36(1):217–227, January 2003.
• [26] R. Vaarandi. Sec - a lightweight event correlation tool. IP Operations and Management, 2002 IEEE Workshop on, pages 111–115, 2002.
• [27] R. Vaarandi. A data clustering algorithm for mining patterns from event logs. IP Operations and Management, 2003. (IPOM 2003). 3rd IEEE Workshop on, pages 119–126, Oct. 2003.
55
• [28] R. Vaarandi. A breadth-first algorithm for mining frequent patterns from event logs. In F. A. Aagesen, C. Anutariya, and V.Wuwongse, editors, INTELLCOMM, volume 3283 of Lecture Notes in Computer Science, pages 293–308. Springer, 2004.
• [29] I. H. Witten and E. Frank. Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann Publishers Inc., 2000.
• [30] K. Yamanishi and Y. Maruyama. Dynamic syslog mining for network failure monitoring. In KDD ’05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 499–508, New York, NY, USA, 2005. ACM.
• [31] P. Bodik, G. Friedman, L. Biewald, H. Levine, G. Candea, K. Patel, G. Tolle, J. Hui, A. Fox, M. I. Jordan, and D. Patterson. Combining visualization and statistical analysis to improve operator confidence and efficiency for failure detection and localization. volume 0, pages 89–100, Los Alamitos, CA, USA, 2005. IEEE Computer Society.
• [32] M. Y. Chen, A. Accardi, E. Kiciman, J. Lloyd, D. Patterson, A. Fox, and E. Brewer. Path-based faliure and evolution management. In NSDI’04: Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation, pages 23–23, Berkeley, CA, USA, 2004. USENIX Association.
• [33] I. Cohen, S. Zhang, M. Goldszmidt, J. Symons, T. Kelly, and A. Fox. Capturing, indexing, clustering, and retrieving system history. SIGOPS Oper. Syst. Rev.,39(5):105–118, 2005.
• [34] E. Kiciman and B. Livshits. Ajaxscope: a platform for remotely monitoring the client-side behavior of web 2.0 applications. In SOSP ’07: Proceedings of twentyfirst ACM SIGOPS symposium on Operating systems principles, pages 17–30, New York, NY, USA, 2007. ACM.
56
• [35] L. Tan, D. Yuan, G. Krishna, and Y. Zhou. /*icomment: bugs or bad comments?*/. In SOSP ’07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, pages 145–158, New York, NY, USA, 2007. ACM.
• [36] K. Papineni. Why inverse document frequency? In NAACL ’01: Second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies 2001, pages 1–8, Morristown, NJ, USA, 2001. Association for Computational Linguistics.
• [37] G. Salton and C. Buckley. Term weighting approaches in automatic text retrieval. Technical report, Ithaca, NY, USA, 1987.
• Samuel T. King, George W. Dunlap, Peter M. Chen, "Debugging operating systems with time-traveling virtual machines", Proceedings of the 2005 Annual USENIX Technical Conference , April 2005 (presentation by Sam King). Best Paper Award.
• [Dunagan09] J. Dunagan, et al. Heat-ray: Combating Identity Snowball Attacks using Machine Learning, Combinatorial Optimization and Attack Graphs, In Proc. of SOSP 2009
• [Kandula09] Detailed Diagnosis in Enterprise Networks Srikanth Kandula (Microsoft Research), Ratul Mahajan (Microsoft Research), Patrick Verkaik (UC San Diego), Sharad agarwal (Microsoft Research), Jitu Padhye (Microsoft Research), Paramvir Bahl (Microsoft Research), In Proc. of SigComm 2009
• [Kiciman07] Emre Kiciman and Benjamin Livshits, AjaxScope: A Platform for Remotely Monitoring the Client-side Behavior of Web 2.0 Applications. In Proc. of SOSP 2007
• [King05] Samuel T. King, George W. Dunlap, Peter M. Chen, "Debugging operating systems with time-traveling virtual machines", Proceedings of the 2005 Annual USENIX Technical Conference , April 2005 (presentation by Sam King).