41
October 25, 2002 PHPCon 2002 1 Making the Case for PHP at Yahoo! Michael J. Radwin [email protected] http://public.yahoo.com/~radwin/talks/

Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 1

Making the Case for PHP at Yahoo!

Michael J. [email protected]

http://public.yahoo.com/~radwin/talks/

Page 2: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 2

Speaker Info

� Michael J. Radwin� engineer for Yahoo! since 1998� technical lead for the Apache web server� co-leading the PHP crusade at Y!

� Contact Info:� [email protected]� http://www.radwin.org/michael/

Page 3: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 3

Outline

� Motivation� History: from proprietary to Open Source� Choosing a new server-side scripting language

� what the ideal system would look like� languages we didn�t choose� why we picked PHP

� Scaling PHP� Lessons learned

Page 4: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 4

Motivation

What�s so special about Yahoo!?

Page 5: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 5

World�s Biggest Site

� World�s most trafficked Internet destination� Nielsen//NetRatings 8/2002

� Users� 201M unique users� 93M active registered users

� Pageviews� more than 1.5 billion a day

Page 6: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 6

Huge Production Network

� 4500+ servers� 16 co-locations

� USA: Sunnyvale, Santa Clara, San Diego, Washington DC, Dallas

� Intl: England, Central America, South America, Taiwan, Hong Kong, Singapore, China, Australia, India, Japan, Korea

Page 7: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 7

Complicated Software

� Site� 74 properties

� mail, shopping, sports, news, games, pets, etc.

� 25 int�l sites� 13 languages

� Code� 8.1M lines of C/C++� 3.0M lines of Perl� 612 developers

Page 8: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 8

More about Y! Server Software

It didn�t start out so complex�

Page 9: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 9

Y! Server Software: 1994-1995

� FreeBSD 2.1 (on Intel x86)� Filo server and Filo pages

� 676 lines of C� optimized for speed� HTML + ads

� CGIs for �dynamic� content� Search & Suggest A Site

� advertisements client/server� yRPC homegrown RPC

Early YearsStatic Content

Page 10: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 10

Y! Server Software: 1996-1998� FreeBSD 2.1 and 2.2� Apache 1.1� Lots of home-grown software

� free stuff wouldn�t scale, immature� yScript1 page Dynamic content

� similar to Apache SSI� HTML + ads + personalization� content via include & DBM files

� advertisements client/server� UDB (user data base)

� NFS-mounted flat files

Dynamic Content Personalization

Page 11: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 11

� FreeBSD 4.1� a few Solaris boxes (Mail, Geo)

� Apache 1.3.x� yScript2 pages

� like yScript1, but more powerful� interactive forms� business logic in C++

� mod_python (Maps, YP)� UDB goes client/server

� yRPC homegrown RPC

Y! Server Software: 1999-2000

Boom YearsCommunications, Commerce,

Communities

Page 12: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 12

Tradeoffs: App Logic in C++

� Advantages� fast execution speed� strongly typed, mature language

� Disadvantages� edit, compile, link, debug cycle� not conducive to rapid prototyping� too easy to make mistakes with memory

Page 13: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 13

web server

Example: my.yahoo.com

browser

user databaseserver

userprefs

ad server

ads

web server

news, weather,sports scores,stock quotes

yScript

load balancer

yRPCyRPC

feedsfeeds

Page 14: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 14

Yahoo! in 2002

Moving towards Open Source

Page 15: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 15

Yahoo!�s Open Source Paradox

� Open Source software runs our business� Perl� Apache� FreeBSD� GCC (+ GNU toolset)

� Yet we seem to build a lot of our own stuff, too� RPC� server-side page languages� databases

Page 16: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 16

Are We Re-inventing the Wheel?

� When Y! started in �94� free stuff did not scale� too immature� small community

� How about today?� performance� integration� legacy & inertia� �Not Invented Here�

syndrome

Page 17: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 17

Costs of Proprietary Languages

� Maintenance� 3 different variants� C++ bugs

� Training overhead� engineers� design folks

� No integration� authoring tools, DBs

� Limited functionality� yScript2 lacks subroutines!

yScript

Page 18: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 18

Moving to Open Source

� Open Source tech eventually matures� Y! replaced Filo server with Apache in 1996� replacing some DBM and Oracle with MySQL

� Server-side languages natural next step� features, performance, integration, community

� Y! is a cheap company� economic recession 2001-2002� can�t afford to waste engineering resources

Page 19: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 19

Choosing a Language

How we ended up picking PHP

Page 20: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 20

Language Criteria

1. C/C++ extensions2. loops, conditionals3. complex data-types4. pleasant syntax5. runs on FreeBSD6. high performance7. robust, sand-boxed

8. interpreted (or dynamically compiled)

9. low training costs10. i18n support11. clean separation of

presentation/content/app semantics

12. doesn�t require CS degree to use

Page 21: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 21

Why not Apache mod_include?

� Pros� built into Apache, easy to learn/use

� Limited language (no loops, subroutines)� Doesn�t interface with Y! code

� Ads, User Database, etc.� Poor performance

� parses file every time you hit page

Page 22: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 22

Why not ASP or Cold Fusion?

� Pros� lots of 3rd-party integration� professional support

� Cons� CF has ugly syntax� $$ for languages� $$ for Microsoft Windows

Page 23: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 23

Why not Perl?

� Pros� FreeBSD support and performance is great� huge CPAN library� we already use it for offline processing

� Cons� There�s More Than One Way To Do It� poor sandboxing, easy to screw up server� wasn�t designed as web scripting language

Page 24: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 24

Why not JSP, Servlets, or J2EE?

� Pros� strongly typed� good performance (JIT), sandboxing� works w/lots of off-the-shelf software

� But� you can�t really use Java w/o threads� Threads support on FreeBSD is not great

Page 25: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 25

Why not XSLT or ClearSilver?

� Pro: separates HTML presentation from app logic� XSLT

� complicated to set up and understand� ClearSilver

� small developer community� Neither is �procedural� language

� totally different models from PHP/ASP/JSP/yScript2� difficult transition for Y! engineering

Page 26: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 26

So Why Did We Pick PHP?

1. Designed for server side web scripting2. Large, Open Source developer community

� integration, libraries� documentation & training

3. Debugging & profiling tools4. Simple and clear syntax (fits Y! paradigm)5. Performs well in our tests

� efficient (with acceleration)� small enough memory footprint

Page 27: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 27

Benchmarking PHP

�But is it as fast as yScript2?�

Page 28: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 28

Performance Tests

� Languages� PHP 4.1.2 (w/Accel)� yScript2 (proprietary)� YSP (mod_perl)

� Hardware� Pentium III 800Mhz� 512 Mb RAM� FreeBSD 4.3

Page 29: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 29

Performance Tests

� 33K input script, 41K output� Included and evaluated 3 other files

� header, navbar, footer� Echoed environment variables� Pseudo-personalization

� �Hello, mradwin�� Called external C++ library for Ads/UDB

� network delay to fetch data

Page 30: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 30

Performance: Requests

Requests/sec

050

100150200250300350

25 50 75 100 150 200 300 400 500

Concurrent requests

req/

s

PHPYSPHF2kNetwork maxyScript2

Page 31: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 31

Performance: Transfer Rate

Transfer Rate

02000400060008000

100001200014000

25 50 75 100 150 200 300 400 500

Concurrent requests

trans

fer r

ate

(kb/

s)

PHPYSPHF2kyScript2

Page 32: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 32

Performance: Processing Time

Processing time

0100020003000400050006000700080009000

25 50 75 100 150 200 300 400 500

Concurrent requests

ms

PHPYSPHF2kyScript2

Page 33: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 33

Performance: Memory

Active Virtual Memory

0

200000

400000

600000

800000

1000000

25 50 75 100 150 200 300 400 500

Concurrent requests

kbyt

es a

ctiv

e

PHPYSPHF2kyScript2

Page 34: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 34

Performance: Scaling PHP

� Profile your codeforeach ($_SERVER as $k => $v)

if (substr($k, 0, 5) == �HTTP_�)

$str .= substr($k, 5) . �: $v\n�;

versus:if (strncmp($k, �HTTP_�, 5) == 0)

� Implement C and C++ extensions� when you�re willing to trade flexibility for speed

� Use an Accelerator

Page 35: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 35

Lessons Learned

4 months after we started using PHP

Page 36: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 36

Early Adopters

� PHP for new properties� remember.yahoo.com for Sep 11 2002

� Internal tools� content mgmt, package repository, aclviewer

� Most Y! properties integrating slowly� no plans to rewrite entire site� mix PHP, Apache DSOs, yScript1 & yScript2

pages

Page 37: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 37

Coding PHP Takes Discipline

� Shallow learning curve� very easy to get some pages up quickly

� But mixed app/presentation problematic� PHP code and HTML forever intertwined� coding conventions help

� *.inc for function and class libraries� *.php for web pages (call functions, echo $vars)

� use Smarty to enforce separation?

Page 38: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 38

PHP != Perl

� The �implement twice� problem� much offline processing done in Perl� example: tax/shipping calculation for Shopping

� PEAR != CPAN� installer doesn�t work in PHP 4.2.x� repository smaller, less mature than CPAN

� Surprises for people used to coding Perl

Page 39: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 39

Giving Back to Open Source

� We customize Open Source software we use� often improvements are not sent back� many are gross Y!-specific hacks

� Improving our relationship with OS community� FreeBSD (Peter Wemm)� Apache (Sander van Zoest)� PHP (Rasmus Lerdorf)� MySQL (Jeremy Zawodny)

Page 40: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 40

Questions and Answers

Slides online at:http://public.yahoo.com/~radwin/talks/

Page 41: Making the Case for PHP at Yahoo · Ł FreeBSD 4.1 Œ a few Solaris boxes (Mail, Geo) Ł Apache 1.3.x Ł yScript2 pages Œ like yScript1, but more powerful Œ interactive forms Œ

October 25, 2002 PHPCon 2002 41

Legal Mumbo-Jumbo� Text of this presentation is Copyright © 2002 Michael J. Radwin. � Clip art is Copyright © 2002 Microsoft Corporation.� Yahoo!, the Yahoo! logo, the �Jumpin� Y Guy� logo, and other

Yahoo! logos, product & service names are trademarks of Yahoo! Inc.� The Yahoo! Engineering logo is Copyright © 2000 John �JR� Conlin. � The PHP logo is Copyright © 2001, 2002 The PHP Group.� The Open Source, Apache Feather, Active Server Pages, Cold Fusion,

�Powered By FreeBSD�, mod_perl, Apache::ASP, Mason, Java, W3C,Neotonic, and ionCube logos are Copyright © their respective owners.