2. Agenda Amazon Redshift Amazon Redshift 3. Amazon Redshift 4. Amazon Redshift Amazon DynamoDB Amazon RDS Amazon ElastiCache Amazon Redshift SQL NoSQL 3 SSD 5. Amazon Redshift MPP CPUDiskNetwork I/O AWS 6. MPP SELECT * FROM lineitem; CPU CPU CPU CPU CPU CPU 7. SELECT * FROM lineitem; CPU CPU CPU CPU CPU CPU SELECT * FROM part; 50 8. CPU CPU CPU CPU CPU CPU CPU 9. RDBMS Redshift orderid name price 1 Book 100 2 Pen 50 n Eraser 70 orderid name price 1 Book 100 2 Pen 50 n Eraser 70 10. CPU CPU CPU CPU CPU CPU 11. - - 12. 13. DW2 SSD DW1 - Dense Storage vCPU ECU Memory(GB) Storage I/O Price / hourdw1.xlarge 2 4.4 15 2TB HDD 0.30GB/s $1.250dw1.8xlarge 16 35 120 16TB HDD 2.40GB/s $10.000DW2 - Dense Compute dw2.large 2 7 15 0.16TB SSD 0.20GB/s $0.330dw2.8xlarge 32 104 244 2.56TB SSD 3.70GB/s $6.400NEW 14. 3RI 2 dw2.large 730 BIData Integration http://aws.amazon.com/redshift/partners 3RI25% NEW 15. ALL EVENDISTKEY3 * 16. n SELECT * FROM 17. max_cursor_result_set_sizeMB A B C D dw1.8xlarge 450GB * 4 = 1800GB A dw1.8xlarge 225GB * 8 = 1800GB C B D E G F H 18. I/O UNLOADS3 Fetch size http://docs.aws.amazon.com/redshift/latest/mgmt/working-with- parameter-groups.html#max-cursor-result-set-size-param 19. - LZO LZO COPY S3LZOPCOPY 20. COPY MANIFEST S3COPY { "entries": [ {"url":"s3://mybucket-alpha/2013-10-04-custdata", "mandatory":true}, {"url":"s3://mybucket-alpha/2013-10-05-custdata", "mandatory":true}, {"url":"s3://mybucket-beta/2013-10-04-custdata", "mandatory":true}, {"url":"s3://mybucket-beta/2013-10-05-custdata", "mandatory":true} ] } 21. COPY JSONCOPY JSONPath Amazon EMR copy sales from 'emr:// j-1H7OUO3B52HI5/myoutput/part*' credentials 'aws_access_key_id=;aws_secret_access_key='; ID HDFS 22. Cross-Region COPY S3DynamoDB COPY copy customer from 's3://mybucket/customer/customer.tbl.' credentials aws_access_key_id=;aws_secret_access_key= gzip delimiter '|' region 'us-east-1'; 23. Cross-Region 35 24. 25. EVEN DISTKEY ALL 26. EVEN vs. DISTKEY EVEN DISTKEY=p_partkey select trim(name) tablename, slice, sum(rows) from stv_tbl_perm where name='part' group by name, slice order by slice; tablename | slice | sum -----------+-------+--------- part | 0 | 1600000 part | 1 | 1600000 part | 126 | 1600000 part | 127 | 1600000 tablename | slice | sum -----------+-------+--------- part | 0 | 1596925 part | 1 | 1597634 part | 126 | 1610452 part | 127 | 1596154 27. 1. DISTKEY 2. ALL selectsum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part Where (p_partkey = l_partkey 1. DISTKEY 2. ALL 28. DISTKEY 6200995 | almond pale linen | Manufacturer#3| Brand#32 part lineitem 5024338535 | 6200995 | 0.01 |0.08 | A | F |1992-01-02 | 1992-02-14 2201039 | almond pale linen | Manufacturer#1| Brand#11 part lineitem 121932093 | 2201039 | 0.05 |0.43 | D | E |1994-07-11 | 1994-08-23 29. ALL part lineitem part lineitem l_partkey l_partkey p_partkey p_partkey 30. ALL EVEN DISTKEY 31. Workload Management 32. WLM User Group A Short-running queueLong-running queue Long Query Group 33. WLM 34. WLM -> 35. 49 SQL WLM1,10,25,35,49 36. 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 vs. 1 10 25 35 49 37. 38. MUJI Passport -> -> 39. 40. POS POS 3,000 4 POS 2 41. NTT DOCOMO DWH http://www.slideshare.net/minoruetoh/nttr4public 42. Redshift Workload Management 43. https://aws.amazon.com/jp/documentation/redshift/ https://forums.aws.amazon.com/forum.jspa? forumID=155&start=0 https://forums.aws.amazon.com/thread.jspa? threadID=132076&tstart=25 44. Webinar AWS http://aws.amazon.com/jp/aws-jp-introduction/