44
Amazon Redshift AWS Black Belt Tech Webinar 2014 (旧マイスターシリーズ) アマゾンデータサービスジャパン株式会社 技術本部 エンタープライズ部 橋 徹平

AWS Black Belt Techシリーズ Amazon Redshift

Embed Size (px)

DESCRIPTION

AWS Black Belt Tech Webinar 2014 (旧マイスターシリーズ) Amazon Redshift

Citation preview

  • 1. Amazon Redshift AWS Black Belt Tech Webinar 2014 ()

2. Agenda Amazon Redshift Amazon Redshift 3. Amazon Redshift 4. Amazon Redshift Amazon DynamoDB Amazon RDS Amazon ElastiCache Amazon Redshift SQL NoSQL 3 SSD 5. Amazon Redshift MPP CPUDiskNetwork I/O AWS 6. MPP SELECT * FROM lineitem; CPU CPU CPU CPU CPU CPU 7. SELECT * FROM lineitem; CPU CPU CPU CPU CPU CPU SELECT * FROM part; 50 8. CPU CPU CPU CPU CPU CPU CPU 9. RDBMS Redshift orderid name price 1 Book 100 2 Pen 50 n Eraser 70 orderid name price 1 Book 100 2 Pen 50 n Eraser 70 10. CPU CPU CPU CPU CPU CPU 11. - - 12. 13. DW2 SSD DW1 - Dense Storage vCPU ECU Memory(GB) Storage I/O Price / hourdw1.xlarge 2 4.4 15 2TB HDD 0.30GB/s $1.250dw1.8xlarge 16 35 120 16TB HDD 2.40GB/s $10.000DW2 - Dense Compute dw2.large 2 7 15 0.16TB SSD 0.20GB/s $0.330dw2.8xlarge 32 104 244 2.56TB SSD 3.70GB/s $6.400NEW 14. 3RI 2 dw2.large 730 BIData Integration http://aws.amazon.com/redshift/partners 3RI25% NEW 15. ALL EVENDISTKEY3 * 16. n SELECT * FROM 17. max_cursor_result_set_sizeMB A B C D dw1.8xlarge 450GB * 4 = 1800GB A dw1.8xlarge 225GB * 8 = 1800GB C B D E G F H 18. I/O UNLOADS3 Fetch size http://docs.aws.amazon.com/redshift/latest/mgmt/working-with- parameter-groups.html#max-cursor-result-set-size-param 19. - LZO LZO COPY S3LZOPCOPY 20. COPY MANIFEST S3COPY { "entries": [ {"url":"s3://mybucket-alpha/2013-10-04-custdata", "mandatory":true}, {"url":"s3://mybucket-alpha/2013-10-05-custdata", "mandatory":true}, {"url":"s3://mybucket-beta/2013-10-04-custdata", "mandatory":true}, {"url":"s3://mybucket-beta/2013-10-05-custdata", "mandatory":true} ] } 21. COPY JSONCOPY JSONPath Amazon EMR copy sales from 'emr:// j-1H7OUO3B52HI5/myoutput/part*' credentials 'aws_access_key_id=;aws_secret_access_key='; ID HDFS 22. Cross-Region COPY S3DynamoDB COPY copy customer from 's3://mybucket/customer/customer.tbl.' credentials aws_access_key_id=;aws_secret_access_key= gzip delimiter '|' region 'us-east-1'; 23. Cross-Region 35 24. 25. EVEN DISTKEY ALL 26. EVEN vs. DISTKEY EVEN DISTKEY=p_partkey select trim(name) tablename, slice, sum(rows) from stv_tbl_perm where name='part' group by name, slice order by slice; tablename | slice | sum -----------+-------+--------- part | 0 | 1600000 part | 1 | 1600000 part | 126 | 1600000 part | 127 | 1600000 tablename | slice | sum -----------+-------+--------- part | 0 | 1596925 part | 1 | 1597634 part | 126 | 1610452 part | 127 | 1596154 27. 1. DISTKEY 2. ALL selectsum(l_extendedprice* (1 - l_discount)) as revenue from lineitem, part Where (p_partkey = l_partkey 1. DISTKEY 2. ALL 28. DISTKEY 6200995 | almond pale linen | Manufacturer#3| Brand#32 part lineitem 5024338535 | 6200995 | 0.01 |0.08 | A | F |1992-01-02 | 1992-02-14 2201039 | almond pale linen | Manufacturer#1| Brand#11 part lineitem 121932093 | 2201039 | 0.05 |0.43 | D | E |1994-07-11 | 1994-08-23 29. ALL part lineitem part lineitem l_partkey l_partkey p_partkey p_partkey 30. ALL EVEN DISTKEY 31. Workload Management 32. WLM User Group A Short-running queueLong-running queue Long Query Group 33. WLM 34. WLM -> 35. 49 SQL WLM1,10,25,35,49 36. 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 vs. 1 10 25 35 49 37. 38. MUJI Passport -> -> 39. 40. POS POS 3,000 4 POS 2 41. NTT DOCOMO DWH http://www.slideshare.net/minoruetoh/nttr4public 42. Redshift Workload Management 43. https://aws.amazon.com/jp/documentation/redshift/ https://forums.aws.amazon.com/forum.jspa? forumID=155&start=0 https://forums.aws.amazon.com/thread.jspa? threadID=132076&tstart=25 44. Webinar AWS http://aws.amazon.com/jp/aws-jp-introduction/