最具影响力的数字化技术在线社区

168主编 发表于 2017-5-24 19:06:01

GreenPlum 集群 gpfdist 实战

作者:黄杉


并行文件服务gpfdist组件模块,能够实现最大并行度、加载带宽,默认greenplum集群已经有了已经安装了gpfdist,但是如果在单独的服务器上,还是需要再次安装的单独的组件,需要下载一个loaders的组件安装包进行安装。


1,下载下载地址:https://network.pivotal.io/products/pivotal-gpdb#/releases/4540/file_groups/561,选择和greenplumdatabase相同款的loaders,loaders里面包括有gpfdisk组件,下载显示如下:http://img.blog.csdn.net/20170519162637371?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvbWNoZGJh/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/CenterC:\pic\greenplum\005.png

2,安装基础组件
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------wget http://pyyaml.org/download/libyaml/yaml-0.1.7.tar.gztar -xvf yaml-0.1.7.tar.gzcd yaml-0.1.7./configure                                                                           makemake install

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------(1)解压缩unzip greenplum-loaders-4.3.8.2-build-1-RHEL5-x86_64.zip                                           (2)创建软件目录mkdir /data/greenplumchown -R gpadmin:gpadmin /data/greenplum (3)开始安装sh greenplum-loaders-4.3.8.2-build-1-RHEL5-x86_64.bin -y(4)查看组件,可以看到gpfdist和gpload$ ll /data/greenplum/bintotal 756drwxr-xr-x 4 gpadmin gpadmin   4096 May 102016 ext-rwxr-xr-x 1 gpadmin gpadmin 663372 May 102016 gpfdist-rwxr-xr-x 1 gpadmin gpadmin    311 May 102016 gpload-rwxr-xr-x 1 gpadmin gpadmin 100338 May 102016 gpload.py                                  $


3,使用
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------启动命令:nohup /data/greenplum/bin/gpfdist -d /home/gpadmin/ -p 8090 > /home/gpadmin/gpfdist.log& 启动过程:$ nohup /data/greenplum/bin/gpfdist -d /home/gpadmin/ -p 8090 > /home/gpadmin/gpfdist.log& 27003$$ more /home/gpadmin/gpfdist.log2017-05-12 14:10:31 27003 INFO Before opening listening sockets - following listening sockets are available:2017-05-12 14:10:31 27003 INFO IPV6 socket: [::]:80902017-05-12 14:10:31 27003 INFO IPV4 socket: 0.0.0.0:80902017-05-12 14:10:31 27003 INFO Trying to open listening socket:2017-05-12 14:10:31 27003 INFO IPV6 socket: [::]:80902017-05-12 14:10:31 27003 INFO Opening listening socket succeeded2017-05-12 14:10:31 27003 INFO Trying to open listening socket:2017-05-12 14:10:31 27003 INFO IPV4 socket: 0.0.0.0:8090Serving HTTP on port 8090, directory /home/gpadmin$


4,通过gpfdist服务建立的外部表建立测试数据,准备2个txt数据,文件名字t01.txt/t02.txt
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------$ pwd/home/gpadmin/gpdextdata$ more t01.txt1|aaa2|zhangsan $ more t02.txt                                       3|wanger4|mazi $
在greenplum db上建立外部表,指向gpfdist服务的t01.txt、t02.txt数据,建立外部表的sql语句如下,在psql命令窗口上执行:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------create external table public.t01_ext_1 (id integer,name varchar(128))location (/*'gpfdist://101.254.13.72:8090/gpextdata/test001.txt','gpfdist:// 101.254.3.72:8090/gpextdata/test002.txt'*//*'gpfdist:// 101.254.13.72:8090/gpextdata/*.txt'*/'gpfdist://101.254.13.72:8090/gpextdata/t01.txt','gpfdist:// 101.254.13.72:8090/gpextdata/t02.txt')Format 'TEXT' (delimiter as E'|' null as '' escape 'OFF')--Encoding 'GB18030' Log errors into public.test001_err segment reject limit 10 rows               ;

执行过程:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------(1)创建外部表成功:yueworld_db=# create external table public.t01_ext_1 (yueworld_db(# id integer,yueworld_db(# name varchar(128)yueworld_db(# )yueworld_db-# location (yueworld_db(# 'gpfdist://101.254.13.72:8090/gpextdata/t01.txt',yueworld_db(# 'gpfdist:// 101.254.13.72:8090/gpextdata/t02.txt'yueworld_db(# )yueworld_db-# Format 'TEXT' (delimiter as E'|' null as '' escape 'OFF')                                 yueworld_db-# ;CREATE EXTERNAL TABLEyueworld_db=#
yueworld_db=# select * from public.t01_ext_1;;
id | name
----+------
1 | aaa
2 | zhangsan
3 | wanger
4 | mazi
(4 rows)


yueworld_db=#



页: [1]
查看完整版本: GreenPlum 集群 gpfdist 实战