Kudu编译安装部署

大数据

2021-06-11

1241

0

    Kudu是Cloudera开源的新型列式存储系统,是Apache Hadoop生态圈的成员之一(incubating),专门为了对快速变化的数据进行快速的分析,填补了以往Hadoop存储层的空缺。kudu官方没有提供有二进制安装包,只提供了docker和源码编译两种方式,本文将在centos7上采用源码编译来安装部署。

一、环境准备

系统要求:

RHEL 7, RHEL 8, CentOS 7, CentOS 8, Ubuntu 18.04 (bionic), Ubuntu 20.04 (focal)

macOS 10.13 (High Sierra), macOS 10.14 (Mojave), macOS 10.15 (Catalina)

JAVA:

jdk8

二、编译环境准备

1、安装必须的编译工具

sudo yum install autoconf automake cyrus-sasl-devel cyrus-sasl-gssapi \
  cyrus-sasl-plain flex gcc gcc-c++ gdb git java-1.8.0-openjdk-devel \
  krb5-server krb5-workstation libtool make openssl-devel patch \
  pkgconfig redhat-lsb-core rsync unzip vim-common which

2、如果是centos8之前的版本这需要安装devtoolset

sudo yum install centos-release-scl-rh
sudo yum install devtoolset-8

3、如果需要kudu支持NVM (non-volatile memory) 功能,还需要安装memkind

memkind需要1.8以上的版本,是用yum来安装可能版本比较低,这里建议使用源码来编译安装。

sudo yum install numactl-libs numactl-devel
git clone https://github.com/memkind/memkind.git
cd memkind
./build.sh --prefix=/usr
sudo yum remove memkind
sudo make install
sudo ldconfig

4、如果需要构建文档的话,还需要安装ruby

sudo yum install gem graphviz zlib-devel rh-ruby23

三、编译安装kudu

1、从git上拉取源码

git clone https://github.com/apache/kudu

PS:如果拉取失败,可以把"https"更换成"git"

2、编译缺失的第三方依赖

cd kudu
build-support/enable_devtoolset.sh thirdparty/build-if-necessary.sh

PS:此过程比较久,还需要下载很多第三方源码

 3.编译kudu release 版本

mkdir -p build/release
cd build/release
../../build-support/enable_devtoolset.sh \
  ../../thirdparty/installed/common/bin/cmake \
  -DCMAKE_BUILD_TYPE=release ../..
make -j4

 

4、安装

sudo make DESTDIR=/opt/kudu install

DESTDIR为安装目录,默认的话安装到

kudu-tserver and kudu-master executables in /usr/local/sbin
Kudu command line tool in /usr/local/bin
Kudu client library in /usr/local/lib64/
Kudu client headers in /usr/local/include/kudu

5、构建文档

make docs

四、配置参数

1、配置master

mkdir conf
cd conf
cat >>master.gflagfile<<EOF
## Comma-separated list of the RPC addresses belonging to all Masters in this cluster.
## NOTE: if not specified, configures a non-replicated Master.
#--master_addresses=172.16.70.36:7051,172.16.70.25:7051
--rpc_bind_addresses=172.16.70.36:7051

--log_dir=/data/kudu_data/master/logs
--log_filename=kudu1
--fs_wal_dir=/data/kudu_data/master/wal
--fs_data_dirs=/data/kudu_data/master/data
--enable_process_lifetime_heap_profiling=true
--heap_profile_path=/data/kudu_data/master/heap

--rpc-encryption=disabled
--rpc_authentication=disabled
#--unlock_unsafe_flags=true
#--allow_unsafe_replication_factor=true

#--max_log_size=1800
--max_log_size=2048

#--memory_limit_hard_bytes=0
--memory_limit_hard_bytes=1073741824

--default_num_replicas=3
--max_clock_sync_error_usec=10000000
--consensus_rpc_timeout_ms=30000
--follower_unavailable_considered_failed_sec=300
--leader_failure_max_missed_heartbeat_periods=3
#--block_manager_max_open_files=10240
#--server_thread_pool_max_thread_count=-1
--tserver_unresponsive_timeout_ms=60000
--rpc_num_service_threads=10
--max_negotiation_threads=50
--min_negotiation_threads=10
--rpc_negotiation_timeout_ms=3000
--rpc_default_keepalive_time_ms=65000

#--rpc_num_acceptors_per_address=1
--rpc_num_acceptors_per_address=5

#--master_ts_rpc_timeout_ms=30000
--master_ts_rpc_timeout_ms=60000

#--remember_clients_ttl_ms=60000
--remember_clients_ttl_ms=3600000

#--remember_responses_ttl_ms=60000
--remember_responses_ttl_ms=600000

#--rpc_service_queue_length=50
--rpc_service_queue_length=1000

#--raft_heartbeat_interval_ms=500
--raft_heartbeat_interval_ms=60000

#--heartbeat_interval_ms=1000
--heartbeat_interval_ms=60000
--heartbeat_max_failures_before_backoff=3

## You can avoid the dependency on ntpd by running Kudu with --use-hybrid-clock=false
## This is not recommended for production environment.
## NOTE: If you run without hybrid time the tablet history GC will not work. 
## Therefore when you delete or update a row the history of that data will be kept 
## forever. Eventually you may run out of disk space. 
--use_hybrid_clock=false

--webserver_enabled=true
--metrics_log_interval_ms=60000
--webserver_port=8051
#--webserver_doc_root=/home/kudu/www
EOF

 

2、配置tserver

cat >>tserver.gflagfile<<EOF
## Comma-separated list of the RPC addresses belonging to all Masters in this cluster.
## NOTE: if not specified, configures a non-replicated Master.
#--tserver_master_addrs=172.16.70.36:7051,172.16.70.25:7051
--rpc_bind_addresses=172.16.70.36:7050

--log_dir=/data/kudu_data/tserver/logs
--log_filename=kudu1
--fs_wal_dir=/data/kudu_data/tserver/wal
--fs_data_dirs=/data/kudu_data/tserver/data
--enable_process_lifetime_heap_profiling=true
--heap_profile_path=/data/kudu_data/tserver/heap

--rpc-encryption=disabled
--rpc_authentication=disabled
#--unlock_unsafe_flags=true
#--allow_unsafe_replication_factor=true

#--max_log_size=1800
--max_log_size=2048

#--memory_limit_hard_bytes=0
--memory_limit_hard_bytes=1073741824

--default_num_replicas=3
--max_clock_sync_error_usec=10000000
--consensus_rpc_timeout_ms=30000
--follower_unavailable_considered_failed_sec=300
--leader_failure_max_missed_heartbeat_periods=3
#--block_manager_max_open_files=10240
#--server_thread_pool_max_thread_count=-1
--tserver_unresponsive_timeout_ms=60000
--rpc_num_service_threads=10
--max_negotiation_threads=50
--min_negotiation_threads=10
--rpc_negotiation_timeout_ms=3000
--rpc_default_keepalive_time_ms=65000

#--rpc_num_acceptors_per_address=1
--rpc_num_acceptors_per_address=5

#--master_ts_rpc_timeout_ms=30000
--master_ts_rpc_timeout_ms=60000

#--remember_clients_ttl_ms=60000
--remember_clients_ttl_ms=3600000

#--remember_responses_ttl_ms=60000
--remember_responses_ttl_ms=600000

#--rpc_service_queue_length=50
--rpc_service_queue_length=1000

#--raft_heartbeat_interval_ms=500
--raft_heartbeat_interval_ms=60000

#--heartbeat_interval_ms=1000
--heartbeat_interval_ms=60000
--heartbeat_max_failures_before_backoff=3

## You can avoid the dependency on ntpd by running Kudu with --use-hybrid-clock=false
## This is not recommended for production environment.
## NOTE: If you run without hybrid time the tablet history GC will not work. 
## Therefore when you delete or update a row the history of that data will be kept 
## forever. Eventually you may run out of disk space. 
--use_hybrid_clock=false

--webserver_enabled=true
--metrics_log_interval_ms=60000
--webserver_port=8050
#--webserver_doc_root=/home/kudu/www
EOF

 

3、创建kudu用户和相关目录

useradd kudu
mkdir /data/kudu_data/{master,tserver}/{data,wal,logs,heap} -p
chown -R kudu.kudu /data/kudu_data/

 

4、配置启动service和环境变量

========= MASTER =========
cat >>/usr/lib/systemd/system/kudu-master.service<<EOF
[Unit]
Description=Apache Kudu Master Server
Documentation=http://kudu.apache.org

[Service]
Environment=KUDU_HOME=/home/kudu
ExecStart=/home/kudu/build/release/bin/kudu-master --flagfile=/home/kudu/build/release/conf/master.gflagfile
TimeoutStopSec=5
Restart=on-failure
User=kudu
#LimitNOFILE=65535
#LimitNPROC=10240

[Install]
WantedBy=multi-user.target
EOF

========= TSERVER =========
cat >>/usr/lib/systemd/system/kudu-tserver.service<<EOF
[Unit]
Description=Apache Kudu Master Server
Documentation=http://kudu.apache.org

[Service]
Environment=KUDU_HOME=/home/kudu
ExecStart=/home/kudu/build/release/bin/kudu-tserver --flagfile=/home/kudu/build/release/conf/tserver.gflagfile
TimeoutStopSec=5
Restart=on-failure
User=kudu
#LimitNOFILE=65535
#LimitNPROC=10240

[Install]
WantedBy=multi-user.target
EOF

========= 环境变量==============
cat >>/etc/profile<<EOF
export PATH=${PATH}:/home/kudu/build/release/bin
EOF
source /etc/profile

 

5、启动kudu

service kudu-master start
service kudu-tserver start

启动成功后可以访问web server。 http://IP:8050

五、使用客户端操作kudu

kudu多种语言的demo,可以在examples目录下查看

未完待续。。。

转载请注明出处: http://www.julyme.com/20210611/112.html

发表评论

全部评论:0条

Julyme

感觉还行吧。

Julyme的IT技术分享



/sitemap