SolidFire成立于2010年,是一家全闪存阵列的存储厂商,其存储控制器基于标准的x86服务器,最大可扩展到100个节点,2015年12月,SolidFire被NetApp收购;2017年6月,NetApp基于SolidFire推出超融合一体机。
SolidFire提供分布式块存储,类似于ceph rbd,非常灵活,支持动态扩缩容,具有良好的性能。同时具有很多企业特性:如快照,组快照,丰富的API,非常灵活的QOS配置等。
作者:潘晓华Michael
链接:https://www.jianshu.com/p/8a393c68f2c9
來源:简书
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
什么是Trident?
NetApp是CNCF的金牌会员,它开发的Trident是一款开源存储配置程序和流程编排程序。
在没有Trident的环境下,K8s/Openshift环境要使用NetApp存储,就需要,手动在NetApp控制台上创建volume,并设置创建PV,再创建PVC。这些过程需要在两个平台切换操作,而且很麻烦。
部署了Trident后,配置好相应的storageclass,K8s/Openshift平台就可以直接通过storageclass动态自动创建PVC。K8s/Openshift平台通过Trident控制器调用NetApp设备的API从而达到控制NetApp设备目的,如创建volume,并自动创建PV,及PVC,进而让Pod能够使用,此过程是自动的,对平台使用者是无感知的。
Trident本身也是以Pod的形式在K8s/Openshift平台上运行的。
Openshift上部署与使用Trident
准备工作
- 用system:admin登录集群
$ oc login -u system:admin
- 集群能够访问SolidFire机器的MVIP(管理VIP)及SVIP(存储SVIP)
$ telnet $MVIP 443
$ telnet $SVIP 3260
- 安装基本包
$ ansible all -m package -a 'name=lsscsi,iscsi-initiator-utils,sg3_utils,device-mapper-multipath state=present'
$ ansible all -m shell -a 'mpathconf --enable --with_multipathd y'
$ ansible all -m service -a 'name=iscsid enabled=true state=started'
$ ansible all -m service -a 'name=multipathd enabled=true state=started'
$ ansible all -m service -a 'name=iscsi enabled=true state=started'
部署
- 下载安装文件,并解压
$ wget https://github.com/NetApp/trident/releases/download/v18.10.0/trident-installer-18.10.0.tar.gz
$ tar -xf trident-installer-18.10.0.tar.gz
$ cd trident-installer
- 配置安装backend.json文件
$ cp sample-input/backend-solidfire.json setup/backend.json# 修改里面的配置
$ cat setup/backend.json
{"version": 1,"storageDriverName": "solidfire-san","Endpoint": "https://{{用户名}}:{{密码}}@{{管理VIP}}/json-rpc/11.0","SVIP": "{{存储VIP}}:3260","TenantName": "trident","UseCHAP": true,"InitiatorIFace": "default","Types": [{"Type": "Bronze", "Qos": {"minIOPS": 1000, "maxIOPS": 2000, "burstIOPS": 4000}},
{"Type": "Silver", "Qos": {"minIOPS": 4000, "maxIOPS": 6000, "burstIOPS": 8000}},
{"Type": "Gold", "Qos": {"minIOPS": 6000, "maxIOPS": 8000, "burstIOPS": 10000}}]
}
- 创建trident project
$ oc new-project trident
- 安装检查
$ ./tridentctl install --dry-run -n trident
这个步骤会模拟安装过程进行执行一遍,并会删除所有资源。通过模拟对整个环境进行全面的检测。以下是执行的日志
[root@master02 trident-installer]# ./tridentctl install --dry-run -n trident -d
DEBU Initialized logging. logLevel=debug
DEBU Running outside a pod, creating CLI-based client.
DEBU Initialized Kubernetes CLI client. cli=oc flavor=openshift namespace=trident version=1.11.0+d4cacc0
DEBU Validated installation environment. installationNamespace=trident kubernetesVersion=
DEBU Deleted Kubernetes configmap. label="app=trident-installer.netapp.io"namespace=trident
DEBU Namespace exists. namespace=trident
DEBU Deleted Kubernetes object by YAML.
DEBU Deleted installer cluster role binding.
DEBU Deleted Kubernetes object by YAML.
DEBU Deleted installer cluster role.
DEBU Deleted Kubernetes object by YAML.
DEBU Deleted installer service account.
DEBU Removed security context constraint user. scc=privileged user=trident-installer
DEBU Created Kubernetes object by YAML.
INFO Created installer service account. serviceaccount=trident-installer
DEBU Created Kubernetes object by YAML.
INFO Created installer cluster role. clusterrole=trident-installer
DEBU Created Kubernetes object by YAML.
INFO Created installer cluster role binding. clusterrolebinding=trident-installer
INFO Added security context constraint user. scc=privileged user=trident-installer
DEBU Created Kubernetes configmap from directory. label="app=trident-installer.netapp.io" name=trident-installer namespace=trident path=/root/trident-installer/setup
INFO Created installer configmap. configmap=trident-installer
DEBU Created Kubernetes object by YAML.
INFO Created installer pod. pod=trident-installer
INFO Waiting for Trident installer pod to start.
DEBU Trident installer pod not yet started, waiting. increment=280.357322ms message="pod not yet started (Pending)"
DEBU Trident installer pod not yet started, waiting. increment=523.702816ms message="pod not yet started (Pending)"
DEBU Trident installer pod not yet started, waiting. increment=914.246751ms message="pod not yet started (Pending)"
DEBU Trident installer pod not yet started, waiting. increment=1.111778662s message="pod not yet started (Pending)"
DEBU Pod started. phase=Succeeded
INFO Trident installer pod started. namespace=trident pod=trident-installer
DEBU Getting logs. cmd="oc --namespace=trident logs trident-installer -f"
DEBU Initialized logging. logLevel=debug
DEBU Running in a pod, creating API-based client. namespace=trident
DEBU Initialized Kubernetes API client. cli=oc flavor=openshift namespace=trident version=v1.11.0+d4cacc0
DEBU Validated installation environment. installationNamespace=trident kubernetesVersion=v1.11.0+d4cacc0
DEBU Parsed requested volume size. quantity=2Gi
DEBU Dumping RBAC fields. ucpBearerToken= ucpHost= useKubernetesRBAC=true
DEBU Namespace exists. namespace=trident
DEBU PVC does not exist. pvc=trident
DEBU PV does not exist. pv=trident
INFO Starting storage driver. backend=/setup/backend.json
DEBU config: {"Endpoint":"https://admin:root1234@99.248.106.82/json-rpc/11.0","InitiatorIFace":"default","SVIP":"99.248.82.55:3260","TenantName":"trident","Types":[{"Qos":{"burstIOPS":4000,"maxIOPS":2000,"minIOPS":1000},"Type":"Bronze"},{"Qos":{"burstIOPS":8000,"maxIOPS":6000,"minIOPS":4000},"Type":"Silver"},{"Qos":{"burstIOPS":10000,"maxIOPS":8000,"minIOPS":6000},"Type":"Gold"}],"UseCHAP":true,"storageDriverName":"solidfire-san","version":1}
DEBU Storage prefix is absent, will use default prefix.
DEBU Parsed commonConfig: {Version:1 StorageDriverName:solidfire-san BackendName: Debug:false DebugTraceFlags:map[] DisableDelete:false StoragePrefixRaw:[] StoragePrefix:<nil> SerialNumbers:[] DriverContext: LimitVolumeSize:}
DEBU Initializing storage driver. driver=solidfire-san
DEBU Configuration defaults Size=1G StoragePrefix= UseCHAP=true
DEBU Parsed into solidfireConfig DisableDelete=false StorageDriverName=solidfire-san Version=1
DEBU Decoded to &{CommonStorageDriverConfig:0xc42064e0a0 TenantName:trident EndPoint:https://admin:root1234@99.248.106.82/json-rpc/11.0 SVIP:99.248.82.55:3260 InitiatorIFace:default Types:0xc4206d26e0 LegacyNamePrefix: AccessGroups:[] UseCHAP:true DefaultBlockSize:0 SolidfireStorageDriverConfigDefaults:{CommonStorageDriverConfigDefaults:{Size:1G}}}
DEBU Set default block size. defaultBlockSize=512
DEBU Using SF API version from config file. version=11.0
DEBU Initializing SolidFire API client. cfg="{trident https://admin:root1234@99.248.106.82/json-rpc/11.0 99.248.82.55:3260 default 0xc4206d26e0 [] 512 map[]}" endpoint="https://admin:root1234@99.248.106.82/json-rpc/11.0" svip="99.248.82.55:3260"
ERRO Error detected in API response. ID=637 code=500 message=xUnknownAccount name=xUnknownAccount
DEBU Account not found, creating. error="device API error: xUnknownAccount" tenantName=trident
DEBU Created account. accountID=0 tenantName=trident
DEBU SolidFire driver initialized. AccountID=2 InitiatorIFace=default
DEBU Using CHAP, skipped Volume Access Group logic. AccessGroups="[]" SVIP="99.248.82.55:3260" UseCHAP=true driver=solidfire-san
DEBU Added pool for SolidFire backend. attributes="map[media:{Offers: ssd} IOPS:{Min: 1000, Max: 2000} snapshots:{Offer: true} clones:{Offer: true} encryption:{Offer: false} provisioningType:{Offers: thin} backendType:{Offers: solidfire-san}]" backend=solidfire_99.248.82.55 pool=Bronze
DEBU Added pool for SolidFire backend. attributes="map[clones:{Offer: true} encryption:{Offer: false} provisioningType:{Offers: thin} backendType:{Offers: solidfire-san} media:{Offers: ssd} IOPS:{Min: 4000, Max: 6000} snapshots:{Offer: true}]" backend=solidfire_99.248.82.55 pool=Silver
DEBU Added pool for SolidFire backend. attributes="map[snapshots:{Offer: true} clones:{Offer: true} encryption:{Offer: false} provisioningType:{Offers: thin} backendType:{Offers: solidfire-san} media:{Offers: ssd} IOPS:{Min: 6000, Max: 8000}]" backend=solidfire_99.248.82.55 pool=Gold
DEBU Storage driver initialized. driver=solidfire-san
INFO Storage driver loaded. driver=solidfire-san
INFO Dry run completed, no problems found.
DEBU Received EOF from pod logs. container= pod=trident-installer
INFO Waiting for Trident installer pod to finish.
DEBU Pod finished. phase=Succeeded
INFO Trident installer pod finished. namespace=trident pod=trident-installer
DEBU Deleted Kubernetes pod. label="app=trident-installer.netapp.io"namespace=trident
INFO Deleted installer pod. pod=trident-installer
DEBU Deleted Kubernetes configmap. label="app=trident-installer.netapp.io"namespace=trident
INFO Deleted installer configmap. configmap=trident-installer
INFO In-cluster installation completed.
DEBU Deleted Kubernetes object by YAML.
INFO Deleted installer cluster role binding.
DEBU Deleted Kubernetes object by YAML.
INFO Deleted installer cluster role.
DEBU Deleted Kubernetes object by YAML.
INFO Deleted installer service account.
INFO Removed security context constraint user. scc=privileged user=trident-installer
- 正式安装
$ ./tridentctl install -n trident
该步骤是真正的执行。会创建serviceaccount, clusterrolebinding,configmap配置,trident-install pod(该pod在部署完trident deployment后会删除)等, 并会创建一个pv与trident pvc进行初始化操作,最终会创建trident deployment,完成trident的安装。
- trident的安装支持自定义一些配置。
- --etcd-image可指定etcd的镜像(默认是quay.io/coreos/etcd,下载会比较慢)
- --trident-image指定trident的镜像
- --volume-size指定trident持久存储的大小(默认为2GiB)
- --volume-name指定volume名字(默认是etcd-vol)
- --pv指定pv名字(默认是trident)
- --pvc指定pvc名字(默认是trident)
- --generate-custom-yaml将使用的所有配置进行导出到一个setup文件夹,不会对集群做任何操作
- --use-custom-yaml安装setup下的所有yaml文件进行部署trident
以下是执行的日志
[root@master02 trident-installer]# ./tridentctl install -n trident
INFO Created installer service account. serviceaccount=trident-installer
INFO Created installer cluster role. clusterrole=trident-installer
INFO Created installer cluster role binding. clusterrolebinding=trident-installer
INFO Added security context constraint user. scc=privileged user=trident-installer
INFO Created installer configmap. configmap=trident-installer
INFO Created installer pod. pod=trident-installer
INFO Waiting for Trident installer pod to start.
INFO Trident installer pod started. namespace=trident pod=trident-installer
INFO Starting storage driver. backend=/setup/backend.json
INFO Storage driver loaded. driver=solidfire-san
INFO Starting Trident installation. namespace=trident
INFO Created service account.
INFO Created cluster role.
INFO Created cluster role binding.
INFO Added security context constraint user. scc=anyuid user=trident
INFO Created PVC.
INFO Controller serial numbers. serialNumbers="4BZXJB2,85Q8JB2,4BXXJB2,4BXTJB2"
INFO Created iSCSI CHAP secret. secret=trident-chap-solidfire-99-248-82-55-trident
INFO Created PV. pv=trident
INFO Waiting for PVC to be bound. pvc=trident
INFO Created Trident deployment.
INFO Waiting for Trident pod to start.
INFO Trident pod started. namespace=trident pod=trident-57ccdff48f-gtflx
INFO Waiting for Trident REST interface.
INFO Trident REST interface is up. version=18.10.0
INFO Trident installation succeeded.
INFO Waiting for Trident installer pod to finish.
INFO Trident installer pod finished. namespace=trident pod=trident-installer
INFO Deleted installer pod. pod=trident-installer
INFO Deleted installer configmap. configmap=trident-installer
INFO In-cluster installation completed.
INFO Deleted installer cluster role binding.
INFO Deleted installer cluster role.
INFO Deleted installer service account.
INFO Removed security context constraint user. scc=privileged user=trident-installer
- 执行完install后,trident并不会安装之前配置的backend,需要另外再单独添加。(个人觉得netapp这点考虑得有点多余,因为dry-run的时候已经对backend.json作了检查了,直接install将它添加上岂不是更方便)
$ ./tridentctl -n trident create backend -f setup/backend.json
$ ./tridentctl -n trident get backend
+------------------------+----------------+--------+---------+| NAME | STORAGE DRIVER | ONLINE | VOLUMES |
+------------------------+----------------+--------+---------+
| solidfire_99.248.82.55| solidfire-san |true| 0 |
+------------------------+----------------+--------+---------+
- 添加基本的storageclass
将sample-input/storage-class-basic.yaml.templ中的BACKEND_TYPE用指定的backend中的STORAGE DRIVER值替换(此例中为solidfire-san)
$ cat sample-input/storage-class-basic.yaml.templapiVersion: storage.k8s.io/v1kind: StorageClassmetadata:name: basicprovisioner: netapp.io/tridentparameters:backendType:"__BACKEND_TYPE__"
$ sed "s/__BACKEND_TYPE__/solidfire-san/" sample-input/storage-class-basic.yaml.templ | occreate -f -
- 根据backend中的Type创建对应的storageclass
$ cat storage-class-gold.yamlapiVersion: storage.k8s.io/v1kind: StorageClassmetadata:name: goldannotations:
storageclass.kubernetes.io/is-default-class: "true"provisioner: netapp.io/tridentparameters:storagePools:"solidfire_99.248.82.55:Gold"# solidfire_99.248.82.55为backend name;Gold为指定的Type
$ oc create -f storage-class-gold.yaml
查看当前的storageclass
$ oc get sc
NAME PROVISIONER AGE
basic netapp.io/trident 2hgold(default) netapp.io/trident 1h
使用:创建PVC
- 创建第一个PVC
$ cat test-pvc.yamlapiVersion: v1kind: PersistentVolumeClaimmetadata:annotations:
volume.beta.kubernetes.io/storage-class: gold
volume.beta.kubernetes.io/storage-provisioner: netapp.io/trident
trident.netapp.io/reclaimPolicy:"Retain"name: testpvcnamespace: testspec:accessModes:
- ReadWriteOnceresources:requests:storage:1Gi
$ oc create -f test-pvc.yaml
PVC创建的说明:
- volume.beta.kubernetes.io/storage-class为10,11步创建的storageclass
- volume.beta.kubernetes.io/storage-provisioner指定为netapp的trident
- trident.netapp.io/reclaimPolicy指定创建PV的reclaimPolicy,默认为"Delete",支持"Delete"和"Retain",不支持"Recycle"
- accessModes因SolidFire是块存储,只支持ReadWriteOnce
SolidFire功能测试
快照恢复数据
创建快照
基于快照创建新的PVC
指定快照,创建新的存储
查看新建的volume的IQN
基于新的volume创建PV
$ cat test-clone-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: netapp.io/trident
volume.beta.kubernetes.io/storage-class: gold
name: test-dd-testxx-volume
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 100Gi
iscsi:
chapAuthDiscovery: true
chapAuthSession: true
fsType: ext4
iqn: iqn.2010-01.com.solidfire:fs69.test-dd-testxx-volume.169
iscsiInterface: default
lun: 0
secretRef:
name: trident-chap-solidfire-99-248-82-55-tridentnamespace: trident
targetPortal: 99.248.82.55:3260
persistentVolumeReclaimPolicy: Delete
storageClassName: gold
$ oc create -f test-clone-pv.yaml
创建pvc使用手动创建的pv
$ cat test-clone-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test111x
namespace: test-dd
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
组快照
组快照与快照类似,不同之处,它把多个存储卷在同一时间的数据做快照,从而避免数据不一致的情况。同时在恢复的时候,也同时将备份时刻的数据进行恢复。
克隆已有的pvc数据
添加annotations配置trident.netapp.io/cloneFromPVC: test-pvc,创建新的pvc基于已有的PVC test-pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
trident.netapp.io/cloneFromPVC: test-pvc
name: test-clone-pvc
namespace: test-dd
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: gold
SolidFire性能测试
测试环境说明
- openshift 3.11物理机部署:
3 Masters
4 Nodes
- SolidFire 4台Node:型号SF9605
每台Node上为10块SSD盘,每个Node的IOPS为5w,集群最高IOPS 20w
- 每块PV存储设置为gold类型storageclass:
{"Type": "Gold", "Qos": {"minIOPS": 6000, "maxIOPS": 8000, "burstIOPS": 10000}}
dd测试
# 测试命令
$ dd if=/dev/zero of=/data/dd.test bs=4k count=200000 oflag=direct
- 单个pod,单个pv作dd命令测试
创建deployment进行测试
$ cat test0-pvc.yamlkind: PersistentVolumeClaimmetadata:annotations:
volume.beta.kubernetes.io/storage-class: gold
volume.beta.kubernetes.io/storage-provisioner: netapp.io/tridentname: test0namespace: test-ddspec:accessModes:
- ReadWriteOnceresources:requests:storage:10Gi
$ oc create -f test0-pvc.yaml ## 创建测试的存储
$ cat dd.yamlapiVersion: apps.openshift.io/v1kind: DeploymentConfigmetadata:labels:run: ddtestname: ddtestspec:replicas:1selector:run: ddteststrategy:type: Recreatetemplate:metadata:labels:run: ddtestspec:containers:
- command:
- /bin/bash
- '-c'
- |
#/bin/bash
dd if=/dev/zero of=/data/out.test1 bs=4k count=200000 oflag=direct
image: tools/iqperf:latest
imagePullPolicy: Always
name: ddtest
volumeMounts:
- mountPath: /data
name: volume-spq10
volumes:
- name: volume-spq10
persistentVolumeClaim:
claimName: test0
triggers:
- type: ConfigChange
$ oc create -f dd.yaml
在webconsole上查看日志如下
200000+0 records in200000+0 records out819200000 bytes (819 MB) copied, 68.8519 s, 11.9 MB/s
NetApp的管理平台上查看集群IO状态,如图(只需要看11:32时间以后部分)
IOPS为2908
- 1个pod,1个pv,8个dd进程
将1中的deploymentconfig中的command内容更新为:
...
- command:
- '/bin/bash'
- '-c'
- |#/bin/bashfor i in {1..8}do
dd if=/dev/zero of=/data/dd.test$i bs=4k count=200000 oflag=direct &done
sleep 1000000
...
IOPS为10000
- 8个pod,8个pv同时使用dd命令测试
创建statefulset,设置volumeClaimTemplates批量创建存储
$ cat dd-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: testdd
namespace: test-dd
spec:
serviceName: testdd
replicas: 8
selector:
matchLabels:
app: testdd
template:
metadata:
labels:
app: testdd
spec:
terminationGracePeriodSeconds: 10
containers:
- name: testdd
containers:
- command:
- /bin/bash
- '-c'
- |#!/bin/bash
dd if=/dev/zero of=/data/out.test1 bs=4k count=2000000 oflag=direct
image: 'harbor.apps.it.mbcloud.com/tools/iqperf:latest'
imagePullPolicy: Always
name: testdd
image: 'tools/dd:latest'
volumeMounts:
- name: data
mountPath: /data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
storageClassName: gold
resources:
requests:
storage: 100Gi
IOPS为33883
- 8个pod,8个pv同时每个pod启用8个dd进程,共64个dd进程测试
更改3中statefulset的command命令如下:
...
- command:
- '/bin/bash'
- '-c'
- |#/bin/bashfor i in {1..8}do
dd if=/dev/zero of=/data/dd.test$i bs=4k count=200000 oflag=direct &done
sleep 1000000
...
IOPS为76832达到了gold Type下设置的IOPS上限
- 50个pod,50个pv同时每个pod启用3个dd进程,共150个dd进程测试
此时单个PV存储的详情
IOPS为205545达到了gold Type下设置的IOPS上限
综合结果如下:
pod数 | pv数 | dd进程数 | IOPS |
---|---|---|---|
1 | 1 | 1 | 2908 |
1 | 1 | 8 | 10000 |
8 | 8 | 8 | 33883 |
8 | 8 | 64 | 76832 |
50 | 50 | 150 | 205545 |
数据库测试
测试工具mydbtest
测试配置
$ mysql -uapp -h172.30.213.17 -papp app -e "create table t_mytest(col1 int);"
$ cat test.conf
option
name app
loop 20000
user app/app@172.30.213.17:3306:app
declare
a int 1030000begin#select * from t_mytest where col1 = :a; # 查询
insert into t_mytest set col1 = :a; # 插入end
执行测试过程
./mydbtest_64.bin query=test.conf degree=40
执行结果
# 插入数据
2019-01-23 17:59:35 Total tran=20000=312/s, qtps=40000=624/s, ela=64046 ms, avg=3202 us
Summary: SQL01 exec=800000, rows=0=0/e, avg=65 us
Summary: SQL02 exec=800000, rows=800000=100/e, avg=3135 us
Summary: exec=12307/s, qtps=24615/s# 创建完索引后,读数据(参考意义不大)
2019-01-23 17:56:31 Total tran=20000=3835/s, qtps=40000=7670/s, ela=5203 ms, avg=260 us
Summary: SQL01 exec=800000, rows=22668078=2833/e, avg=174 us
Summary: SQL02 exec=800000, rows=0=0/e, avg=69 us
Summary: exec=133333/s, qtps=266666/s
插入的qtps为24615/s,性能不错。