やっ太郎ブログ: 2012

2012年12月26日水曜日

Openstack Swift Objectを間違って消しちゃった太郎

OpenStack Swiftでわざとobjectを消したときの復旧方法基本はないとは思うんだけど、rmとかでミスって消しちゃいました的な感じの対応用に。ちなみにファイル間違って消して、べつのレプリカからcpしても使えない(objectとして認識できない？）状態になるので注意だお(*´ω｀*)

#まずはファイルを確認する。ファイルあるよね。

root@yatta-swift01:/srv/node/sdb1/objects/222204# ls -l
total 4
drwxr-xr-x 3 yatta_swift yatta_swift  45 Dec 26 14:15 159
drwxr-xr-x 3 yatta_swift yatta_swift  45 Dec 26 14:15 2bb
drwxr-xr-x 3 yatta_swift yatta_swift  45 Dec 26 14:15 a82
drwxr-xr-x 3 yatta_swift yatta_swift  45 Dec 26 14:15 b24
drwxr-xr-x 3 yatta_swift yatta_swift  45 Dec 26 14:15 e20
-rw------- 1 yatta_swift yatta_swift 223 Dec 26 14:15 hashes.pkl
root@yatta-swift01:/srv/node/sdb1/objects/222204# 
root@yatta-swift01:/srv/node/sdb1/objects/222204# ls -l e20/d8ff1b40757392c4af291b9839ae1e20/1356405624.80957.data 
-rw------- 1 yatta_swift yatta_swift 211481 Dec 26 14:15 e20/d8ff1b40757392c4af291b9839ae1e20/1356405624.80957.data
root@yatta-swift01:/srv/node/sdb1/objects/222204#

#
#ディレクトリ消してみる。ポイントはレプリのチェックをする
#hashes.pklも一緒にけすこと。

root@yatta-swift01:/srv/node/sdb1/objects/222204# rm -fR e20/ hashes.pkl 
root@yatta-swift01:/srv/node/sdb1/objects/222204# ls -ltr
root@yatta-swift01:/srv/node/sdb1/objects/222204# ls -ltr
total 0
drwxr-xr-x 3 cy_yatta cy_yatta 45 Dec 26 14:15 b24
drwxr-xr-x 3 cy_yatta cy_yatta 45 Dec 26 14:15 a82
drwxr-xr-x 3 cy_yatta cy_yatta 45 Dec 26 14:15 2bb
drwxr-xr-x 3 cy_yatta cy_yatta 45 Dec 26 14:15 159

#
#そのうち復旧される
#

root@yatta-swift01:/srv/node/sdb1/objects/222204# ls -ltr
total 4
drwxr-xr-x 3 cy_yatta cy_yatta  45 Dec 26 14:15 b24
drwxr-xr-x 3 cy_yatta cy_yatta  45 Dec 26 14:15 a82
drwxr-xr-x 3 cy_yatta cy_yatta  45 Dec 26 14:15 2bb
drwxr-xr-x 3 cy_yatta cy_yatta  45 Dec 26 14:15 159
drwxr-xr-x 3 cy_yatta cy_yatta  45 Dec 26 14:42 e20
-rw------- 1 cy_yatta cy_yatta 223 Dec 26 14:42 hashes.pkl
root@yatta-swift01:/srv/node/sdb1/objects/222204# ls -l e20/d8ff1b40757392c4af291b9839ae1e20/
total 208
-rw------- 1 cy_yatta cy_yatta 211481 Dec 26 14:42 1356405624.80957.data

ブラウザからも確認できたお(*´ω｀*) この復旧方法でいいのかちょっと微妙だけど・・・

2012年12月14日金曜日

Openstack swift objectの情報調べる太郎

Swiftでobjectの情報、どのサーバにレプリカがあるかを確認する方法(*´ω｀*)

#実データに対して、swift-object-infoコマンドを実行。
#Ring locationのところにどのサーバにレプリカがあるかが表示される
root@swift-obj03:# swift-object-info /srv/node/sdb1/objects/222228/7a4/d90536ed78c14cf9335ebff89e21f7a4/1355209796.68366.data 
Path: /AUTH_yattarou/test/20121202/22/22/22/33/test8957.jpg
  Account: AUTH_yattarou
  Container: test
  Object: 20121202/22/22/22/33/test8957.jpg
  Object hash: d90536ed78c14cf9335ebff89e21f7a4
Ring locations:
  xxx.xxx.xxx.xx1:6000 - /srv/node/sdb1/objects/222228/7a4/d90536ed78c14cf9335ebff89e21f7a4/1355209796.68366.data
  xxx.xxx.xxx.xx2:6000 - /srv/node/sdb1/objects/222228/7a4/d90536ed78c14cf9335ebff89e21f7a4/1355209796.68366.data
  xxx.xxx.xxx.xx3:6000 - /srv/node/sdb1/objects/222228/7a4/d90536ed78c14cf9335ebff89e21f7a4/1355209796.68366.data
Content-Type: image/jpeg
Timestamp: 2012-12-11 16:09:56.683660 (1355209796.68366)
ETag: 3a650f6dd89d775ef02f48963e821706 (valid)
Content-Length: 211481 (valid)
User Metadata: {}

swift-get-nodesコマンドでも同じような情報がとれるんだけど、表示されたパスをみにいくとファイルがない・・・使い方まちがってるんだろうか。

2012年11月22日木曜日

Openstack Swift recon太郎

Swiftにreconっていう機能があって、loadとかmemoryとかの情報がとれるみたい。これを使って監視とかしたらいんかな。英語よめねからよくわかんね・・・

#proxyサーバからうつぞ
#load averageみる

root@swift-proxy01:~# curl http://xx.xx.xx.xx:6000/recon/load
{"5m": 2.3199999999999998, "15m": 2.27, "processes": 5311, "tasks": "2/364", "1m": 2.4700000000000002}

#memoryの情報チェックするぞ

curl http://xx.xx.xx.xx:6000/recon/mem
{"WritebackTmp": "0 kB", "SwapTotal": "1998840 kB", "Active(anon)": "337664 kB", "SwapFree": "1998840 kB", "DirectMap4k": "6756 kB", "KernelStack": "2936 kB", "MemFree": "128024 kB", "HugePages_Rsvd": "0", "Committed_AS": "4068192 kB", "Active(file)": "4061696 kB", "NFS_Unstable": "0 kB", "VmallocChunk": "34359370072 kB", "Writeback": "0 kB", "Inactive(file)": "15142944 kB", "MemTotal": "24731684 kB", "VmallocUsed": "331016 kB", "HugePages_Free": "0", "AnonPages": "368188 kB", "Active": "4399360 kB", "Inactive(anon)": "30712 kB", "CommitLimit": "14364680 kB", "Hugepagesize": "2048 kB", "Cached": "19014360 kB", "SwapCached": "0 kB", "VmallocTotal": "34359738367 kB", "Shmem": "252 kB", "Mapped": "9884 kB", "SUnreclaim": "811208 kB", "Unevictable": "64 kB", "SReclaimable": "3741636 kB", "Mlocked": "64 kB", "DirectMap2M": "25149440 kB", "HugePages_Surp": "0", "Bounce": "0 kB", "Inactive": "15173656 kB", "PageTables": "5080 kB", "HardwareCorrupted": "0 kB", "HugePages_Total": "0", "Slab": "4552844 kB", "Buffers": "190532 kB", "Dirty": "456 kB"}

#Diskの情報もとれるぞ

curl http://xx.xx.xx.xx:6000/recon/diskusage
[{"device": "sdb1", "avail": 198553079808, "mounted": true, "used": 57811726336, "size": 256364806144}]

#objectのreplicationタイムもとれる(´；ω；｀)
curl http://xx.xx.xx.xx:6000/recon/replication
{"object_replication_time": 21.035725466410319}

#asyncの情報もとれるんだけど、nullかえってきてる・・・
#ここは調べときます。

curl http://10.200.32.41:6000/recon/async
{"async_pending": null}

#swift clientからもたたけるんだな

swift-recon object -r --zone 1
===============================================================================
--> Starting reconnaissance on 1 hosts
===============================================================================
[2012-11-16 14:31:19] Checking on replication
[replication_time] low: 21, high: 21, avg: 21.0, total: 21, Failed: 0.0%, no_result: 0, reported: 1
===============================================================================

Openstack Swift ACL太郎

SwiftのACLの方法でおじゃる。

#containerの作成
swift -A http://PROXY_VIP:8080/auth/v1.0 -U yattarou:yattarou -K yattarou post -r '.r:*' Container_Name
 
#indexの作成
swift -A http://PROXY_VIP:8080/auth/v1.0 -U yattarou:yattarou -K yattarou post -m 'web-index:index.html' Container_Name
 
#list設定
swift -A http://PROXY_VIP:8080/auth/v1.0 -U yattarou:yattarou -K yattarou post -m 'web-listings: true' Container_Name
 
#ACLの確認
swift -A http://PROXY_VIP:8080/auth/v1.0 -U yattarou:yattarou -K yattarou stat Container_Name
 
  Account: AUTH_system
Container: Container_Name
  Objects: 1
    Bytes: 12
 Read ACL: .r:*
Write ACL: 
  Sync To: 
 Sync Key: 
Meta Web-Listings: true
Meta Web-Index: index.html

Openstack swift rebalance太郎

SwiftでのRebalance手順まとめたお(*´ω｀*)(*´ω｀*)

レプリカの一台が復旧できない状態に鳴った場合を想定して、
nodeの切り離し、別のnodeの追加まとめた。

#zone確認
root@swift-proxy01:/etc/swift# swift-ring-builder object.builder
object.builder, build version 3
262144 partitions, 3 replicas, 3 zones, 3 devices, 0.00 balance
The minimum number of hours before a partition can be reassigned is 1
Devices:    id  zone      ip address  port      name weight partitions balance meta
0     1    xx.xxx.xx.41  6000      sdb1   1.00     262144    0.00
1     2    xx.xxx.xx.43  6000      sdb1   1.00     262144    0.00
2     3    xx.xxx.xx.44  6000      sdb1   1.00     262144    0.00

#ringからのremoce
root@swift-proxy01:/etc/swift# swift-ring-builder object.builder remove z1-xx.xxx.xx.41/sdb1
d0z1-xx.xxx.xx.41:6000/sdb1_"" marked for removal and will be removed next rebalance.

#ringへの追加
root@swift-proxy01:/etc/swift# swift-ring-builder object.builder add z1-xx.xxx.xx.45:6000/sdb1 1
Device z1-xx.xxx.xx:6000/sdb1_"" with 1.0 weight got id 3

#rebalance
root@swift-proxy01:/etc/swift# swift-ring-builder object.builder rebalance
Reassigned 262144 (100.00%) partitions. Balance is now 0.00.

#zone確認
#xx.xxx.xx.41がはずれ、45が追加されていることがわかる。
root@swift-proxy01:/etc/swift# swift-ring-builder object.builder
object.builder, build version 6
262144 partitions, 3 replicas, 3 zones, 3 devices, 0.00 balance
The minimum number of hours before a partition can be reassigned is 1
Devices:    id  zone      ip address  port      name weight partitions balance meta
1     2    xx.xxx.xx.43  6000      sdb1   1.00     262144    0.00
2     3    xx.xxx.xx.44  6000      sdb1   1.00     262144    0.00
3     1    xx.xxx.xx.45  6000      sdb1   1.00     262144    0.00
root@swift-proxy01:/etc/swift#

#この時点ではrebalanceは動かないため、proxyサーバに、builderファイル、ringファイルを展開。container object accountにはringファイルを展開
#その後、swift-init all reload"で再読み込みすればrebalanceが開始される。

2012年11月9日金曜日

2012年10月26日金曜日

Openstack Swift構築太郎

openstack swiftの構築手順。 OSはubuntu 10.04 swiftのverは1.7.0(folsum) 構成は、 LB配下にproxy:2 account:3 container:3 object:3 公式ドキュメント参考にして、計11台構成で作ってみたお


#まずはproxyサーバから。
#必要なモジュールの準備

apt-get install python-software-properties
add-apt-repository ppa:swift-core/release
apt-get update
apt-get install curl gcc git-core memcached python-configobj python-coverage python-dev python-nose python-setuptools python-simplejson python-xattr sqlite3 xfsprogs python-webob python-eventlet python-greenlet python-pastedeploy python-netifaces

#ユーザつくる
/usr/sbin/groupadd -g XXXXX yatta_swift
/usr/sbin/useradd -u XXXXX -g yatta_swift yatta_swift


#ディレクトリつくる
mkdir -p /etc/swift

#hashの設定ファイルつくる
cat >/etc/swift/swift.conf <> /etc/fstab
mkdir -p /srv/node/sdb1
mount -a
chown -R yatta_swift: /srv/*

#recon用のディレクトリきっとく
mkdir -p /var/cache/swift
chown -R yatta_swift: /var/cache/swift/

#proxy serverの設定 2台ともやるお

vim /etc/swift/proxy-server.conf
[DEFAULT]
#cert_file = /etc/swift/cert.crt
#key_file = /etc/swift/cert.key
bind_port = 8080
workers = 8
user = yatta_swift
log_facility = LOG_LOCAL1
[pipeline:main]
pipeline = healthcheck cache tempauth staticweb proxy-server

[app:proxy-server]
use = egg:swift#proxy
allow_account_management = true
account_autocreate = true

[filter:tempauth]
use = egg:swift#tempauth
##proxyのinternval vipを設定
user_system_root = testpass .admin https:XXX.XXX.XXX.XXX:8080/
reseller_prefix = /yatta_images/
user_test_tester = testing .admin
user_test2_tester2 = testing2 .admin
user_test_tester3 = testing3

[filter:healthcheck]
use = egg:swift#healthcheck

[filter:cache]
use = egg:swift#memcache
##proxyserverのipを追加
memcache_servers = XXX.XXX.XXX.XXX:11211,XXX.XXX.XXX.XXX:11211

#静的配信もするので、staticwebモジュール追加
[filter:staticweb]
use = egg:swift#staticweb



#ログ周りの設定
vim /etc/rsyslog.d/10-swift.conf 
# Uncomment the following to have a log containing all logs together
#local1.*   /var/log/swift/all.log

# Uncomment the following to have hourly proxy logs for stats processing
#$template HourlyProxyLog,"/var/log/swift/hourly/%$YEAR%%$MONTH%%$DAY%%$HOUR%"
#local1.*;local1.!notice ?HourlyProxyLog

local1.*;local1.!notice /var/log/swift/proxy.log
local1.notice           /var/log/swift/proxy.error
local1.*                ~

vim /etc/rsyslog.conf
$PrivDropToGroup adm

mkdir -p /var/log/swift/hourly
chown -R yatta_swift /var/log/swift
chmod -R g+w /var/log/swift
service rsyslog restart

ここから、container、account、objectの設定するおその前に、swiftの各node間の同期はrynscでやるんでproxyをのぞく各nodeで下記を設定

vim /etc/rsyncd.conf 
uid = yatta_swift
gid = yatta_swift
log file = /var/log/rsyncd.log
pid file = /var/run/rsyncd.pid
address = xxx.xxx.xx.xxx 

[account]
max connections = 100
path = /srv/node/
read only = false
lock file = /var/lock/account.lock

#rysncの設定ファイル編集
vim /etc/default/rsync 
RSYNC_ENABLE=true

#rsync再起動
/etc/init.d/rsync restart

#accountの設定の場合 ※各nodeでやる
vim  /etc/swift/account-server.conf 
[DEFAULT]
#containerのip
bind_ip = XXX.XXX.XXX.XXX
workers = 100
log_facility = LOG_LOCAL1
mount_check = false
disable_fallocate = true
recon_cache_path = /var/cache/swift
user = yatta_swift

[pipeline:main]
pipeline = recon account-server

[app:account-server]
use = egg:swift#account

[filter:recon]
use = egg:swift#recon

[account-replicator]

[account-auditor]

[account-reaper]

#containerの場合
vim /etc/swift/container-server.conf 
[DEFAULT]
bind_ip = xxx.xxx.xxx.xxx
mount_check = false
disable_fallocate = true
workers = 100
log_facility = LOG_LOCAL1
devices = /srv/node
user = yatta_swift
recon_cache_path = /var/cache/swift

[pipeline:main]
pipeline = recon container-server

[app:container-server]
use = egg:swift#container

[container-replicator]

[filter:recon]
use = egg:swift#recon

[container-updater]

[container-auditor]

[container-sync]

#objectの場合
/etc/swift/object-server.conf 
[DEFAULT]
bind_ip = xxx.xxx.xxx.xxx
workers = 100
log_facility = LOG_LOCAL1
devices = /srv/node
user = yatta_swift
mount_check = false
disable_fallocate = true
recon_cache_path = /var/cache/swift

[pipeline:main]
pipeline = recon object-server

[app:object-server]
use = egg:swift#object

[filter:recon]
use = egg:swift#recon

[object-replicator]

[object-updater]

[object-auditor]

swift-clientいれとく

python-swiftclient-1.2.0.tar.gzを本家からDL
tar zxvf python-swiftclient-1.2.0.tar.gz
python setup.py install

swift起動させてみる

#swift起動させる
#proxyの場合
swift-init proxy start

#container object accountの場合
#※ほんとは個別であげるべきだとおもうけど　とりま。
swift-init all start

#proxyでringファイル生成(複数ある場合は一台だけで)
swift-ring-builder account.builder create 18 3 1
swift-ring-builder container.builder create 18 3 1
swift-ring-builder object.builder create 18 3 1

swift-ring-builder account.builder add z1-xxx.xxx.xxx.xxx:6002/sdb1 1
swift-ring-builder account.builder add z2-xxx.xxx.xxx.xxx:6002/sdb1 1
swift-ring-builder account.builder add z3-xxx.xxx.xxx.xxx:6002/sdb1 1
swift-ring-builder account.builder rebalance
 
swift-ring-builder container.builder add z1-xxx.xxx.xxx.xxx:6001/sdb1 1
swift-ring-builder container.builder add z2-xxx.xxx.xxx.xxx:6001/sdb1 1
swift-ring-builder container.builder add z3-xxx.xxx.xxx.xxx:6001/sdb1 1
swift-ring-builder container.builder rebalance
 
swift-ring-builder object.builder add z1-xxx.xxx.xxx.xxx:6000/sdb1 1
swift-ring-builder object.builder add z2-xxx.xxx.xxx.xxx:6000/sdb1 1
swift-ring-builder object.builder add z3-xxx.xxx.xxx.xxx:6000/sdb1 1
swift-ring-builder object.builder rebalance

#builderファイルと*.ring.gzが生成されるので、builderは各proxyに*.ring.gzは全nodeの/etc/swift配下に。

ちょっといじってみるお

#各nodeの状況確認@proxyで

#swift-ring-builder container.builder 
container.builder, build version 6
262144 partitions, 3 replicas, 3 zones, 3 devices, 0.00 balance
The minimum number of hours before a partition can be reassigned is 1
Devices:    id  zone      ip address     port      name weight partitions balance meta
             0     1    xxx.xxx.xxx.xxx  6001      sdb1   1.00     262144    0.00 
             1     2    xxx.xxx.xxx.xxx  6001      sdb1   1.00     262144    0.00 
             3     3    xxx.xxx.xxx.xxx  6001      sdb1   1.00     262144    0.00 

#nodeのadd remove rebalance
#node外す
swift-ring-builder object.builder remove z1-xxx.xxx.xxx.xxx/sdb1

#node入れる(重み代えるのもこれでできる)
swift-ring-builder object.builder add z1-xxx.xxx.xxx.xxx/sdb1 1

#reblance ちなみにrebalanceは一度やると一時間はできない　↑の設定だと。
swift-ring-builder object.builder

swift-ring-builder object.builder rebalance
Reassigned 262144 (100.00%) partitions. Balance is now 726.36.
-------------------------------------------------------------------------------
NOTE: Balance of 726.36 indicates you should push this 
      ring, wait at least 1 hours, and rebalance/repush.
-------------------------------------------------------------------------------

とりあえずいったんここまでで。ほかにも結構いろいろ検証しているので、あとでのせます。

2012年6月12日火曜日

Megacli64でディスク交換太郎

Megacli64を使ったディスク交換の手順だお。
ちなみに今回は、Medium Errorがたくさんでてるディスクの予防交換。

OSはCent5.4
RAIDカードは、LSI Megaraid 9280 FW:12.12.0-0090
Megacli 8.02.16で。

#まずはHDDの状態を確認。
/opt/MegaRAID/MegaCli/MegaCli64 -PDList -a0

Enclosure Device ID: 252
Slot Number: 0
Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 16
WWN:
Sequence Number: 2
Media Error Count: 67　　　　　← errorはいてる
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 1.819 TB [0xe8e088b0 Sectors]
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Firmware state: Online, Spun Up
Is Commissioned Spare : NO
Device Firmware Level: A5C0
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x4433221100000000
Connected Port Number: 0(path0)
Inquiry Data:       MN1220F32HYULDHitachi HDS723020BLA642                 MN6OA5C0

・・・省略

#対象のディスクを確認(ランプ点灯)
/opt/MegaRAID/MegaCli/MegaCli64 -Pdlocate start physdrv[252:0] -a0

#対象のディスクを確認(ランプ消灯)
/opt/MegaRAID/MegaCli/MegaCli64 -Pdlocate stop physdrv[252:0] -a0


#対象ディスクのオフライン
/opt/MegaRAID/MegaCli/MegaCli64 -PDOffline -PhysDrv[252:0] -a0

Adapter: 0: EnclId-252 SlotId-0 state changed to OffLine.

Exit Code: 0x00


#status確認
/opt/MegaRAID/MegaCli/MegaCli64 -PDList -a0 | less

Adapter #0

Enclosure Device ID: 252
Slot Number: 0
Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 16
WWN:
Sequence Number: 3
Media Error Count: 67
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 1.819 TB [0xe8e088b0 Sectors]
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Firmware state: Offline  ←　offlineになってる。

・・・省略

#対象のディスクにmissing markつける
/opt/MegaRAID/MegaCli/MegaCli64 -PDMarkMissing -PhysDrv[252:0] -a0

EnclId-252 SlotId-0 is marked Missing.

Exit Code: 0x00

#missing markのついたdiskの確認
/opt/MegaRAID/MegaCli/MegaCli64 -PDGetMissing -aALL

    Adapter 0 - Missing Physical drives

    No.   Array   Row   Size Expected
    0     0       0     1907200 MB

#削除前の準備
/opt/MegaRAID/MegaCli/MegaCli64 -PDPrpRmv -PhysDrv[252:0] -a0


Prepare for removal Success

Exit Code: 0x00

#ここでHDDを交換。


#Missing markの状態変更
/opt/MegaRAID/MegaCli/MegaCli64 -PDReplaceMissing -PhysDrv[252:0] -Array0 -row0 -a0

#Rebuildスタート
/opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -Start -PhysDrv[252:0] -a0

#状態確認
/opt/MegaRAID/MegaCli/MegaCli64 -PDList -a0 | less
^M
Adapter #0

Enclosure Device ID: 252
Slot Number: 0
Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 16
WWN:
Sequence Number: 12
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 1.819 TB [0xe8e088b0 Sectors]
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Firmware state: Rebuild   ← rebuildのステータスになった

・・・省略

#Rebuildの状況確認

/opt/MegaRAID/MegaCli/MegaCli64 -PDRbld -showprog -physdrv[252:0] -a0

Rebuild Progress on Device at Enclosure 252, Slot 0 Completed 4% in 11 Minutes.

Exit Code: 0x00

こんな感じでできる。ただ、PDmarkMissingが対応していない場合があるので注意。

2012年4月12日木曜日

やっと構築までこぎつけたので、一旦メモで。

SPDYとはGoogleが策定を進めている通信プロトコルらしく
HTTPよりはやいお！！ってことらしい。

gmail、googlecalenderとか、Twitterとかが実装しているらしい太郎。

なんかみたところ、通常のHTTPの50%近くの高速化を目標としているらしい。
今後の主流となること間違いないという予想太郎。

ただ、Firefox11系と、Choromeの最新版？でしか対応してないので
大規模サービスの画像配信とかで使うにはちょっと難しいのかな・・・

まあ、apacheのモジュール公開されてたんで構築してみた。
環境は、Ubuntu 11.04


#まずは事前準備
apt-get install subversion curl g++ apache2 patch binutils julius make

#任意のディレクトリで

mkdir mod_spdy
mkdir temp
cd temp

#depot_tools持ってくる
svn co http://src.chromium.org/svn/trunk/tools/depot_tools

#PATH通す
export PATH="$PATH":`pwd`/depot_tools
echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/yattarou/temp/depot_tools

#SPDYのコード落としてくる ちなみにrootで実行してる
gclient config "http://mod-spdy.googlecode.com/svn/trunk/src"
Running depot tools as root is sad.
#失敗したｗ

#gclientソース見てみる
#base_dirんとこでうまくいってないっぽいので、パワープレイ。
#パス直書きにしてやったw
 vim /home/yattarou/temp/depot_tools/gclient

#!/usr/bin/env bash
# Copyright (c) 2009 The Chromium Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

base_dir=$(dirname "$0")

"$base_dir"/update_depot_tools

PYTHONDONTWRITEBYTECODE=1 exec python "$base_dir/gclient.py" "$@"

#もっかいチャレンジ
/home/yattarou/temp/depot_tools/gclient config "http://mod-spdy.googlecode.com/svn/trunk/src"
Running depot tools as root is sad.

#失敗した
#update_depot_toolsのコードみる
#rootの権限チェック、コメントアウトした。
vim /home/yattarou/temp/depot_tools/update_depot_tools

#!/usr/bin/env bash
# Copyright (c) 2011 The Chromium Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

# This script will try to sync the bootstrap directories and then defer control.

if [ "$USER" == "root" ];
then
  echo Running depot tools as root is sad.
  exit
fi

#もっかいやってみる
/home/yattarou/temp/depot_tools/gclient config "http://mod-spdy.googlecode.com/svn/trunk/src"

#いけたぽいので
/home/udagawa/temp/depot_tools/gclient sync --force
1>________ running 'svn checkout http://mod-spdy.googlecode.com/svn/trunk/src /home/udagawa/mod_spdy/src --non-interactive --force --ignore-externals' in '/home/yattarou/mod_spdy'
・・・・
いっぱいおとしてくる。

26>Checked out revision 933.
Syncing projects:  96% (26/27), done.ols/gyp

________ running '/usr/bin/python src/build/gyp_chromium' in '/home/yattarou/mod_spdy'
Updating projects from gyp files...

#落としてきたら


cd src/
./build_modssl_with_npn.sh

#.soをコピー
cp -rp mod_ssl.so /usr/lib/apache2/modules/
a2enmod ssl

#おれおれ証明書作成
apt-get install ssl-cert

make-ssl-cert /usr/share/ssl-cert/ssleay.cnf /tmp/selfsigned.crt
mkdir /etc/apache2/ssl
mv /tmp/selfsigned.crt /etc/apache2/ssl/

#証明書の設定編集
vim /etc/apache2/sites-available/default-ssl

        SSLCertificateFile    /etc/apache2/ssl/selfsigned.crt
#コメントアウト
        #SSLCertificateKeyFile /etc/ssl/private/ssl-cert-snakeoil.key

a2ensite default-ssl

#apache再起動
/etc/init.d/apache2 restart

#SPDYのBUILD
make BUILDTYPE=Release
・・・・
#いっぱいでる

cp out/Release/libmod_spdy.so /usr/lib/apache2/modules/mod_spdy.so
echo "LoadModule spdy_module /usr/lib/apache2/modules/mod_spdy.so" | tee /etc/apache2/mods-available/spdy.load
echo "SpdyEnabled on" | tee /etc/apache2/mods-available/spdy.conf

a2enmod spdy
Enabling module spdy.
Run '/etc/init.d/apache2 restart' to activate new configuration!

#apache再起動
/etc/init.d/apache2 restart


#これでおｋ
#下のでいろいろ見れる
chrome://net-internals/#spdy

#あと、SPDY用のベンチマークツール
#windowsで確認するとき、chrome.exe --enable-benchmarkingしないと起動しないので注意
http://dev.chromium.org/developers/design-documents/extensions/how-the-extension-system-works/chrome-benchmarking-extension

すこし、確認したところ、まあまま早い感じがする。
検証結果は、あとでちゃんと載せます。

2012年4月10日火曜日

glusterFS(Distribute)構築太郎

GlusterFSを構築するお。

環境は、ubuntu 11.04とかCentOS5.4。とりあえず作ってみたかったので
バラバラです。

事前に
・python2.4以上
・python-crypt
・flex
・bison
いれた。

コマンドの結果、とり忘れたのであとでつけますね。


#まずは、FUSEから。
tar zxvf fuse-2.8.7.tar.gz
./configure
make && make install

#FUSE読み込み
modprobe fuse
dmesg | grep -i fuse
#fuse init (API version 7.13)

#次、gluster。
tar zxvf glusterfs-3.2.6.tar.gz
./configure
make && make install

#インストールすると、/etc/glusterd配下にディレクトリ・ファイルが展開される。
# ls -lHR
.:
total 4
drwxr-xr-x 2 root root 24 Apr 12 03:25 geo-replication
-rw-r--r-- 1 root root 42 Apr 12 03:25 glusterd.info
drwxr-xr-x 2 root root  6 Apr 12 03:25 nfs
drwxr-xr-x 2 root root  6 Apr 12 03:25 peers
drwxr-xr-x 2 root root  6 Apr 12 03:25 vols

./geo-replication:
total 4
-rwxr-xr-x 1 root root 1137 Apr 12 03:25 gsyncd.conf

./nfs:
total 0

./peers:
total 0

./vols:
total 0

#glusterFS 起動　( 各サーバで)
/etc/init.d/glusterd start

#ディレクトリほる
mkdir /gluster_data

#peerの作成 (各ノード間で信頼関係をつくるらしい)

#まずはpeerの状態確認
/usr/local/sbin/gluster peer status
No peers present

#peerはる
/usr/local/sbin/gluster peer probe xxx.xxx.xxx.xxx
Probe successful

#peerの状態確認
/usr/local/sbin/gluster peer status
Number of Peers: 1

Hostname: xxx.xxx.xxx.xxx
Uuid: ad0bb29e-f91b-4053-8ab0-9a90adbd095e
State: Peer in Cluster (Connected)

#volumeの作成(今回は、とりあえず全ノードの領域を使う方法にしました。ほかに、stripe、replicaがあるみたいす。あとでやり方あげます)

/usr/local/sbin/gluster volume create gluster-vol xxx.xxx.xxx.xxx:/gluster_data xxx.xxx.xxx.xxx:/gluster_data
Creation of volume gluster-vol has been successful. Please start the volume to access data.

#volume確認
/usr/local/sbin/gluster volume info 

Volume Name: gluster-vol
Type: Distribute
Status: Created
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: xxx.xxx.xxx.xxx:/gluster_data
Brick2: xxx.xxx.xxx.xxx:/gluster_data

#volume 起動
/usr/local/sbin/gluster volume start gluster-vol
Starting volume gluster-vol has been successful

#確認
/usr/local/sbin/gluster volume info 

Volume Name: gluster-vol
Type: Distribute
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: xxx.xxx.xxx.xxx:/gluster_data
Brick2: xxx.xxx.xxx.xxx:/gluster_data

#各Clientからmount
mount -t glusterfs -o log-level=WARNING,log-file=/var/log/gluster.log xxx.xxx.xxx.xxx:gluster-vol /mnt/glusterfs

#これでとりあえず使えます。
 df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda5             1.9T  3.4G  1.9T   1% /
none                  5.9G  204K  5.9G   1% /dev
none                  5.9G     0  5.9G   0% /dev/shm
none                  5.9G   48K  5.9G   1% /var/run
none                  5.9G     0  5.9G   0% /var/lock
none                  5.9G     0  5.9G   0% /lib/init/rw
/dev/sdb1              51T   11G   51T   1% /gluster_data
/dev/sda1             472M   27M  422M   6% /boot
xxx.xxx.xxx.xxx:gluster-vol
                       52T   17G   52T   1% /mnt/glusterfs

#こんなファイルできてる
/etc/glusterd# ls -lhR
.:
total 4.0K
drwxr-xr-x 2 root root 24 Apr 12 03:25 geo-replication
-rw-r--r-- 1 root root 42 Apr 12 03:25 glusterd.info
drwxr-xr-x 3 root root 37 Apr 12 03:42 nfs
drwxr-xr-x 2 root root 49 Apr 12 03:34 peers
drwxr-xr-x 3 root root 24 Apr 12 03:40 vols

./geo-replication:
total 4.0K
-rwxr-xr-x 1 root root 1.2K Apr 12 03:25 gsyncd.conf

./nfs:
total 4.0K
-rw-r--r-- 1 root root 1.3K Apr 12 03:42 nfs-server.vol
drwxr-xr-x 2 root root   20 Apr 12 03:42 run

./nfs/run:
total 64K
-rw-r--r-- 1 root root 6 Apr 12 03:42 nfs.pid

./peers:
total 4.0K
-rw-r--r-- 1 root root 74 Apr 12 03:34 ad0bb29e-f91b-4053-8ab0-9a90adbd095e

./vols:
total 4.0K
drwxr-xr-x 4 root root 4.0K Apr 12 03:42 gluster-vol

./vols/gluster-vol:
total 24K
drwxr-xr-x 2 root root   66 Apr 12 03:42 bricks
-rw-r--r-- 1 root root   16 Apr 12 03:42 cksum
-rw-r--r-- 1 root root 1.2K Apr 12 03:40 gluster-vol-fuse.vol
-rw-r--r-- 1 root root  995 Apr 12 03:40 gluster-vol.xxx.xxx.xxx.xxx.gluster_data.vol
-rw-r--r-- 1 root root  995 Apr 12 03:40 gluster-vol.xxx.xxx.xxx.xxx.gluster_data.vol
-rw-r--r-- 1 root root  174 Apr 12 03:42 info
-rw-r--r-- 1 root root   12 Apr 12 03:42 rbstate
drwxr-xr-x 2 root root   39 Apr 12 03:42 run

./vols/gluster-vol/bricks:
total 8.0K
-rw-r--r-- 1 root root 75 Apr 12 03:42 xxx.xxx.xxx.xxx:-gluster_data
-rw-r--r-- 1 root root 71 Apr 12 03:42 xxx.xxx.xxx.xxx:-gluster_data

./vols/gluster-vol/run:
total 64K
-rw-r--r-- 1 root root 6 Apr 12 03:42 xxx.xxx.xxx.xxx-gluster_data.pid

#/etc/glusterd/glusterd.infoの中身
cat glusterd.info 
UUID=e0548668-aec6-49c0-96e7-46761d97015d

#/etc/glusterd/nfs//nfs-server.volの中身
/etc/glusterd/nfs# cat nfs-server.vol

#各clientの情報 
volume gluster-vol-client-0
    type protocol/client
    option remote-host xxx.xxx.xxx.xxx
    option remote-subvolume /gluster_data
    option transport-type tcp
end-volume

volume gluster-vol-client-1
    type protocol/client
    option remote-host xxx.xxx.xxx.xxx
    option remote-subvolume /gluster_data
    option transport-type tcp
end-volume

#volumeの情報？
volume gluster-vol-dht
    type cluster/distribute
    subvolumes gluster-vol-client-0 gluster-vol-client-1
end-volume

#writeのパラメータ。バックグラウンドで遅延書き込みするってことみたい。
volume gluster-vol-write-behind
    type performance/write-behind
    subvolumes gluster-vol-dht
end-volume

#readのパラメータ。read-aheadってことは先読みか。
volume gluster-vol-read-ahead
    type performance/read-ahead
    subvolumes gluster-vol-write-behind
end-volume

#cacheのパラメータ。clientとserver間の接続のパラメータみたい。
volume gluster-vol-io-cache
    type performance/io-cache
    subvolumes gluster-vol-read-ahead
end-volume

#こっから下まだ調べてない
volume gluster-vol-quick-read
    type performance/quick-read
    subvolumes gluster-vol-io-cache
end-volume

volume gluster-vol
    type debug/io-stats
    option latency-measurement off
    option count-fop-hits off
    subvolumes gluster-vol-quick-read
end-volume

volume nfs-server
    type nfs/server
    option nfs.dynamic-volumes on
    option rpc-auth.addr.gluster-vol.allow *
    option nfs3.gluster-vol.volume-id dc957ba2-188a-48bf-8c80-e6899ed999c7
    subvolumes gluster-vol
end-volume

#↑のパラメータは、http://www.gluster.org/community/documentation/index.php/Translators/performanceに結構書いてあるので後でまとめます。

結構チューニングするとこ多そうなので、調べなきゃ。

LSI9285 - Potential non-optimal configuration太郎

Megaraid9285のFWを21.0.1-0111 →　23.1.1-0004にしたところ、ストレージがピーピー言っておる。

しらべた。


#ログだす
/opt/MegaRAID/MegaCli/MegaCli64 -AdpEventLog -GetEvents -f /home/yattarou/getlog.log -a0

#確認する　一部抜粋
#各physical disk(計8本)で、下記のメッセージがでていた。
#見る限り、おで、emergency spareになるよってことだと思う。
===========
Device ID: 14
Enclosure Index: 252
Slot Number: 5


seqNum: 0x0000070d
Time: Wed Mar 28 14:11:21 2012

Code: 0x00000196
Class: 1
Locale: 0x02
Event Description: Reminder: Potential non-optimal configuration due to drive PD 0f(e0xfc/s4) commissioned as emergency spare
Event Data:
===========

確認する限り、ディスクのread,write、RAIDの状態には問題なさそう。
ただDiskが死んだ時、変な動きをしそうなのでLSIさんに問い合わせてみた。

結果、farmのバグらしく下のlinkから最新版を当てたらなおった。
2週間もたたないくらいで直してくれました。ありがとうございます。

http://www.lsi.com/products/storagecomponents/Pages/MegaRAIDSAS9285-8e.aspx

Version:23.4.1-0028

です。

2012年3月12日月曜日

Megaraid 9285 Firmware update太郎

Megaraid 9285のfirmあげようとおもったら、ちょっとつまづいたOrz



#Megaraid9285のFWを21.0.1-0111 →　23.1.1-0004にしようとしてみる
/opt/MegaRAID/MegaCli/MegaCli64 -AdpFwFlash -f mr2208fw.rom -a0

Adapter 0: LSI MegaRAID SAS 9285-8e Vendor ID: 0x1000, Device ID: 0x005B ERROR: 
The image file is invalid and could not be flashed to the controller!!!

#エラーでてできない(´･ω･`)
#megacliのバージョンが古いVer 8.00.40）のが原因みたい。
#下のサイトから最新版もってきたらできた
http://www.lsi.com/Pages/user/eula.aspx?file=http%3a%2f%2fwww.lsi.com%2fdownloads%2fPublic%2fMegaRAID%2520Common%2520Files%2f8.02.16_MegaCLI.zip&Source=http%3a%2f%2fwww.lsi.com%2fdownloads

2012年3月7日水曜日

Megaraid 9280 Firmware update太郎

Megaraid9280 e8のfirmwareのアップデートしてみたお。
9285もいっしょっぽい。


#firmwareのバージョンを確認
/opt/MegaRAID/MegaCli/MegaCli64  -AdpAllInfo -a0

Adapter #0

==============================================================================
                    Versions
                ================
Product Name    : LSI MegaRAID SAS 9280-8e
Serial No       : XXXXXXXX
FW Package Build: 12.0.1-0081

#LSIのホームページから、対象RAIDカードのFWをとってくる。解凍。
unzip 12.12.0-0090_SAS_2108_FW_Image_APP-2.120.243-1482.zip

#適用
MegaCli -adpfwflash -f mr2108fw.rom -a0

#再起動
/sbin/init 6

#確認
/opt/MegaRAID/MegaCli/MegaCli64  -AdpAllInfo -a0
Adapter #0

==============================================================================
                    Versions
                ================
Product Name    : LSI MegaRAID SAS 9280-8e
Serial No       : XXXXXXXXX
FW Package Build: 12.12.0-0090

ものの数分でおわるお

2012年3月6日火曜日

Megacli log出力太郎

Megaraid 9285(9280)をつかっておるのだが、通常ではRAIDコントローラのログは出力されない。
なので、障害時とかちょっと困る。

なので、ログを出力するコマンドかく。


#RAIDカードのイベントログ出力
#-f で、出力先きめる
/opt/MegaRAID/MegaCli/MegaCli64 -AdpEventLog -GetEvents -f /home/yattarou/geteventlog -aALL

ちなみに、ログ出力すると、結構なサイズ(20Mくらいかな)がでるので
ちょっと注意。

あと、個人的な想像だけどこのログ、raidカードのRAMに格納してる感じがするので
多くためてるとfifoで消えていくきがするぞ。

あとコントローラが見えなくなる障害とかあると、ログ出力できないから
その辺も考慮する必要があるんじゃないか(*´ω｀*)

2012年1月12日木曜日

ディスクの保守ランプ点灯太郎

DELL R310についてるコントローラー(PERC H200 Adapter)で
ディスクの保守ランプを点灯させる方法だーお

事前に、dellでだしてるomreport,omconfigをインストールしとくこと。


#事前にディスクの情報を確認
#vdisk
/opt/dell/srvadmin/bin/omreport storage vdisk controller=0
Virtual Disk 0 on Controller PERC H200 Adapter (Slot 2)

Controller PERC H200 Adapter (Slot 2)
ID                        : 0
Status                    : Ok
Name                      : Virtual Disk 0
State                     : Ready
Hot Spare Policy violated : Not Assigned
Virtual Disk Bad Blocks   : Not Applicable
Secured                   : Not Applicable
Progress                  : Not Applicable
Layout                    : RAID-1
Size                      : 232.25 GB (249376538624 bytes)
Device Name               : /dev/sda
Bus Protocol              : SATA
Media                     : HDD
Read Policy               : Not Applicable
Write Policy              : Not Applicable
Cache Policy              : Not Applicable
Stripe Element Size       : 64 KB
Disk Cache Policy         : Disabled

#pdisk確認
/opt/dell/srvadmin/bin/omreport storage pdisk controller=0
List of Physical Disks on Controller PERC H200 Adapter (Slot 2)

Controller PERC H200 Adapter (Slot 2)
ID                        : 0:0:0
Status                    : Ok
Power Status              : Not Applicable
Name                      : Physical Disk 0:0:0
State                     : Online
Failure Predicted         : No
Certified                 : Yes
Encryption Capable        : No
Secured                   : Not Applicable
Progress                  : Not Applicable
Bus Protocol              : SATA
Media                     : HDD
Mirror Set ID             : Not Applicable
Capacity                  : 232.25 GB (249376538624 bytes)
Used RAID Disk Space      : 232.25 GB (249376538624 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare                 : No
Vendor ID                 : DELL
Product ID                : SAMSUNG HE253GJ                         
Revision                  : 1AJ30001
Serial No.                : S2B5J90B222087222087
Part Number               : Not Available
Negotiated Speed          : 3.00 Gbps
Capable Speed             : Not Available
Manufacture Day           : Not Available
Manufacture Week          : Not Available
Manufacture Year          : Not Available
SAS Address               : 4433221107000000

ID                        : 0:0:1
Status                    : Ok
Power Status              : Not Applicable
Name                      : Physical Disk 0:0:1
State                     : Online
Failure Predicted         : No
Certified                 : Yes
Encryption Capable        : No
Secured                   : Not Applicable
Progress                  : Not Applicable
Bus Protocol              : SATA
Media                     : HDD
Mirror Set ID             : Not Applicable
Capacity                  : 232.25 GB (249376538624 bytes)
Used RAID Disk Space      : 232.25 GB (249376538624 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare                 : No
Vendor ID                 : DELL
Product ID                : SAMSUNG HE253GJ                         
Revision                  : 1AJ30001
Serial No.                : S2B5J90B222091222091
Part Number               : Not Available
Negotiated Speed          : 3.00 Gbps
Capable Speed             : Not Available
Manufacture Day           : Not Available
Manufacture Week          : Not Available
Manufacture Year          : Not Available
SAS Address               : 4433221106000000

#ランプ点灯(pdiskの値は、omreport pdiskでとったIDをいれる
/opt/dell/srvadmin/bin/omconfig storage pdisk action=blink controller=0 pdisk=0:0:0

#ランプ消灯
/opt/dell/srvadmin/bin/omconfig storage pdisk action=unblink controller=0 pdisk=0:0:0

登録: 投稿 (Atom)

やっ太郎ブログ