注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

技术行者

时代的车轮在按照摩尔定律滚动。

 
 
 

日志

 
 

hadoop配置方法(转)  

2011-03-09 11:20:00|  分类: 分布式 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

看了这么久的书,第一次配置,本来以为会很顺利,结果整整配置了一天,好算在群里的兄弟和通过百度搞定。

现在记录流水日志

36 master

203 204 205 206  207 208 218 是slave

所需软件包

 

创建hadoop用户和组

groupadd hadoop

tail /etc/group #看新添加的组的ID

useradd hadoop -g 502

passwd hadoop #密码是 ku6.ABC123

id hadoop

cd /home/hadoop/

mkdir software

cd software

 

在36上执行

复制安装包到/home/hadoop/software目录下

scp /home/hadoop/software/* root@10.1.121.201:/home/hadoop/software

scp ./jdk-6u20-linux-x64.bin root@10.1.121.205:/home/hadoop/software

scp ./jdk-6u20-linux-x64.bin root@10.1.121.207:/home/hadoop/software

scp ./jdk-6u20-linux-x64.bin root@10.1.121.208:/home/hadoop/software

安装Jdk

cd /home/hadoop/software

./jdk-6u20-linux-x64.bin

mv jdk1.6.0_20 /usr/local/jdk1.6.0_20 

ln -s /usr/local/jdk1.6.0_20/bin/java /usr/bin/java

然后输入 java -version 看java的版本

 

安装hadoop

scp ./hadoop-0.20.2.tar.gz root@10.1.121.205:/home/hadoop/software

scp ./hadoop-0.20.2.tar.gz root@10.1.121.207:/home/hadoop/software

scp ./hadoop-0.20.2.tar.gz root@10.1.121.208:/home/hadoop/software

 

tar -xzvf hadoop-0.20.2.tar.gz 

chown -R hadoop:hadoop hadoop-0.20.2 

mv hadoop-0.20.2 /usr/local/hadoop

 

安装rsync

 tar -xzvf rsync-3.0.7.tar.gz  

cd rsync-3.0.7

./configure

make && make install

cd ..

配置ssh无密码登陆

su hadoop

cd ~

mkdir -p ./.ssh/

chmod 700 .ssh

chmod 700 ~

ssh-keygen -t rsa

mv ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys

scp ~/.ssh/* 10.1.121.205:~/.ssh/

scp ~/.ssh/* 10.1.121.207:~/.ssh/

scp ~/.ssh/* 10.1.121.208:~/.ssh/

 

mkdir -p /home/hadoop/hadoopData

chown hadoop:hadoop /home/hadoop/hadoopData

ln -s /usr/local/hadoop/bin/hadoop /usr/bin/hadoop

 

 

设置配置文件

同步

scp /usr/local/hadoop/conf/* 10.1.121.201:/usr/local/hadoop/conf/

scp /usr/local/hadoop/conf/* 10.1.121.203:/usr/local/hadoop/conf/

scp /usr/local/hadoop/conf/* 10.1.121.204:/usr/local/hadoop/conf/

scp /usr/local/hadoop/conf/* 10.1.121.205:/usr/local/hadoop/conf/

scp /usr/local/hadoop/conf/* 10.1.121.206:/usr/local/hadoop/conf/

scp /usr/local/hadoop/conf/* 10.1.121.207:/usr/local/hadoop/conf/

scp /usr/local/hadoop/conf/* 10.1.121.208:/usr/local/hadoop/conf/

scp /usr/local/hadoop/conf/* 10.1.121.218:/usr/local/hadoop/conf/

 

设置HOSTS

在每台机器上设置hostName

hostname hadoopSlave203

10.1.121.36 hadoopMaster

10.1.121.36 hadoopJobTacker

10.1.121.201 hadoopSlave201 

10.1.121.203 hadoopSlave203 

10.1.121.204 hadoopSlave204 

10.1.121.205 hadoopSlave205 

10.1.121.206 hadoopSlave206 

10.1.121.207 hadoopSlave207 

10.1.121.208 hadoopSlave208 

10.1.121.218 hadoopSlave218 

 

格式化并启动系统

su hadoop

cd /usr/local/hadoop/bin

./hadoop namenode -format

./start-dfs.sh 

./start-mapred.sh 

测试

./hadoop fs -mkdir myn

./hadoop fs -put ../ivy.xml /ivy2.xml

./hadoop fs -ls  /

./hadoop fs -lsr /

./hadoop fs -cat /ivy2.xml

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/yannian.log /pgv/dayRawLog/

./hadoop fs -tail /pgv/dayRawLog/yannian.log

 

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/001.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/002.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/003.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/004.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/005.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/006.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/007.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/008.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/009.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/010.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/011.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/012.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/013.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/014.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/015.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/016.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/017.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/018.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/019.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/020.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/021.log

./hadoop fs -put /data/hdfs_collect/pvdata/20100607/20100607.hash0 /pgv/dayRawLog/022.log

总结  

1 ssh的给配反了,导致耽误很多时间

2.HOST的设置 后面的域名移动要写两个 好怪啊

3.访问要通过 hosts或nameServer来访问,不能再配置文件中直接设置IP

4.每次重新格式化后一定要删除hadoop_tmp目录下的所有文件

5.日志文件很有版主信息

6.多使用JPS查看状态,还有http的接口

7.在每台机器上设置hostName

http://59.151.121.36:50070

http://59.151.121.36:50030

 

 

10.1.121.36 hadoopMaster hadoopMaster

10.1.121.36 hadoopJobTacker hadoopJobTacker

10.1.121.201 hadoopSlave201 hadoopSlave201

10.1.121.203 hadoopSlave203 hadoopSlave203

10.1.121.204 hadoopSlave204 hadoopSlave204

10.1.121.205 hadoopSlave205 hadoopSlave205

10.1.121.206 hadoopSlave206 hadoopSlave206

10.1.121.207 hadoopSlave207 hadoopSlave207

10.1.121.208 hadoopSlave208 hadoopSlave208

10.1.121.218 hadoopSlave218 hadoopSlave218

 

10.1.121.36 hadoopMaster

10.1.121.36 hadoopJobTacker

10.1.121.201 hadoopSlave201

10.1.121.203 hadoopSlave203

10.1.121.204 hadoopSlave204

10.1.121.205 hadoopSlave205

10.1.121.206 hadoopSlave206

10.1.121.207 hadoopSlave207

10.1.121.208 hadoopSlave208

10.1.121.218 hadoopSlave218

 

 

 

hive配置手记

tar -xzvf hive-0.4.1-bin.tar.gz 

mv hive-0.4.1-bin /usr/local/hive

chown hadoop:hadoop /usr/local/hive/ -R

su hadoop

mkdir -p /home/hadoop/hiveData

cd /usr/local/hive/bin/

 

vi hive-config.sh 

export HIVE_HOME=/usr/local/hive

export HADOOP_HOME=/usr/local/hadoop

export JAVA_HOME=/usr/local/jdk1.6.0_20

cd /usr/local/hadoop

bin/hadoop fs -mkdir       /tmp

bin/hadoop fs -mkdir       /user/hive/warehouse

bin/hadoop fs -chmod g+w   /tmp

bin/hadoop fs -chmod g+w   /user/hive/warehouse

 

vi hive-default.xml

<property>

  <name>hive.exec.scratchdir</name>

  <value>/home/hadoop/hiveData/hive-${user.name}</value>

  <description>Scratch space for Hive jobs</description>

</property>

测试

cd /usr/local/hive/bin/

./hive

CREATE TABLE pokes (foo INT, bar STRING);  

SHOW TABLES;

 CREATE TABLE invites (foo INT, bar STRING) PARTITIONED BY (ds STRING); 

 SHOW TABLES '.*s';

DROP TABLE pokes;

DESCRIBE invites; 


  评论这张
 
阅读(301)| 评论(2)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017