8.9 建置 Spark standalone cluster執行環境


8.1 安裝scala
Step1~4 下載安裝 Scala
wget http://www.scala-lang.org/files/archive/scala-2.11.6.tgz
tar xvf scala-2.11.6.tgz
sudo mv scala-2.11.6 /usr/local/scala
Step5 Scala使用者環境變數設定
修改~/.bashrc
sudo gedit ~/.bashrc
輸入下列內容
export SCALA_HOME=/usr/local/scala
export PATH=$PATH:$SCALA_HOME/bin
Step6 使讓~/.bashrc修改生效
source ~/.bashrc
8.2 安裝Spark
Step1~3 下載安裝 Spark
wget http://apache.stu.edu.tw/spark/spark-1.4.0/spark-1.4.0-bin-hadoop2.6.tgz
tar zxf spark-1.4.0-bin-hadoop2.6.tgz
sudo mv spark-1.4.0-bin-hadoop2.6 /usr/local/spark/
Step4 Spark使用者環境變數設定
修改~/.bashrc
sudo gedit ~/.bashrc
輸入下列內容
export SPARK_HOME=/usr/local/spark
export PATH=$PATH:$SPARK_HOME/bin
Step5 使讓~/.bashrc修改生效
source ~/.bashrc

8.4 啟動spark-shell互動介面

spark-shell
8.5 設定spark-shell 顯示訊息

cd /usr/local/spark/conf
cp log4j.properties.template log4j.properties 
修改log4j.properties
sudo gedit log4j.properties
開啟gedit編輯log4j.properties,原本是INFO改為WARN
8.6 啟動Hadoop
start-all.sh
8.7 本機執行Spark-shell 程式
Step1 進入spark-shell
spark-shell  --master local[4]
Step2 讀取本機檔案
val textFile=sc.textFile("/usr/local/spark/README.md")
textFile.count
Step3 讀取HDFS檔案
val textFile = sc.textFile("hdfs://master:9000/user/hduser/wordcount/input/pg5000.txt")
textFile.count
8.8 在Hadoop YARN執行spark-shell
Step1 在Hadoop YARN執行spark-shell
SPARK_JAR=/usr/local/spark/lib/spark-assembly-1.4.0-hadoop2.6.0.jar HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop MASTER=yarn-client /usr/local/spark/bin/spark-shell
Step2 讀取本機檔案
錯誤語法
val textFile=sc.textFile("/usr/local/spark/README.md")
textFile.count
正確語法
val textFile=sc.textFile("file:/usr/local/spark/README.md")
textFile.count
Step3 讀取HDFS檔案
val textFile = sc.textFile("/user/hduser/wordcount/input/pg5000.txt")
val textFile = sc.textFile("hdfs://master:9000/user/hduser/wordcount/input/pg5000.txt")
textFile.count

此圖出自Spark官網 https://spark.apache.org/


以上內容節錄自這本書,很適合Python程式設計師學習Spark機器學習與大數據架構,點選下列連結查看本書詳細介紹:
  Python+Spark 2.0+Hadoop機器學習與大數據分析實戰
  http://pythonsparkhadoop.blogspot.tw/2016/10/pythonspark-20hadoop.html

《購買本書 限時特價專區》
博客來網路書店: http://www.books.com.tw/products/0010730134?loc=P_007_090

天瓏網路書店: https://www.tenlong.com.tw/items/9864341537?item_id=1023658
  

露天拍賣:http://goods.ruten.com.tw/item/show?21640846068139
蝦皮拍賣:https://goo.gl/IEx13P 



Share on Google Plus

About kevin

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.
    Blogger Comment
    Facebook Comment

0 意見:

張貼留言